Apache Qpid : Cluster Failover Modes
This page last changed on Feb 25, 2010 by aconway.
Qpid cluster failure modes.This section describes failure modes and techniques to deal with them, the following Broker process terminatedE.g. broker killed. Clients: disconnected immediately, can fail over to another broker in Multicast group: broker is automatically removed from the multicast group. The broker needs to be manually restarted. Broker host crashE.g. power failure, hardware failure. Clients: may not detect loss of connection until a long TCP timeout is Multicast group: broker is automatically removed from the multicast Broker freeze -e .g. kill -STOPE.g. using kill -STOP. Clients: disconnected after TCP timeout, use heartbeats to disconnect quicker. Multicast group: Broker is not automatically can eventually hold up Broker needs to be manually restarted. Client-broker network failureClients: disconnected after TCP timeout, use heartbeats to disconnect quicker. Broker: clean up client resources (e.g. auto-delete queues) when Broker-broker multicast network failureA failure in the multicast network creates a "partition" creating two To deal with this situation, you need cman's quorum service. In the Alternatively to avoid partitions entirely you can use the Brokers that shut down need to be manually restarted. Broker-broker update network failure.New brokers joining the cluster receive an initial state snapshot from Broker must be manually restarted. Note as of qpid 0.6 the update connections are made using the same URL Client crashBroker: client resources such as auto-delete queues are reclaimed Client host crashBroker: client resources such as auto-delete queues are reclaimed after the TCP time-out. ConfigurationSeparate client/multicast networksFor best performance use a separate network for clients and the openais.conf/corosync.conftotem.token: timeout in milliseconds until host crash or network Redundant ring protocol (RRP), uses two physically separate networks active: Active replication can offer slightly lower latency in faulty network environments, however it can reduce throughput. To enable RRP make the following changes to corosync.conf (for RHEL6) or openais.conf (for RHEL5): 1. In the totem section, add rrp_mode=active or rrp_mode=passive qpidd configuration optionscluster-url: specify addresses that clients will use to connect. Can Note a future release will provide cluster-update-url to allow updates watchdog pluginThe watchdog plug-in will kill the qpidd broker process if it If the watchdog plugin is loaded and the --watchdog-interval=N The watchdog process runs a very simple program that starts a timer This is useful in a cluster setting because in some instances cman configurationNote: when using cman, do not start the openais/corosync service. It Only basic cman configuration (cluster.conf) is required. Other Enabling heartbeatsIn C++ clients, heartbeat is disabled by default. You can enable ConnectionSettings settings; In a JMS client, heartbeat is set using the idle_timeout property of pconnectionfactory.qpidConnectionfactory = amqp://guest:guest@clientid/test?brokerlist='tcp://localhost:5672',idle_timeout=3 Heartbeats are enabled in both directions, the connection can be ReferencesCman configuration: See chapters 3 & 5 of http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5.4/html/Cluster_Administration/index.html |
![]() |
Document generated by Confluence on May 26, 2010 10:31 |