Apache Qpid : AMQP breakdown for clustering
This page last changed on Jul 14, 2008 by aconway.
AmqpBreakdownBreakdown of AMQP 0-10 commands and controls for clustering.DefinitionsReplica: broker member of the cluster. Connected replica: replica directly connected to the client. Connected session: Attahced session on the connected replica. Shadow session: Backup of active session state on non-connected replica. Detached session: Session not attached to any replica, awaiting timeout. Replicate: Multicast to cluster, defer completing some action till cluster responds. Inferred: Change on the connected replica that can be inferred by other replicas because it is a deterministic response to some replicated change (e.g. responding to a query command) Types of state.
Cluster qualities of service:
We examine the impact on cluster state of each AMQP control & command, and the implications for what needs replication. Connection controlsNone of the connection controls MUST be replicated, they do not affect state. Connection creation/destruction MAY be replicated if using shadow connections to organize sessions. Session controlsPerformance note: Session controls other than completed are not sent on a per-message basis so are not critical path. Failover requires that all replicas know:
This means:
Recieveattach,detach: MUST replicate. Replicas must know all sessions & attachment. request-timeout: MUST replicate - replicas should respect timeout. See other events below. command-point, gap, expected: MUST replicate for consistent command numbering. confirmed: Can be ignored as per spec. flush: No need to replicate, only attached replica need respond. known-completed: MUST replicate so replicas can avoid wrap-around. on their unknown-complete set. completed: Replicas need client completions to bound their replay list. They do not need immediate replication since completions will be re-sent as part of failover. They MUST known of completions before sending a known-complete to client since the client will no longer notify completion of known-complete commands. So we can do one of
Sendattached, detached, timeout, command-point, expected,confirmed, flush: No effect on replicated state. known-completed: See receive completed above. gap: Not used. completed: Only need replication for persistent commands, i.e. commands that change persistent shared state. Persistent commands and the async store.Persistent commands guarantee that once completed their effects are stored persistently and will survive total broker shutdown. For the strongest guarantee it must be persisted on all persistent replicas in the cluster:
A more performant implementation with a weaker guarantee would send completed when the local async store completes with no async notifications from the cluster. The risk: client receives completed, local disk is destroyed, rest of cluster shuts down before storing, message is lost. Need to determine if that's an acceptable risk in general, or perhaps offer configurable choice. Detached session timeouts.Having each replica indepenedently destroy timed-out sessions creates a race: a client could resume a session on ne replica concurrently with the timeout expiring and session state being destroyed on other replicas. To avoid this we choose an arbitrary cluster member to mcast "session timeout" events when sessions time out. This woudl be the primary in active/passive mode or an arbitrarily chosen member (e.g. the oldest member) in active-active mode. Execution commandsReceivesync: No need to replicate, no effect on replicated state. exception: MUST replicate, causes destruction of session state. transfer: MUST replicate before sending completed. Sendsync,result, exception: inferred. Message commandsReceiveDefinition: an incoming message transfer is finished when:
Before a message is finished it can be re-queued in the event of a client disconnect. transfer: MUST replicate, update queue content. acquire, release, accept, reject: See dequeue management below. resume: Not implemented. subscribe, cancel, set-flow-mode, flow, flush, stop: MUST replicate subscription state for replay. Sendtransfer: MUST replicate for shared state & replay. Need not replicate content if it can be inferred. See "Persistent commands and the async store" above and "Dequeue management" note below. acquire, release, accept, reject: See dequeue management below. stop: Dequeue management.Enqueue is straightforward: incoming transfers are replicated. Dequeue provides more options:
Active-active no owners: Replicate all incoming message commands. Replicate dequeue descisions that cannot be inferred. In our current IO-driven model this means all dequeues. Active/passive: Active broker does dequeues. Can avoid/defer replication till of incoming acquire, release, accept, reject and outgoing transfer till messages are finished - i.e. till sending completion for accept or for transfer in implicit-accept mode. At this point it may be possible to batch the events. Note that events must still be sent even if batched: all replicas need to know which messages were removed from which queues, and the order to replay them in the event of fail-over. Tx commandsselect, commit, rollback: MUST replicate. All replicas must commit/rollback consistently. Dtx commandsselect,start,end,commit,forget,get-timeout,prepare,recover,rollback,set-timeout: MUST replicate so all replicas know which commands are within transactions. All replicas must join the DTX so all will commit/fail under control of DTX manager. Exchange commandsdeclare,delete,bind,unbind: MUST replicate, update wiring. bound,query: MUST replicate replicas can respond in the event of failover. Queue commandsdeclare,delete,purge: MUST replicate, update wiring/queue content. query: MUST replicate replicas can respond in the event of failover. File & Stream commands not implemented. |
![]() |
Document generated by Confluence on May 26, 2010 10:31 |