--- # Copyright (C) 2009, Progress Software Corporation and/or its # subsidiaries or affiliates. All rights reserved. title: Network Design in_menu: true sort_info: 2 --- name:overview pipeline:haml,tags %h1 A Scaleable Network Design %p .author{:style=>"font-size:10pt;"} %b Author: %a{:href=>"mailto:hiram@hiramchirino.com"} Hiram Chirino An approach to scaling a network of brokers also known as a federation of brokers.. --- name:content pipeline:haml,tags .left %h1 The Problem .right %p The network of brokers implementation in ActiveMQ 5.x cannot scale to a large number of broker nodes, large number of destinations, or a large number of consumer endpoints. The advisory model of propagating consumer demand does not scale well in large networks. .left %h1 The Solution .right %p The solution being proposed in this document is to partition the destinations over the brokers in the network. In this scheme, every destination will have a broker to be it's master. All other brokers on the network will send or subscribe to the destination's master if it has producers or consumers attached to the destination (respectively). %p By using a master broker, all message routing is centralized to one well known broker and therefore does not require advisory messages to implement. It only needs to use Consistent Hashing which has been a proven approach to scaling web infrastructure and caching systems like memcached. .left %h1 Example Topology .right %p The diagram to the right illustrates an example topology of the solution. Like in ActiveMQ 5.x, the a named queue can exist on multiple brokers. Only one of the queues will be the master and the others can be considered slaves. The diagram highlights in green the master queues. %img{:src=>"images/diagram-1.png", :style=>"float:right"} For example the %em queue://A master is on %em Broker 2 but there are slaves on both %em Broker 1 and %em Broker 3. Note that the master queues are evenly distributed across all brokers in the network. %p Forwarding bridges will be established by the network connectors on the brokers with the slave queues. When a slave queue does not have any subscriptions but is not empty, it should create forwarding bridge to send it's messages to the master. And when a slave queue has subscriptions it should create a forwarding bridge to to receive messages from the master. The arrows to the left of the brokers in the diagram illustrate the flow of messages across the bridges. The network connector on nodes with a slave queue that has a consumer attached should create a bridging subscription to the master queue. .left %h1 Enter the Hash Ring .right %p This solution relies on all the brokers in the network to pick the same broker as the master of a destination. Since we also need to scale to a large number of destinations, using a registry track who is the masters would also not scale well. Not to mention the problems associated with notifying brokers of master changes for each destination. %p Lucky brokers can use a Consistent Hashing algorithm to uniformly distribute and map destinations to broker nodes. The algorithm can be easily visualized by drawing a ring the represents all the possible integer values. Nodes and Resources (like destinations) get hashed onto positions on the ring. The master %img{:src=>"images/diagram-2.png", :style=>"float:right"} node for a resource is next node on the ring. The blue section of the ring in the diagram to to the right illustrates the hash values that would map to the %%em Broker 2 node. This algorithm has been implemented in the activemq-util modules as the %a{:href=>"http://fisheye6.atlassian.com/browse/activemq/sandbox/activemq-flow/activemq-util/src/main/java/org/apache/activemq/util/HashRing.java?r=HEAD"} %em HashRing class. %p The hash ring only needs to be provided a list of all the broker nodes and it can then determine which node is the master for any given destination. %p There are several important attributes to note about how a hash ring maps destinations to broker nodes. Firstly, by default destinations will be uniformly partitioned across all the broker nodes. Furthermore, a node can be 'weighted' so that it assigned fewer or more destinations. Adding or removing a node will only change the master of 1/N of all the destinations, N being the number of nodes in the network. The ring can also be iterated to get backup nodes to the master. .left %h1 Non Uniform Load .right %p A hash ring will uniformly distribute destinations across the brokers and if all destinations are uniformly utilized then the brokers would also be uniformly utilized. Unfortunately, some destinations will move a higher load of messages than others. Therefore, it will be desirable associate high load destinations with a different list of brokers which are dedicated for handling the high load. %p In more extreme cases the load on a single destination will be higher than a single broker can handle. For these cases, multiple masters will be needed for the destination. Slave destinations creating networking bridges to and from the masters will have to evenly distribute themselves across the masters so that the load can be parallelized. This of course assumes that there are multiple producing and consuming slave destinations. %p A possible implementation approach would be to configure a destination as having N masters. When that's the case we use the hash ring iteration feature to pick the first N nodes for the destination as the masters. TODO: Load balancing the slave brokers across the master brokers is very important. They could just randomly pick a master but something more deterministic may be better. .left %h1 Reducing Broker Hops .right %p Implementing destination partitioning at the network connector abstraction layer poses the problem that in the worst case a message has to traverse 4 network hops to from producer client to consumer client. %p If latency is major factor, using embedded brokers and attaching them to the network will reduce your the worst case by 2 hops. The downside is that the topology is complex to manage since the brokers and the application are collocated in the same JVM. %p In theory clients could directly connect to the master brokers for the destinations that will be used. If this is done then the max hop count goes down by 2 once again. The one complication is that if transactions are being used, then all the destinations in the transaction will have to go to 1 broker even if it's not the master for all the destinations involved in the transaction otherwise the client would have to coordinate a distributed transaction. .left %h1 WAN Considerations .right %p WAN broker setups are typically setup with a central broker or cluster in the main data center and a remote broker per remote office/location which is access over a slow network connection. %p To support this topology, we would need to be able to run multiple brokers in the data center and only have them be in the network list for destination masters. The remote brokers would never then be elected as masters for the node and therefore the only data sent and received by the remote brokers would be data needed by the remote endpoints. .left %h1 Topic Complexities .right %p In the case of topic destination and if there are multiple master destination nodes, then the network connector should broadcast the message to all the master nodes. But to efficiently support the WAN case, remote broker should be able to send the message to 1 master and request it do the broadcast to the other master nodes. This option would in effect increase your max hop count by 1.