Helix Tutorial
In this tutorial, we will cover the roles of a Helix-managed cluster, and show the code you need to write to integrate with it. In many cases, there is a simple default behavior that is often appropriate, but you can also customize the behavior.
Convention: we first cover the basic approach, which is the easiest to implement. Then, we'll describe advanced options, which give you more control over the system behavior, but require you to write more code.
Prerequisites
- Read Concepts/Terminology and Architecture
- Read the Quickstart guide to learn how Helix models and manages a cluster
- Install Helix source. See: Quickstart for the steps.
Tutorial Outline
- Participant
- Spectator
- Controller
- Rebalancing Algorithms
- User-Defined Rebalancing
- State Machines
- Messaging
- Customized health check
- Throttling
- Application Property Store
- Admin Interface
- YAML Cluster Setup
- Helix Agent (for non-JVM systems)
- Task Framework
- Helix REST Service 2.0
- Helix UI Setup
- Helix Customized View
- Helix Cloud Support
- Helix Distributed Lock
Preliminaries
First, we need to set up the system. Let's walk through the steps in building a distributed system using Helix.
Start ZooKeeper
This starts a zookeeper in standalone mode. For production deployment, see Apache ZooKeeper for instructions.
./start-standalone-zookeeper.sh 2199 &
Create a Cluster
Creating a cluster will define the cluster in appropriate znodes on ZooKeeper.
Using the Java API:
// Create setup tool instance
// Note: ZK_ADDRESS is the host:port of Zookeeper
String ZK_ADDRESS = "localhost:2199";
admin = new ZKHelixAdmin(ZK_ADDRESS);
String CLUSTER_NAME = "helix-demo";
//Create cluster namespace in zookeeper
admin.addCluster(CLUSTER_NAME);
OR
Using the command-line interface:
./helix-admin.sh --zkSvr localhost:2199 --addCluster helix-demo
Configure the Nodes of the Cluster
First we'll add new nodes to the cluster, then configure the nodes in the cluster. Each node in the cluster must be uniquely identifiable. The most commonly used convention is hostname:port.
String CLUSTER_NAME = "helix-demo";
int NUM_NODES = 2;
String hosts[] = new String[]{"localhost","localhost"};
String ports[] = new String[]{"7000","7001"};
for (int i = 0; i < NUM_NODES; i++)
{
InstanceConfig instanceConfig = new InstanceConfig(hosts[i]+ "_" + ports[i]);
instanceConfig.setHostName(hosts[i]);
instanceConfig.setPort(ports[i]);
instanceConfig.setInstanceEnabled(true);
//Add additional system specific configuration if needed. These can be accessed during the node start up.
instanceConfig.getRecord().setSimpleField("key", "value");
admin.addInstance(CLUSTER_NAME, instanceConfig);
}
Configure the Resource
A resource represents the actual task performed by the nodes. It can be a database, index, topic, queue or any other processing entity. A resource can be divided into many sub-parts known as partitions.
Define the State Model and Constraints
For scalability and fault tolerance, each partition can have one or more replicas. The state model allows one to declare the system behavior by first enumerating the various STATES, and the TRANSITIONS between them. A simple model is ONLINE-OFFLINE where ONLINE means the task is active and OFFLINE means it's not active. You can also specify how many replicas must be in each state, these are known as constraints. For example, in a search system, one might need more than one node serving the same index to handle the load.
The allowed states:
- LEADER
- STANDBY
- OFFLINE
The allowed transitions:
- OFFLINE to STANDBY
- STANDBY to OFFLINE
- STANDBY to LEADER
- LEADER to STANDBY
The constraints:
- no more than 1 LEADER per partition
- the rest of the replicas should be STANDBYs
The following snippet shows how to declare the state model and constraints for the LEADER-STANDBY model.
String STATE_MODEL_NAME = "LeaderStandby";
StateModelDefinition.Builder builder = new StateModelDefinition.Builder(STATE_MODEL_NAME);
// Define your own states: those are opaque strings to Helix
// Only the topology of the state machine (initial state, transitions, priorities, final DROPPED state) is meaningful to Helix
String LEADER = "LEADER";
String STANDBY = "STANDBY";
String OFFLINE = "OFFLINE";
// Add states and their rank to indicate priority. A lower rank corresponds to a higher priority
builder.addState(LEADER, 1);
builder.addState(STANDBY, 2);
builder.addState(OFFLINE);
// Note the special inclusion of the DROPPED state (REQUIRED)
builder.addState(HelixDefinedState.DROPPED.name());
// Set the initial state when the node starts
builder.initialState(OFFLINE);
// Add transitions between the states.
builder.addTransition(OFFLINE, STANDBY);
builder.addTransition(STANDBY, OFFLINE);
builder.addTransition(STANDBY, LEADER);
builder.addTransition(LEADER, STANDBY);
// There must be a path to DROPPED from each state (REQUIRED)
builder.addTransition(OFFLINE, HelixDefinedState.DROPPED.name());
// set constraints on states
// static constraint: upper bound of 1 LEADER
builder.upperBound(LEADER, 1);
// dynamic constraint: R means it should be derived based on the replication factor for the cluster
// this allows a different replication factor for each resource without
// having to define a new state model
builder.dynamicUpperBound(STANDBY, "R");
StateModelDefinition myStateModel = builder.build();
admin.addStateModelDef(CLUSTER_NAME, STATE_MODEL_NAME, myStateModel);
Assigning Partitions to Nodes
The final goal of Helix is to ensure that the constraints on the state model are satisfied. Helix does this by assigning a state to a partition (such as LEADER, STANDBY), and placing it on a particular node.
There are 3 assignment modes Helix can operate in:
- FULL_AUTO: Helix decides the placement and state of a partition.
- SEMI_AUTO: Application decides the placement but Helix decides the state of a partition.
- CUSTOMIZED: Application controls the placement and state of a partition.
For more information on the assignment modes, see the Rebalancing Algorithms section of this tutorial.
String RESOURCE_NAME = "MyDB";
int NUM_PARTITIONS = 6;
String STATE_MODEL_NAME = "LeaderStandby";
String MODE = "SEMI_AUTO";
int NUM_REPLICAS = 2;
admin.addResource(CLUSTER_NAME, RESOURCE_NAME, NUM_PARTITIONS, STATE_MODEL_NAME, MODE);
admin.rebalance(CLUSTER_NAME, RESOURCE_NAME, NUM_REPLICAS);