~~ Licensed to the Apache Software Foundation (ASF) under one or more ~~ contributor license agreements. See the NOTICE file distributed with ~~ this work for additional information regarding copyright ownership. ~~ The ASF licenses this file to You under the Apache License, Version 2.0 ~~ (the "License"); you may not use this file except in compliance with ~~ the License. You may obtain a copy of the License at ~~ ~~ http://www.apache.org/licenses/LICENSE-2.0 ~~ ~~ Unless required by applicable law or agreed to in writing, software ~~ distributed under the License is distributed on an "AS IS" BASIS, ~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. ~~ See the License for the specific language governing permissions and ~~ limitations under the License. ~~ Introduction Apache Ambari™ is a monitoring, administration and lifecycle management project for Apache Hadoop™ clusters. Hadoop clusters require many inter-related components that must be installed, configured, and managed across the entire cluster. The set of components that are currently supported by Ambari includes: * {{{http://hbase.apache.org} Apache HBase™}} * {{{http://incubator.apache.org/hcatalog} Apache HCatalog™}} * {{{http://hadoop.apache.org/hdfs} Apache Hadoop HDFS™}} * {{{http://hive.apache.org} Apache Hive™}} * {{{http://hadoop.apache.org/mapreduce} Apache Hadoop MapReduce™}} * {{{http://pig.apache.org} Apache Pig™}} * {{{http://zookeeper.apache.org} Apache Zookeeper™}} [] Ambari's audience is operators responsible for managing Hadoop clusters. It allow them to: * Deploy and configure Hadoop * Define a set of nodes as a cluster * Assign roles to particular nodes or let Ambari pick a mapping for them. * Override the default versions of components or configure particular values. * Upgrade a cluster * Modify the versions or configuration of each component * Upgrade easily without losing data * Monitoring and other maintenance tasks * Check which servers are currently running across the cluster * Starting and stopping Hadoop services (like HDFS, MR, HBase) * Integrate with other tools * Provide a REST interface for defining or manipulating clusters. [] Ambari provides a REST, command line, and graphical interface. The command line and graphical interface are implemented using the REST interface and all three have the same functionality. The graphical interface is browser-based using JSON and JavaScript. Ambari requires that the base operating system has been deployed and managed via existing tools, such as Chef or Puppet. Ambari is solely focused on simplifying configuring and managing the Hadoop stack. Ambari does support adding third party software packages to be deployed as part of the Hadoop cluster. Key concepts * <> are machines in the datacenter that are managed by Ambari to run Hadoop clusters. * <> are the individual software products that are installed to create a complete Hadoop cluster. Some components are active and include servers, such as HDFS, and some are passive libraries, such as Pig. The servers of active components provide a <>. * Components consist of <> that represent the different configurations required by the component. Components have a client role and a role for each server. HDFS roles, for example, are 'client,' 'namenode,' 'secondary namenode,' and 'datanode.' The client role installs the client software and configuration, while each server role installs the appropriate software and configuration. * <> define the software and configuration for a cluster. Stacks can inherit from each and only need to specify the part that differ from their parent. Thus, although stacks can specify the version for each component, most will not. * A <> uses a stack and a set of nodes to form a cluster. When a cluster is defined, the user may specify the nodes for each role or let Ambari automatically assign the roles based on the nodes characteristics. Clusters' state can either be active, inactive, or retired. Active clusters will be started, inactive clusters have reserved nodes, but and will be stopped. Retired clusters will keep their definition, but their nodes are released. Configuration Ambari abstracts cluster configuration into groups of string key/value pairs. This abstraction lets us manage and manipulate the configurations in a consistent and component agnostic way. The groups are named for the file that they end up in, and the groups are defined by the set of components. For Hadoop, the groups are: * hadoop/hadoop-env * hadoop/capacity-scheduler * hadoop/core-site * hadoop/hdfs-site * hadoop/log4j.properties * hadoop/mapred-queue-acl * hadoop/mapred-site * hadoop/metrics2.properties * hadoop/task-controller * Configuration example Although users will typically define configurations via the web UI, it is useful to examine a sample JSON expression that would define a configuration in the REST api. ------ { "hadoop/hadoop-env": { "HADOOP_CONF_DIR": "/etc/hadoop", "HADOOP_NAMENODE_OPTS": "-Dsecurity.audit.logger=INFO,DRFAS", "HADOOP_CLIENT_OPTS": "-Xmx128m" }, "hadoop/core-site": { "fs.default.name" : "hdfs://${namenode}:8020/", "hadoop.tmp.dir" : "/grid/0/hadoop/tmp", "hadoop.security.authentication" : "kerberos", } "hadoop/hdfs-site": { "hdfs.user": "hdfs" } } ------ Stacks Stacks form the basis of defining what software needs to be installed and run and the configuration for that software. Rather than have the administrator define the entire stack from scratch, stacks inherit most of their properties from their parent. This allows the administrator to take a default stack and only modify the properties that need to be changed without dealing with a lot of boilerplate. Stacks include a list of repositories that contain the rpms or tarballs. The repositories will be searched in the given order and if the required component versions are not found, the next one will be searched. If the required file isn't found, the parent stack's repository list will be searched and so on. Stacks define the version of each component that they need. Most of the versions will come from the stack, but the operator can override the version as needed. The stack define the configuration parameters to be used by this stack. To keep the stacks generic, the configuration values may refer to the nodes that hold a particular role. Thus, <<>> may be configured to <<>> and the name of the namenode will be filled in during the configuration. A few configuration settings need to set exclusively for particular roles. For example, the NameNode needs to enable the https security option. * Stack example Here's a example JSON expression for defining a stack. ------ { "parent": "site", /* declare parent as site, r42 */ "parent-revision": "42", "repositories": { "yum": ["http://incubator.apache.org/ambari/stack/yum"], "tar": ["http://incubator.apache.org/ambari/stack/tar"] }, "configuration": { /* define the general configuration */ "hadoop/hadoop-env": { "HADOOP_CONF_DIR": "/etc/hadoop", "HADOOP_NAMENODE_OPTS": "-Dsecurity.audit.logger=INFO,DRFAS", "HADOOP_CLIENT_OPTS": "-Xmx128m" }, "hadoop/core-site": { "fs.default.name" : "hdfs://${namenode}:8020/", "hadoop.tmp.dir" : "/grid/0/hadoop/tmp", "hadoop.security.authentication" : "kerberos", } "hadoop/hdfs-site": { "hdfs.user": "hdfs" } } "components": { "common": { "version": "0.20.204.1" /* define a new version for common */ "arch": "i386" }, "hdfs": { "roles": { "namenode": { /* override one value on the namenode */ "hadoop/hdfs-site": { "dfs.https.enable": "true" } } } }, "pig": { "version": "0.9.0" } } } ------ Component Definitions We are designing the Ambari infrastructure with a generic interface for defining components. The current version of Ambari doesn't publicize the interface, but the intention is to open it up to support thirrd party components. Ambari will search the configured repositories for the component definition and use that definition to install, manage, run, and remove the component. To have consistency in the architecture, the standard Hadoop services will also be plugged in to Ambari using the same mechanism. The component definitions are written as a text file that provides the commands to perform each kind of action, such as install, start, stop, or remove. There will be well defined environment that the commands run in to provide consistency between platforms. Clusters Defining a cluster, involves picking a stack and assigning nodes to the cluster. Clusters have a goal state, which can be one of three values: * <> -- the user wants the cluster to be started * <> -- the user wants the cluster to be stopped * <> -- the user wants the cluster to be stopped, the nodes released, and the data deleted. This is useful, if the user expects to recreate the cluster eventually, but wants to release the nodes. [] Clusters also have a list of active components that should be running. This overrides the stack and provides a mechanism for the administrator to shutdown a service temporarily. * Cluster example ------ { "description": "alpha cluster", "stack": "kryptonite", "nodes": ["node000-999", "gateway0-1"], "goal": "active", "services": ["hdfs", "mapreduce"], "roles": { "namenode": ["node000"], "jobtracker": ["node001"], "secondary-namenode": ["node002"], "gateway": ["gateway0-1"], } } ------ Stack Deployment Ambari will deploy the software for its clusters from either OS-specific packages (rpms and debs) or tarballs. Rpms have the advantage of putting the software in a user-convenient location, such as <<>>, but they are specific to an OS and don't support having multiple versions installed at once, while tarballs require rebuilding the entire deployment to change one component version. The layout on the nodes looks like: ------ ${ambari}/clusters/${cluster}-${role}/stack/ /logs/ /data/disk-${0 to N}/ /pkgs/ ------ The software and configuration for the role are installed in <<>>. The logs for the managed cluster are put into <<>>. The cluster's data is in <<>> with symlinks to each of the disks that machine should use. Finally, the component tarballs are placed in the <<>> directory to be installed by the component. Ambari Installation Ambari will be packaged as both OS-specific packages (rpms and debs) and tarballs, which need to be installed on each node. The user chooses one node as the Ambari controller, which is the point of interaction for both the web UI and the REST interface. If the user doesn't already have a Zookeeper service for Ambari to use, Ambari will run one internally for its own use. Monitoring Monitoring the current state of the cluster is an important part of operating Hadoop. Ambari current supports running basic health-checks on processes running on nodes. The status will be aggregated up as the health of the corresponding Hadoop services). Roughly these checks will consist of pinging the RPC port of the server to see if it responds. High-level Design Ambari is managed by the Ambari <> – a central server, which provides the user interface and that directs the agent on each node. The agent is responsible for installing, configuring, running and cleaning up components of the Hadoop stack on the local node. Each agent will contact the controller when it has finished its work or N seconds have passed. The controller stores all of the information about the clusters and stacks in Zookeeper, which is highly available and redundant. Ambari abstracts out the configuration and software stack in the cluster as stack. Every stack release provides a default stack. If a site has multiple clusters, they can define a "site" stack that provides the site-wide defaults and have the cluster stacks derive from it. Ambari will keep the revision history of stacks to enable operators to diagnose problems and track changes. Roadmap In the future, Ambari would integrate with and use existing datacenter management and monitoring infrastructure - Nagios, etc. The other area where Ambari will focus on is a store for metrics data. HBase is a likely candidate for such a store. We also need to support adding and removing nodes from a running cluster without brining it down first. This will require doing decommissioning of nodes before they are removed. A lot of support needs to be added for supporting secure clusters, including providing a single interface to manage access control lists for the cluster. Ambari would also host a KDC, especially for servers like the tasktracker and the datanode to have their own keytabs generated and deployed by Ambari. The native KDC could optionally hook up to the Corporate KDC for user management, or host user management within itself. Continuing on the security aspects, Ambari would also have a convenient way of allowing administrators to specify ACLs for services and queues. We plan to integrate an SNMP interface for integration with other cluster management tools.