Introduction
Apache Ambari is a web-based tool for installing, managing, and monitoring Apache Hadoop clusters. The set of Hadoop components that are currently supported by Ambari includes:
Ambari's primary audience is system administrators responsible for managing Hadoop clusters.
Ambari allows them to:
- Easily Install a Hadoop Cluster
- Ambari provides an easy-to-use, step-by-step wizard for installing Hadoop services across any number of nodes.
- Ambari leverages Puppet to perform installation and configuration of Hadoop services for the cluster.
- Manage a Hadoop Cluster
- Ambari provides central management for starting, stopping, and reconfiguring Hadoop services across the entire cluster.
- Monitor a Hadoop Cluster
- Ambari provides a dashboard for monitoring health and status of the Hadoop cluster. Ambari leverages Ganglia to collect system metrics.
- Ambari sends email alerts when your attention is needed (e.g., a node goes down, remaining disk space is low, etc). Ambari leverages Nagios to monitor and trigger alerts.
In the near future, Ambari will allow third-party tool developers to integrate Hadoop cluster management and monitoring capabilities via its RESTful interface.
Roadmap
- Support for Hadoop Security
- Support for various operating systems
- Ambari currently supports 64-bit RHEL/CentOS 5.* and 6.*
- Support for other operating systems are being worked on (SLES 11.* support will be coming soon)
- RESTful API for third-party integration
- Ambari will expose a unified, RESTful API to enable third-party applications to integrate Hadoop cluster management and monitoring capabilities. This is an area of active development. We will publish the API docs soon.
- Granular configurations
- Ambari currently applies configurations at the cluster-level. To allow for more flexibility, Ambari will allow for configurations in a more granular manner (e.g., apply a set of configurations to a specific group of nodes, etc.)
- Security
- Easy installation of secure Hadoop clusters (Kerberos-based)
- Role-based user authentication, authorization, and auditing
- Support for LDAP and Active Directory
- Visualization
- Interactive visualization of current and historical states of the cluster for a number of key metrics
- Interactive visualization of Pig, Hive, and MapReduce jobs