Managing Hadoop with Apache Ambari
With the increasing popularity of Hadoop, the issue of a proper management platform is moving to the forefront of the current issues. There are already several commercial Hadoop management platforms, such as Cloudera Enterprise Manager, but Apache Ambari is the first open source implementation of such a system. It is a web-based tool for provisioning, managing, and monitoring Apache Hadoop clusters. Currently Ambari supports the majority of Hadoop components including: HDFS, MapReduce, Hive, Pig, HBase, Zookeper, Sqoop, HCatalog, etc.
In his new blog post, "Apache Amabri: Hadoop Operations, Innovtation, and Enterprise Readiness," Hortonworks Vice President of Corporate Strategy Shaun Connolly emphasizes the following main achievements of Ambari during this year:
- Simplified cluster provisioning with a step-by-step installation wizard
- Pre-configured key operational metrics for instant insight into the health of Hadoop Core (Hadoop Distributed File System and MapReduce) and related projects such as HBase, Hive and HCatalog
- Visualization and analysis of job and task execution to gain a better view into dependencies and performance
- A complete RESTful API for exposing monitoring information and integrating with existing operational tools
- An intuitive user interface that makes viewing information and controlling a cluster easy and productive
Ambari leverages Ganglia for metrics collection and Nagios for system alerting and will send emails when the administrator’s attention is needed (e.g., a node goes down, remaining disk space is low, etc).
Additionally, Ambari supports Hadoop security by supporting installation of secure (Kerberos-based) Hadoop clusters, providing role-based user authentication, authorization, and auditing and integration with LDAP and Active Directory for user management.
Apache Ambari is currently one of the 6 top Open Source Hadoop management tools. According Connolly, Ambari is an important part of Hadoop ecosystem because “stability and ease of management are two key requirements for enterprise adoption of Hadoop”.