Cloudbreak, New Hadoop as a Service API, Enters Open Beta
Cloudbreak, a new open-source and cloud-agnostic Hadoop as a Service API, is now open for beta access to application developers and enterprises. SequenceIQ, Cloudbreak's maker, claims that its freely available product will make it easier to manage and monitor on-demand Hadoop clusters while also abstracting their provisioning.
Cloudbreak API can provision an arbitrary number of Hadoop nodes by taking care of spanning up the cluster, and configuring the networks and the selected Hadoop services. As the workload changes, the API allows to add or remove nodes on the fly. Hadoop cluster stack definition and component layout are managed through the concept of blueprints, which are supported for different Hadoop distributions.
Cloudbreak aims at satisfying several criterias, says SequenceIQ:
- Being 100% open source under Apache 2 license.
- Providing the ability to quickly launch arbitrary sized Hadoop clusters.
- Being cloud provider agnostic.
- Supporting different Hadoop services and configurations in a declarative way.
- Being elastic and flexible, with the ability to resize running clusters.
- Enforcing security.
Cloubreak is built on the foundations provided by docker containers, which add a high-level API providing lightweight virtualization on top of Linux Containers, an operating system–level virtualization method for running multiple isolated Linux systems (containers) on a single control host. Unlike traditional virtual machines, a docker container does not include a separate operating system and relies on the kernel's functionality provided by the underlying Linux Container.
Dockers containers allow Cloudbreak to be cloud-agnostic, since all the Hadoop services are installed and run inside them, and the containers are shipped between different cloud vendors. The use of dockers ensures further advantages, says SequenceIQ, such as the availability of a reproducible and testable environment; versioning; sandbox isolation, and others.
Besides docker containers, Cloudbreak rests its foundations on other open-sources technologies:
- Apache Ambari: a project aimed at making Hadoop management simpler by providing an intuitive, easy-to-use Hadoop management web UI backed by a RESTful API.
- Serf: another SequenceIQ component providing cluster membership, failure detection, and decentralized, fault-tolerant and highly available clusters orchestration.
SequenceIQ is offering the possibility of using Cloudbreak UI from their own servers to launch on-demand Hadoop clusters on a cloud hosting provider managed by the user. Alternatively, there is the option to host Cloudbreak within a private cloud and access it through its REST API.
According to SequenceIQ, Cloudbreak is still under development, although the codebase is considered stable for deployments.
Stephanie Davis (nee Stewart) Dec 21, 2014