EC2 users can now automate the deployment of Apache Mesos, an open-source tool to share cluster resources between multiple data processing frameworks, at scale through a new web service called Elastic Mesos provided by Big Data startup Mesosphere.
This service is similar in nature to Amazon's Elastic MapReduce, as it will install all the Mesos dependencies on Amazon EC2 instances, including Zookeeper and HDFS, and deliver a ready-to-go cluster. To top it off, there is no fee associated with Elastic Mesos, so you only pay for your EC2 usage as you go. Elastic Mesos currently proposes cluster sizes limited to either six or 18 m1.large instances in the us-east-1 region, subject to the on-demand instance price.
You can deploy a Mesos cluster through the Elastic Mesos UI in a three-step process where you have to specify the desired cluster size, the EC2 credentials and an email address for notifications, nothing more. The completion time depends mostly on the EC2 instance provisioning time, which can vary, and Mesosphere gives an estimate of 20 minutes before your cluster is ready.
Initially developed at UC Berkeley as a research project, Mesos was quickly turned into a full-featured platform by Twitter to handle its explosive growth. As Twitter’s SVP of Engineering Christopher Fry puts it “Mesos is our version of elastic compute” and it is now a top-level Apache project. Developing an offering around Mesos makes a lot of sense for Mesosphere, whose founders Flo Leibert and Tobi Knaup previously worked at Twitter and AirBnb, the two biggest adopters of Mesos. The list of Mesos supporters is growing every month, and you can find other big names like Vimeo, OpenTable or UC Berkeley powered by Mesos today.
Elastic Mesos is similar to Apache Whirr, an open-source library to run and manage services in the cloud, but Whirr doesn’t yet have Mesos support. Even so, Elastic Mesos offers a self-contained service whereas Whirr is more geared towards system administrators with more control over the full lifecycle of a cluster.
This is the first big step towards the mainstream adoption of Mesos, a project often confused with Hadoop’s YARN. Both projects indeed have the same goal, which is to make sharing clusters seamless and efficient, and so far YARN has seen a much higher adoption rate since it comes as the de facto scheduler in Hadoop 2.
The community response was mostly positive on Twitter, even if it remains to be seen how much this will drive adoption. There was already a tutorial at the Spark Summit in December 2013 detailing how to run Spark on Elastic Mesos, which got a favorable response from the Spark community.