BT

New Early adopter or innovator? InfoQ has been working on some new features for you. Learn more

Hadoop-as-a-Service Provider Qubole Now Runs on Google Compute Engine

| by Michael Hausenblas on Dec 28, 2013. Estimated reading time: 2 minutes |

Qubole, a managed Hadoop-as-a-Service offering is now available on Google Compute Engine (GCE). Qubole was so far only available on Amazon’s AWS and this announcement follows only a few days after Google releasing GCE into general availability.

Community reactions were by and large positive and it seems people consider the Big Data theme as a potential killer app for GCE. Alex Popescu of DataStax puts it like this:

If you look at these, you’ll notice a theme: covering data from every angle; Cassandra/DSE from DataStax for OLTP, DataTorrent for stream processing, Qubole for Hadoop, MapR for their Hadoop-like solution. I can see this continuing for a while and making Google Compute Engine a strong competitor for Amazon Web Services.

With Hadoop-as-a-Service (HaaS, also known as Hadoop in the cloud) come different options:

  • Rolling your own deployment, that is, installing Apache Hadoop or one of the distributions (Cloudera, Hortonworks, MapR) in an IaaS offering, such as GCE or EC2. This allows for fine-grained control over what is running but also comes with deployment and management complexity.
  • Pre-packaged services such as Amazon’s EMR or Savvis’ Big Data offering that help with reduced deployment complexity and offer mid-level control over installed services.
  • Managed HaaS such as Qubole or Mortar, promising reduced deployment and management complexity.

The key differences of HaaS versus on-premise deployments are around elasticity, spot pricing, separation between compute and storage (for example, eventually consistent object stores such as Amazon’s S3 or Google’s Cloud Storage, and enhanced security standards. Managed HaaS offerings such as Qubole are often used in development cases, for evaluation and testing, short-running analysis jobs and to realise hybrid cloud setups. They do, however, also come with their own limitations:

  • Getting data into the cloud and getting it out again has its own price tag.
  • There may be privacy and data protection issues stemming from legal requirements that prevent or limit the use cases.
  • The TCO of a 24/7 operation has to be calculated through on a case-by-case basis.
  • There is a general mismatch between Hadoop, Hive, etc. on the one hand and the eventually consistent object stores on the other.

Ashish Thusoo and Joydeep Sen Sarma gathered experience running Hadoop and Hive during their tenure at Facebook, where they ran a data infrastructure team. Then, in June 2012, they launched Qubole that completed a $7 million Series A funding round in April 2013. Joydeep gave a deep-dive on the challenges they faced implementing their HaaS offering and provided insights on the internals in his Hive London Meetup talk Cloud Friendly Hadoop & Hive. Further, Christian Prokopp (Data Scientist at Rangespan) recently wrote up a detailed rundown and comparison of Qubole and EMR.

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Discuss

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and dont miss out on content that matters to you

BT