Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Google Launches Cloud Datalab Beta

Google Launches Cloud Datalab Beta

At the recent Google Cloud Platform Next experience event in Paris, Google announced a beta data visualization service called Cloud Datalab. Cloud Datalab allows developers to explore and analyze their data through an interactive web-browser experience.

Greg DeMichellie, director product management at Google, describes the service as a “tool that allows you to get insights from your raw data and explore, share, and publish reports in a fast, simple and cost-effective way.”

DeMichellie lists Cloud Datalab core capabilities as:

  • Explore, transform, visualize and process data in Google Cloud Platform. The size of data that can be managed includes megabytes and gigabytes of data.
  • Combine code from multiple languages seamlessly: Python, SQL and JavaScript (BigQuery UDF).
  • Build and test data pipeline for deployment to Google BigQuery.
  • Create, tune, and deploy Machine Learning models.

Since Cloud Datalab is a managed service, developers and data scientists can expect a low barrier of entry through the use of configuration and a wizard-based setup process. In order to use Cloud Datalab, a developer must deploy the service as a Google App Engine application.  As a result, Datalab leverages both Google Big Query and Cloud Storage as underlying services. 

Cloud Datalab also uses Jupyter where developers can store their scripts, documentation, visualizations and results in a notebook. Developers can take advantage of existing Jupyter packages including statistical and machine learning libraries.  Users of the service also have the ability to share their notebooks with non-Google source control repositories like GitHub and Bitbucket.

The following image illustrates a pre-existing notebook that can be leveraged by new users and GitHub integration that exists within the service.

Image Source:

From a pricing perspective, Google has indicated you will only pay for cloud resources that the underlying App Engine consumes including BigQuery and Cloud Storage. Google is also open sourcing Cloud Datalab by allowing developers to fork or submit pull requests on the project that is hosted in GitHub.

Google faces competition in this cloud data exploration and visualization space from familiar competitors including Amazon and Microsoft.  Amazon positions its QuickSight business intelligence tool as a low barrier, configuration driven tool that allows customers to start visualizing their data within a short period of time through a web browser. Amazon also leverages a similar model to Google where it will layer its visualization platform on top of other first party services like Amazon RDS and Amazon DynamoDB. Microsoft’s PowerBi is their prominent BI tool that allows end users and developers to consume data from a variety of on-premises and cloud-based services and visualize them through a web browser or mobile device. Microsoft has leveraged a lot of Excel-like features, that end users are comfortable with, in order to drive adoption.

Rate this Article