BT

New Early adopter or innovator? InfoQ has been working on some new features for you. Learn more

Domino: Datascience-as-a-Service

| by Michael Hausenblas on Mar 11, 2014. Estimated reading time: 1 minute |

Domino, a Platform-as-a-Service for data science, enables people to do analytical work using languages such as Python or R in the cloud.

According to Nick Elprin (co-founder of Domino), Domino allows data scientist to focus on their analysis, not their infrastructure:

As data volumes have increased and analytical techniques have become more sophisticated, we believe the tools required to do modern data analysis have lagged in their ease of use and have unnecessarily restricted access to the field of data science

The Domino platform rests on three pillars of functionality:

  1. Direct to cloud deployment and execution: Domino allows to run existing code (Python, R, Matlab, Julia, shell scripts, etc.) on EC2 in order to off-load long-running or resource-intensive tasks. The system also takes care of all the plumbing under the hood to make this happen: AMI management, starting and stopping machines, securely transferring data onto a machine and securely transferring results back.
  2. Version control for data science: The Domino folks figured that tools like git are insufficient for analytical workflows, as they can’t handle large data sets, and don’t create a link between the inputs and the results (e.g., charts, figures). Domino automatically keeps snapshots of the entire project, currently up to 40GB, making it easy to trace the history of the work in its entirety, incl. code, data, and the results.
  3. Collaboration: Like a Github project, Domino projects can have collaborators who can view, edit, and run a project. Domino detects conflicts, sends notifications with updated results after runs finish, and has an internal notebook to facilitate discussion as the team’s work progresses.

With a pay-as-you-go approach, Domino’s pricing ranges from introductory free accounts to monthly subscriptions. As InfoQ learned from Nick Elprin, despite its early days the platform is already being used by the entire spectrum of data science practitioners: from academics, such as an ecologist who analyzes thousands of images for her research, over data science consultants for Kaggle competitions to marketing firms, for example, to help its clients better target mailings.

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Discuss

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and dont miss out on content that matters to you

BT