New Early adopter or innovator? InfoQ has been working on some new features for you. Learn more

Databricks Integrates Spark and TensorFlow for Deep Learning

| by Dylan Raithel Follow 4 Followers on Mar 12, 2016. Estimated reading time: 2 minutes | NOTICE: - Applied AI conference for Developers Apr 9-11, 2018, San Francisco. Join us!

Since announcements late last year about Google open-sourcing TensorFlow, the company’s open-source library for machine learning, and previous coverage at InfoQ, the data-science community has had an opportunity to try out TensorFlow for their own projects.

Databricks’ Tim Hunter demonstrates TensorFlow-generated model selection and at-scale neural network processing with Spark.

Hunter describes an artificial neural network as mimicking the neurons in the visual cortex of the human brain, which when adequately trained can be used for processing complex input data like imagery or audio.

Hunter detailed how he ran TensorFlow on various Spark configurations to parallelize hyperparameter tuning. Hunter stated that TensorFlow, currently available with Python and C++ support helped “automate the creation of training algorithms for neural networks of various shapes and sizes” for the purpose of training a neural network to process large amounts of data with high accuracy and optimal runtime performance.

Some hyperparameters Hunter noted are the number of neurons in each layer and the learning rate, which are separate from the training algorithms used on the neural network itself.

How well tuned the hyperparameters are for a given algorithm affects run time and model accuracy. The hyperparameter settings were compared against each other to correlate variables like the number of neurons in each layer and test error. Hunter noted:

The learning rate is critical: if it is too low, the neural network does not learn anything (high test error). If it is too high, the training process may oscillate randomly and even diverge in some configurations.

Databricks set up an experiment to measure the effects of Spark-based TensorFlow training algorithms on neural network accuracy and run time performance. The experiment consisted of a default hyperparameter group, a number of hyperparameter permutations, a test data set, and a single node, two-node and 13-node Spark cluster.

To address the need to discover the optimal hyperparameter set Hunter employed Spark to distribute processing the TensorFlow generated sets to concurrently test model efficacy.

Hunter described using Spark as a way

to broadcast the common elements such as data and model description, … then scheduling the individual repetitive computations across a cluster of machines in a fault-tolerant manner.

Hunter noted the improvements in model accuracy and runtime with Spark integration stating

Even though the neural network framework we used itself only works in a  single-node, we can use Spark to distribute the hyperparameter tuning process and model deployment …

Distributing algorithm selection cut down training time, improved accuracy by 34 percent over the control hyperparameter set and gave Databricks a better understanding of various hyperparameters’ sensibility.

It sped up model validation seven times over single node model validation by allowing multiple algorithms generated by TensorFlow to be tested concurrently or as Hunter noted “[in an] embarrassingly parallel” manner across the nodes for each of the three Spark clusters mentioned.

Once the best-fit model was selected and the neural network trained, the neural network was then deployed to Spark to run on large data sets.

Databricks didn't note the specific hardware implementation, but some indication may be made based on the iPython notebooks Hunter made available for reproducing the experiments, as well as Databricks’ cluster options for customers.

The ability to scale model selection and neural network tuning by adopting tools like Spark and TensorFlow may be a boon for the data science and machine learning communities because of the increasing availability of cloud computing and parallel resources to a wider range of engineers.

Rate this Article

Adoption Stage

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread


Login to InfoQ to interact with what matters most to you.

Recover your password...


Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.


More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.


Stay up-to-date

Set up your notifications and don't miss out on content that matters to you