Google Cloud Machine Learning and Tensor Flow Alpha Release

Late last month Google released an alpha version of their TensorFlow (TF) integrated cloud machine learning service as a response to a growing need to make their Tensor Flow library to run at scale on the Google Cloud Platform (GCP).

Google describes in detail several new feature sets around making TF usage scale by integrating several pieces of the GCP like Dataproc, a managed Hadoop and Spark service. The open-source initiative TF and its recent integration with GCP “allows users to run custom distributed learning algorithms” on managed cloud services made generally available as part of the announcement, tightly coupling their big-data and analytics platforms with the machine learning space.

The PaaS hinged around TF and a number of pre-trained models for language translation, image and speech recognition. TF Serving now integrates with Kubernetes to take advantage of its external load balancer and Docker orchestration semantics of pods to “scale training to thousands of cores”. The example noted running a trained Inception-v3 model by replicating the packaged TF resources, including the TF Serving-based gRPC server and model attributes across pods with “over 27 million parameters and 5.7 billion floating point operations per inference” processed across a cluster.

The Docker integration further enables new model version packaging and deployment as part of a continuous training pipeline, a TF Serving goal announced earlier this year. An available TF & Kubernetes tutorial allows one to recreate an Inception-based TF & Kubernetes pipeline for demonstration and learning purposes.

The announcement also noted Apache Beam, a new data processing pipeline project proposal that could potentially satisfy TF at-scale’s need for a pipeline to deliver learning data sets, by providing batch and stream processing options, while still being compatible with Spark, Flink and Dataflow based systems. Google provided analyses around the properties of tightly coupled ML systems in terms of maintainability and technical debt over time that could point to further standardization possibilities for ML and its adoption by watching TF’s increasing ubiquity across the GCP.

InfoQ Software Architects' Newsletter

Write for InfoQ

Rate this Article

This content is in the AI, ML & Data Engineering topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter