Google Open-Sources AutoML Algorithm Model Search

A team from Google Research has open-sourced Model Search, an automated machine learning (AutoML) platform for designing deep-learning models. Experimental results show that the system produces models that outperform the best human-designed models, with fewer training iterations and model parameters.

Researchers Hanna Mazzawi and Xavi Gonzalvo described the system in a recent blog post. Model Search is implemented in the TensorFlow deep-learning framework and composes a deep neural network (DNN) from a set of component blocks such as Transformers or long short-term memory (LSTM) layers. The system trains and evaluates a set of candidate models, each composed of several blocks; a search algorithm then selects the top-performing model and "mutates" it. This process is repeated iteratively until the best model is found. The Google team used Model Search to create deep-learning systems for speech processing that outperformed state-of-the-art human-designed models while requiring only 60% of the number of parameters. According to Mazzawi and Gonzalvo,

By building upon previous knowledge for a given domain, we believe that this framework is powerful enough to build models with the state-of-the-art performance on well studied problems when provided with a search space composed of standard building blocks.

AutoML is a research area devoted to automating the manual tasks associated with machine learning. Of particular interest is the automation of the design of deep-learning models, often called neural architecture search (NAS). These systems use techniques such as genetic algorithms (GA) or reinforcement learning (RL) to attempt to efficiently explore a search space. However, because NAS requires that many deep-learning models be trained and evaluated, it can be computationally expensive and time-consuming.

Google's Model Search system addresses this problem using a two-phased approach that resembles RL's "exploration vs. exploitation" tradeoff. In the exploration phase, a greedy search algorithm is used to find an optimal DNN architecture. This algorithm generates several candidate architectures, then performs random mutations on them: adding blocks, or setting parameters to random values. The system then transfers knowledge to these new candidates from previously trained candidates using parameter sharing. Finally, the candidate models are trained and evaluated; the top-scoring models will be used to seed the next exploration phase.

For exploitation, Model Search uses an ensembling strategy. A candidate model is replicated several times, and each replica is trained from scratch with a random shuffling of the training data and different initial parameters. A weighted average of the replicas' outputs is used to produce a final result during inference. This often results in better model accuracy with fewer parameters, and thus speeds up the training of the models.

Google used Model Search to automate the development of DNN models for spoken language identification and keyword spotting from audio signals. For language identification, the current best human-designed production model contains 5M parameters and achieves 60.3% accuracy; Model Search with no ensembling created a model of 2.7M parameters with 59% accuracy. With ensembling, Model Search achieved 62.77% accuracy with 5.4M parameters. For the keyword spotting problem, Model Search created a model with 184k parameters that achieved 97.04% accuracy, as compared to the human-designed model that achieved 96.7% accuracy with nearly twice as many parameters (315k).

Model Search is only the latest of a growing number of open-source AutoML and NAS tools. The TensorFlow project maintains a lightweight AutoML framework called AdaNet, and the Keras team has adopted AutoKeras from an academic laboratory. Microsoft recently released version 2.0 of their Neural Network Intelligence (NNI) AutoML toolkit, which includes several NAS algorithms and supports all common deep-learning frameworks, including TensorFlow and PyTorch.

The Model Search code is available on GitHub.

InfoQ Software Architects' Newsletter

Write for InfoQ

Rate this Article

This content is in the AI, ML & Data Engineering topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter