Uber's Synthetic Training Data Speeds Up Deep Learning by 9x

Uber AI Labs has developed an algorithm called Generative Teaching Networks (GTN) that produces synthetic training data for neural networks which allows the networks to be trained faster than when using real data. Using this synthetic data, Uber sped up its neural architecture search (NAS) deep-learning optimization process by 9x.

In a paper published on arXiv, the team described the system and a series of experiments. GTN is motivated by the problem of neural architecture search (NAS), which trains many different deep-learning model structures and selects the one that performs best on a set of test data. While a typical approach would train each model on the full data set for multiple iterations (or epochs), this is time-consuming and expensive. However, models can be trained on GTN's synthetic data for a shorter amount of time to produce an estimate of their true performance on the real training data; thus, the models can be quickly evaluated and the search time improved. According to the researchers:

GTN-neural architecture search (GTN-NAS) is competitive with the state of the art NAS approaches that achieve top performance while using orders of magnitude less computation than typical NAS methods.

Neural architecture search is an active research area in automated machine learning (AutoML). One drawback of NAS is that it requires training many deep-learning models to determine which model performs best. Much of the research is focused on efficiently exploring the search space, which means the system would train fewer models. Uber's system instead produces a new dataset which allows each model to be trained for fewer iterations, allowing the system to experiment with many more models in the same amount of time.

The problem with training a model for fewer iterations is that in the very early stages of training, most models perform poorly, and many iterations are necessary to determine a model's true performance. However, research shows that not all training samples are created equal, and training can be sped up by carefully choosing input samples. Uber's insight was to use meta-learning to generate training samples. Similar to a generative-adversarial network (GAN), Uber's GTN trains a generator neural network to produce training samples for a learner network. The learner is evaluated on real test data to produce a "meta-loss", and the gradient of the meta-loss updates the generator. Using this technique, Uber created a generator that produced samples for training a computer-vision (CV) system to recognize digits from the MNIST dataset. The CV system was able to achieve 98.9% accuracy with only 32 training steps. For a similar experiment on the CIFAR10 dataset, Uber showed that they could predict model performance with 128 training steps using synthetic data, compared to 1200 steps using real data, a speedup of 9x.

Paper co-author Jeff Clune tweeted a picture of synthetic image data produced by the system, describing it as "alien and unrealistic." He also said:

GTN-generated data is a drop-in replacement for real data in neural architecture search, & can thus dramatically speed up any NAS algorithm. We've only shown that so far for Random Search-NAS (+some bells and whistles), but we'd love to see others try it with fancier NAS methods!

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

InfoQ Article Contest

Rate this Article

This content is in the AI, ML & Data Engineering topic

Related Topics:

Related Editorial

Related Sponsored Content

Popular across InfoQ

The InfoQ Newsletter