Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Google Researcher Invented New Technology to Bring Neural Networks to Mobile Devices

Google Researcher Invented New Technology to Bring Neural Networks to Mobile Devices

Leia em Português

This item in japanese

Recently many companies released applications that use deep neural networks. Neural networks can require a large amount of computation. For this reason, they often appear in software as a service applications on GPU powered servers. For applications that should run without internet access, must be fast and responsible, or in which privacy is a concern, using networks on servers is not possible.

Lately, multiple companies announced that they are working on putting neural networks on mobile devices. Apple already announced their CoreML platform at WWDC 2017. Google is working on a version of their popular Tensorflow toolkit for mobile devices called "Tensorflow Lite". Google also released several pre-trained image recognition models, where developers can choose their tradeoff between efficiency and accuracy.

Although developers can run their networks on mobile devices, they are still limited in the options for building faster applications with neural networks. One option is reducing reduce the size of their network, which often comes with an accuracy loss. Another option is training a full network and reducing the floating point precision after training. With this option, it's difficult to estimate what effect this will have on your performance. There are also some older techniques like "Optimal Brain Damage", invented by director of AI research at Facebook, Yann Le Cun. None of these methods to optimise network inference became very popular.

Google researcher Sujith Ravi came up with a novel idea: co-train two neural networks. One network is a full neural network, called the trainer network. The other network, called projection network, is a network that tries to represent the input and intermediate representations of the trainer network in a low-memory representation. To do this it uses efficient functions. Both networks are trained at the same time and share the same loss function. In this way the projection network learns from the trainer network. When both neural networks are ready to be used, the large network can remain on the server while users can download the small efficient network on their smartphone.

The paper of Sujith Ravis is available on the Arxiv. The paper also contains discussions on how many bits you need to score well on several famous data sets.

Rate this Article