Google Researcher Invented New Technology to Bring Neural Networks to Mobile Devices

Recently many companies released applications that use deep neural networks. Neural networks can require a large amount of computation. For this reason, they often appear in software as a service applications on GPU powered servers. For applications that should run without internet access, must be fast and responsible, or in which privacy is a concern, using networks on servers is not possible.

Lately, multiple companies announced that they are working on putting neural networks on mobile devices. Apple already announced their CoreML platform at WWDC 2017. Google is working on a version of their popular Tensorflow toolkit for mobile devices called "Tensorflow Lite". Google also released several pre-trained image recognition models, where developers can choose their tradeoff between efficiency and accuracy.

Although developers can run their networks on mobile devices, they are still limited in the options for building faster applications with neural networks. One option is reducing reduce the size of their network, which often comes with an accuracy loss. Another option is training a full network and reducing the floating point precision after training. With this option, it's difficult to estimate what effect this will have on your performance. There are also some older techniques like "Optimal Brain Damage", invented by director of AI research at Facebook, Yann Le Cun. None of these methods to optimise network inference became very popular.

Google researcher Sujith Ravi came up with a novel idea: co-train two neural networks. One network is a full neural network, called the trainer network. The other network, called projection network, is a network that tries to represent the input and intermediate representations of the trainer network in a low-memory representation. To do this it uses efficient functions. Both networks are trained at the same time and share the same loss function. In this way the projection network learns from the trainer network. When both neural networks are ready to be used, the large network can remain on the server while users can download the small efficient network on their smartphone.

The paper of Sujith Ravis is available on the Arxiv. The paper also contains discussions on how many bits you need to score well on several famous data sets.

Topics

Beyond the Breach: Proactive Defense in the Age of Advanced Threats

Cell-Based Architecture Adoption Guidelines

Launching AI Agents Across Europe at Breakneck Speed With an Agent Computing Platform

Making Digital Accessibility More Than Just High Contrast: Building Truly Inclusive Software

Proactive Approaches to Securing Linux Systems and Engineering Applications

Helpful links

Choose your language

Write for InfoQ

Rate this Article

This content is in the AI, ML & Data Engineering topic

Related Topics:

Related Editorial

Related Sponsored Content

Popular across InfoQ

Cloudflare Introduces Workflows for Building Scalable Resilient Multi-Step Applications

Cloudflare Introduces Short-Lived SSH Access, Eliminating the Need for SSH Credentials

Microsoft Introduces Modern Web App Pattern for .NET: Accelerating App Modernization to the Cloud

Apache Tomcat 11.0 Delivers Support for Virtual Threads and Jakarta EE 11

AWS Lambda Introduces a Visual Studio Code-Based Editor with Advanced Features and AI Integration

Generally AI - Season 2 - Episode 5: Do Robots Dream of Electric Pianos?

Beyond the Breach: Proactive Defense in the Age of Advanced Threats

Steve Klabnik and Herb Sutter Talk about Rust and C++

Challenges and Lessons Porting Code from C to Rust

Grab Employs LLMs for Conversational Data Discovery with GPT-4, Glean and Slack

Cell-Based Architecture Adoption Guidelines

Software Architecture Tracks at QCon San Francisco 2024 – Navigating Current Challenges and Trends

Making Digital Accessibility More Than Just High Contrast: Building Truly Inclusive Software

What Developers Can Do to Continue to Program as They Age

How Rules Can Foster Creativity: The Design System of Reykjavík

Launching AI Agents Across Europe at Breakneck Speed With an Agent Computing Platform

OSI Releases New Definition for Open Source AI, Setting Standards for Transparency and Accessibility

Being a Responsible Developer in the Age of AI Hype

Optimizing Uber's Search Infrastructure: Upgrading to Apache Lucene 9.5

Improving the Efficiency of Goku Time-Series Database at Pinterest

Expedia Migrates a Massive Cassandra Cluster to ScyllaDB with Zero Downtime

QCon San Francisco

QCon London

InfoQ Dev Summit Boston

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?