BT

Your opinion matters! Please fill in the InfoQ Survey!

Google Uses Machine Learning to Identify Intrusive Android Apps

| by Sergio De Simone Follow 5 Followers on Jul 15, 2017. Estimated reading time: 1 minute | NOTICE: QCon.ai - Applied AI conference for Developers Apr 9-11, 2018, San Francisco. Join us!

Google uses a clustering algorithm to automatically analyze Android apps and detect which ones can be considered intrusive, write Google security engineers Martin Pelikan, Giles Hogben, and Ulfar Erlingsson.

Intrusive apps are those that require the user to grant a larger set of capabilities than what would be strictly required for their proper functioning. For example, as Google engineers explain, a coloring book app will not usually need access to geolocation data. Other examples of capabilities that not all apps need to do their job are access to personal data, camera, address book, etc. Granting more privileges than strictly necessary is a potentially harmful, since you cannot really know what those data are used for. Among the most frequent cases of harmful app behaviours are: backdoors, spyware, data collection, denial of service, and many more.

The approach that Google follows to detect intrusive apps is based on the concept of functional peer group, i.e. a group of apps that share similar features and that should therefore require a similar set of authorizations. Once those groups are formed, it becomes possible to detect anomalous apps in each group, meaning those apps that require more privileges than similar apps do. This approach requires monitoring the Android Play Store, collecting detailed statistics, and discovering user expectations, so that app groups can be determined automatically. Indeed, according to Google engineers, fixed categorization and manual curation would be a tedious and error-prone task.

To make this approach more effective, Google uses deep learning to discover groups of apps that share similar characteristics using those apps’ metadata, which include textual descriptions and install metrics. Once peer groups are defined, anomaly detection is used inside of each group to identify anomalous apps, i.e. apps that show a mismatch between the privileges they require and their functionality. Anomalous apps are then inspected thoroughly to decide which ones are actually intrusive. That information is used also to determine which apps should be promoted, as well as to get in touch with potentially intrusive apps’ developers and help them improve the privacy and security of their apps.

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Discuss

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT