DeepMind Introduces Gato, a New Generalist AI Agent

Gato, as the agent is known, is DeepMinds’s generalist AI that can perform many different tasks that humans can do, without carving a niche for itself as an expert on one task. Gato can perform more than 600 different tasks, such as playing video games, captioning images and moving real-world robotic arms. Gato is a multi-modal, multi-task, multi-embodiment generalist policy.

DeepMind is one of the most well-known AI companies dedicated to the advancement of artificial intelligence. With several programs, it aims to offer new ideas and improvements in machine learning, engineering, simulation, and computer infrastructure. The remarkable all-in-one machine learning kit has recently gained popularity in the worldwide tech market.

DeepMind says that Gato is trained on a large number of datasets comprising agent experience in both simulated and real-world environments, in addition to a variety of natural language and image datasets.

Gato, like all AI systems, learns by example, ingesting billions of words, images from real-world and simulated environments, button presses, joint torques and more in the form of tokens. These tokens served to represent data in a way Gato could understand, enabling the system to perform different tasks.

Gato's architecture isn't that different from many of the AI systems in use today. In the sense that it's a Transformer, it's similar to OpenAI's GPT-3. The Transformer has been the architecture of choice for complicated reasoning tasks, displaying abilities in summarizing texts, producing music, categorizing objects in photos, and analyzing protein sequences.

Even more remarkable, Gato has a parameter count that is orders of magnitude lower than single-task systems, including GPT-3. Parameters are system components learnt from training data that fundamentally describe the system's ability to solve a problem, such as text generation. GPT-3 has more than 170 billion, while Gato has only 1.2 billion.

Both GPT-3 and Gato require strong filters to remove weaknesses and shortcomings like bias, racism, and harsh language from the outcome. Meanwhile, AGI is known for enabling intelligent robots to understand, learn, and do intellectual activities in the same way as humans do.

With cognitive computing capabilities, it can analyze the human mind and solve any complex problem. Both of these tech companies are dealing with major AGI challenges, including issues with learning human-centric capabilities like sensory perception, motor skills, problem-solving, human-level creativity, and so on, as well as a lack of working protocol, reduced universality, business alignment, and AGI direction.

About the Author

Daniel Dominguez

Show moreShow less

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

InfoQ Article Contest

About the Author

Daniel Dominguez

Rate this Article

This content is in the AI, ML & Data Engineering topic

Related Topics:

Related Editorial

Related Sponsored Content

Popular across InfoQ

The InfoQ Newsletter