InfoQ Homepage Robotics Content on InfoQ
-
Dreamer 4: Learning to Achieve Goals from Offline Data through Imagination Training
Researchers from DeepMind have described a new approach for teaching intelligent agents to solve complex, long-term tasks by training them exclusively on video footage rather than through direct interaction with the environment. Their new agent, called Dreamer 4, demonstrated the ability to mine diamonds playing Minecraft after being trained on videos, without ever actually playing the game.
-
DeepMind Releases Gemini Robotics-ER 1.5 for Embodied Reasoning
Google DeepMind introduced Gemini Robotics-ER 1.5, a new embodied reasoning model for robotic applications. The model is available in preview through Google AI Studio and the Gemini API.
-
Google DeepMind Announces Robotics Foundation Model Gemini Robotics On-Device
Google DeepMind introduced Gemini Robotics On-Device, a vision-language-action (VLA) foundation model designed to run locally on robot hardware. The model features low-latency inference and can be fine-tuned for specific tasks with as few as 50 demonstrations.
-
Hugging Face Launches Reachy Mini Robots for Human-Robot Interaction
Hugging Face has launched its Reachy Mini robots, now available for order. Designed for AI developers, researchers, and enthusiasts, the robots offer an exciting opportunity to experiment with human-robot interaction and AI applications.
-
Meta Introduces V-JEPA 2, a Video-Based World Model for Physical Reasoning
Meta has introduced V-JEPA 2, a new video-based world model designed to improve machine understanding, prediction, and planning in physical environments. The model extends the Joint Embedding Predictive Architecture (JEPA) framework and is trained to predict outcomes in embedding space using video data.
-
CMU Researchers Introduce LegoGPT: Building Stable LEGO Structures from Text Prompts
Researchers at Carnegie Mellon University have introduced LegoGPT, a system that generates physically stable and buildable LEGO® structures from natural language descriptions. The project combines large language models with engineering constraints to produce designs that can be assembled manually or by robotic systems.
-
Hugging Face to Democratize Robotics with Open-Source Reachy 2 Robot
Hugging Face has acquired Pollen Robotics, a French startup that developed the humanoid robot Reachy 2. The acquisition aims to make robotics more accessible by open-sourcing the robot’s design and allowing developers to modify and improve its code.
-
Nvidia Unveils AI, GPU, and Quantum Computing Innovations at GTC 2025
Nvidia presented several new technologies at its GTC 2025 event, covering advancements in GPUs, AI, robotics, and quantum computing.
-
Google DeepMind Unveils Gemini Robotics
Google DeepMind has introduced Gemini Robotics, an advanced AI model designed to enhance robotics by integrating vision, language, and action. This innovation, based on the Gemini 2.0 framework, aims to make robots smarter and more capable, particularly in real-world settings.
-
Physical Intelligence Unveils Robotics Foundation Model Pi-Zero
Physical Intelligence recently announced π0 (pi-zero), a general-purpose AI foundation model for robots. Pi-zero is based on a pre-trained vision-language model (VLM) and outperforms other baseline models in evaluations on five robot tasks.
-
Hugging Face Unveils LeRobot, an Open-Source Machine Learning Model for Robotics
Hugging Face has unveiled LeRobot, a new machine learning model trained for real-world robotics applications. LeRobot functions as a platform, offering a versatile library for data sharing, visualization, and training of advanced models.
-
Nvidia Announces Robotics-Oriented AI Foundational Model
At its recent GTC 2024 event, Nvidia announced a new foundational model to build intelligent humanoid robots. Dubbed GR00T, short for Generalist Robot 00 Technology, the model will understand natural language and be able to observe human actions and emulate human movements.
-
Researchers at Stanford Use Brain Signals to Control Intelligent Robots
In a paper presented at the 7th Annual Conference on Robot Learning last November, a team of Stanford University researchers presented an intelligent human brain-robot interface that enables controlling a robot through brain signals. Dubbed NOIR, short for Neural Signal Operated Intelligent Robots, the system uses electroencephalography (EEG) to communicate human intentions to the robots.
-
Nvidia Introduces Eureka, an AI Agent Powered by GPT-4 That Can Train Robots
Nvidia Research revealed that it has created a brand-new AI agent named Eureka that is driven by OpenAI's GPT-4 and is capable of teaching robots sophisticated abilities on its own.
-
Google DeepMind Announces LLM-Based Robot Controller RT-2
Google DeepMind recently announced Robotics Transformer 2 (RT-2), a vision-language-action (VLA) AI model for controlling robots. RT-2 uses a fine-tuned LLM to output motion control commands. It can perform tasks not explicitly included in its training data and improves on baseline models by up to 3x on emergent skill evaluations.