InfoQ Homepage Google DeepMind Content on InfoQ
-
Google Introduces Nano Banana Pro with Grounded, Multimodal Image Synthesis
Google has released Nano Banana Pro. The system moves beyond conventional diffusion workflows by tightly coupling image generation with Gemini’s multimodal reasoning stack. The result: visuals that are not only aesthetically pleasing, but structurally, contextually, and informationally accurate.
-
Private AI Compute Enables Google Inference with Hardware Isolation and Ephemeral Data Design
Google announced Private AI Compute, a system designed to process AI requests using Gemini cloud models while aiming to keep user data private. The announcement positions Private AI Compute as Google's approach to addressing privacy concerns while providing cloud-based AI capabilities, building on what the company calls privacy-enhancing technologies it has developed for AI use cases.
-
Google DeepMind Introduces CodeMender, an AI Agent for Automated Code Repair
Google DeepMind has introduced CodeMender, a new AI-driven agent designed to detect, fix, and secure software vulnerabilities automatically. The project builds on recent advances in reasoning models and program analysis, aiming to reduce the time developers spend identifying and patching security issues.
-
Google DeepMind Launches Gemini 2.5 Computer Use Model to Power UI-Controlling AI Agents
Google DeepMind has recently released the Gemini 2.5 Computer Use model, a specialized variant of its Gemini 2.5 Pro system designed to enable AI agents to interact directly with graphical user interfaces. The new model allows developers to build agents that can click, type, scroll, and manipulate interactive elements on web pages.
-
Dreamer 4: Learning to Achieve Goals from Offline Data through Imagination Training
Researchers from DeepMind have described a new approach for teaching intelligent agents to solve complex, long-term tasks by training them exclusively on video footage rather than through direct interaction with the environment. Their new agent, called Dreamer 4, demonstrated the ability to mine diamonds playing Minecraft after being trained on videos, without ever actually playing the game.
-
DeepMind Releases Gemini Robotics-ER 1.5 for Embodied Reasoning
Google DeepMind introduced Gemini Robotics-ER 1.5, a new embodied reasoning model for robotic applications. The model is available in preview through Google AI Studio and the Gemini API.
-
Kaggle Introduces Game Arena to Benchmark AI Models in Strategic Games
Kaggle, in collaboration with Google DeepMind, has introduced Kaggle Game Arena, a platform designed to evaluate artificial intelligence models by testing their performance in strategy-based games.
-
Google DeepMind Launches EmbeddingGemma, an Open Model for On-Device Embeddings
Google DeepMind has introduced EmbeddingGemma, a 308M parameter open embedding model designed to run efficiently on-device. The model aims to make applications like retrieval-augmented generation (RAG), semantic search, and text classification accessible without the need for a server or internet connection.
-
Google DeepMind Unveils AlphaEarth Foundations Model for Global Mapping
Google DeepMind has introduced AlphaEarth Foundations, an artificial intelligence model designed to integrate massive volumes of Earth observation data into a unified digital representation of the planet. The system, described as functioning like a “virtual satellite”, can process petabytes of multimodal inputs.
-
DeepMind Launches Genie 3, a Text-to-3D Interactive World Model
DeepMind has introduced Genie 3, the latest version of its “world model” framework for generating interactive 3D environments directly from text prompts.
-
Google DeepMind Open Sources Aeneas, an AI Model for Analyzing Ancient Texts
Google DeepMind open sourced Aeneas, a generative AI model for understanding ancient inscriptions. Aeneas can process both text and image input and outperforms other state-of-the-art models at restoring missing characters in damaged inscriptions.
-
Google DeepMind Announces Robotics Foundation Model Gemini Robotics On-Device
Google DeepMind introduced Gemini Robotics On-Device, a vision-language-action (VLA) foundation model designed to run locally on robot hardware. The model features low-latency inference and can be fine-tuned for specific tasks with as few as 50 demonstrations.
-
Google DeepMind Unveils AlphaGenome: a Unified AI Model for High-Resolution Genome Interpretation
Google DeepMind has announced the release of AlphaGenome, a new AI model designed to predict how genetic variants affect gene regulation across the entire genome. It represents a significant advancement in computational genomics by integrating long-range sequence context with base-pair resolution in a single, general-purpose architecture.
-
Midjourney Debuts V1 AI Video Model
Midjourney has launched its first video generation V1 model, a web-based tool that allows users to animate still images into 5-second video clips.
-
Google DeepMind Unveils AI Coding Agent AlphaEvolve
Google DeepMind published a paper describing their AlphaEvolve coding agent. AlphaEvolve uses LLMs to discover and optimize algorithms across a range of domains, including hardware design, data center operations, and AI training.