InfoQ Homepage Google Content on InfoQ
-
Google LiteRT-LM Speeds up Local Inference up to 2.2x with Gemma 4 Multi-Token Prediction
LiteRT-LM brings native support for Gemma 4 Multi-Token Prediction (MTP) drafters, enabling up to 2.2x faster inference. The framework is expanding beyond Kotlin and C++ adding support for new Swift and a JavaScript APIs.
-
Google Workspace CLI: Unified Command-Line Tool Built for Humans and AI Agents
Google has released a new CLI for Google Workspace, offering a unified interface for various services like Drive, Gmail, and Calendar. Built in Rust, the tool dynamically adjusts to API changes and features over 100 bundled skills. It requires Node.js and a Google Cloud project for setup. Initial community feedback is mixed, highlighting both its dynamic capabilities and setup challenges.
-
Google Cloud Suspends Railway's Production Account, Causing Eight-Hour Platform-Wide Outage
Google Cloud's automated systems suspended Railway's production account without notice, triggering an eight-hour platform-wide outage affecting 3 million users. The cascade took down workloads across all providers including AWS and bare metal because Railway's control plane was hosted on GCP. Railway is demoting GCP to backup-only status.
-
Google Expands SynthID Adoption for AI Watermarking, Previews Content Detection API
Google's SynthID, designed to embed imperceptible signals into AI-generated content, is adding a new Content Detection API on Google Cloud's Gemini Enterprise Agent Platform, after gaining adoption by several industry players including Nvidia and OpenAI.
-
Gemma 4 Multi-Token Prediction Delivers up to ~3x Faster Token Generation
Gemma 4 can be paired with multi-token prediction (MTP) drafters that use speculative decoding to generate multiple tokens in parallel, allowing the model to verify them in a single pass and achieve up to ~3× faster inference without quality loss.
-
Google Introduces Middleware Architecture for Genkit Applications
Google has introduced Middleware for Genkit, its open-source framework for building AI-powered and agentic applications. The update adds a programmable interception layer around model calls, tool execution, and generation loops, giving developers more control over reliability, safety, and orchestration inside production AI systems.
-
With Android CLI, Google is Making the Android Toolchain Agent-Friendly
Google introduced new Android development tools that enable building apps up to 3x faster by using AI agents, including a redesigned Android command-line interface (CLI), structured skills", and an integrated knowledge base. These tools are designed to support agent-driven workflows and are compatible with third-party agents such as Claude Code and Codex, in addition to Google Gemini.
-
Google New TPU Generation is Specifically Designed for Agents and SOTA Model Training
Google has unvelied a new generation of Tensor Processing Units (TPUs), featuring two specialized chips designed to accelerate model training and agent workflows, which require continuous, multi-step reasoning, and action loops distributed across multiple models. The new TPUs deliver better performance, memory, and energy efficiency, the company says.
-
AWS Interconnect Reaches General Availability with Managed Multicloud and Last-Mile Connectivity
AWS Interconnect reached general availability, offering managed private Layer 3 connections to Google Cloud and a last-mile capability via Lumen. Azure and OCI support is planned for later in 2026. AWS published the underlying specification on GitHub under Apache 2.0, which Forrester analysts read as a play to set a de facto standard for multicloud connectivity.
-
Google Introduces Room 3.0: a Kotlin-First, Async, Multiplatform Persistence Library
Room 3.0 is a major update to Android's persistence library that introduces breaking changes in key areas. The new release focuses on modernizing Android persistence layer around Kotlin Multiplatform and expands platform support to include JavaScript and WebAssembly.
-
Subagents in Gemini CLI Enable Task Delegation and Parallel Agent Workflows
Google has introduced subagents in Gemini CLI, a new capability designed to help developers delegate complex or repetitive tasks to specialized AI agents operating alongside a primary session.
-
Google ADK for Java 1.0 Introduces New App and Plugin Architecture, External Tools Support, and More
Google's Agent Development Kit for Java reached 1.0, introducing integrations with new external tools, a new app and plugin architecture, advanced context engineering, human-in-the-loop workflows, and more.
-
Google’s Aletheia Advances the State of the Art of Fully Autonomous Agentic Math Research
Google announced Aletheia, an AI using Gemini 3 Deep Think that solved 6/10 novel math problems in the FirstProof challenge. Aletheia also scored ~91.9% on IMO-ProofBench, signaling a significant shift in automated research-level proof discovery without human intervention.
-
Google Brings MCP Support to Colab, Enabling Cloud Execution for AI Agents
Google has released the open-source Colab MCP Server, enabling AI agents to directly interact with Google Colab through the Model Context Protocol (MCP). The project is designed to bridge local agent workflows with cloud-based execution, allowing developers to offload compute-intensive or potentially unsafe tasks from their own machines.
-
Google Open Sources Experimental Multi-Agent Orchestration Testbed Scion
Designed to manage concurrent agents running in containers across local and remote compute, Scion is an experimental orchestration testbed that enables developers to run groups of specialized agents with isolated identities, credentials, and shared workspaces.