InfoQ Homepage Large language models Content on InfoQ
-
xAI Releases Grok Skills and Updates Tool Calling Responses API
xAI has released Grok Skills together with enhancements to the Responses API for Grok 4.3, enabling persistent custom expertise that the model retains across all conversations.
-
Six Sessions at QCon AI Boston 2026 That Take Productionizing AI Seriously
QCon AI Boston 2026 is close to selling out. Six sessions where speakers engage directly with the gap between AI working in a demo and AI working in production.
-
With Android CLI, Google is Making the Android Toolchain Agent-Friendly
Google introduced new Android development tools that enable building apps up to 3x faster by using AI agents, including a redesigned Android command-line interface (CLI), structured skills", and an integrated knowledge base. These tools are designed to support agent-driven workflows and are compatible with third-party agents such as Claude Code and Codex, in addition to Google Gemini.
-
OpenAI Open-Sources Symphony, a SPEC.md for Autonomous Coding Agent Orchestration
OpenAI Symphony is an agent orchestrator that uses project-management tools, like issue trackers, as a control plan to coordinate multiple coding agents. Instead of developers managing interactive coding sessions, Symphony manages "tasks" by assigning each one to a dedicated agent that works autonomously to completion. Once a task is finished, a human is in charge to review the resulting output.
-
Ubuntu Embraces Local AI instead of Cloud-First OS Integration
Ubuntu has outlined its AI strategy, describing it as a deliberate departure from industry trends towards cloud-centric, AI-first operating systems. Instead, the company says, Ubuntu will focus future releases on local intelligence, modular design, and strict user control.
-
Anthropic Introduces Routines for Claude Code Automation
Anthropic has introduced a new feature called Routines for Claude Code, allowing developers to configure automated coding workflows that run on schedules, through API calls, or in response to external events.
-
Anthropic Traces Six Weeks of Claude Code Quality Complaints to Three Overlapping Product Changes
Anthropic published a postmortem tracing six weeks of Claude Code quality complaints to three overlapping product-layer changes: a reasoning effort downgrade, a caching bug that progressively erased the model's own thinking, and a system prompt verbosity limit that caused a 3% quality drop. The API and model weights were unaffected. All issues were resolved April 20.
-
Anthropic Launches Claude Platform on AWS
Anthropic has announced the general availability of Claude Platform on AWS, a new deployment option that gives AWS customers direct access to Anthropic’s native Claude platform using AWS authentication, billing, and monitoring services.
-
Coder Agents Enable Running AI Coding Workflows on Self-Hosted Infrastructure
Coder Agents is a model-agnostic platform designed to let organizations run AI coding agents on their own infrastructure, rather than relying on cloud-based services. This allows teams to maintain full control over code, data, and execution environments.
-
OpenAI Introduces Websocket-Based Execution Mode to Reduce Latency in Agentic Workflows
OpenAI introduces a WebSocket-based execution mode for its Responses API to improve agentic workflow performance in coding agents and real-time AI systems. The update reduces latency by up to 40 percent by replacing HTTP request-response cycles with persistent connections, improving streaming, tool execution, and multi-step orchestration in production-scale AI systems.
-
Google New TPU Generation is Specifically Designed for Agents and SOTA Model Training
Google has unvelied a new generation of Tensor Processing Units (TPUs), featuring two specialized chips designed to accelerate model training and agent workflows, which require continuous, multi-step reasoning, and action loops distributed across multiple models. The new TPUs deliver better performance, memory, and energy efficiency, the company says.
-
Mistral Adds Remote Agents and Work Mode to Le Chat
Mistral has released Mistral Medium 3.5, a 128-billion parameter model designed to handle instruction following, reasoning, and coding within a single system, and introduced new cloud-based agent capabilities in its Vibe and Le Chat products.
-
Cloudflare Builds High-Performance Infrastructure for Running LLMs
Cloudflare has recently announced new infrastructure designed to run large AI language models across its global network. As these models rely on costly hardware and must handle large volumes of incoming and outgoing text, Cloudflare separates the model's input processing and output generation onto different optimized systems.
-
NVIDIA Launches Ising Open Models for Quantum Computing
NVIDIA has announced a new family of open models called NVIDIA Ising, designed to address quantum processor calibration and quantum error correction. These are two of the main engineering challenges limiting the scalability of current quantum systems, where noise and instability in qubits reduce the reliability of computations.
-
Cloudflare Announces Agent Memory, a Managed Persistent Memory Service for AI Agents
Cloudflare announced Agent Memory in private beta, a managed service that extracts structured memories from AI agent conversations and retrieves them on demand using five-channel parallel retrieval with Reciprocal Rank Fusion. Shared memory profiles let teams of agents access common knowledge. Competitors include Mem0, Zep, LangMem, and Letta.