InfoQ Homepage Large language models Content on InfoQ
-
OpenAI Launches o3-pro Model Focused on Reliability, Amid Mixed User Feedback
OpenAI launched o3-pro, a new version of its most advanced model aimed at delivering more reliable, thoughtful responses across complex tasks. Now available to Pro and Team users in ChatGPT and via API, o3-pro replaces the earlier o1-pro.
-
Mistral AI Releases Magistral, Its First Reasoning-Focused Language Model
Mistral AI has released Magistral, a new model family built for transparent, multi-step reasoning. Available in open and enterprise versions, it supports structured logic, multilingual output, and traceable decision-making.
-
Meta Introduces V-JEPA 2, a Video-Based World Model for Physical Reasoning
Meta has introduced V-JEPA 2, a new video-based world model designed to improve machine understanding, prediction, and planning in physical environments. The model extends the Joint Embedding Predictive Architecture (JEPA) framework and is trained to predict outcomes in embedding space using video data.
-
Anthropic Releases Claude Code SDK to Power AI-Paired Programming
Anthropic has launched Claude Code SDK, a new toolkit that extends the reach of its code assistant, Claude, far beyond the chat interface. Designed for integration into modern developer workflows, the SDK offers a suite of tools for TypeScript, Python, and the command line, enabling advanced automation of code review, refactoring, and transformation tasks.
-
Mistral Releases Its Own Coding Assistant Mistral Code
Mistral has introduced Mistral Code, a new AI-powered development tool aimed at improving the efficiency and accuracy of coding workflows. Mistral Code utilizes advanced AI models to offer developers intelligent code completion, real-time suggestions, and the capability to interact with the codebase using natural language.
-
QCon AI New York 2025: Program Committee Announced
Meet the QCon AI New York Program Committee, senior software leaders shaping a practical AI conference for engineers building at scale.
-
Google Cloud Run Now Offers Serverless GPUs for AI and Batch Processing
Google Cloud has launched NVIDIA GPU support for Cloud Run, enhancing its serverless platform with scalable, cost-efficient GPU resources. This upgrade enables rapid AI inference and batch processing, featuring pay-per-second billing and automatic scaling to zero. Developers can access seamless GPU support easily, making advanced AI applications faster and more accessible.
-
Anthropic Open-Sources Tool to Trace the "Thoughts" of Large Language Models
Anthropic researchers have open-sourced the tool they used to trace what goes on inside a large language model during inference. It includes a circuit tracing Python library that can be used with any open-weights model and a frontend hosted on Neuropedia to explore the library output through a graph.
-
Introducing ANS: DNS-Inspired Secure Discovery for AI Agents
The Open Worldwide Application Security Project (OWASP) has recently introduced a new standard for securely discovering AI agents. Inspired by DNS, the Agent Name Service (ANS) provides a protocol-agnostic registry mechanism that uses Public Key Infrastructure (PKI) to establish agent identity and trust.
-
AWS Introduces Open Source Model Context Protocol Servers for ECS, EKS, and Serverless
AWS has launched open-source Model Context Protocol (MCP) servers on GitHub to supercharge AI development within Amazon ECS, EKS, and Serverless environments. These specialized tools equip developers with real-time, context-specific insights, enhancing application deployment, troubleshooting, and operational efficiency. Empower your cloud experience today!
-
Introducing Embabel: Advanced AI Agent Development for Java Applications
Introducing the Embabel Agent Framework, a pioneering platform developed by Spring founder Rod Johnson, designed to revolutionize AI applications on the JVM. By integrating structured agent development and Goal-Oriented Action Planning, Embabel combines strong typing with dynamic planning, ensuring reliable, adaptable, and type-safe solutions for enterprise Java applications.
-
Perplexity Introduces Labs for Project-Based AI Workflows
Perplexity has released Labs, a new feature for Pro subscribers designed to support more complex tasks beyond question answering. The update marks a shift from search-based interactions toward structured, multi-step workflows powered by generative AI.
-
Google Brings Gemini Nano to ML Kit with New On-Device GenAI APIs
The new GenAI APIs recently added to ML Kit enable developers to use Gemini Nano for on-device inference in Android apps, supporting features like summarization, proofreading, rewriting, and image description.
-
Anthropic Introduces Claude 4 Family and Claude Code
Anthropic released Claude Opus 4 and Sonnet 4, the newest versions of their Claude series of LLMs. Both models support extended thinking, tool use, and memory improvements, and Claude 4 Opus outperforms other LLMs on coding benchmarks.
-
Amazon Open Sources Strands Agents SDK for Building AI Agents
Amazon has released Strands Agents, an open source SDK that simplifies AI agent development through a model-driven approach. The framework enables developers to build agents by defining prompts and tool lists with minimal code.