InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Microsoft Adds Agent Mode and Office Agent to Office Applications
Microsoft has expanded its Microsoft 365 Copilot platform with Agent Mode and Office Agent. The update moves Copilot beyond a conversational assistant into a system capable of running continuous, multi-step workflows across Microsoft 365 applications.
-
Researchers Introduce ACE, a Framework for Self-Improving LLM Contexts
Researchers from Stanford University, SambaNova Systems, and UC Berkeley have proposed Agentic Context Engineering (ACE), a new framework designed to improve large language models (LLMs) through evolving, structured contexts rather than weight updates. The method, described in a paper, seeks to make language models self-improving without retraining.
-
Google’s Open Source Gemini CLI Extensions Let Developers Build Custom AI-Powered Workflows
Google's Gemini CLI Extensions launch an open-source framework empowering developers to create and share integrations effortlessly. With modular architecture and playbooks for seamless tool interaction, Gemini CLI becomes a central hub for AI-assisted workflows. The platform fosters collaboration with prominent partners, enabling a robust ecosystem for personalized developer tools.
-
AWS Launches Amazon Quick Suite, an Agentic AI Workspace
AWS has launched Amazon Quick Suite, a new AI-powered workspace designed to connect company data, automate workflows, and perform actions across business applications.
-
Hugging Face Introduces RTEB, a New Benchmark for Evaluating Retrieval Models
Hugging Face unveils the Retrieval Embedding Benchmark (RTEB), a pioneering framework to assess embedding models' real-world retrieval accuracy. By merging public and private datasets, RTEB narrows the "generalization gap," ensuring models perform reliably across critical sectors. Now live and inviting collaboration, RTEB aims to set a community standard in AI retrieval evaluation.
-
10 AI-Related Standout Sessions at QCon San Francisco 2025
Join us at QCon San Francisco 2025 (Nov 17–21) for a three-day deep dive into the future of software development, exploring AI’s transformative impact. As a program committee member, I’m excited to showcase tracks that tackle real-world challenges, featuring industry leaders and sessions on AI, LLMs, and engineering mindsets. Don’t miss out!
-
Paper2Agent Converts Scientific Papers into Interactive AI Agents
Stanford's Paper2Agent framework revolutionizes research by transforming static papers into interactive AI agents that execute analyses and respond to queries. Leveraging the Model Context Protocol, it simplifies reproducibility and enhances accessibility, empowering users with dynamic, autonomous tools for deeper scientific exploration and understanding.
-
Genkit Extension for Gemini CLI Brings Framework-Aware AI Assistance to the Terminal
Introducing Google's Genkit Extension for Gemini CLI: a groundbreaking tool that delivers framework-aware AI assistance directly to the terminal. Streamline your Genkit application development with context-aware code generation, debugging, and best practices—all without leaving the command line. Unleash productivity and innovation in building generative AI applications.
-
GitHub MCP Registry Offers a Central Hub for Discovering and Deploying MCP Servers
GitHub has recently launched its Model Context Protocol (MCP) Registry, designed to help developers discover and use the AI tools directly from within their working environment. The registry currently lists over 40 MCP servers from Microsoft, GitHub, Dynatrace, Terraform, and many others.
-
OpenAI Adds Full MCP Support to ChatGPT Developer Mode
OpenAI has rolled out full Model Context Protocol (MCP) support in ChatGPT, bringing developers a long-requested feature: the ability to use custom connectors for both read and write actions directly inside chats. The feature, now in beta under Developer Mode, effectively turns ChatGPT into a programmable automation hub capable of interacting with external systems or internal APIs.
-
OpenAI Study Investigates the Causes of LLM Hallucinations and Potential Solutions
In a recent research paper, OpenAI suggested that the tendency of LLMs to hallucinate stems from the way standard training and evaluation methods reward guessing over acknowledging uncertainty. According to the study, this insight could pave the way for new techniques to reduce hallucinations and build more trustworthy AI systems, but not all agree on what hallucinations are in the first place.
-
Claude Sonnet 4.5 Tops SWE-Bench Verified, Extends Coding Focus beyond 30 Hours
Anthropic's Claude Sonnet 4.5, its most advanced coding model, excels in task performance and safety, achieving a 98.7% safety score and improving real-world coding capabilities. Enhanced reasoning skills allow for sustained multi-step tasks, with notable user gains reported. This drop-in replacement demonstrates a powerful balance of capability and security for users.
-
PlanetScale Extends Database Platform to PostgreSQL
PlanetScale has announced the general availability of its managed sharded Postgres service, built for performance and reliability on AWS or Google Cloud. The launch extends PlanetScale's offerings to PostgreSQL users, adding to the company's existing popular MySQL-based platform built on top of Vitess.
-
Google DeepMind Introduces CodeMender, an AI Agent for Automated Code Repair
Google DeepMind has introduced CodeMender, a new AI-driven agent designed to detect, fix, and secure software vulnerabilities automatically. The project builds on recent advances in reasoning models and program analysis, aiming to reduce the time developers spend identifying and patching security issues.
-
OpenAI DevDay 2025 Introduces GPT-5 Pro API, Agent Kit, and More
At OpenAI's DevDay 2025, AgentKit and models GPT-5 Pro and Sora 2 were unveiled, enabling interactive software experiences directly within ChatGPT. This shift towards "apps inside ChatGPT" fosters collaboration and commercialization in conversations. Enhanced self-hosting options and robust SDKs empower developers and streamline workflows, positioning OpenAI at the forefront of AI innovation.