InfoQ Homepage Retrieval-Augmented Generation Content on InfoQ

News

RSS Feed

Newer Older

AI, ML & Data Engineering

QCon London 2026: Reliable Retrieval for Production AI Systems

At QCon London 2026, Lan Chu, AI tech lead at Rabobank, shared lessons from deploying a production AI search system used internally by more than 300 users across 10,000 documents. Her experience shows that most failures in RAG systems stem from indexing and retrieval, rather than the language model itself.

Daniel Dominguez
on Mar 17, 2026
AI, ML & Data Engineering

Scaling Human Judgment: How Dropbox Uses LLMs to Improve Labeling for RAG Systems

To improve the relevance of responses produced by Dropbox Dash, Dropbox engineers began using LLMs to augment human labelling, which plays a crucial role in identifying the documents that should be used to generate the responses. Their approach offers useful insights for any system built on retrieval-augmented generation (RAG).

Sergio De Simone
on Mar 07, 2026
Architecture & Design

How Dropbox Built a Scalable Context Engine for Enterprise Knowledge Search

Dropbox engineers have detailed how the company built the context engine behind Dropbox Dash, revealing a shift toward index-based retrieval, knowledge graph-derived context, and continuous evaluation to support enterprise AI at scale.

Matt Foster
on Feb 18, 2026
AI, ML & Data Engineering

VillageSQL Launches as an Extension-Focused MySQL Fork

A new open-source project, VillageSQL, has been introduced as a tracking fork of MySQL aimed at expanding extensibility and addressing feature gaps increasingly relevant to AI and agent-based workloads.

Robert Krzaczyński
on Feb 13, 2026
AI, ML & Data Engineering

MongoDB Introduces Embedding and Reranking API on Atlas

MongoDB has recently announced the public preview of its Embedding and Reranking API on MongoDB Atlas. The new API gives developers direct access to Voyage AI’s search models within the managed cloud database, enabling them to create features such as semantic search and AI-powered assistants within a single integrated environment, with consolidated monitoring and billing.

Renato Losio
on Feb 03, 2026
Cloud

Amazon S3 Vectors Reaches GA, Introducing "Storage-First" Architecture for RAG

AWS has announced the general availability of Amazon S3 Vectors, increasing per-index capacity forty-fold to 2 billion vectors. By natively integrating vector search into the S3 storage engine, the service introduces a "Storage-First" architecture that decouples compute from storage, reducing total cost of ownership by up to 90% for large-scale RAG workloads.

Steef-Jan Wiggers
on Jan 02, 2026
Cloud

Microsoft Foundry Agent Service Simplifies State Management with Long-Term Memory Preview

Microsoft has launched a public preview of a managed long-term memory store for its Foundry Agent Service. The service automates the extraction, consolidation, and retrieval of user context, providing a native "state layer" that prevents intelligence decay in long-running interactions with AI agents.

Steef-Jan Wiggers
on Dec 30, 2025
AI, ML & Data Engineering

QCon AI New York 2025 Schedule Published, Highlights Practical Enterprise AI

The QCon AI New York 2025 schedule is now live for its Dec 16-17 event. Focused on moving AI from PoC to production, the program offers a practical roadmap for senior engineers & tech leaders. It addresses the real-world challenges of building, scaling, and deploying reliable, enterprise-grade AI systems, helping organizations overcome the hurdles of productionizing their AI initiatives.

Artenisa Chatziou
on Oct 10, 2025
AI, ML & Data Engineering

Claude Code Gains Support for Remote MCP Servers over Streamable HTTP

Anthropic has recently introduced support for connecting to remote MCP servers in Claude Code, allowing developers to integrate external tools and resources without manual local server setup.

Sergio De Simone
on Jun 22, 2025
Cloud

Cloudflare AutoRAG Streamlines Retrieval-Augmented Generation

Cloudflare has launched a managed service for using retrieval-augmented generation in LLM-based systems. Now in beta, CloudFlare AutoRAG aims to make it easier for developers to build pipelines that integrate rich context data into LLMs.

Sergio De Simone
on Apr 30, 2025
AI, ML & Data Engineering

UC Berkeley's Sky Computing Lab Introduces Model to Reduce AI Language Model Inference Costs

UC Berkeley's Sky Computing Lab has released Sky-T1-32B-Flash, an updated reasoning language model that addresses the common issue of AI overthinking. The model, developed through the NovaSky (Next-generation Open Vision and AI) initiative, "slashes inference costs on challenging questions by up to 57%" while maintaining accuracy across mathematics, coding, science, and general knowledge domains.

Vinod Goje
on Feb 19, 2025
AI, ML & Data Engineering

Microsoft Introduces CoRAG: Enhancing AI Retrieval with Iterative Reasoning

Microsoft AI has introduced Chain-of-Retrieval Augmented Generation (CoRAG), a new AI framework designed to enhance Retrieval-Augmented Generation (RAG) models. Unlike traditional RAG systems, which rely on a single retrieval step, CoRAG enables iterative search and reasoning, allowing AI models to refine their retrievals dynamically before generating answers.

Robert Krzaczyński
on Feb 11, 2025
Java

Micronaut Framework 4.7.0 Provides Integration with LangChain4j and Graal Languages

The Micronaut Foundation has released Micronaut Framework 4.7.0 in December 2024, four months after the release of version 4.6.0. This version provides LangChain4J support to integrate LLMs into Java applications. Micronaut Graal Languages provides integration with Graal-based dynamic languages such as the Micronaut GraalPy feature to interact with Python.

Johan Janssen
on Feb 11, 2025
AI, ML & Data Engineering

AMD and Johns Hopkins Researchers Develop AI Agent Framework to Automate Scientific Research Process

Researchers from AMD and Johns Hopkins University have developed Agent Laboratory, an artificial intelligence framework that automates core aspects of the scientific research process. The system uses large language models to handle literature reviews, experimentation, and report writing, producing both code repositories and research documentation.

Vinod Goje
on Jan 31, 2025
AI, ML & Data Engineering

Amazon Bedrock Introduces Multi-Agent Systems (MAS) with Open Source Framework Integration

Amazon Web Services has released a multi-agent collaboration capability for Amazon Bedrock, introducing a framework for deploying and managing multiple AI agents that collaborate on complex tasks. The system enables specialized agents to work together under a supervisor agent's coordination, addressing challenges developers face with agent orchestration in distributed AI systems.

Vinod Goje
on Jan 23, 2025

Newer News

Older News

InfoQ Software Architects' Newsletter

News