InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Reducing False Positives in Retrieval-Augmented Generation (RAG) Semantic Caching: a Banking Case Study
In this article, author Elakkiya Daivam discusses why Retrieval Augmented Generation (RAG) and semantic caching techniques are powerful levers for reducing false positives in AI powered applications. She shares the insights from a production-grade evaluation with 1,000 query variations tested across seven bi-encoder models.
-
Training Data Preprocessing for Text-to-Video Models
In this article, author Aleksandr Rezanov discusses the data preparation for generative text-to-image models to accelerate work on video generation services to be used in TV series and films. He explains how data is prepared and can serve as a starting point for creating custom datasets to develop proprietary models.
-
A Plan-Do-Check-Act Framework for AI Code Generation
AI code generation tools promise faster development but often create quality issues, integration problems, and delivery delays. A structured Plan-Do-Check-Act cycle can maintain code quality while leveraging AI capabilities. Through working agreements, structured prompts, and continuous retrospection, it asserts accountability over code while guiding AI to produce tested, maintainable software.
-
Disaggregation in Large Language Models: the Next Evolution in AI Infrastructure
Large Language Model (LLM) inference faces a fundamental challenge: the same hardware that excels at processing input prompts struggles with generating responses, and vice versa. Disaggregated serving architectures solve this by separating these distinct computational phases, delivering throughput improvements and better resource utilization while reducing costs.
-
InfoQ AI, ML and Data Engineering Trends Report - 2025
This InfoQ Trends Report offers readers a comprehensive overview of emerging trends and technologies in the areas of AI, ML, and Data Engineering. This report summarizes the InfoQ editorial team’s and external guests' view on the current trends in AI and ML technologies and what to look out for in the next 12 months.
-
Effective Practices for Architecting a RAG Pipeline
Hybrid search, smart chunking, and domain-aware indexing are key to building effective RAG pipelines. Context window limits and prompt quality critically affect LLM response accuracy. This article provides lessons learned from setting up a RAG pipeline.
-
How Causal Reasoning Addresses the Limitations of LLMs in Observability
Large language models excel at converting observability telemetry into clear summaries but struggle with accurate root cause analysis in distributed systems. LLMs often hallucinate explanations and confuse symptoms with causes. This article suggests how causal reasoning models with Bayesian inference offer more reliable incident diagnosis.
-
MCP: the Universal Connector for Building Smarter, Modular AI Agents
In this article, the authors discuss Model Context Protocol (MCP), an open standard designed to connect AI agents with tools and data they need. They also talk about how MCP empowers agent development, and its adoption in leading open-source frameworks.
-
The Missing Layer in AI Infrastructure: Aggregating Agentic Traffic
In this article, author Eyal Solomon discusses AI Gateways, the outbound proxy servers that intercept and manage AI-agent-initiated traffic in real time to enforce policies and provide central management.
-
Infusing AI into Your Java applications
Equip yourself with the basic AI knowledge and skills you need to start building intelligent and responsive Enterprise Java applications. With the help of our simple chatbot application for booking interplanetary space trips, see how Java frameworks like LangChain4j with Quarkus make it easy and efficient to interact with LLMs and create satisfying applications for end-users.
-
Building Reproducible ML Systems with Apache Iceberg and SparkSQL: Open Source Foundations
Traditional data lakes are great for storing massive amounts of stuff, but they're terrible at the transactional guarantees and versioning that ML workloads desperately need. Apache Iceberg and SparkSQL bring database-like reliability to your data lake. Time travel, schema evolution, and ACID transactions help support reproducible machine learning experiments.
-
A First-Timer’s Guide to Curating a Technical Conference Track
One first-time track host shares the process, constraints, and takeaways from building a track from scratch at QCon London 2025.