InfoQ Homepage Large language models Content on InfoQ
-
Building a RAG Application with Spring Boot, Spring AI, MongoDB Atlas Vector Search, and OpenAI
The RAG paradigm redefines AI: it combines generative models and business data for accurate, contextualised responses. The article shows how to integrate Spring Boot, Spring AI, MongoDB Atlas and OpenAI into a powerful and flexible pipeline capable of transforming the way businesses access and create value from data, with applications ranging from finance and healthcare to customer service.
-
Disaggregation in Large Language Models: the Next Evolution in AI Infrastructure
Large Language Model (LLM) inference faces a fundamental challenge: the same hardware that excels at processing input prompts struggles with generating responses, and vice versa. Disaggregated serving architectures solve this by separating these distinct computational phases, delivering throughput improvements and better resource utilization while reducing costs.
-
InfoQ AI, ML and Data Engineering Trends Report - 2025
This InfoQ Trends Report offers readers a comprehensive overview of emerging trends and technologies in the areas of AI, ML, and Data Engineering. This report summarizes the InfoQ editorial team’s and external guests' view on the current trends in AI and ML technologies and what to look out for in the next 12 months.
-
Effective Practices for Architecting a RAG Pipeline
Hybrid search, smart chunking, and domain-aware indexing are key to building effective RAG pipelines. Context window limits and prompt quality critically affect LLM response accuracy. This article provides lessons learned from setting up a RAG pipeline.
-
How Causal Reasoning Addresses the Limitations of LLMs in Observability
Large language models excel at converting observability telemetry into clear summaries but struggle with accurate root cause analysis in distributed systems. LLMs often hallucinate explanations and confuse symptoms with causes. This article suggests how causal reasoning models with Bayesian inference offer more reliable incident diagnosis.
-
MCP: the Universal Connector for Building Smarter, Modular AI Agents
In this article, the authors discuss Model Context Protocol (MCP), an open standard designed to connect AI agents with tools and data they need. They also talk about how MCP empowers agent development, and its adoption in leading open-source frameworks.
-
The Virtual Think Tank: Using LLMs to Gain a Multitude of Perspectives
The virtual think tank leverages LLMs to simulate diverse stakeholder and expert perspectives, enabling architects to explore trade-offs, challenge biases, and refine decisions. By prompting personas of real industry experts, the method fosters rich, contextual debates—offering a scalable, low-cost alternative to a traditional think tank.
-
Effective Practices for Coding with a Chat-Based AI
In this article, we explore how AI agents are reshaping software development and the impact they have on a developer’s workflow. We introduce a practical approach to staying in control while working with these tools by adopting key best practices from the discipline of software architecture, including defining an implementation plan, splitting tasks, and so on.
-
The State Space Solution to Hallucinations: How State Space Models are Slicing the Competition
AI-powered search tools often hallucinate and make up facts, misquote sources, and recycle outdated information. The real cause of this is tied to the architecture of most AI models: Transformer. In this article, author Albert Lie explains why transformers struggle with hallucinations, how State Space Models (SSMs) offer a solution, and what this shift could mean for the future of AI search.
-
Large Concept Models: a Paradigm Shift in AI Reasoning
Differently from LLMs, Large Concept Models (LCMs) use structured knowledge to grasp relationships between concepts, enhancing the decision-making process and providing a transparent reasoning audit trail. Using LCMs with LLMs can facilitate building AI systems that can analyze complex scenarios and effectively communicate insights, driving towards developing more reliable and explainable AI.
-
Domain-Driven RAG: Building Accurate Enterprise Knowledge Systems through Distributed Ownership
Retrieval augmented generation (RAG) can help reduce LLM hallucination. Learn how applying high-quality metadata and distributing ownership of documents and prompts to domain experts can further increase accuracy in RAG applications. An additional layer of intelligence can use metadata to focus RAG searches on a specific domain for even better results.
-
InfoQ Software Architecture and Design Trends Report - 2025
The InfoQ Trends Reports offer InfoQ readers a comprehensive overview of key topics worthy of attention. The reports also guide the InfoQ editorial team towards cutting-edge technologies in our reporting. In conjunction with the report and trends graph, our accompanying podcast features insightful discussions among the editors digging deeper into some of the trends.