InfoQ Homepage Large language models Content on InfoQ
-
Why Observability Matters (More!) with AI Applications
Sally O'Malley shares how to build an AI observability stack with open-source tools (Prometheus, Grafana, OpenTelemetry, Tempo, vLLM/Llama Stack). Learn to track performance, quality and cost signals.
-
Deploy MultiModal RAG Systems with vLLM
Stephen Batifol explains the core concepts of multimodal RAG systems, vector search indexes (HNSW, IVF), and embedding model selection. He details vLLM and Pixtral for optimized inference..
-
Chatting with Your Knowledge Graph
Jonathan Lowe discusses how to enable an LLM to chat with a structured graph database. He explains the process of using semantic search and knowledge graphs to answer natural language questions.
-
The Data Backbone of LLM Systems
Paul Iusztin discusses the evolution of AI engineering, highlighting the shift from model training to foundational models. He shares insights on scalable LLM systems and optimizing RAG.
-
Enhance LLMs’ Explainability and Trustworthiness with Knowledge Graphs
Leann Chen discusses how knowledge graphs provide structured data to enhance LLM accuracy, tackling common challenges like hallucinations and the "lost-in-the-middle" phenomenon in RAG systems.
-
AI Agents & LLMs: Scaling the Next Wave of Automation
The panelists discuss AI agents and LLMs, exploring their definitions, architectures, use cases, reliability, and impact on the SDLC and future of automation.
-
A Framework for Building Micro Metrics for LLM System Evaluation
Denys Linkov discusses critical lessons for senior developers and leaders on building robust LLM systems and actionable metrics that prevent production issues and drive business value.
-
Scaling Large Language Model Serving Infrastructure at Meta
Ye (Charlotte) Qi explains key considerations for optimizing LLM inference, including hardware, latency, and production scaling strategies.
-
How Green is Green: LLMs to Understand Climate Disclosure at Scale
Leo Browning explains the journey of developing a Retrieval Augmented Generation (RAG) system at a climate-focused startup.
-
LLM and Generative AI for Sensitive Data - Navigating Security, Responsibility, and Pitfalls in Highly Regulated Industries
Stefania Chaplin and Azhir Mahmood discuss responsible, secure, and explainable AI in regulated industries. Learn MLOps, legislation, and future trends.
-
Unleashing Llama's Potential: CPU-Based Fine-Tuning
Anil Rajput and Rema Hariharan detail CPU-based LLM (Llama) optimization strategies for performance and TCO reduction.
-
Navigating LLM Deployment: Tips, Tricks, and Techniques
Meryem Arik shares best practices for self-hosting LLMs in corporate environments, highlighting the importance of cost efficiency and performance optimization.