InfoQ Homepage InfoQ Dev Summit Boston 2025 Content on InfoQ
-
Fix SLO Breaches before They Repeat: an SRE AI Agent for Application Workloads
Bruno Borges explains how to automate SLO breach diagnostics using SRE agents and MCP tools. He shares methodologies for identifying bottlenecks and balancing speed, cost, and reliability.
-
DevOps Is for Product Engineers, Too
Lesley Cordero explains how DevOps and platform engineering drive sociotechnical excellence. She shares strategies for joint optimization, distributed leadership, and organizational sustainability.
-
Powering Enterprise AI Applications with Data and Open Source Software
Francisco Javier Arceo explored Feast, the open-source feature store designed to address common data challenges in the AI/ML lifecycle, such as feature redundancy, and low-latency serving at scale.
-
Architecting Planet Scale, Modern Apps in the Cloud
George Mao discusses the five stages of maturity for building a planet-scale, global, and highly available architecture using fully serverless, cloud-native services (GCP/AWS).
-
Empathy Driven Platforms: You Build It, Let’s Run It Together
Erin Doyle shares the "ultimate attack" for conquering engineering roadblocks: building Empathy-Driven Platforms. Learn how to bridge the Platform/Product Dev divide and boost productivity.
-
AI-Driven Software Delivery: Leveraging Lean, ChOP & LLMs to Create More Effective Learning Experiences at QCon
Wes Reisz details building a RAG-powered QCon certification in 4 weeks. He dives into the serverless pipeline, RAG architecture, lessons on using supervised coding agents, and Lean thinking.
-
Growing and Cultivating Strong Machine Learning Engineers
Vivek Gupta explains how to nourish and cultivate Machine Learning engineers, detailing the unique production-ML skills required for scaling, governance, and LLMOps.
-
You Are Asking the Wrong Questions (about Reliability and SRE)
David Blank-Edelman shares seven SRE-focused questions for reliability, challenging conventional wisdom on root cause analysis, toil elimination, and the true meaning of resilience.
-
Vector Sync Patterns: Keeping AI Features Fresh When Your Data Changes
Ricardo Ferreira discusses five advanced Vector Sync Patterns to tackle multi-dimensional vector staleness & integration challenges in modern AI/microservices architectures.
-
Why Observability Matters (More!) with AI Applications
Sally O'Malley shares how to build an AI observability stack with open-source tools (Prometheus, Grafana, OpenTelemetry, Tempo, vLLM/Llama Stack). Learn to track performance, quality and cost signals.
-
Systems Thinking for Scaling Responsible Multi-Agent Architectures
Nimisha Asthagiri explains how systems thinking and Causal Flow Diagrams are tools to navigate the unintended consequences of increasingly complex multi-agent AI systems and ensure responsible AI.
-
From Grassroots to Enterprise: Vanguard's Journey in SRE Transformation
Christina Yakomin shares Vanguard's SRE journey: from monolithic releases to DevOps, coaching, request-rate autoscaling, and resiliency testing for AI-backed systems.