InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
QCon London 2026: Reliable Retrieval for Production AI Systems
At QCon London 2026, Lan Chu, AI tech lead at Rabobank, shared lessons from deploying a production AI search system used internally by more than 300 users across 10,000 documents. Her experience shows that most failures in RAG systems stem from indexing and retrieval, rather than the language model itself.
-
AI Is Amplifying Software Engineering Performance, Says the 2025 DORA Report
Artificial intelligence is rapidly reshaping the way software is built, but its impact is more nuanced than many organizations expected. The 2025 DevOps Research and Assessment (DORA) report, titled State of AI-Assisted Software Development, finds that AI does not automatically improve software delivery performance.
-
QCon London 2026: Behind Booking.com's AI Evolution: the Unpolished Story
Jabez Eliezer Manuel, senior principal engineer at Booking.com, presented “Behind Booking.com's AI Evolution: the Unpolished Story” at QCon London 2026. Manuel discussed how Booking.com has evolved over the past 20 years and the challenges they faced on their journey to incorporate AI.
-
DoorDash Builds DashCLIP to Align Images, Text, and Queries for Semantic Search Using 32M Labels
DoorDash has launched a multimodal machine learning system that aligns product images, text, and user queries in a shared embedding space. Trained on 32 million labeled query-product pairs using contrastive learning, the system improves semantic search, product ranking, and advertising relevance. Embeddings also support other machine learning tasks across the marketplace.
-
Google Researchers Propose Bayesian Teaching Method for Large Language Models
Google Research has proposed a training method that teaches large language models to approximate Bayesian reasoning by learning from the predictions of an optimal Bayesian system. The approach focuses on improving how models update beliefs as they receive new information during multi-step interactions.
-
DoorDash Builds LLM Conversation Simulator to Test Customer Support Chatbots at Scale
DoorDash engineers built a simulation and evaluation flywheel to test large language model customer support chatbots at scale. The system generates multi-turn synthetic conversations using historical transcripts and backend mocks, evaluates outcomes with an LLM-as-judge framework, and enables rapid iteration on prompts, context, and system design before production deployment.
-
AWS Launches Strands Labs for Experimental AI Agent Projects
Amazon Web Services has introduced Strands Labs, a new GitHub organization created to host experimental projects related to agent-based AI development.
-
Claude Opus 4.6 Introduces Adaptive Reasoning and Context Compaction for Long-Running Agents
Anthropic’s Claude Opus 4.6 introduces "Adaptive Thinking" and a "Compaction API" to solve context rot in long-running agents. The model supports a 1M token context window with 76% multi-needle retrieval accuracy. While leading benchmarks in agentic coding, independent tests show a 49% detection rate for binary backdoors, highlighting the gap between SOTA claims and production security.
-
AI-Powered Bot Compromises GitHub Actions Workflows across Microsoft, DataDog, and CNCF Projects
AI-powered bot hackerbot-claw exploited GitHub Actions workflows across Microsoft, DataDog, and CNCF projects over 7 days using 5 attack techniques. Bot achieved RCE in 5 of 7 targets, stole GitHub token from awesome-go (140k stars), and fully compromised Aqua Security's Trivy. Campaign included first documented AI-on-AI attack where bot attempted prompt injection against Claude Code.
-
GitLab Suggests AI Can Detect Vulnerabilities But it's AI Governance That Determines Risk
Artificial intelligence is rapidly transforming how software vulnerabilities are detected, but questions about who governs the risks AI exposes, and how those risks are acted on, are becoming increasingly urgent, according to a new blog post by GitLab.
-
Cloudflare Releases Experimental Next.js Alternative Built with AI Assistance
Cloudflare released vinext, an experimental Next.js reimplementation built on Vite by one engineer, with AI guidance over one week, for $1,100. Early benchmarks show 4.4x faster builds, but Cloudflare cautions it's untested at scale. Missing static pre-rendering. HN reaction skeptical, noting Vite does the heavy lifting. Already running on CIO.gov despite experimental status.
-
Google BigQuery Previews Cross-Region SQL Queries for Distributed Data
Google Cloud has recently announced the preview of a global queries feature for BigQuery. The new option lets developers run SQL queries across data stored in different geographic regions without first moving or copying the data to aggregate the results.
-
Scaling Human Judgment: How Dropbox Uses LLMs to Improve Labeling for RAG Systems
To improve the relevance of responses produced by Dropbox Dash, Dropbox engineers began using LLMs to augment human labelling, which plays a crucial role in identifying the documents that should be used to generate the responses. Their approach offers useful insights for any system built on retrieval-augmented generation (RAG).
-
New Research Reassesses the Value of AGENTS.md Files for AI Coding
Despite widespread industry recommendations, a new ETH Zurich paper concludes that AGENTS.md files may often hinder AI coding agents. The researchers recommend omitting LLM-generated context files entirely and limiting human-written instructions to non-inferable details, such as highly specific tooling or custom build commands.
-
OpenAI Secures AWS Distribution for Frontier Platform in $110B Multi-Cloud Deal
OpenAI's $110B funding includes AWS as the exclusive third-party distributor for the Frontier agent platform, introducing an architectural split: Azure retains stateless API exclusivity; AWS gains stateful runtime environments via Bedrock. Deal expands the existing $38B AWS agreement by $100B and commits 2GW of Trainium capacity.