InfoQ Homepage Machine Learning Content on InfoQ
-
Uber Improves Restaurant Recommendations Using Real-Time Signals and Listwise Ranking
Uber updates its Uber Eats Home Feed recommendation system using near real-time user sequence features and a Generative Recommender model. The system evolves from hand-crafted features to transformer-based sequence modeling, reduces feature freshness from 24 hours to seconds, and shifts from pointwise scoring to listwise GenRec for improved contextual ranking and real-time personalization.
-
Agoda Builds Multimodal Content System to Bridge Images and Reviews in Travel Discovery
Agoda unifies hotel images and guest reviews using a shared topic taxonomy, enabling multimodal retrieval across 700M+ images and multilingual reviews with offline enrichment and low-latency serving.
-
Swiggy Improves Search Autocomplete Using Real Time Machine Learning Ranking
Swiggy detailed real-time machine-learning ranking system for autocomplete built on OpenSearch. The architecture separates candidate generation and ranking, uses feature stores for real time signals, and applies learning to rank models for improved relevance. It replaces heuristic ranking while maintaining strict latency constraints and enabling continuous model updates from user behavior signals.
-
Netflix Introduces ‘Model Lifecycle Graph’ to Scale Enterprise Machine Learning
Netflix has developed a graph-based architecture for managing machine learning systems, called the Model Lifecycle Graph. This system maps interconnections between datasets, models, features, and workflows, addressing challenges in scaling ML operations. It enhances discoverability, governance, and component reuse while supporting a self-service approach for engineers and data scientists.
-
Confluent Moves Schema IDs to Kafka Headers to Simplify Schema Governance
Confluent introduces a new approach in Apache Kafka that moves schema IDs from message payloads to record headers, aiming to simplify schema governance and evolution. The update integrates with Schema Registry, improves compatibility across serialization formats, and reduces coupling between data and metadata in event-driven architectures.
-
Legare Kerrison and Cedric Clyburn on LLM Performance and Evaluations
Effectively measuring the performance of applications that are leveraging Large Language Models (LLM) is critical to the adoption of AI technologies in organizations. Legare Kerrison and Cedric Clyburn from RedHat team recently spoke at Arc of AI 2026 Conference about practical methods to evaluate and optimize LLM inference.
-
Cloudflare and ETH Zurich Outline Approaches for AI-Driven Cache Optimization
Cloudflare and ETH Zurich highlight how AI-driven crawler traffic challenges traditional caching in CDNs and databases. They propose AI-aware strategies including separate cache tiers, adaptive algorithms, and pay-per-crawl models to balance performance for human users and AI services while maintaining cache efficiency and system stability.
-
Inside Spotify’s 2025 Wrapped Archive: AI Narratives at Scale and the Privacy Trade‑Off
Spotify's engineering team developed the 2025 "Wrapped Archive," generating 1.4 billion personalized reports for 350 million users. This system identifies key listening days and crafts narratives using a language model. As companies increasingly provide narrative recaps, concerns about user privacy and data tracking persist, necessitating a balance between insights and privacy safeguards.
-
QCon London 2026: Behind Booking.com's AI Evolution: the Unpolished Story
Jabez Eliezer Manuel, senior principal engineer at Booking.com, presented “Behind Booking.com's AI Evolution: the Unpolished Story” at QCon London 2026. Manuel discussed how Booking.com has evolved over the past 20 years and the challenges they faced on their journey to incorporate AI.
-
DoorDash Builds DashCLIP to Align Images, Text, and Queries for Semantic Search Using 32M Labels
DoorDash has launched a multimodal machine learning system that aligns product images, text, and user queries in a shared embedding space. Trained on 32 million labeled query-product pairs using contrastive learning, the system improves semantic search, product ranking, and advertising relevance. Embeddings also support other machine learning tasks across the marketplace.
-
Google Researchers Propose Bayesian Teaching Method for Large Language Models
Google Research has proposed a training method that teaches large language models to approximate Bayesian reasoning by learning from the predictions of an optimal Bayesian system. The approach focuses on improving how models update beliefs as they receive new information during multi-step interactions.
-
Scaling Human Judgment: How Dropbox Uses LLMs to Improve Labeling for RAG Systems
To improve the relevance of responses produced by Dropbox Dash, Dropbox engineers began using LLMs to augment human labelling, which plays a crucial role in identifying the documents that should be used to generate the responses. Their approach offers useful insights for any system built on retrieval-augmented generation (RAG).
-
Enhancing A/B Testing at DoorDash with Multi-Armed Bandits
While experimentation is essential, traditional A/B testing can be excessively slow and expensive, according to DoorDash engineers Caixia Huang and Alex Weinstein. To address these limitations, they adopted a "multi-armed bandits" (MAB) approach to optimize their experiments.
-
DoorDash Applies AI to Safety across Chat and Calls, Cutting Incidents by 50%
DoorDash deploys SafeChat, an AI-driven safety system for moderating chat, images, and voice calls between Dashers and customers. Using a layered text moderation architecture, machine learning models, and human review, SafeChat detects unsafe content in real time, enabling immediate actions and reducing low- and medium-severity safety incidents by roughly 50 percent.
-
AWS Hikes EC2 Capacity Block Rates by 15% in Uniform ML Pricing Adjustment
AWS has raised EC2 Capacity Block prices for ML by 15% across all regions, impacting GPU-based workloads. The uniform price hikes affect top-tier instances powered by NVIDIA GPUs, underscoring supply chain pressures and inflation. With limited alternatives, organizations face higher costs, emphasizing the need for effective workload optimization and cost management strategies.