InfoQ Homepage Machine Learning Content on InfoQ
-
Karrot Improves Conversion Rates by 70% with New Scalable Feature Platform on AWS
Karrot replaced its legacy recommendation system with a scalable architecture that leverages various AWS services. The company sought to address challenges related to tight coupling, limited scalability, and poor reliability in its previous solution, opting instead for a distributed, event-driven architecture built on top of scalable cloud services.
-
How Discord Scaled its ML Platform from Single-GPU Workflows to a Shared Ray Cluster
Discord has detailed how it rebuilt its machine learning platform after hitting the limits of single-GPU training. The changes enabled daily retrains for large models and contributed to a 200% uplift in a key ads ranking metric.
-
KubeCon NA 2025 - Robert Nishihara on Open Source AI Compute with Kubernetes, Ray, PyTorch, and vLLM
AI workloads are growing more complex in terms of compute and data, and technologies like Kubernetes and PyTorch can help build production-ready AI systems to support them. Robert Nishihara from Anyscale recently spoke at KubeCon + CloudNativeCon North America 2025 Conference about how an AI compute stack comprising Kubernetes, PyTorch, VLLM and Ray technologies can support these new AI workloads.
-
LinkedIn’s Migration Journey to Serve Billions of Users by Nishant Lakshmikanth at QCon SF
Engineering Manager Nishant Lakshmikanth showcased LinkedIn's transformation at QCon SF 2025, detailing a shift from legacy batch-based systems to a real-time architecture. By decoupling recommendations and leveraging dynamic scoring techniques, LinkedIn achieved a 90% reduction in offline costs, enhanced session-level freshness, and improved member engagement while future-proofing its platform.
-
Google Announces Gemini 3
Google's Gemini 3, unveiled on November 18, 2025, sets a new standard for multimodal AI, integrating seamlessly across platforms like Search and Vertex AI. With capabilities for text, code, and rich media, it empowers both consumer and enterprise applications. Gemini 3 Pro and its advanced Deep Think mode enhance reasoning and task execution, revolutionizing workflows and analytics.
-
New IBM Granite 4 Models to Reduce AI Costs with Inference-Efficient Hybrid Mamba-2 Architecture
IBM recently announced the Granite 4.0 family of small language models. The model family aims to deliver faster speeds and significantly lower operational costs at acceptable accuracy vs. larger models. Granite 4.0 features a new hybrid Mamba/transformer architecture that largely reduces memory requirements, enabling Granite to run on significantly cheaper GPUs and at significantly reduced costs.
-
KubeCon NA 2025 - Erica Hughberg and Alexa Griffith on Tools for the Age of GenAI
Generative AI technologies need to support new workloads, traffic patterns, and infrastructure demands and require a new set of tools for the age of GenAI. Erica Hughberg from Tetrate and Alexa Griffith from Bloomberg spoke last week at KubeCon + CloudNativeCon North America 2025 Conference about what it takes to build GenAI platforms capable of serving model inference at scale.
-
Anthropic Adds Sandboxing and Web Access to Claude Code for Safer AI-Powered Coding
Anthropic released sandboxing capabilities for Claude Code and launched a web-based version of the tool that runs in isolated cloud environments. The company introduced these features to address security risks that arise when Claude Code writes, tests, and debugs code with broad access to developer codebases and files.
-
New Claude Haiku 4.5 Model Promises Faster Performance at One-Third the Cost
Anthropic released Claude Haiku 4.5, making the model available to all users as its latest entry in the small, fast model category. The company positions the new model as delivering performance levels comparable to Claude Sonnet 4, which launched five months ago as a state-of-the-art model, but at "one-third the cost and more than twice the speed."
-
How Meta Is Using AI to Standardize and Cut Carbon Emissions
Meta has developed an AI-based approach to improve the quality of Scope 3 emissions estimates across its IT hardware supply chain. The method combines machine learning and generative models to classify hardware components and infer missing product carbon footprint (PCF) data.
-
Google Research Open-Sources the Coral NPU Platform to Help Build AI into Wearables and Edge Devices
Coral NPU is an open-source full-stack platform designed to help hardware engineers and AI developers overcome the limitations that prevent integrating AI in wearables and edge devices, including performance, fragmentation, and user trust.
-
Instagram Improves Engagement by Reducing Notification Fatigue with New Ranking Framework
Meta has introduced a diversity-aware ranking framework for Instagram notifications. The system applies multiplicative penalties to reduce repetitive alerts from the same creators or product surfaces, improving engagement while maintaining relevance and introducing content variety.
-
An AI-Driven Approach to Creating Effective Learning Experiences at QCon
An experiment was created around a certification program influenced by AI at QCon London, which included special events during the conference, a pre-conference breakfast where participants could learn about upcoming activities, and an AI-driven workshop immediately following the conference. Wes Reisz spoke at InfoQ Dev Summit Boston about a program he led using AI.
-
How Netflix is Reimagining Data Engineering for Video, Audio, and Text
Netflix has introduced a new engineering specialization—Media ML Data Engineering, alongside a Media Data Lake designed to handle video, audio, text, and image assets at scale. Early results include richer ML models trained on standardized media, faster evaluation cycles, and deeper insights into creative workflows.
-
Roblox Open-Sources AI System to Detect Conversations Potentially Harmful to Kids
Roblox Sentinel is an AI system designed to detect early signs of potential child endangerment for further analysis and investigation. Implemented as a Python library, Sentinel uses contrastive learning to handle highly imbalanced datasets that often challenge traditional classifiers and can be applied to a wide range of use cases.