InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
How Discord Scaled its ML Platform from Single-GPU Workflows to a Shared Ray Cluster
Discord has detailed how it rebuilt its machine learning platform after hitting the limits of single-GPU training. The changes enabled daily retrains for large models and contributed to a 200% uplift in a key ads ranking metric.
-
Google Introduces Nano Banana Pro with Grounded, Multimodal Image Synthesis
Google has released Nano Banana Pro. The system moves beyond conventional diffusion workflows by tightly coupling image generation with Gemini’s multimodal reasoning stack. The result: visuals that are not only aesthetically pleasing, but structurally, contextually, and informationally accurate.
-
Google's New LiteRT Accelerator Supercharges AI Workloads on Snapdragon-powered Android Devices
Google has introduced a new accelerator for LiteRT, called Qualcomm AI Engine Direct (QNN), to enhance on-device AI performance on Qualcomm-powered Android devices equipped with Snapdragon 8 SoCs. The accelerator delivers significant gains, offering up to a 100x speedup over CPU execution and 10x over GPU.
-
Private AI Compute Enables Google Inference with Hardware Isolation and Ephemeral Data Design
Google announced Private AI Compute, a system designed to process AI requests using Gemini cloud models while aiming to keep user data private. The announcement positions Private AI Compute as Google's approach to addressing privacy concerns while providing cloud-based AI capabilities, building on what the company calls privacy-enhancing technologies it has developed for AI use cases.
-
KubeCon NA 2025 - Robert Nishihara on Open Source AI Compute with Kubernetes, Ray, PyTorch, and vLLM
AI workloads are growing more complex in terms of compute and data, and technologies like Kubernetes and PyTorch can help build production-ready AI systems to support them. Robert Nishihara from Anyscale recently spoke at KubeCon + CloudNativeCon North America 2025 Conference about how an AI compute stack comprising Kubernetes, PyTorch, VLLM and Ray technologies can support these new AI workloads.
-
Amazon Adds A2A Protocol to Bedrock AgentCore for Interoperable Multi-Agent Workflows
Amazon announced support for the Agent-to-Agent (A2A) protocol in Amazon Bedrock AgentCore Runtime, enabling communication between agents built on different frameworks. The protocol allows agents developed with Strands Agents, OpenAI Agents SDK, LangGraph, Google ADK, or Claude Agents SDK to "share context, capabilities, and reasoning in a common, verifiable format."
-
LinkedIn’s Migration Journey to Serve Billions of Users by Nishant Lakshmikanth at QCon SF
Engineering Manager Nishant Lakshmikanth showcased LinkedIn's transformation at QCon SF 2025, detailing a shift from legacy batch-based systems to a real-time architecture. By decoupling recommendations and leveraging dynamic scoring techniques, LinkedIn achieved a 90% reduction in offline costs, enhanced session-level freshness, and improved member engagement while future-proofing its platform.
-
SAM 3 Introduces a More Capable Segmentation Architecture for Modern Vision Workflows
Meta has released SAM 3, the latest version of its Segment Anything Model and the most substantial update to the project since its initial launch. Built to provide more stable and context-aware segmentation, the model offers improvements in accuracy, boundary quality, and robustness to real-world scenes, aiming to make segmentation more reliable across research and production systems.
-
Google Launches Agent Development Kit for Go
Google has added support for the Go language to its Agent Development Kit (ADK), enabling Go developers to build and manage agents in an idiomatic way that leverages the language's strong concurrency and typing features.
-
Microsoft Copilot Fall Release Includes Collaboration and Personalization Features
Microsoft's recent Copilot Fall Release includes several new features for productivity, collaboration, and personalization. The release also includes updates to Copilot features in Edge and Windows, as well as integration with Microsoft's in-house AI models.
-
Google Brings Colab Integration to Visual Studio Code
Google has announced the availability of a new Visual Studio Code extension that connects local notebooks to a Colab runtime. This allows developers to unify their previously separate local development setup and web-based Colab environment.
-
AnyLanguageModel: Unified API for Local and Cloud LLMs on Apple Platforms
Developers on Apple platforms often face a fragmented ecosystem when using language models. Local models via Core ML or MLX offer privacy and offline capabilities, while cloud services like OpenAI, Anthropic, or Google Gemini provide advanced features. AnyLanguageModel, a new Swift package, simplifies integration by offering a unified API for both local and remote models.
-
Google Cloud Introduces Bigtable Tiered Storage
Google Cloud recently introduced the preview of Bigtable tiered storage. The new feature allows developers to manage both hot and cold data within a single Bigtable instance, optimizing costs while maintaining access to all data.
-
New Token-Oriented Object Notation (TOON) Hopes to Cut LLM Costs by Reducing Token Consumption
The recently released Token-Oriented Object Notation (TOON) aims to be a schema-aware alternative to JSON that significantly reduces token consumption at a similar level of accuracy. While the existence and importance of token saved depend on the data shape, some benchmarks show TOON may use in some cases 40% fewer tokens than JSON, possibly resulting in LLM and inference cost savings.
-
Olmo 3 Release Provides Full Transparency into Model Development and Training
The Allen Institute for AI has unveiled Olmo 3, an open-source language model family that empowers developers with full access to the model lifecycle, from training datasets to checkpoints. Featuring reasoning-focused variants and robust tools for post-training modifications, Olmo 3 promotes transparency, experimentation, and community collaboration, driving innovations in AI.