InfoQ Homepage AI, ML & Data Engineering Content on InfoQ

News

RSS Feed

Newer Older

AI, ML & Data Engineering

How Discord Scaled its ML Platform from Single-GPU Workflows to a Shared Ray Cluster

Discord has detailed how it rebuilt its machine learning platform after hitting the limits of single-GPU training. The changes enabled daily retrains for large models and contributed to a 200% uplift in a key ads ranking metric.

Matt Foster
on Dec 03, 2025
AI, ML & Data Engineering

Google Introduces Nano Banana Pro with Grounded, Multimodal Image Synthesis

Google has released Nano Banana Pro. The system moves beyond conventional diffusion workflows by tightly coupling image generation with Gemini’s multimodal reasoning stack. The result: visuals that are not only aesthetically pleasing, but structurally, contextually, and informationally accurate.

Robert Krzaczyński
on Dec 02, 2025
Mobile

Google's New LiteRT Accelerator Supercharges AI Workloads on Snapdragon-powered Android Devices

Google has introduced a new accelerator for LiteRT, called Qualcomm AI Engine Direct (QNN), to enhance on-device AI performance on Qualcomm-powered Android devices equipped with Snapdragon 8 SoCs. The accelerator delivers significant gains, offering up to a 100x speedup over CPU execution and 10x over GPU.

Sergio De Simone
on Nov 30, 2025
AI, ML & Data Engineering

Private AI Compute Enables Google Inference with Hardware Isolation and Ephemeral Data Design

Google announced Private AI Compute, a system designed to process AI requests using Gemini cloud models while aiming to keep user data private. The announcement positions Private AI Compute as Google's approach to addressing privacy concerns while providing cloud-based AI capabilities, building on what the company calls privacy-enhancing technologies it has developed for AI use cases.

Vinod Goje
on Nov 30, 2025
AI, ML & Data Engineering

KubeCon NA 2025 - Robert Nishihara on Open Source AI Compute with Kubernetes, Ray, PyTorch, and vLLM

AI workloads are growing more complex in terms of compute and data, and technologies like Kubernetes and PyTorch can help build production-ready AI systems to support them. Robert Nishihara from Anyscale recently spoke at KubeCon + CloudNativeCon North America 2025 Conference about how an AI compute stack comprising Kubernetes, PyTorch, VLLM and Ray technologies can support these new AI workloads.

Srini Penchikala
on Nov 28, 2025
AI, ML & Data Engineering

Amazon Adds A2A Protocol to Bedrock AgentCore for Interoperable Multi-Agent Workflows

Amazon announced support for the Agent-to-Agent (A2A) protocol in Amazon Bedrock AgentCore Runtime, enabling communication between agents built on different frameworks. The protocol allows agents developed with Strands Agents, OpenAI Agents SDK, LangGraph, Google ADK, or Claude Agents SDK to "share context, capabilities, and reasoning in a common, verifiable format."

Vinod Goje
on Nov 28, 2025
AI, ML & Data Engineering

LinkedIn’s Migration Journey to Serve Billions of Users by Nishant Lakshmikanth at QCon SF

Engineering Manager Nishant Lakshmikanth showcased LinkedIn's transformation at QCon SF 2025, detailing a shift from legacy batch-based systems to a real-time architecture. By decoupling recommendations and leveraging dynamic scoring techniques, LinkedIn achieved a 90% reduction in offline costs, enhanced session-level freshness, and improved member engagement while future-proofing its platform.

Steef-Jan Wiggers
on Nov 26, 2025
AI, ML & Data Engineering

SAM 3 Introduces a More Capable Segmentation Architecture for Modern Vision Workflows

Meta has released SAM 3, the latest version of its Segment Anything Model and the most substantial update to the project since its initial launch. Built to provide more stable and context-aware segmentation, the model offers improvements in accuracy, boundary quality, and robustness to real-world scenes, aiming to make segmentation more reliable across research and production systems.

Robert Krzaczyński
on Nov 26, 2025
AI, ML & Data Engineering

Google Launches Agent Development Kit for Go

Google has added support for the Go language to its Agent Development Kit (ADK), enabling Go developers to build and manage agents in an idiomatic way that leverages the language's strong concurrency and typing features.

Sergio De Simone
on Nov 25, 2025
AI, ML & Data Engineering

Microsoft Copilot Fall Release Includes Collaboration and Personalization Features

Microsoft's recent Copilot Fall Release includes several new features for productivity, collaboration, and personalization. The release also includes updates to Copilot features in Edge and Windows, as well as integration with Microsoft's in-house AI models.

Anthony Alford
on Nov 25, 2025
AI, ML & Data Engineering

Google Brings Colab Integration to Visual Studio Code

Google has announced the availability of a new Visual Studio Code extension that connects local notebooks to a Colab runtime. This allows developers to unify their previously separate local development setup and web-based Colab environment.

Sergio De Simone
on Nov 24, 2025
AI, ML & Data Engineering

AnyLanguageModel: Unified API for Local and Cloud LLMs on Apple Platforms

Developers on Apple platforms often face a fragmented ecosystem when using language models. Local models via Core ML or MLX offer privacy and offline capabilities, while cloud services like OpenAI, Anthropic, or Google Gemini provide advanced features. AnyLanguageModel, a new Swift package, simplifies integration by offering a unified API for both local and remote models.

Robert Krzaczyński
on Nov 24, 2025
Cloud

Google Cloud Introduces Bigtable Tiered Storage

Google Cloud recently introduced the preview of Bigtable tiered storage. The new feature allows developers to manage both hot and cold data within a single Bigtable instance, optimizing costs while maintaining access to all data.

Renato Losio
on Nov 23, 2025
Development

New Token-Oriented Object Notation (TOON) Hopes to Cut LLM Costs by Reducing Token Consumption

The recently released Token-Oriented Object Notation (TOON) aims to be a schema-aware alternative to JSON that significantly reduces token consumption at a similar level of accuracy. While the existence and importance of token saved depend on the data shape, some benchmarks show TOON may use in some cases 40% fewer tokens than JSON, possibly resulting in LLM and inference cost savings.

Bruno Couriol
on Nov 23, 2025
AI, ML & Data Engineering

Olmo 3 Release Provides Full Transparency into Model Development and Training

The Allen Institute for AI has unveiled Olmo 3, an open-source language model family that empowers developers with full access to the model lifecycle, from training datasets to checkpoints. Featuring reasoning-focused variants and robust tools for post-training modifications, Olmo 3 promotes transparency, experimentation, and community collaboration, driving innovations in AI.

Robert Krzaczyński
on Nov 22, 2025

Newer News

Older News

InfoQ Software Architects' Newsletter

News