InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Google's New LiteRT Accelerator Supercharges AI Workloads on Snapdragon-powered Android Devices
Google has introduced a new accelerator for LiteRT, called Qualcomm AI Engine Direct (QNN), to enhance on-device AI performance on Qualcomm-powered Android devices equipped with Snapdragon 8 SoCs. The accelerator delivers significant gains, offering up to a 100x speedup over CPU execution and 10x over GPU.
-
Private AI Compute Enables Google Inference with Hardware Isolation and Ephemeral Data Design
Google announced Private AI Compute, a system designed to process AI requests using Gemini cloud models while aiming to keep user data private. The announcement positions Private AI Compute as Google's approach to addressing privacy concerns while providing cloud-based AI capabilities, building on what the company calls privacy-enhancing technologies it has developed for AI use cases.
-
KubeCon NA 2025 - Robert Nishihara on Open Source AI Compute with Kubernetes, Ray, PyTorch, and vLLM
AI workloads are growing more complex in terms of compute and data, and technologies like Kubernetes and PyTorch can help build production-ready AI systems to support them. Robert Nishihara from Anyscale recently spoke at KubeCon + CloudNativeCon North America 2025 Conference about how an AI compute stack comprising Kubernetes, PyTorch, VLLM and Ray technologies can support these new AI workloads.
-
Amazon Adds A2A Protocol to Bedrock AgentCore for Interoperable Multi-Agent Workflows
Amazon announced support for the Agent-to-Agent (A2A) protocol in Amazon Bedrock AgentCore Runtime, enabling communication between agents built on different frameworks. The protocol allows agents developed with Strands Agents, OpenAI Agents SDK, LangGraph, Google ADK, or Claude Agents SDK to "share context, capabilities, and reasoning in a common, verifiable format."
-
LinkedIn’s Migration Journey to Serve Billions of Users by Nishant Lakshmikanth at QCon SF
Engineering Manager Nishant Lakshmikanth showcased LinkedIn's transformation at QCon SF 2025, detailing a shift from legacy batch-based systems to a real-time architecture. By decoupling recommendations and leveraging dynamic scoring techniques, LinkedIn achieved a 90% reduction in offline costs, enhanced session-level freshness, and improved member engagement while future-proofing its platform.
-
SAM 3 Introduces a More Capable Segmentation Architecture for Modern Vision Workflows
Meta has released SAM 3, the latest version of its Segment Anything Model and the most substantial update to the project since its initial launch. Built to provide more stable and context-aware segmentation, the model offers improvements in accuracy, boundary quality, and robustness to real-world scenes, aiming to make segmentation more reliable across research and production systems.
-
Google Launches Agent Development Kit for Go
Google has added support for the Go language to its Agent Development Kit (ADK), enabling Go developers to build and manage agents in an idiomatic way that leverages the language's strong concurrency and typing features.
-
Microsoft Copilot Fall Release Includes Collaboration and Personalization Features
Microsoft's recent Copilot Fall Release includes several new features for productivity, collaboration, and personalization. The release also includes updates to Copilot features in Edge and Windows, as well as integration with Microsoft's in-house AI models.
-
Google Brings Colab Integration to Visual Studio Code
Google has announced the availability of a new Visual Studio Code extension that connects local notebooks to a Colab runtime. This allows developers to unify their previously separate local development setup and web-based Colab environment.
-
AnyLanguageModel: Unified API for Local and Cloud LLMs on Apple Platforms
Developers on Apple platforms often face a fragmented ecosystem when using language models. Local models via Core ML or MLX offer privacy and offline capabilities, while cloud services like OpenAI, Anthropic, or Google Gemini provide advanced features. AnyLanguageModel, a new Swift package, simplifies integration by offering a unified API for both local and remote models.
-
Google Cloud Introduces Bigtable Tiered Storage
Google Cloud recently introduced the preview of Bigtable tiered storage. The new feature allows developers to manage both hot and cold data within a single Bigtable instance, optimizing costs while maintaining access to all data.
-
New Token-Oriented Object Notation (TOON) Hopes to Cut LLM Costs by Reducing Token Consumption
The recently released Token-Oriented Object Notation (TOON) aims to be a schema-aware alternative to JSON that significantly reduces token consumption at a similar level of accuracy. While the existence and importance of token saved depend on the data shape, some benchmarks show TOON may use in some cases 40% fewer tokens than JSON, possibly resulting in LLM and inference cost savings.
-
Olmo 3 Release Provides Full Transparency into Model Development and Training
The Allen Institute for AI has unveiled Olmo 3, an open-source language model family that empowers developers with full access to the model lifecycle, from training datasets to checkpoints. Featuring reasoning-focused variants and robust tools for post-training modifications, Olmo 3 promotes transparency, experimentation, and community collaboration, driving innovations in AI.
-
Valkey 9.0 Introduces Multi-Database Clustering, Atomic Slot Migration, and Major Performance Gains
The Linux Foundation has announced the general availability of Valkey 9.0, the open-source in-memory storage solution developed as a successor to Redis. The latest major version introduces atomic slot migrations, hash field expiration, and full support for numbered databases in cluster mode, enabling scaling to 2,000 nodes and achieving over 1 billion requests per second.
-
QConSF 2025: Humans in the Loop: Engineering Leadership in a Chaotic Industry
At QCon SF 2025, Michelle Brush of Google explored the evolving landscape of software engineering in her keynote “Humans in the Loop: Engineering Leadership in a Chaotic Industry.” She highlighted the complexities engineers face amid automation and AI, stressing the importance of conscious competence, higher-level problem-solving, and effective leadership in navigating today's challenges.