InfoQ Homepage Machine Learning Content on InfoQ

News

RSS Feed

Newer Older

AI, ML & Data Engineering

New IBM Granite 4 Models to Reduce AI Costs with Inference-Efficient Hybrid Mamba-2 Architecture

IBM recently announced the Granite 4.0 family of small language models. The model family aims to deliver faster speeds and significantly lower operational costs at acceptable accuracy vs. larger models. Granite 4.0 features a new hybrid Mamba/transformer architecture that largely reduces memory requirements, enabling Granite to run on significantly cheaper GPUs and at significantly reduced costs.

Bruno Couriol
on Nov 18, 2025
AI, ML & Data Engineering

KubeCon NA 2025 - Erica Hughberg and Alexa Griffith on Tools for the Age of GenAI

Generative AI technologies need to support new workloads, traffic patterns, and infrastructure demands and require a new set of tools for the age of GenAI. Erica Hughberg from Tetrate and Alexa Griffith from Bloomberg spoke last week at KubeCon + CloudNativeCon North America 2025 Conference about what it takes to build GenAI platforms capable of serving model inference at scale.

Srini Penchikala
on Nov 17, 2025
AI, ML & Data Engineering

Anthropic Adds Sandboxing and Web Access to Claude Code for Safer AI-Powered Coding

Anthropic released sandboxing capabilities for Claude Code and launched a web-based version of the tool that runs in isolated cloud environments. The company introduced these features to address security risks that arise when Claude Code writes, tests, and debugs code with broad access to developer codebases and files.

Vinod Goje
on Nov 14, 2025
AI, ML & Data Engineering

New Claude Haiku 4.5 Model Promises Faster Performance at One-Third the Cost

Anthropic released Claude Haiku 4.5, making the model available to all users as its latest entry in the small, fast model category. The company positions the new model as delivering performance levels comparable to Claude Sonnet 4, which launched five months ago as a state-of-the-art model, but at "one-third the cost and more than twice the speed."

Vinod Goje
on Nov 12, 2025
Development

How Meta Is Using AI to Standardize and Cut Carbon Emissions

Meta has developed an AI-based approach to improve the quality of Scope 3 emissions estimates across its IT hardware supply chain. The method combines machine learning and generative models to classify hardware components and infer missing product carbon footprint (PCF) data.

Matt Foster
on Oct 31, 2025
Mobile

Google Research Open-Sources the Coral NPU Platform to Help Build AI into Wearables and Edge Devices

Coral NPU is an open-source full-stack platform designed to help hardware engineers and AI developers overcome the limitations that prevent integrating AI in wearables and edge devices, including performance, fragmentation, and user trust.

Sergio De Simone
on Oct 22, 2025
Architecture & Design

Instagram Improves Engagement by Reducing Notification Fatigue with New Ranking Framework

Meta has introduced a diversity-aware ranking framework for Instagram notifications. The system applies multiplicative penalties to reduce repetitive alerts from the same creators or product surfaces, improving engagement while maintaining relevance and introducing content variety.

Leela Kumili
on Sep 29, 2025
Culture & Methods

An AI-Driven Approach to Creating Effective Learning Experiences at QCon

An experiment was created around a certification program influenced by AI at QCon London, which included special events during the conference, a pre-conference breakfast where participants could learn about upcoming activities, and an AI-driven workshop immediately following the conference. Wes Reisz spoke at InfoQ Dev Summit Boston about a program he led using AI.

Ben Linders
on Aug 28, 2025
AI, ML & Data Engineering

How Netflix is Reimagining Data Engineering for Video, Audio, and Text

Netflix has introduced a new engineering specialization—Media ML Data Engineering, alongside a Media Data Lake designed to handle video, audio, text, and image assets at scale. Early results include richer ML models trained on standardized media, faster evaluation cycles, and deeper insights into creative workflows.

Matt Foster
on Aug 25, 2025
AI, ML & Data Engineering

Roblox Open-Sources AI System to Detect Conversations Potentially Harmful to Kids

Roblox Sentinel is an AI system designed to detect early signs of potential child endangerment for further analysis and investigation. Implemented as a Python library, Sentinel uses contrastive learning to handle highly imbalanced datasets that often challenge traditional classifiers and can be applied to a wide range of use cases.

Sergio De Simone
on Aug 16, 2025
AI, ML & Data Engineering

Google Releases Major Firebase Studio Updates for Agentic AI Development

At Google Cloud Summit London in early July, Google revealed new capabilities in Firebase Studio that promise to enhance agentic cloud-based development: an autonomous Agent mode, native support for Model Context Protocol (MCP), and Gemini CLI integration. These updates aim to streamline agentic AI development by making AI agents more independent and seamlessly embedded in developer workflows.

Hien Luu
on Jul 31, 2025
AI, ML & Data Engineering

Databricks Agent Bricks Automates Enterprise AI Development with TAO and ALHF Methods

Databricks introduced Agent Bricks, a new product that changes how enterprises develop domain-specific agents. The automated workflow includes generating task-specific evaluations and LLM judges for quality assessment, creating synthetic data that resembles customer data to supplement agent learning, and searching across optimization techniques to refine agent performance.

Vinod Goje
on Jul 28, 2025
Cloud

Microsoft Adds Deep Research Capability in Azure AI Foundry Agent Service

Unlock the future of research with Microsoft’s Azure AI Foundry Agent Service, featuring Deep Research—an innovative tool that empowers knowledge workers in complex fields. This advanced AI capability autonomously analyzes and synthesizes web data, automating rigorous research tasks while ensuring traceability and transparency. Sign up for the public preview today!

Steef-Jan Wiggers
on Jul 14, 2025
Mobile

Arm Scalable Matrix Extension 2 Coming to Android to Accelerate On-Device AI

Available in the Armv9-A architecture, Arm Scalable Matrix Extension 2 (SME2) is a set of advanced CPU instructions designed to accelerate matrix heavy computation. The new Arm technology aims to help mobile developers to run advanced AI models directly on CPU with improved performance and efficiency, without requiring any changes to their apps.

Sergio De Simone
on Jul 13, 2025
Culture & Methods

The Rise of Energy and Water Consumption Using AI Models, and How It Can Be Reduced

Artificial intelligence's (AI) energy and water consumption has become a growing concern in the tech industry, particularly for large-scale machine learning models and data centers. Sustainable AI focuses on making AI technology more environmentally friendly and socially responsible.

Ben Linders
on Jun 26, 2025

Newer News

Older News

InfoQ Software Architects' Newsletter

News