InfoQ Homepage AI, ML & Data Engineering Content on InfoQ

News

RSS Feed

Newer Older

AI, ML & Data Engineering

Docker’s Cagent Brings Deterministic Testing to AI Agents

Docker is positioning its Cagent runtime as a way to bring deterministic testing back to AI agents, addressing a growing problem for teams building production agentic systems.

Matt Foster
on Jan 19, 2026
AI, ML & Data Engineering

Hugging Face Releases FineTranslations, a Trillion-Token Multilingual Parallel Text Dataset

Hugging Face has released FineTranslations, a large-scale multilingual dataset containing more than 1 trillion tokens of parallel text across English and 500+ languages. The dataset was created by translating non-English content from the FineWeb2 corpus into English using Gemma3 27B, with the full data generation pipeline designed to be reproducible and publicly documented.

Robert Krzaczyński
on Jan 18, 2026
Mobile

Android Studio Otter Boosts Agent Workflows and Adds LLM Flexibility

The latest Android Studio Otter feature drop introduces several new features that make it easier for developers to integrate AI-powered tools in their workflows, including the ability to set which LLM to use, enhanced agent mode through device interaction, support for natural language testing, and more.

Sergio De Simone
on Jan 17, 2026
AI, ML & Data Engineering

Cloudflare Introduces Aggregations in R2 SQL for Data Analytics

Cloudflare recently announced support for aggregations in R2 SQL, a new feature that lets developers run SQL queries on data stored in R2. This enhancement expands R2 SQL beyond basic filtering and makes it more useful for analytical workloads without requiring separate data warehouse tools.

Renato Losio
on Jan 17, 2026
Cloud

AWS Hikes EC2 Capacity Block Rates by 15% in Uniform ML Pricing Adjustment

AWS has raised EC2 Capacity Block prices for ML by 15% across all regions, impacting GPU-based workloads. The uniform price hikes affect top-tier instances powered by NVIDIA GPUs, underscoring supply chain pressures and inflation. With limited alternatives, organizations face higher costs, emphasizing the need for effective workload optimization and cost management strategies.

Steef-Jan Wiggers
on Jan 15, 2026
AI, ML & Data Engineering

Mistral Releases OCR 3 with Improved Accuracy on Handwritten and Structured Documents

Mistral has released Mistral OCR 3, the latest version of its optical character recognition model, focused on higher accuracy across a wide range of document types, including handwritten notes, forms, low-quality scans, and complex tables.

Robert Krzaczyński
on Jan 15, 2026
Architecture & Design

How Agoda Unified Multiple Data Pipelines into a Single Source of Truth

Agoda recently described how it consolidated multiple independent data pipelines into a centralized Apache Spark-based platform to eliminate inconsistencies in financial data. The company implemented a multi-layered quality framework that combines automated validations, machine-learning-based anomaly detection, and data contracts, while processing millions of daily booking transactions.

Eran Stiller
on Jan 14, 2026
AI, ML & Data Engineering

AI-Powered Code Editor Cursor Introduces Dynamic Context Discovery to Improve Token-Efficiency

Cursor introduced a new approach to minimize the context size of requests sent to large language models. Called dynamic context discovery, this method moves away from including large amounts of static context upfront, allowing the agent to dynamically retrieve only the information it needs. This reduces token usage and limits the inclusion of potentially confusing or irrelevant details.

Sergio De Simone
on Jan 14, 2026
AI, ML & Data Engineering

Vercel Open-Sources Bash Tool for Context Retrieval Using Local Filesystems

Vercel has open-sourced bash-tool that provides a Bash execution engine for AI agents, enabling them to run filesystem-based commands to retrieve context for model prompts.

Daniel Dominguez
on Jan 14, 2026
AI, ML & Data Engineering

QCon London 2026: Practitioner-Led Tracks on Connectivity & Production AI Engineering

QCon London 2026 returns March 16–19 with 15 tracks for senior leads. Key sessions cover system integration via MCP, AI engineering, and debugging distributed systems. Explore modern security, Staff+ insights, and performance optimization with peer-led and practical discussions.

Artenisa Chatziou
on Jan 13, 2026
Architecture & Design

Solving Fragmented Mobile Analytics: Uber’s Platform-Led Approach

Uber Engineering outlines its platform-led mobile analytics redesign, standardizing event instrumentation across iOS and Android to improve cross-platform consistency, reduce engineering effort, and provide reliable insights for product and data teams.

Leela Kumili
on Jan 13, 2026
AI, ML & Data Engineering

Google Introduces Conductor, a Context-Driven Development Extension for Gemini CLI

Google has released Conductor, a new preview extension for Gemini CLI that introduces a structured, context-driven approach to AI-assisted software development. The extension is designed to address a common limitation of chat-based coding tools: the loss of project context across sessions.

Robert Krzaczyński
on Jan 13, 2026
AI, ML & Data Engineering

Google Releases Gemma Scope 2 to Deepen Understanding of LLM Behavior

Gemma Scope 2 is a suite of tools designed to interpret the behavior of Gemini 3 models, enabling researchers to analyze emergent model behaviors, audit and debug AI agents, and devise mitigation strategies against security issues like jailbreaks, hallucinations and sycophancy.

Sergio De Simone
on Jan 12, 2026
AI, ML & Data Engineering

FACTS Benchmark Suite Introduced to Evaluate Factual Accuracy of Large Language Models

A new industry benchmark aimed at systematically evaluating the factual accuracy of LLMs has been released with the launch of the FACTS Benchmark Suite. Developed by the FACTS team in collaboration with Kaggle, the suite expands earlier work on factual grounding and introduces a broader, multi-dimensional framework for measuring how reliably language models produce factually correct responses.

Robert Krzaczyński
on Jan 12, 2026
Development

Inside the Development Workflow of Claude Code's Creator

Claude Code's creator Boris Cherny described how he uses it at Anthropic, highlighting practices such as running parallel instances, sharing learnings, automating prompting, and rigorously verifying results to compound productivity over time.

Sergio De Simone
on Jan 10, 2026

Newer News

Older News

InfoQ Software Architects' Newsletter

News