InfoQ Homepage Large language models Content on InfoQ

News

RSS Feed

Newer Older

AI, ML & Data Engineering

MIT's Recursive Language Models Improve Performance on Long-Context Tasks

Researchers at MIT's CSAIL published a design for Recursive Language Models (RLM), a technique for improving LLM performance on long-context tasks. RLMs use a programming environment to recursively decompose and process inputs, and can handle prompts up to 100x longer than base LLMs.

Anthony Alford
on Jan 20, 2026
AI, ML & Data Engineering

Hugging Face Releases FineTranslations, a Trillion-Token Multilingual Parallel Text Dataset

Hugging Face has released FineTranslations, a large-scale multilingual dataset containing more than 1 trillion tokens of parallel text across English and 500+ languages. The dataset was created by translating non-English content from the FineWeb2 corpus into English using Gemma3 27B, with the full data generation pipeline designed to be reproducible and publicly documented.

Robert Krzaczyński
on Jan 18, 2026
AI, ML & Data Engineering

Mistral Releases OCR 3 with Improved Accuracy on Handwritten and Structured Documents

Mistral has released Mistral OCR 3, the latest version of its optical character recognition model, focused on higher accuracy across a wide range of document types, including handwritten notes, forms, low-quality scans, and complex tables.

Robert Krzaczyński
on Jan 15, 2026
AI, ML & Data Engineering

AI-Powered Code Editor Cursor Introduces Dynamic Context Discovery to Improve Token-Efficiency

Cursor introduced a new approach to minimize the context size of requests sent to large language models. Called dynamic context discovery, this method moves away from including large amounts of static context upfront, allowing the agent to dynamically retrieve only the information it needs. This reduces token usage and limits the inclusion of potentially confusing or irrelevant details.

Sergio De Simone
on Jan 14, 2026
AI, ML & Data Engineering

Vercel Open-Sources Bash Tool for Context Retrieval Using Local Filesystems

Vercel has open-sourced bash-tool that provides a Bash execution engine for AI agents, enabling them to run filesystem-based commands to retrieve context for model prompts.

Daniel Dominguez
on Jan 14, 2026
AI, ML & Data Engineering

Google Releases Gemma Scope 2 to Deepen Understanding of LLM Behavior

Gemma Scope 2 is a suite of tools designed to interpret the behavior of Gemini 3 models, enabling researchers to analyze emergent model behaviors, audit and debug AI agents, and devise mitigation strategies against security issues like jailbreaks, hallucinations and sycophancy.

Sergio De Simone
on Jan 12, 2026
AI, ML & Data Engineering

NVIDIA Releases Open Models, Datasets, and Tools across AI, Robotics, and Autonomous Driving

NVIDIA has released a set of open models, datasets, and development tools covering language, agentic systems, robotics, autonomous driving, and biomedical research. The update expands several existing NVIDIA model families and makes accompanying training data and reference implementations available through GitHub, Hugging Face, and NVIDIA’s developer platforms.

Robert Krzaczyński
on Jan 10, 2026
Architecture & Design

Meta Applies Mutation Testing with LLM to Improve Compliance Coverage

Meta applies large language models to mutation testing through its Automated Compliance Hardening system, generating targeted mutants and tests to improve compliance coverage, reduce overhead, and detect privacy and safety risks. The approach supports scalable, LLM-driven test generation and continuous compliance across Meta’s platforms.

Leela Kumili
on Jan 06, 2026
AI, ML & Data Engineering

Intel DeepMath Introduces a Smart Architecture to Make LLMs Better at Math

Intel has announced DeepMath, a lightweight agent built on Qwen3-Thinking that specializes in solving mathematical problems. To address common limitations of LLMs in math reasoning, DeepMath generates small Python scripts that support and enhance its problem-solving process.

Sergio De Simone
on Jan 05, 2026
AI, ML & Data Engineering

Google’s Eight Essential Multi-Agent Design Patterns

Google recently published a guide outlining eight essential design patterns for multi-agent systems, ranging from sequential pipelines to human-in-the-loop architecture. The guide provides concrete explanations of each pattern along with sample code for Google's Agent Development Kit.

Sergio De Simone
on Jan 05, 2026
AI, ML & Data Engineering

Microsoft Research Develops Novel Approaches to Enforce Privacy in AI Models

A team of AI researchers at Microsoft introduces two novel approaches for enforcing contextual integrity in large language models: PrivacyChecker, an open-source lightweight module that acts as a privacy shield during inference, and CI-CoT + CI-RL, an advanced training method designed to teach models to reason about privacy.

Sergio De Simone
on Jan 02, 2026
Architecture & Design

Swiggy Rolls out Hermes V3: from Text-to-SQL to Conversational AI

Swiggy has released Hermes V3, a GenAI-powered text-to-SQL assistant that enables employees to query data in plain English. The Slack-native system combines vector retrieval, conversational memory, agentic orchestration, and explainability to improve SQL accuracy and support multi-turn analytical queries.

Leela Kumili
on Jan 02, 2026
DevOps

Open-Source Agent Sandbox Enables Secure Deployment of AI Agents on Kubernetes

The Agent Sandbox is an open-source Kubernetes controller that provides a declarative API for managing a single, stateful pod with stable identity and persistent storage. It is particularly well suited for creating isolated environments to execute untrusted, LLM-generated code, as well as for running other stateful workloads.

Sergio De Simone
on Dec 30, 2025
Mobile

Cactus v1: Cross-Platform LLM Inference on Mobile with Zero Latency and Full Privacy

Cactus, a Y Combinator-backed startup, enables local AI inference to mobile phones, wearables, and other low-power devices through cross-platform, energy-efficient kernels and a native runtime. It delivers sub-50ms time-to-first-token for on-device inference, eliminates network latency, and defaults to complete privacy.

Sergio De Simone
on Dec 24, 2025
Architecture & Design

Target Improves Add to Cart Interactions by 11 Percent with Generative AI Recommendations

Target has deployed GRAM, a GenAI-powered accessory recommendation system for the Home category, using large language models to prioritize product attributes and capture aesthetic cohesion. The system helps shoppers find compatible accessories, integrates human-in-the-loop curation, and achieved measurable improvements in engagement and conversion.

Leela Kumili
on Dec 22, 2025

Newer News

Older News

InfoQ Software Architects' Newsletter

News