InfoQ Homepage Large language models Content on InfoQ
-
Hugging Face Releases FineTranslations, a Trillion-Token Multilingual Parallel Text Dataset
Hugging Face has released FineTranslations, a large-scale multilingual dataset containing more than 1 trillion tokens of parallel text across English and 500+ languages. The dataset was created by translating non-English content from the FineWeb2 corpus into English using Gemma3 27B, with the full data generation pipeline designed to be reproducible and publicly documented.
-
Mistral Releases OCR 3 with Improved Accuracy on Handwritten and Structured Documents
Mistral has released Mistral OCR 3, the latest version of its optical character recognition model, focused on higher accuracy across a wide range of document types, including handwritten notes, forms, low-quality scans, and complex tables.
-
AI-Powered Code Editor Cursor Introduces Dynamic Context Discovery to Improve Token-Efficiency
Cursor introduced a new approach to minimize the context size of requests sent to large language models. Called dynamic context discovery, this method moves away from including large amounts of static context upfront, allowing the agent to dynamically retrieve only the information it needs. This reduces token usage and limits the inclusion of potentially confusing or irrelevant details.
-
Vercel Open-Sources Bash Tool for Context Retrieval Using Local Filesystems
Vercel has open-sourced bash-tool that provides a Bash execution engine for AI agents, enabling them to run filesystem-based commands to retrieve context for model prompts.
-
Google Releases Gemma Scope 2 to Deepen Understanding of LLM Behavior
Gemma Scope 2 is a suite of tools designed to interpret the behavior of Gemini 3 models, enabling researchers to analyze emergent model behaviors, audit and debug AI agents, and devise mitigation strategies against security issues like jailbreaks, hallucinations and sycophancy.
-
NVIDIA Releases Open Models, Datasets, and Tools across AI, Robotics, and Autonomous Driving
NVIDIA has released a set of open models, datasets, and development tools covering language, agentic systems, robotics, autonomous driving, and biomedical research. The update expands several existing NVIDIA model families and makes accompanying training data and reference implementations available through GitHub, Hugging Face, and NVIDIA’s developer platforms.
-
Meta Applies Mutation Testing with LLM to Improve Compliance Coverage
Meta applies large language models to mutation testing through its Automated Compliance Hardening system, generating targeted mutants and tests to improve compliance coverage, reduce overhead, and detect privacy and safety risks. The approach supports scalable, LLM-driven test generation and continuous compliance across Meta’s platforms.
-
Intel DeepMath Introduces a Smart Architecture to Make LLMs Better at Math
Intel has announced DeepMath, a lightweight agent built on Qwen3-Thinking that specializes in solving mathematical problems. To address common limitations of LLMs in math reasoning, DeepMath generates small Python scripts that support and enhance its problem-solving process.
-
Google’s Eight Essential Multi-Agent Design Patterns
Google recently published a guide outlining eight essential design patterns for multi-agent systems, ranging from sequential pipelines to human-in-the-loop architecture. The guide provides concrete explanations of each pattern along with sample code for Google's Agent Development Kit.
-
Microsoft Research Develops Novel Approaches to Enforce Privacy in AI Models
A team of AI researchers at Microsoft introduces two novel approaches for enforcing contextual integrity in large language models: PrivacyChecker, an open-source lightweight module that acts as a privacy shield during inference, and CI-CoT + CI-RL, an advanced training method designed to teach models to reason about privacy.
-
Swiggy Rolls out Hermes V3: from Text-to-SQL to Conversational AI
Swiggy has released Hermes V3, a GenAI-powered text-to-SQL assistant that enables employees to query data in plain English. The Slack-native system combines vector retrieval, conversational memory, agentic orchestration, and explainability to improve SQL accuracy and support multi-turn analytical queries.
-
Open-Source Agent Sandbox Enables Secure Deployment of AI Agents on Kubernetes
The Agent Sandbox is an open-source Kubernetes controller that provides a declarative API for managing a single, stateful pod with stable identity and persistent storage. It is particularly well suited for creating isolated environments to execute untrusted, LLM-generated code, as well as for running other stateful workloads.
-
Cactus v1: Cross-Platform LLM Inference on Mobile with Zero Latency and Full Privacy
Cactus, a Y Combinator-backed startup, enables local AI inference to mobile phones, wearables, and other low-power devices through cross-platform, energy-efficient kernels and a native runtime. It delivers sub-50ms time-to-first-token for on-device inference, eliminates network latency, and defaults to complete privacy.
-
Target Improves Add to Cart Interactions by 11 Percent with Generative AI Recommendations
Target has deployed GRAM, a GenAI-powered accessory recommendation system for the Home category, using large language models to prioritize product attributes and capture aesthetic cohesion. The system helps shoppers find compatible accessories, integrates human-in-the-loop curation, and achieved measurable improvements in engagement and conversion.
-
Toad: a Unified CLI Tool for All Your LLMs That Promises Improved UX from Existing Ones
During his sabbatical, Will McGugan, maker of Rich and Textual, frameworks for making Textual User Interfaces (TUI), put his UI skills to work to build Toad. The newly publicly released tool aims to provide a unified, "beautiful" GUI for multiple coding agents in your terminal, accessible via the same tool via the Agent Communication Protocol (ACP).