InfoQ Homepage Large language models Content on InfoQ

News

RSS Feed

Newer Older

AI, ML & Data Engineering

Hugging Face to Democratize Robotics with Open-Source Reachy 2 Robot

Hugging Face has acquired Pollen Robotics, a French startup that developed the humanoid robot Reachy 2. The acquisition aims to make robotics more accessible by open-sourcing the robot’s design and allowing developers to modify and improve its code.

Daniel Dominguez
on May 10, 2025
AI, ML & Data Engineering

Meta Launches AutoPatchBench to Evaluate LLM Agents on Security Fixes

AutoPatchBench is a standardized benchmark designed to help researchers and developers evaluate and compare how effectively LLM agents can automatically patch security vulnerabilities in C/C++ native code.

Sergio De Simone
on May 07, 2025
AI, ML & Data Engineering

OpenAI Launches BrowseComp to Benchmark AI Agents' Web Search and Deep Research Skills

OpenAI has released BrowseComp, a new benchmark designed to test AI agents' ability to locate difficult-to-find information on the web. The benchmark contains 1,266 challenging problems that require agents to persistently navigate through multiple websites to retrieve entangled information.

Vinod Goje
on May 04, 2025
Cloud

Cloudflare AutoRAG Streamlines Retrieval-Augmented Generation

Cloudflare has launched a managed service for using retrieval-augmented generation in LLM-based systems. Now in beta, CloudFlare AutoRAG aims to make it easier for developers to build pipelines that integrate rich context data into LLMs.

Sergio De Simone
on Apr 30, 2025
Architecture & Design

Scaling Financial Operations: Uber’s GenAI-Powered Approach to Invoice Automation

Uber recently described a GenAI-powered invoice processing system that reduced manual effort by 2x, cut handling time by 70%, and delivered 25–30% cost savings. By leveraging GPT-4 and a modular platform called TextSense, Uber improved data accuracy by 90%, enabling globally scalable, efficient, and highly automated financial operations.

Eran Stiller
on Apr 30, 2025
AI, ML & Data Engineering

Docker Bridges Agents and Containers with New MCP Catalog and Toolkit

Docker has announced two new AI-focused tools—the Docker MCP Catalog and the Docker MCP Toolkit—to bring container-grade security and developer-friendly workflows to agentic applications, helping build a developer-centric ecosystem for Model Context Protocol (MCP) tools.

Sergio De Simone
on Apr 29, 2025
AI, ML & Data Engineering

Google's Gemma 3 QAT Language Models Can Run Locally on Consumer-Grade GPUs

Google released the Gemma 3 QAT family, quantized versions of their open-weight Gemma 3 language models. The models use Quantization-Aware Training (QAT) to maintain high accuracy when the weights are quantized from 16 to 4 bits.

Anthony Alford
on Apr 29, 2025
AI, ML & Data Engineering

Google DeepMind Shares Approach to AGI Safety and Security

Google DeepMind has released a new paper outlining its approach to safety and security in the development of artificial general intelligence (AGI). AGI refers to AI systems that are as capable as humans at most cognitive tasks.

Daniel Dominguez
on Apr 29, 2025
AI, ML & Data Engineering

PayPal's New Agent Toolkit Connects AI Frameworks with Payment APIs through MCP

PayPal has released its Agent Toolkit, designed to help developers integrate PayPal's API suite with AI frameworks through the Model Context Protocol (MCP). The toolkit provides access to APIs for payments, invoices, disputes, shipment tracking, catalog management, subscriptions, and analytics capabilities.

Vinod Goje
on Apr 28, 2025
Architecture & Design

AWS Promotes Responsible AI in the Well-Architected Generative AI Lens

AWS announced the availability of the new Well-Architected Generative AI Lens, focused on providing best practices for designing and operating generative AI workloads. The lens is aimed at organizations delivering robust and cost-effective generative AI solutions on AWS. The document offers cloud-agnostic best practices, implementation guidance and links to additional resources.

Rafal Gancarz
on Apr 27, 2025
AI, ML & Data Engineering

DeepMind Researchers Propose Defense against LLM Prompt Injection

To prevent prompt injection attacks when working with untrusted sources, Google DeepMind researchers have proposed CaMeL, a defense layer around LLMs that blocks malicious inputs by extracting the control and data flows from the query. According to their results, CaMeL can neutralize 67% of attacks in the AgentDojo security benchmark.

Sergio De Simone
on Apr 26, 2025
AI, ML & Data Engineering

Microsoft Native 1-Bit LLM Could Bring Efficient genAI to Everyday CPUs

In a recent paper, Microsoft researchers described BitNet b1.58 2B4T, the first LLM to be natively trained using "1-bit" (technically, 1-trit) weights, rather than being quantized from a model trained with floating point weights. According to Microsoft, the model delivers performance comparable to full-precision LLMs of similar size at a fraction of the computation cost and hardware requirements.

Sergio De Simone
on Apr 23, 2025
AI, ML & Data Engineering

Google DeepMind Introduces QuestBench to Evaluate LLMs in Solving Logic and Math Problems

Google DeepMind’s QuestBench benchmark helps in evaluating if LLMs can pinpoint the single, crucial question needed to solve logic, planning, or math problems. DeepMind team recently published an article on QuestBench which is a set of underspecified reasoning tasks solvable by asking at most one question.

Srini Penchikala
on Apr 22, 2025
AI, ML & Data Engineering

Docker Model Runner Aims to Make it Easier to Run LLM Models Locally

Currently in preview with Docker Desktop 4.40 for macOS on Apple Silicon, Docker Model Runner allows developers to run models locally and iterate on application code using the local models- without disrupting their container-based workflows.

Sergio De Simone
on Apr 22, 2025
AI, ML & Data Engineering

AI Continent: European Commission Outlines Strategy for Scaling AI Development

The European Commission has presented the AI Continent Action Plan, a new strategy designed to strengthen the European Union’s capacity for AI development and deployment. The plan outlines coordinated investment in infrastructure, access to high-quality data, AI adoption in strategic sectors, and support for regulatory implementation.

Robert Krzaczyński
on Apr 17, 2025

Newer News

Older News

InfoQ Software Architects' Newsletter

News