InfoQ Homepage Large language models Content on InfoQ
-
Llama 4 Scout and Maverick Now Available on Amazon Bedrock and SageMaker JumpStart
AWS recently announced the availability of Meta's latest foundation models, Llama 4 Scout and Llama 4 Maverick, in Amazon Bedrock and AWS SageMaker JumpStart. Both models provide multimodal capabilities and follow the mixture-of-experts architecture.
-
Mistral Unveils Medium 3: Enterprise-Ready Language Model
Mistral AI has unveiled Mistral Medium 3, a mid-sized language model aimed at enterprises seeking a balance between cost-efficiency, strong performance, and flexible deployment options. The model is now available through Mistral’s platform and Amazon SageMaker, with further releases planned for IBM WatsonX, Azure AI Foundry, Google Cloud Vertex AI, and NVIDIA NIM.
-
CMU Researchers Introduce LegoGPT: Building Stable LEGO Structures from Text Prompts
Researchers at Carnegie Mellon University have introduced LegoGPT, a system that generates physically stable and buildable LEGO® structures from natural language descriptions. The project combines large language models with engineering constraints to produce designs that can be assembled manually or by robotic systems.
-
Anthropic Introduces Web Search Functionality for Claude Models
Anthropic has announced the addition of web search capabilities to its Claude models, available via the Anthropic API. This update enables Claude to access current information from the web, allowing developers to create applications and AI agents that provide up-to-date insights.
-
Meta Open Sources LlamaFirewall for AI Agent Combined Protection
LlamaFirewall is a security framework aimed at safeguarding AI agents against prompt injection, goal misalignment, and insecure code generation. It achieved over 90% efficacy in reducing attack success rates when evaluated on the AgentDojo benchmark. Additionally, developers can update its behavior by adding new security guardrails.
-
Meta Announces API and Protection Tools at First LlamaCon Event
At Meta's first-ever LlamaCon event, the company announced several new tools for building with their Llama AI models: a limited preview of the Llama API that allows developers to experiment with different models, and new Llama Protection Tools for securing AI applications.
-
Google Introduces DolphinGemma to Support Dolphin Communication Research
Google has released a new AI model called DolphinGemma, which has been developed to assist researchers in analyzing and interpreting dolphin vocalizations. The project is part of an ongoing collaboration with the Wild Dolphin Project (WDP) and researchers at Georgia Tech, and it focuses on identifying patterns in the natural communication of Atlantic spotted dolphins.
-
Hugging Face to Democratize Robotics with Open-Source Reachy 2 Robot
Hugging Face has acquired Pollen Robotics, a French startup that developed the humanoid robot Reachy 2. The acquisition aims to make robotics more accessible by open-sourcing the robot’s design and allowing developers to modify and improve its code.
-
Meta Launches AutoPatchBench to Evaluate LLM Agents on Security Fixes
AutoPatchBench is a standardized benchmark designed to help researchers and developers evaluate and compare how effectively LLM agents can automatically patch security vulnerabilities in C/C++ native code.
-
OpenAI Launches BrowseComp to Benchmark AI Agents' Web Search and Deep Research Skills
OpenAI has released BrowseComp, a new benchmark designed to test AI agents' ability to locate difficult-to-find information on the web. The benchmark contains 1,266 challenging problems that require agents to persistently navigate through multiple websites to retrieve entangled information.
-
Cloudflare AutoRAG Streamlines Retrieval-Augmented Generation
Cloudflare has launched a managed service for using retrieval-augmented generation in LLM-based systems. Now in beta, CloudFlare AutoRAG aims to make it easier for developers to build pipelines that integrate rich context data into LLMs.
-
Scaling Financial Operations: Uber’s GenAI-Powered Approach to Invoice Automation
Uber recently described a GenAI-powered invoice processing system that reduced manual effort by 2x, cut handling time by 70%, and delivered 25–30% cost savings. By leveraging GPT-4 and a modular platform called TextSense, Uber improved data accuracy by 90%, enabling globally scalable, efficient, and highly automated financial operations.
-
Docker Bridges Agents and Containers with New MCP Catalog and Toolkit
Docker has announced two new AI-focused tools—the Docker MCP Catalog and the Docker MCP Toolkit—to bring container-grade security and developer-friendly workflows to agentic applications, helping build a developer-centric ecosystem for Model Context Protocol (MCP) tools.
-
Google's Gemma 3 QAT Language Models Can Run Locally on Consumer-Grade GPUs
Google released the Gemma 3 QAT family, quantized versions of their open-weight Gemma 3 language models. The models use Quantization-Aware Training (QAT) to maintain high accuracy when the weights are quantized from 16 to 4 bits.
-
Google DeepMind Shares Approach to AGI Safety and Security
Google DeepMind has released a new paper outlining its approach to safety and security in the development of artificial general intelligence (AGI). AGI refers to AI systems that are as capable as humans at most cognitive tasks.