InfoQ Homepage Large language models Content on InfoQ
-
PayPal's New Agent Toolkit Connects AI Frameworks with Payment APIs through MCP
PayPal has released its Agent Toolkit, designed to help developers integrate PayPal's API suite with AI frameworks through the Model Context Protocol (MCP). The toolkit provides access to APIs for payments, invoices, disputes, shipment tracking, catalog management, subscriptions, and analytics capabilities.
-
AWS Promotes Responsible AI in the Well-Architected Generative AI Lens
AWS announced the availability of the new Well-Architected Generative AI Lens, focused on providing best practices for designing and operating generative AI workloads. The lens is aimed at organizations delivering robust and cost-effective generative AI solutions on AWS. The document offers cloud-agnostic best practices, implementation guidance and links to additional resources.
-
DeepMind Researchers Propose Defense against LLM Prompt Injection
To prevent prompt injection attacks when working with untrusted sources, Google DeepMind researchers have proposed CaMeL, a defense layer around LLMs that blocks malicious inputs by extracting the control and data flows from the query. According to their results, CaMeL can neutralize 67% of attacks in the AgentDojo security benchmark.
-
Microsoft Native 1-Bit LLM Could Bring Efficient genAI to Everyday CPUs
In a recent paper, Microsoft researchers described BitNet b1.58 2B4T, the first LLM to be natively trained using "1-bit" (technically, 1-trit) weights, rather than being quantized from a model trained with floating point weights. According to Microsoft, the model delivers performance comparable to full-precision LLMs of similar size at a fraction of the computation cost and hardware requirements.
-
Google DeepMind Introduces QuestBench to Evaluate LLMs in Solving Logic and Math Problems
Google DeepMind’s QuestBench benchmark helps in evaluating if LLMs can pinpoint the single, crucial question needed to solve logic, planning, or math problems. DeepMind team recently published an article on QuestBench which is a set of underspecified reasoning tasks solvable by asking at most one question.
-
Docker Model Runner Aims to Make it Easier to Run LLM Models Locally
Currently in preview with Docker Desktop 4.40 for macOS on Apple Silicon, Docker Model Runner allows developers to run models locally and iterate on application code using the local models- without disrupting their container-based workflows.
-
AI Continent: European Commission Outlines Strategy for Scaling AI Development
The European Commission has presented the AI Continent Action Plan, a new strategy designed to strengthen the European Union’s capacity for AI development and deployment. The plan outlines coordinated investment in infrastructure, access to high-quality data, AI adoption in strategic sectors, and support for regulatory implementation.
-
FastAPI-MCP: Simplifying the Integration of FastAPI with AI Agents
A new open-source library, FastAPI-MCP, is making it easier for developers to connect traditional FastAPI applications with modern AI agents through the Model Context Protocol (MCP). Designed for zero-configuration setup, FastAPI-MCP allows developers to automatically expose their API endpoints as MCP-compatible tools.
-
Google Releases Open-Source Agent Development Kit for Multi-Agent AI Applications
At Google Cloud Next 2025, Google announced the Agent Development Kit (ADK), an open-source framework aimed at simplifying the development of intelligent, multi-agent applications. The toolkit is designed to support developers across the entire lifecycle of agentic systems — from logic design and orchestration to debugging, evaluation, and deployment.
-
Datadog Employs LLMs for Assisting with Writing Accident Postmortems
Datadog combined structured metadata from its incident management app with Slack messages to create an LLM-driven functionality assisting engineers in composing incident postmortems. While working on this solution, the company dealt with the challenges of using LLMs outside of the interactive dialog systems and ensuring that high-quality content was produced.
-
Anthropic's "AI Microscope" Explores the Inner Workings of Large Language Models
Two recent papers from Anthropic attempt to shed light on the processes that take place within a large language model, exploring how to locate interpretable concepts and link them to the computational "circuits" that translate them into language, and how to characterize crucial behaviors of Claude Haiku 3.5, including hallucinations, planning, and other key traits.
-
Claude for Education: Anthropic’s AI Assistant Goes to University
Anthropic has announced the launch of Claude for Education, a specialized version of its AI assistant, Claude, developed specifically for colleges and universities. The initiative aims to support students, faculty, and administrators with secure and responsible AI integration across academics and campus operations.
-
Microsoft Collaborates with Anthropic to Launch C# SDK for MCP Integration
Microsoft has partnered with Anthropic to develop an official C# SDK for the Model Context Protocol (MCP), an open protocol designed to connect large language models (LLMs) with external tools and data sources. The SDK is open-source and available under the modelcontextprotocol GitHub organization.
-
AMD’s Gaia Framework Brings Local LLM Inference to Consumer Hardware
AMD has released Gaia, an open-source project allowing developers to run large language models (LLMs) locally on Windows machines with AMD hardware acceleration. The framework supports retrieval-augmented generation (RAG) and includes tools for indexing local data sources. Gaia is designed to offer an alternative to LLMs hosted on a cloud service provider (CSP).
-
Meta AI Releases Llama 4: Early Impressions and Community Feedback
Meta has officially released the first models in its new Llama 4 family—Scout and Maverick—marking a step forward in its open-weight large language model ecosystem. Designed with a native multimodal architecture and a mixture-of-experts (MoE) framework, these models aim to support a broader range of applications, from image understanding to long-context reasoning.