InfoQ Homepage Large language models Content on InfoQ
-
AWS Introduces Open Source Model Context Protocol Servers for ECS, EKS, and Serverless
AWS has launched open-source Model Context Protocol (MCP) servers on GitHub to supercharge AI development within Amazon ECS, EKS, and Serverless environments. These specialized tools equip developers with real-time, context-specific insights, enhancing application deployment, troubleshooting, and operational efficiency. Empower your cloud experience today!
-
Introducing Embabel: Advanced AI Agent Development for Java Applications
Introducing the Embabel Agent Framework, a pioneering platform developed by Spring founder Rod Johnson, designed to revolutionize AI applications on the JVM. By integrating structured agent development and Goal-Oriented Action Planning, Embabel combines strong typing with dynamic planning, ensuring reliable, adaptable, and type-safe solutions for enterprise Java applications.
-
Perplexity Introduces Labs for Project-Based AI Workflows
Perplexity has released Labs, a new feature for Pro subscribers designed to support more complex tasks beyond question answering. The update marks a shift from search-based interactions toward structured, multi-step workflows powered by generative AI.
-
Google Brings Gemini Nano to ML Kit with New On-Device GenAI APIs
The new GenAI APIs recently added to ML Kit enable developers to use Gemini Nano for on-device inference in Android apps, supporting features like summarization, proofreading, rewriting, and image description.
-
Anthropic Introduces Claude 4 Family and Claude Code
Anthropic released Claude Opus 4 and Sonnet 4, the newest versions of their Claude series of LLMs. Both models support extended thinking, tool use, and memory improvements, and Claude 4 Opus outperforms other LLMs on coding benchmarks.
-
Amazon Open Sources Strands Agents SDK for Building AI Agents
Amazon has released Strands Agents, an open source SDK that simplifies AI agent development through a model-driven approach. The framework enables developers to build agents by defining prompts and tool lists with minimal code.
-
Google Releases LMEval, an Open-Source Cross-Provider LLM Evaluation Tool
LMEval aims to help AI researchers and developers compare the performance of different large language models. Designed to be accurate, multimodal, and easy to use, LMEval has already been used to evaluate major models in terms of safety and security.
-
Azure AI Search Unveils Agentic Retrieval for Smarter Conversational AI
Microsoft’s Azure AI Search unveils agentic retrieval, a cutting-edge query engine that enhances conversational AI answer relevance by up to 40%. This dynamic system leverages conversation history and parallel subquery execution, paving the way for sophisticated knowledge retrieval. Currently in public preview, it offers adaptive search strategies tailored for evolving enterprise needs.
-
Google Releases MedGemma: Open AI Models for Medical Text and Image Analysis
Google has released MedGemma, a pair of open-source generative AI models designed to support medical text and image understanding in healthcare applications. Based on the Gemma 3 architecture, the models are available in two configurations: MedGemma 4B, a multimodal model capable of processing both images and text, and MedGemma 27B, a larger model focused solely on medical text.
-
Mistral Releases Devstral, an Open-Source LLM for Software Engineering Agents
Mistral AI announced the release of Devstral, a new open-source large language model designed to improve the automation of software engineering workflows, particularly in complex coding environments that require reasoning across multiple files and components.
-
Cisco Reveals JARVIS: an AI Assistant for Platform-Engineering Teams
Introducing JARVIS by Cisco, an AI-powered assistant revolutionizing platform-engineering workflows. With seamless integration across 40+ tools, JARVIS automates complex tasks, reducing project timelines from weeks to hours. Powered by a hybrid AI architecture, it ensures accuracy and reliability while enhancing productivity.
-
HashiCorp Releases Terraform MCP Server for AI Integration
HashiCorp has released the Terraform MCP Server, an open-source implementation of the Model Context Protocol designed to improve how large language models interact with infrastructure as code.
-
Prime Intellect Releases INTELLECT-2: a 32B Parameter Model Trained via Decentralized Reinforcement
Prime Intellect has released INTELLECT-2, a 32 billion parameter language model trained using fully asynchronous reinforcement learning across a decentralized network of compute contributors. Unlike traditional centralized model training, INTELLECT-2 is developed on a permissionless infrastructure where rollout generation, policy updates, and training are distributed and loosely coupled.
-
Gemma 3 Supports Vision-Language Understanding, Long Context Handling, and Improved Multilinguality
Google’s generative artificial intelligence (AI) model Gemma 3 supports vision-language understanding, long context handling, and improved multi-linguality. In a recent blog post, Google DeepMind and AI Studio teams discussed the new features in Gemma 3. The model also highlights KV-cache memory reduction, a new tokenizer and offers better performance and higher resolution vision encoders.
-
OpenAI Launches Codex Software Engineering Agent Preview
OpenAI has launched Codex, a research preview of a cloud-based software engineering agent designed to automate common development tasks such as writing code, debugging, testing, and generating pull requests. Integrated into ChatGPT, Codex runs each assignment in a secure sandbox environment preloaded with the user's codebase and configured to reflect their development setup.