InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Atlassian's 4 Million PostgreSQL Database Migration: When Standard Cloud Strategies Fail
Atlassian recently migrated 4 million Jira databases to Amazon Aurora, intending to reduce costs and improve the reliability of its Jira Cloud platform. Due to the large number of files involved and the constraints of managed services, the team developed a custom tool to orchestrate the process, as traditional cloud migration strategies were not viable.
-
LM Studio 0.3.17 Adds Model Context Protocol (MCP) Support for Tool-Integrated LLMs
LM Studio has released version 0.3.17, introducing support for the Model Context Protocol (MCP) — a step forward in enabling language models to access external tools and data sources. Originally developed by Anthropic, MCP defines a standardized interface for connecting LLMs to services such as GitHub, Notion, or Stripe, enabling more powerful, contextual reasoning.
-
Gemma 3n Introduces Novel Techniques for Enhanced Mobile AI Inference
Launched in early preview last May, Gemma 3n is now officially available. It targets mobile-first, on-device AI applications, using new techniques designed to increase efficiency and improve performance, such as per-layer embeddings and transformer nesting.
-
Navigating Complexity, from AI Strategy to Resilient Architecture: InfoQ Dev Summit Munich 2025
Tired of conferences that don't address your real challenges? The InfoQ Dev Summit Munich 2025 schedule is different. It's packed with sessions on the topics that keep us up at night: responsible AI adoption, leadership friction, and EU data sovereignty
-
Google Launches Gemini CLI: Open-Source Terminal AI Agent for Developers
Google has released Gemini CLI, a new open-source AI command-line interface that brings the full capabilities of its Gemini 2.5 Pro model directly into developers’ terminals. Designed for flexibility, transparency, and developer-first workflows, Gemini CLI provides high-performance, natural language AI assistance through a lightweight, locally accessible interface.
-
DevSummit Boston: Key Lessons from Shipping AI Products beyond the Hype
Phil Calçado, CEO of Outropy, shared key insights at the InfoQ Dev Summit on scaling generative AI products. He highlighted the need for effective workflows and agents in AI development, advocating for iterative approaches that leverage proven software engineering principles. His insights promise to guide teams in building resilient AI systems without reinventing the wheel.
-
Google's Agent2Agent Protocol Enters the Linux Foundation
Recently open-sourced by Google, the Agent2Agent protocol is now part of the Linux Foundation, along with its accompanying SDKs and developer tools.
-
Apple's Illusion of Thinking Paper Explores Limits of Large Reasoning Models
Apple Machine Learning Research published a paper titled "The Illusion of Thinking," which investigates the abilities of Large Reasoning Models (LRMs) on a set of puzzles. As the complexity of the puzzles increases, the researchers found that LRMs encounter a "collapse" threshold where the models reduce their reasoning effort, indicating a limit to the models' scalability.
-
Google DeepMind Unveils AlphaGenome: a Unified AI Model for High-Resolution Genome Interpretation
Google DeepMind has announced the release of AlphaGenome, a new AI model designed to predict how genetic variants affect gene regulation across the entire genome. It represents a significant advancement in computational genomics by integrating long-range sequence context with base-pair resolution in a single, general-purpose architecture.
-
Anthropic Upgrades App-Building Capabilities to Claude Artifacts
Anthropic has upgraded Claude with new app-building capabilities, allowing users to create, host, and share AI applications directly from text prompts. This functionality, known as Artifacts, enables users to build functional tools like data analyzers, flashcard generators, or study aids by simply describing their ideas.
-
Nvidia's GB200 NVL72 Supercomputer Achieves 2.7× Faster Inference on DeepSeek V3
In collaboration with NVIDIA, researchers from SGLang have published early benchmarks of the GB200 (Grace Blackwell) NVL72 system, showing up to a 2.7× increase in LLM inference throughput compared to the H100 on the DeepSeek-V3 671B model.
-
OWASP Launches AI Testing Guide to Address Security, Bias, and Risk in AI Systems
The OWASP Foundation has officially introduced the AI Testing Guide (AITG), a new open-source initiative aimed at assisting organizations in the systematic testing and security of artificial intelligence systems. This guide serves as a fundamental resource for developers, testers, risk officers, and cybersecurity professionals, promoting best practices in AI system security.
-
Microsoft Introduces Mu: a Lightweight On-Device Language Model for Windows Settings
Microsoft has introduced Mu, a new small-scale language model designed to run locally on Neural Processing Units (NPUs), starting with its deployment in the Windows Settings application for Copilot+ PCs. The model allows users to control system settings using natural language, aiming to reduce reliance on cloud-based processing.
-
MiniMax Releases M1: a 456B Hybrid-Attention Model for Long-Context Reasoning and Software Tasks
MiniMax has introduced MiniMax-M1, a new open-weight reasoning model built to handle extended contexts and complex problem-solving with high efficiency. Built on top of the earlier MiniMax-Text-01, M1 features a hybrid Mixture-of-Experts (MoE) architecture and a novel “lightning attention” mechanism.
-
GPULlama3.java Brings GPU-Accelerated LLM Inference to Pure Java
The University of Manchester's Beehive Lab has released GPULlama3.java, marking the first Java-native implementation of Llama3 with automatic GPU acceleration. This project leverages TornadoVM to enable GPU-accelerated large language model inference without requiring developers to write CUDA or native code, potentially transforming how Java developers approach AI apps in enterprise environments.