InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Navigating Complexity, from AI Strategy to Resilient Architecture: InfoQ Dev Summit Munich 2025
Tired of conferences that don't address your real challenges? The InfoQ Dev Summit Munich 2025 schedule is different. It's packed with sessions on the topics that keep us up at night: responsible AI adoption, leadership friction, and EU data sovereignty
-
Google Launches Gemini CLI: Open-Source Terminal AI Agent for Developers
Google has released Gemini CLI, a new open-source AI command-line interface that brings the full capabilities of its Gemini 2.5 Pro model directly into developers’ terminals. Designed for flexibility, transparency, and developer-first workflows, Gemini CLI provides high-performance, natural language AI assistance through a lightweight, locally accessible interface.
-
DevSummit Boston: Key Lessons from Shipping AI Products beyond the Hype
Phil Calçado, CEO of Outropy, shared key insights at the InfoQ Dev Summit on scaling generative AI products. He highlighted the need for effective workflows and agents in AI development, advocating for iterative approaches that leverage proven software engineering principles. His insights promise to guide teams in building resilient AI systems without reinventing the wheel.
-
Google's Agent2Agent Protocol Enters the Linux Foundation
Recently open-sourced by Google, the Agent2Agent protocol is now part of the Linux Foundation, along with its accompanying SDKs and developer tools.
-
Apple's Illusion of Thinking Paper Explores Limits of Large Reasoning Models
Apple Machine Learning Research published a paper titled "The Illusion of Thinking," which investigates the abilities of Large Reasoning Models (LRMs) on a set of puzzles. As the complexity of the puzzles increases, the researchers found that LRMs encounter a "collapse" threshold where the models reduce their reasoning effort, indicating a limit to the models' scalability.
-
Google DeepMind Unveils AlphaGenome: a Unified AI Model for High-Resolution Genome Interpretation
Google DeepMind has announced the release of AlphaGenome, a new AI model designed to predict how genetic variants affect gene regulation across the entire genome. It represents a significant advancement in computational genomics by integrating long-range sequence context with base-pair resolution in a single, general-purpose architecture.
-
Anthropic Upgrades App-Building Capabilities to Claude Artifacts
Anthropic has upgraded Claude with new app-building capabilities, allowing users to create, host, and share AI applications directly from text prompts. This functionality, known as Artifacts, enables users to build functional tools like data analyzers, flashcard generators, or study aids by simply describing their ideas.
-
Nvidia's GB200 NVL72 Supercomputer Achieves 2.7× Faster Inference on DeepSeek V3
In collaboration with NVIDIA, researchers from SGLang have published early benchmarks of the GB200 (Grace Blackwell) NVL72 system, showing up to a 2.7× increase in LLM inference throughput compared to the H100 on the DeepSeek-V3 671B model.
-
OWASP Launches AI Testing Guide to Address Security, Bias, and Risk in AI Systems
The OWASP Foundation has officially introduced the AI Testing Guide (AITG), a new open-source initiative aimed at assisting organizations in the systematic testing and security of artificial intelligence systems. This guide serves as a fundamental resource for developers, testers, risk officers, and cybersecurity professionals, promoting best practices in AI system security.
-
Microsoft Introduces Mu: a Lightweight On-Device Language Model for Windows Settings
Microsoft has introduced Mu, a new small-scale language model designed to run locally on Neural Processing Units (NPUs), starting with its deployment in the Windows Settings application for Copilot+ PCs. The model allows users to control system settings using natural language, aiming to reduce reliance on cloud-based processing.
-
MiniMax Releases M1: a 456B Hybrid-Attention Model for Long-Context Reasoning and Software Tasks
MiniMax has introduced MiniMax-M1, a new open-weight reasoning model built to handle extended contexts and complex problem-solving with high efficiency. Built on top of the earlier MiniMax-Text-01, M1 features a hybrid Mixture-of-Experts (MoE) architecture and a novel “lightning attention” mechanism.
-
GPULlama3.java Brings GPU-Accelerated LLM Inference to Pure Java
The University of Manchester's Beehive Lab has released GPULlama3.java, marking the first Java-native implementation of Llama3 with automatic GPU acceleration. This project leverages TornadoVM to enable GPU-accelerated large language model inference without requiring developers to write CUDA or native code, potentially transforming how Java developers approach AI apps in enterprise environments.
-
Midjourney Debuts V1 AI Video Model
Midjourney has launched its first video generation V1 model, a web-based tool that allows users to animate still images into 5-second video clips.
-
Claude Code Gains Support for Remote MCP Servers over Streamable HTTP
Anthropic has recently introduced support for connecting to remote MCP servers in Claude Code, allowing developers to integrate external tools and resources without manual local server setup.
-
Phoenix.new Launches Remote Agent-Powered Dev Environments for Elixir
Chris McCord has released Phoenix.new, a browser-native agent platform that gives large language models full-stack control over Elixir development environments. Designed to work entirely in the cloud, Phoenix.new spins up real Phoenix apps inside ephemeral VMs, allowing LLM agents to build, test, and iterate in real time.