InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Nvidia's GB200 NVL72 Supercomputer Achieves 2.7× Faster Inference on DeepSeek V3
In collaboration with NVIDIA, researchers from SGLang have published early benchmarks of the GB200 (Grace Blackwell) NVL72 system, showing up to a 2.7× increase in LLM inference throughput compared to the H100 on the DeepSeek-V3 671B model.
-
OWASP Launches AI Testing Guide to Address Security, Bias, and Risk in AI Systems
The OWASP Foundation has officially introduced the AI Testing Guide (AITG), a new open-source initiative aimed at assisting organizations in the systematic testing and security of artificial intelligence systems. This guide serves as a fundamental resource for developers, testers, risk officers, and cybersecurity professionals, promoting best practices in AI system security.
-
Microsoft Introduces Mu: a Lightweight On-Device Language Model for Windows Settings
Microsoft has introduced Mu, a new small-scale language model designed to run locally on Neural Processing Units (NPUs), starting with its deployment in the Windows Settings application for Copilot+ PCs. The model allows users to control system settings using natural language, aiming to reduce reliance on cloud-based processing.
-
MiniMax Releases M1: a 456B Hybrid-Attention Model for Long-Context Reasoning and Software Tasks
MiniMax has introduced MiniMax-M1, a new open-weight reasoning model built to handle extended contexts and complex problem-solving with high efficiency. Built on top of the earlier MiniMax-Text-01, M1 features a hybrid Mixture-of-Experts (MoE) architecture and a novel “lightning attention” mechanism.
-
GPULlama3.java Brings GPU-Accelerated LLM Inference to Pure Java
The University of Manchester's Beehive Lab has released GPULlama3.java, marking the first Java-native implementation of Llama3 with automatic GPU acceleration. This project leverages TornadoVM to enable GPU-accelerated large language model inference without requiring developers to write CUDA or native code, potentially transforming how Java developers approach AI apps in enterprise environments.
-
Midjourney Debuts V1 AI Video Model
Midjourney has launched its first video generation V1 model, a web-based tool that allows users to animate still images into 5-second video clips.
-
Claude Code Gains Support for Remote MCP Servers over Streamable HTTP
Anthropic has recently introduced support for connecting to remote MCP servers in Claude Code, allowing developers to integrate external tools and resources without manual local server setup.
-
Phoenix.new Launches Remote Agent-Powered Dev Environments for Elixir
Chris McCord has released Phoenix.new, a browser-native agent platform that gives large language models full-stack control over Elixir development environments. Designed to work entirely in the cloud, Phoenix.new spins up real Phoenix apps inside ephemeral VMs, allowing LLM agents to build, test, and iterate in real time.
-
AlphaWrite: Improving AI Narratives through Evolution
AlphaWrite is a new framework designed to enhance creative writing with structure and measurable improvements. Developed by Toby Simonds, it employs an evolutionary process to iteratively boost storytelling quality during inference.
-
Yearly MariaDB LTS Release Integrates Vector Search
MariaDB has recently released MariaDB Community Server 11.8 as generally available, its yearly long-term support (LTS) release for 2025. The new release introduces integrated vector search capabilities for AI-driven and similarity search applications, enhanced JSON functionality, and temporal tables for data history and auditing.
-
OpenAI Launches o3-pro Model Focused on Reliability, Amid Mixed User Feedback
OpenAI launched o3-pro, a new version of its most advanced model aimed at delivering more reliable, thoughtful responses across complex tasks. Now available to Pro and Team users in ChatGPT and via API, o3-pro replaces the earlier o1-pro.
-
Agentica Project's Open Source DeepCoder Model Outperforms OpenAI's O1 on Coding Benchmarks
The Agentica Project and Together AI have released DeepCoder-14B-Preview, an open source AI coding model based on Deepseek-R1-Distilled-Qwen-14B. The model achieves a 60.6% pass rate on LiveCodeBench, outperforming OpenAI's o1 model and matching the performance of o3-mini.
-
Mistral AI Releases Magistral, Its First Reasoning-Focused Language Model
Mistral AI has released Magistral, a new model family built for transparent, multi-step reasoning. Available in open and enterprise versions, it supports structured logic, multilingual output, and traceable decision-making.
-
HTAP: the Rise and Fall of Unified Database Systems?
A recent article by Zhou Sun sparked a debate in the data community about the future of HTAP systems. Hybrid transaction/analytical processing was meant to help integrate historical and online data at scale, supporting more flexible query methods and reducing business complexity.
-
Meta Introduces V-JEPA 2, a Video-Based World Model for Physical Reasoning
Meta has introduced V-JEPA 2, a new video-based world model designed to improve machine understanding, prediction, and planning in physical environments. The model extends the Joint Embedding Predictive Architecture (JEPA) framework and is trained to predict outcomes in embedding space using video data.