InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
OpenAI Introduces New Speech Models for Transcription and Voice Generation
OpenAI has introduced new speech-to-text and text-to-speech models in its API, focusing on improving transcription accuracy and offering more control over AI-generated voices. These updates aim to enhance automated speech applications, making them more adaptable to different environments and use cases.
-
Mistral AI Launches API for LLM-Based OCR of Multimodal Documents
Now available on Mistral AI's la Plateforme SaaS, Mistral OCR aims to provide an OCR solution for digitizing complex documents that interleave text and images, tables, mathematical expressions, and advanced layouts. This makes it particularly suitable for digitizing scientific research, historical documents and artifacts, user manuals, and more, the company says.
-
Google DeepMind Launches TxGemma: Advancing AI-Driven Drug Discovery and Development
Designed to enhance the efficiency of drug discovery and clinical trial predictions. Built on the Gemma model family, TxGemma aims to streamline the drug development process and accelerate the discovery of new treatments.
-
Google Introduces Gemini 2.5 Pro with Improved Reasoning and Coding Capabilities
Google has released Gemini 2.5 Pro, an updated AI model focused on enhanced reasoning, code generation, and multimodal processing. The model is ranked first on LMArena, a benchmark for human preference in AI responses, and achieves strong results in math, science, and logic-based tasks. It also features a 1 million token context window, with plans to expand to 2 million.
-
How Airbnb Used LLMs to Accelerate Test Migration
Thanks to the right mix of workflow automation and large language models, Airbnb significantly accelerated the process of updating their codebase to adopt React Testing Library (RTL) and converted nearly 3.5K React test files originally using Enzyme.
-
Nvidia Unveils AI, GPU, and Quantum Computing Innovations at GTC 2025
Nvidia presented several new technologies at its GTC 2025 event, covering advancements in GPUs, AI, robotics, and quantum computing.
-
Fauna Shutting Down: Is the Future Open Source?
The team behind the distributed serverless database Fauna has recently announced plans to shut down the service by the end of May. While the managed database will be terminated soon and all customers will have to migrate to other platforms, Fauna is committing to releasing an open source version of the core database technology alongside the existing drivers and CLI tooling.
-
InfoQ Dev Summit Boston 2025: Real-World AI, Platform Engineering & DevEx Strategies
InfoQ Dev Summit Boston 2025 (June 9-10, 2025) is where senior software developers, architects, and engineering leaders come together to tackle today’s most pressing challenges in AI adoption, platform engineering, and developer experience (DevEx). This event delivers insights from practitioners building and scaling modern software systems.
-
Roblox Releases Cube 3D, an AI Open-Source Model for 3D Model Generation
Roblox has introduced Cube 3D, a generative AI system designed for creating 3D and 4D objects and environments.
-
Google Cloud Introduces HDD Tier for Spanner Database, Cutting Cold Storage Costs by 80%
Google has recently introduced tiered storage for Spanner, its distributed SQL database on Google Cloud. This tiered storage is based on a new HDD storage option that is 80% cheaper than the existing SSD option, allowing for cost optimization of older data while minimizing the overhead associated with traditional data migration.
-
Dapr Agents: Scalable AI Workflows with LLMs, Kubernetes & Multi-Agent Coordination
Introducing Dapr Agents—a groundbreaking framework for creating scalable AI agents using Large Language Models (LLMs). With robust workflows, multi-agent coordination, and cloud-neutral architecture, it enables enterprises to deploy thousands of resilient agents. Built on Dapr’s proven infrastructure, Dapr Agents ensures reliability and observability in AI-driven applications.
-
Gemini Code Assist Now Grants Generous Free-Usage Limits to Everyone
Born as an enterprise-focused AI-based code generation tool, Gemini Code Assist now provides a free tier to individual developers with a limit of 6,000 code completions and 240 chat requests daily.
-
Google DeepMind Unveils Gemini Robotics
Google DeepMind has introduced Gemini Robotics, an advanced AI model designed to enhance robotics by integrating vision, language, and action. This innovation, based on the Gemini 2.0 framework, aims to make robots smarter and more capable, particularly in real-world settings.
-
Google Launches Gemma 3 1B for Mobile and Web Apps
Requiring a "mere" 529MB, Gemma 3 1B is a small language model (SLM) specifically meant for distribution across mobile and Web apps, where models must download quickly and be responsive to keep user engagement high.
-
OpenAI Launches New API, SDK, and Tools to Develop Custom Agents
OpenAI has announced the new Responses API, the Agents SDK, and observability tools to address the challenges that creating production-ready agents pose, such as building custom orchestration, and handling prompt iteration across complex, multi-step tasks.