InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
From Architecture to Deployment: How AI-Powered Toolkits Are Unifying Developer Workflows
Developer tooling is undergoing a shift as AI moves beyond code completion to unify multiple stages of the software development workflow.
-
OpenAI Launches BrowseComp to Benchmark AI Agents' Web Search and Deep Research Skills
OpenAI has released BrowseComp, a new benchmark designed to test AI agents' ability to locate difficult-to-find information on the web. The benchmark contains 1,266 challenging problems that require agents to persistently navigate through multiple websites to retrieve entangled information.
-
Cloudflare Upgrades D1 Database with Global Read Replication
During the recent Developer Week 2025, Cloudflare announced the beta of global read replication for its serverless SQL database D1, providing a globally distributed option without sacrificing consistency. With automatically provisioned replicas in every region, applications can now serve read queries faster while maintaining strong sequential consistency across requests.
-
Google Unveils Ironwood TPU for AI Inference
Google's Ironwood TPU, its most advanced custom AI accelerator, powers the "age of inference" with unmatched performance and scalability. With up to 9,216 liquid-cooled chips, it outpaces competitors, delivering 42.5 Exaflops. Engineered for high-efficiency, low-latency AI tasks, Ironwood redefines potential in AI hardware, leveraging AlphaChip to revolutionize chip design.
-
Cloudflare AutoRAG Streamlines Retrieval-Augmented Generation
Cloudflare has launched a managed service for using retrieval-augmented generation in LLM-based systems. Now in beta, CloudFlare AutoRAG aims to make it easier for developers to build pipelines that integrate rich context data into LLMs.
-
Scaling Financial Operations: Uber’s GenAI-Powered Approach to Invoice Automation
Uber recently described a GenAI-powered invoice processing system that reduced manual effort by 2x, cut handling time by 70%, and delivered 25–30% cost savings. By leveraging GPT-4 and a modular platform called TextSense, Uber improved data accuracy by 90%, enabling globally scalable, efficient, and highly automated financial operations.
-
Docker Bridges Agents and Containers with New MCP Catalog and Toolkit
Docker has announced two new AI-focused tools—the Docker MCP Catalog and the Docker MCP Toolkit—to bring container-grade security and developer-friendly workflows to agentic applications, helping build a developer-centric ecosystem for Model Context Protocol (MCP) tools.
-
Google's Gemma 3 QAT Language Models Can Run Locally on Consumer-Grade GPUs
Google released the Gemma 3 QAT family, quantized versions of their open-weight Gemma 3 language models. The models use Quantization-Aware Training (QAT) to maintain high accuracy when the weights are quantized from 16 to 4 bits.
-
Google DeepMind Shares Approach to AGI Safety and Security
Google DeepMind has released a new paper outlining its approach to safety and security in the development of artificial general intelligence (AGI). AGI refers to AI systems that are as capable as humans at most cognitive tasks.
-
Docker Desktop 4.40 Introduces Model Runner to Run LLMs Locally, Expanding its AI Capabilities
Docker Desktop 4.40, released on March 31, 2025, introduces a suite of features aimed at enhancing AI development workflows and strengthening enterprise compliance capabilities.
-
PayPal's New Agent Toolkit Connects AI Frameworks with Payment APIs through MCP
PayPal has released its Agent Toolkit, designed to help developers integrate PayPal's API suite with AI frameworks through the Model Context Protocol (MCP). The toolkit provides access to APIs for payments, invoices, disputes, shipment tracking, catalog management, subscriptions, and analytics capabilities.
-
AWS Promotes Responsible AI in the Well-Architected Generative AI Lens
AWS announced the availability of the new Well-Architected Generative AI Lens, focused on providing best practices for designing and operating generative AI workloads. The lens is aimed at organizations delivering robust and cost-effective generative AI solutions on AWS. The document offers cloud-agnostic best practices, implementation guidance and links to additional resources.
-
DeepMind Researchers Propose Defense against LLM Prompt Injection
To prevent prompt injection attacks when working with untrusted sources, Google DeepMind researchers have proposed CaMeL, a defense layer around LLMs that blocks malicious inputs by extracting the control and data flows from the query. According to their results, CaMeL can neutralize 67% of attacks in the AgentDojo security benchmark.
-
Google Cloud Announces Firestore with MongoDB Compatibility
During the recent Google Cloud Next 2025, the cloud provider announced the preview of Firestore with MongoDB compatibility. This new option provides MongoDB API and query language to store and query semi-structured JSON data in Google Cloud’s real-time document database.
-
Microsoft Native 1-Bit LLM Could Bring Efficient genAI to Everyday CPUs
In a recent paper, Microsoft researchers described BitNet b1.58 2B4T, the first LLM to be natively trained using "1-bit" (technically, 1-trit) weights, rather than being quantized from a model trained with floating point weights. According to Microsoft, the model delivers performance comparable to full-precision LLMs of similar size at a fraction of the computation cost and hardware requirements.