InfoQ Homepage AI, ML & Data Engineering Content on InfoQ

News

RSS Feed

Newer Older

AI, ML & Data Engineering

From Architecture to Deployment: How AI-Powered Toolkits Are Unifying Developer Workflows

Developer tooling is undergoing a shift as AI moves beyond code completion to unify multiple stages of the software development workflow.

Matt Foster
on May 04, 2025
AI, ML & Data Engineering

OpenAI Launches BrowseComp to Benchmark AI Agents' Web Search and Deep Research Skills

OpenAI has released BrowseComp, a new benchmark designed to test AI agents' ability to locate difficult-to-find information on the web. The benchmark contains 1,266 challenging problems that require agents to persistently navigate through multiple websites to retrieve entangled information.

Vinod Goje
on May 04, 2025
Cloud

Cloudflare Upgrades D1 Database with Global Read Replication

During the recent Developer Week 2025, Cloudflare announced the beta of global read replication for its serverless SQL database D1, providing a globally distributed option without sacrificing consistency. With automatically provisioned replicas in every region, applications can now serve read queries faster while maintaining strong sequential consistency across requests.

Renato Losio
on May 03, 2025
Cloud

Google Unveils Ironwood TPU for AI Inference

Google's Ironwood TPU, its most advanced custom AI accelerator, powers the "age of inference" with unmatched performance and scalability. With up to 9,216 liquid-cooled chips, it outpaces competitors, delivering 42.5 Exaflops. Engineered for high-efficiency, low-latency AI tasks, Ironwood redefines potential in AI hardware, leveraging AlphaChip to revolutionize chip design.

Steef-Jan Wiggers
on May 02, 2025
Cloud

Cloudflare AutoRAG Streamlines Retrieval-Augmented Generation

Cloudflare has launched a managed service for using retrieval-augmented generation in LLM-based systems. Now in beta, CloudFlare AutoRAG aims to make it easier for developers to build pipelines that integrate rich context data into LLMs.

Sergio De Simone
on Apr 30, 2025
Architecture & Design

Scaling Financial Operations: Uber’s GenAI-Powered Approach to Invoice Automation

Uber recently described a GenAI-powered invoice processing system that reduced manual effort by 2x, cut handling time by 70%, and delivered 25–30% cost savings. By leveraging GPT-4 and a modular platform called TextSense, Uber improved data accuracy by 90%, enabling globally scalable, efficient, and highly automated financial operations.

Eran Stiller
on Apr 30, 2025
AI, ML & Data Engineering

Docker Bridges Agents and Containers with New MCP Catalog and Toolkit

Docker has announced two new AI-focused tools—the Docker MCP Catalog and the Docker MCP Toolkit—to bring container-grade security and developer-friendly workflows to agentic applications, helping build a developer-centric ecosystem for Model Context Protocol (MCP) tools.

Sergio De Simone
on Apr 29, 2025
AI, ML & Data Engineering

Google's Gemma 3 QAT Language Models Can Run Locally on Consumer-Grade GPUs

Google released the Gemma 3 QAT family, quantized versions of their open-weight Gemma 3 language models. The models use Quantization-Aware Training (QAT) to maintain high accuracy when the weights are quantized from 16 to 4 bits.

Anthony Alford
on Apr 29, 2025
AI, ML & Data Engineering

Google DeepMind Shares Approach to AGI Safety and Security

Google DeepMind has released a new paper outlining its approach to safety and security in the development of artificial general intelligence (AGI). AGI refers to AI systems that are as capable as humans at most cognitive tasks.

Daniel Dominguez
on Apr 29, 2025
DevOps

Docker Desktop 4.40 Introduces Model Runner to Run LLMs Locally, Expanding its AI Capabilities

Docker Desktop 4.40, released on March 31, 2025, introduces a suite of features aimed at enhancing AI development workflows and strengthening enterprise compliance capabilities.

Craig Risi
on Apr 28, 2025
AI, ML & Data Engineering

PayPal's New Agent Toolkit Connects AI Frameworks with Payment APIs through MCP

PayPal has released its Agent Toolkit, designed to help developers integrate PayPal's API suite with AI frameworks through the Model Context Protocol (MCP). The toolkit provides access to APIs for payments, invoices, disputes, shipment tracking, catalog management, subscriptions, and analytics capabilities.

Vinod Goje
on Apr 28, 2025
Architecture & Design

AWS Promotes Responsible AI in the Well-Architected Generative AI Lens

AWS announced the availability of the new Well-Architected Generative AI Lens, focused on providing best practices for designing and operating generative AI workloads. The lens is aimed at organizations delivering robust and cost-effective generative AI solutions on AWS. The document offers cloud-agnostic best practices, implementation guidance and links to additional resources.

Rafal Gancarz
on Apr 27, 2025
AI, ML & Data Engineering

DeepMind Researchers Propose Defense against LLM Prompt Injection

To prevent prompt injection attacks when working with untrusted sources, Google DeepMind researchers have proposed CaMeL, a defense layer around LLMs that blocks malicious inputs by extracting the control and data flows from the query. According to their results, CaMeL can neutralize 67% of attacks in the AgentDojo security benchmark.

Sergio De Simone
on Apr 26, 2025
Cloud

Google Cloud Announces Firestore with MongoDB Compatibility

During the recent Google Cloud Next 2025, the cloud provider announced the preview of Firestore with MongoDB compatibility. This new option provides MongoDB API and query language to store and query semi-structured JSON data in Google Cloud’s real-time document database.

Renato Losio
on Apr 26, 2025
AI, ML & Data Engineering

Microsoft Native 1-Bit LLM Could Bring Efficient genAI to Everyday CPUs

In a recent paper, Microsoft researchers described BitNet b1.58 2B4T, the first LLM to be natively trained using "1-bit" (technically, 1-trit) weights, rather than being quantized from a model trained with floating point weights. According to Microsoft, the model delivers performance comparable to full-precision LLMs of similar size at a fraction of the computation cost and hardware requirements.

Sergio De Simone
on Apr 23, 2025

Newer News

Older News

InfoQ Software Architects' Newsletter

News