InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Swiggy Rolls out Hermes V3: from Text-to-SQL to Conversational AI
Swiggy has released Hermes V3, a GenAI-powered text-to-SQL assistant that enables employees to query data in plain English. The Slack-native system combines vector retrieval, conversational memory, agentic orchestration, and explainability to improve SQL accuracy and support multi-turn analytical queries.
-
Amazon S3 Vectors Reaches GA, Introducing "Storage-First" Architecture for RAG
AWS has announced the general availability of Amazon S3 Vectors, increasing per-index capacity forty-fold to 2 billion vectors. By natively integrating vector search into the S3 storage engine, the service introduces a "Storage-First" architecture that decouples compute from storage, reducing total cost of ownership by up to 90% for large-scale RAG workloads.
-
Kubernetes 1.35 Released with In-Place Pod Resize and AI-Optimized Scheduling
The Cloud Native Computing Foundation (CNCF) announced the release of Kubernetes 1.35, named "Timbernetes", emphasizing its focus on mutability and the optimization of high-performance AI/ML workloads.
-
Cloudflare Year in Review: AI Bots Crawl Aggressively, Post-Quantum Encryption Hits 50%, Go Doubles
Cloudflare has recently published the sixth edition of its Radar Year in Review. The results reveal 19% yearly growth in global internet traffic, Googlebot dominance, increasing crawl-to-refer ratios, and broad adoption of post-quantum encryption. Over 20% of automated API requests were made by Go-based clients, almost doubling adoption over the previous year.
-
QCon AI New York 2025: AI Works, PRs Don't: How AI is Breaking the SDLC and What to Do about it
Michael Webster, Principal Engineer at CircleCI, presented “AI Works, Pull Requests Don’t: How AI Is Breaking the SDLC and What to Do about It” at QCon AI New York 2025. Webster discussed the impact of AI on the Software Development Lifecycle (SDLC) and Continuous Integration/Continuous Delivery (CI/CD) processes at CircleCI.
-
Open-Source Agent Sandbox Enables Secure Deployment of AI Agents on Kubernetes
The Agent Sandbox is an open-source Kubernetes controller that provides a declarative API for managing a single, stateful pod with stable identity and persistent storage. It is particularly well suited for creating isolated environments to execute untrusted, LLM-generated code, as well as for running other stateful workloads.
-
Microsoft Foundry Agent Service Simplifies State Management with Long-Term Memory Preview
Microsoft has launched a public preview of a managed long-term memory store for its Foundry Agent Service. The service automates the extraction, consolidation, and retrieval of user context, providing a native "state layer" that prevents intelligence decay in long-running interactions with AI agents.
-
CNCF Launches Certified Kubernetes AI Conformance Program to Standardise Workloads
The CNCF has launched the Certified Kubernetes AI Conformance program to standardise artificial intelligence workloads. By establishing a technical baseline for GPU management, networking, and gang scheduling, the initiative ensures portability across cloud providers. It aims to reduce technical debt and prevent vendor lock-in as enterprises move generative AI models into production.
-
SIMA 2 Uses Gemini and Self-Improvement to Generalize across Unseen 3D and Photorealistic Worlds
Google DeepMind researchers introduced SIMA 2 (Scalable Instructable Multiworld Agent), a generalist agent built on the Gemini foundation model that can understand and act across multiple 3D virtual game environments. The SIMA 2 architecture uses a Gemini Flash-Lite model trained on a mixture of gameplay and Gemini pretraining data.
-
QCon AI NY 2025 - Becoming AI-Native without Losing our Minds to Architectural Amnesia
Tracy Bannon's QCon AI NY 2025 talk revealed how the rise of AI agents risks amplifying common architectural failures. She emphasized the distinctions between bots, assistants, and agents, highlighting the need for governance, clear identity controls, and disciplined decision-making to address “agentic debt.” Bannon called for architects to apply foundational principles amid rapid AI adoption.
-
Cactus v1: Cross-Platform LLM Inference on Mobile with Zero Latency and Full Privacy
Cactus, a Y Combinator-backed startup, enables local AI inference to mobile phones, wearables, and other low-power devices through cross-platform, energy-efficient kernels and a native runtime. It delivers sub-50ms time-to-first-token for on-device inference, eliminates network latency, and defaults to complete privacy.
-
OpenAI and Anthropic Donate AGENTS.md and Model Context Protocol to New Agentic AI Foundation
OpenAI and Anthropic have donated their AGENTS.md and Model Context Protocol projects to the Agentic AI Foundation (AAIF), a new directed fund under the Linux Foundation. Block contributed their agent framework, goose, as another founding project, and several other tech companies have joined as Platinum members.
-
Target Improves Add to Cart Interactions by 11 Percent with Generative AI Recommendations
Target has deployed GRAM, a GenAI-powered accessory recommendation system for the Home category, using large language models to prioritize product attributes and capture aesthetic cohesion. The system helps shoppers find compatible accessories, integrates human-in-the-loop curation, and achieved measurable improvements in engagement and conversion.
-
Toad: a Unified CLI Tool for All Your LLMs That Promises Improved UX from Existing Ones
During his sabbatical, Will McGugan, maker of Rich and Textual, frameworks for making Textual User Interfaces (TUI), put his UI skills to work to build Toad. The newly publicly released tool aims to provide a unified, "beautiful" GUI for multiple coding agents in your terminal, accessible via the same tool via the Agent Communication Protocol (ACP).
-
Neptune Combines AI‑Assisted Infrastructure as Code and Cloud Deployments
Now available in beta, Neptune is a conversational AI agent designed to act like an AI platform engineer, handling the provisioning, wiring, and configuration of the cloud services needed to run a containerized app. Neptune is both language and cloud-agnostic, with support for AWS, GCP, and Azure.