InfoQ Homepage Infrastructure Content on InfoQ
-
NVIDIA Dynamo Addresses Multi-Node LLM Inference Challenges
Serving Large Language Models (LLMs) at scale is complex. Modern LLMs now exceed the memory and compute capacity of a single GPU or even a single multi-GPU node. As a result, inference workloads for 70B+, 120B+ parameter models, or pipelines with large context windows, require multi-node, distributed GPU deployments.
-
Azure API Management Premium v2 GA: Simplified Private Networking and VNet Injection
Microsoft has launched API Management Premium v2, redefining security and ease-of-use in cloud API gateways. This new architecture enhances private networking by eliminating management traffic from customer VNets. With features like Inbound Private Link, availability zone support, and custom CA certificates, users gain unmatched networking flexibility, resilience, and significant cost savings.
-
Airbnb Adds Adaptive Traffic Control to Manage Key Value Store Spikes
Airbnb upgraded Mussel, its multi-tenant key-value store, replacing static per-client rate limits with an adaptive, resource-aware traffic control system. The redesign ensures resilience during traffic spikes, protects critical workflows, and maintains fair usage across thousands of tenants while scaling efficiently.
-
KubeCon NA 2025 - Salesforce’s Approach to Self-Healing Using AIOps and Agentic AI
AIOps and Agentic AI technologies can help in developing solutions to intelligently analyze Kubernetes cluster health, automatically diagnose problems, and orchestrate issue resolutions with minimal human intervention. Vikram Venkataraman and Srikanth Rajan spoke at KubeCon + CloudNativeCon NA 2025 Conference about Salesforce’s approach to self-healing systems using AIOps and AI Agents.
-
Airbnb’s Mussel V2: Next-Gen Key Value Storage to Unify Streaming and Bulk Ingestion
Airbnb’s engineering team re-architected its internal key-value storage system, Mussel, to unify streaming and bulk ingestion while simplifying operations, achieving over 100,000 writes per second and sub-25ms read latencies on 100-terabyte tables, while leveraging Kubernetes, Kafka, and a NewSQL backend to improve scalability, reliability, and operational efficiency across its internal services.
-
Anthropic Reveals Three Infrastructure Bugs behind Claude Performance Issues
Anthropic recently published a postmortem revealing that three distinct infrastructure bugs intermittently degraded the output quality of its Claude models in recent weeks. While the company states it has now resolved those issues and is modifying its internal processes to prevent similar disruptions, the community highlights the challenges of running the service across three hardware platforms.
-
Microsoft Tests Microfluidic Cooling for Next-Generation AI Chips
Microsoft has announced progress on a new chip cooling approach that could help address one of the biggest bottlenecks in scaling AI infrastructure: heat. The company’s researchers have successfully demonstrated in-chip microfluidic cooling, a system that channels liquid coolant directly into etched grooves on the back of silicon chips.
-
Imagine Learning Highlights Linkerd’s Role in Cloud-Native Scale and Cost Savings
Innovative education technology provider Imagine Learning relies on Linkerd as the backbone of its cloud-native infrastructure, enabling rapid growth and ensuring reliability, scalability, and security. With over 80% reduction in compute needs and a 40% cut in networking costs, Linkerd offers a proven solution that enhances efficiency across diverse sectors.
-
System Initiative Launches “AI Native” Platform to Simplify Infrastructure Automation
System Initiative recently released its AI Native Infrastructure Automation platform, aiming to offer DevOps teams a new way to manage infrastructure through natural language.
-
AWS Launches Memory-Optimized EC2 R8i and R8i-flex Instances with Custom Intel Xeon 6 Processors
AWS has launched its eighth-generation Amazon EC2 R8i and R8i-flex instances, powered by custom Intel Xeon 6 processors. Designed for memory-intensive workloads, these instances offer up to 15% better price performance and enhanced memory throughput, making them ideal for real-time data processing and AI applications.
-
AWS CCAPI MCP Server: Natural Language Infra
AWS introduces the Cloud Control API (CCAPI) MCP Server, revolutionizing infrastructure management by enabling natural language commands for resource management. This tool boosts developer productivity with automated security checks, IaC template generation, and cost estimation, bridging the gap between intent and cloud deployment. Embrace simplicity and efficiency in cloud operations!
-
Amazon EVS Offers Enterprises a New Path for VMware Workload Migration
AWS has launched Amazon Elastic VMware Service (EVS), enabling rapid deployment of VMware Cloud Foundation within Amazon VPC. Users can leverage existing VMware expertise without re-architecting, optimizing their virtualization stack seamlessly. With competitive pricing and full root access, EVS empowers businesses amidst VMware licensing changes, supporting efficient migration and modernization.
-
The White House Releases National AI Strategy Focused on Innovation, Infrastructure, and Global Lead
The White House has published America’s AI Action Plan, outlining a national strategy to enhance U.S. leadership in artificial intelligence. The plan follows President Trump’s January Executive Order 14179, which directed federal agencies to accelerate AI development and remove regulatory barriers to innovation.
-
Zendesk Streamlines Infrastructure Provisioning with Foundation Interface Platform
Zendesk has unveiled its new Foundation Interface, a unified platform designed to transform infrastructure provisioning into a fully self-service experience. This platform enables engineers to request infrastructure components, such as databases, object storage, compute resources, and secrets, by simply defining requirements in a declarative YAML file.
-
Google Cloud Introduces Non-Disruptive Cloud Storage Bucket Relocation
Google Cloud's innovative Cloud Storage bucket relocation feature enables seamless, non-disruptive data migration across regions while preserving metadata and minimizing application downtime. Maintain governance, enhance lifecycle management, and leverage insights for optimized storage—all without altering access paths. Experience efficient, low-latency solutions tailored for your needs.