InfoQ Homepage Cloud Architecture Content on InfoQ
-
Cloudflare Completes Its Agent Infrastructure Stack with Browser Run Rebuild and Six-Layer Platform
Cloudflare rebuilt Browser Run on its own Containers platform, delivering 4x higher concurrency and 50% faster response times. The upgrade completes a six-layer agent infrastructure stack: compute (Dynamic Workers + Sandboxes), orchestration (Dynamic Workflows), memory (Agent Memory), browsing (Browser Run), and commerce (Stripe Projects).
-
OpenAI Outlines WebRTC Architecture for Low-Latency Voice AI at Scale
OpenAI recently outlined how it adapted WebRTC for low-latency voice AI at global scale. The new architecture replaced a conventional media termination model with a relay-transceiver design better suited to Kubernetes and cloud load balancers. It keeps WebRTC session state in a dedicated transceiver layer while using relays to reduce public UDP exposure and keep media routing close to users.
-
AWS WorkSpaces Now Lets AI Agents Operate Legacy Desktop Applications without APIs
AWS announced that Amazon WorkSpaces can now serve as managed virtual desktops for AI agents in public preview. Agents authenticate through IAM and operate legacy applications via computer vision and input simulation without APIs. Reflex benchmarks show vision agents consume 45x more tokens than API agents.
-
Cloudflare Ships Dynamic Workflows, Bringing Durable Execution to Per-Tenant and Per-Agent Code
Cloudflare released Dynamic Workflows, an MIT-licensed library that extends its durable execution engine so workflow code can differ per tenant, agent, or request at runtime. Built on Dynamic Workers, the library enables platforms to serve millions of unique durable workflows at near-zero idle cost. CI/CD and agent plan execution are the headline use cases.
-
AWS Interconnect Reaches General Availability with Managed Multicloud and Last-Mile Connectivity
AWS Interconnect reached general availability, offering managed private Layer 3 connections to Google Cloud and a last-mile capability via Lumen. Azure and OCI support is planned for later in 2026. AWS published the underlying specification on GitHub under Apache 2.0, which Forrester analysts read as a play to set a de facto standard for multicloud connectivity.
-
"Pick and Mix" Custom Regions: Cloudflare Introduces Fine-Grained Data Residency Control
Cloudflare recently introduced Custom Regions, an expansion of its Regional Services that lets customers precisely define where their data is processed. By selecting specific groups of data centers by country or region, customers can ensure that TLS termination and application-layer processing remain within chosen geographic boundaries for compliance and control.
-
From Minutes to Seconds: Uber Boosts MySQL Cluster Uptime with Consensus Architecture
Uber redesigned its MySQL fleet using a consensus-driven architecture based on MySQL Group Replication, reducing cluster failover time from minutes to seconds. By moving leader election and failure detection into the database layer, Uber improved availability, simplified external orchestration, and strengthened consistency across thousands of production clusters.
-
Reducing Onboarding from 48 Hours to 4: inside Amazon Key’s Event-Driven Platform
Amazon Key modernized its event platform by adopting a centralized, event-driven architecture built on Amazon EventBridge. The redesign processes millions of daily events with millisecond latency, improves schema governance, automates cross-account routing, and reduces service onboarding time from 48 hours to four, while maintaining 99.99 percent reliability.
-
Uber Moves In-House Search Indexing to Pull-Based Ingestion in OpenSearch
Uber transitions its in-house search indexing to OpenSearch with a pull-based ingestion framework, improving reliability, backpressure handling, and multi-region consistency for large-scale streaming data while simplifying recovery and supporting global, real-time search experiences.
-
Parting the Clouds: the Rise of Disaggregated Systems by Murat Demirbas at QCon SF 2025
Cloud computing is evolving through disaggregation, addressing inefficiencies of traditional architectures by decoupling compute and storage. This shift enhances scalability, fault isolation, and operational simplicity, driven by advancements in networking. As seen in cloud databases such as Amazon Aurora, embracing these principles enables true economic optimization and innovative design.
-
Azure Front Door Outage: How a Single Control-Plane Defect Exposed Architectural Fragility
A recent 9-hour Azure Front Door (AFD) outage was triggered by a faulty control-plane configuration change that bypassed safety checks due to a software defect, leading to a massive blast radius and affecting M365 and Entra ID via Identity Coupling, exposing a critical architectural anti-pattern in centralized edge fabrics.
-
Google Cloud KMS Launches Post-Quantum KEM Support to Combat "Harvest Now, Decrypt Later" Threat
Google Cloud's Key Management Service now supports post-quantum Key Encapsulation Mechanisms (KEMs), addressing future threats from quantum computing. This update empowers organizations to prepare against "Harvest Now, Decrypt Later" attacks while ensuring long-term data confidentiality.
-
Crossplane Tackles Applications alongside Cloud Infrastructure with v2.0 Release
The Crossplane open-source project has announced the release of version 2.0, an upgrade that moves the project from managing only cloud infrastructure to more comprehensive application and infrastructure orchestration. Some architectural changes have also been made to simplify platform engineering workflows and expand the project's original scope.
-
Cloudflare Rearchitects Workers KV Following GCP Outage, Achieves 40x Performance Gain
Cloudflare has recently redesigned Workers KV with a hybrid storage architecture that automatically routes objects between distributed databases and object storage based on size characteristics, while operating dual storage backends. This change improved the p99 read latencies from 200ms to under 5ms for their global key-value store while handling hundreds of billions of key-value pairs.
-
When Unchecked Autoscaling Generates a $120K Cloud Spend
In the wake of a staggering $120K bill due to unchecked autoscaling during a DDoS attack, industry experts stress the necessity of robust FinOps strategies. Key recommendations include capping resource limits and utilizing real-time alerts to prevent financial disasters. Balancing cost control with system availability is crucial to safeguard modern cloud environments.