InfoQ Homepage Cloud Computing Content on InfoQ
-
Event-Driven Patterns for Cloud-Native Banking: Lessons from What Works and What Hurts
Event-driven architecture helps banks decouple systems, scale services, and create clear activity trails. But it also introduces complexity, new failure modes, and operational challenges. Chris Tacey-Green explains where it adds value in banking systems and the practical patterns, such as inbox/outbox and stable event contracts, needed to make it reliable.
-
Configuration as a Control Plane: Designing for Safety and Reliability at Scale
Configuration has evolved from static deployment files into a live control plane that directly shapes system behavior. The evolution of configuration management highlights why misconfigurations can trigger large outages and how hyperscalers deploy changes safely using staged rollouts, validation, blast radius limits, and automated rollback at scale.
-
Building a Least-Privilege AI Agent Gateway for Infrastructure Automation with MCP, OPA, and Ephemeral Runners
This article presents a least-privilege AI Agent Gateway that places clear controls between AI agents and infrastructure. Agents do not access infrastructure APIs directly. Instead, every request is validated, authorized using policy as code with Open Policy Agent (OPA), and executed in short-lived, isolated environments, with built-in observability using OpenTelemetry.
-
Proactive Autoscaling for Edge Applications in Kubernetes
Kubernetes often reacts too late when traffic suddenly increases at the edge. A proactive scaling approach that considers response time, spare CPU capacity, and container startup delays can add or remove instances more smoothly, prevent sudden spikes, and keep performance stable on systems with limited resources.
-
Preventing Data Exfiltration: a Practical Implementation of VPC Service Controls at Enterprise Scale in Google Cloud Platform
Implementing VPC Service Controls is more about people and process than technology. Organizations must conduct extensive upfront discovery, use phased rollouts to avoid breaking production systems, and design VPC Service Controls that enable rather than block work. Success requires automation, clear exception processes, tracking both security and business metrics, and continuous improvement.
-
Stop Guessing, Start Improving: Using DORA Metrics and Process Behavior Charts
Delivery performance rarely changes in a straight line. Small degradations caused by tooling, environment instability, or team changes can accumulate quietly, while real improvements take time to emerge. This article shows how combining DORA metrics with Process Behavior Charts helps teams zoom out, detect meaningful shifts early, and validate improvement hypotheses.
-
Building Distributed Event-Driven Architectures across Multi-Cloud Boundaries
Multi-cloud event-driven architectures are now essential, not optional. With most organizations already multi-cloud, success depends on optimizing latency, ensuring resilience, and managing event consistency across providers. Key practices include code-level tuning, robust recovery policies, duplicate prevention, observability, and strong team readiness.
-
InfoQ Cloud and DevOps Trends Report - 2025
This InfoQ Trends Report offers readers a comprehensive overview of emerging trends and technologies in the areas of Cloud and DevOps. This report summarizes the InfoQ editorial team’s and external guests' view on the current trends in Cloud and DevOps technologies and what to look out for in the next 12 months.
-
Beyond the Padlock: Why Certificate Transparency is Reshaping Internet Trust
Certificate Transparency (CT) creates public, append-only logs of every TLS certificate issued, enabling detection of rogue or mistaken certificates. This article explores how CT has transformed internet PKI by moving from reliance on certificate authority trustworthiness to providing verifiable transparency that major browsers now require.
-
Ransomware-Resilient Storage: the New Frontline Defense in a High-Stakes Cyber Battle
Cybersecurity has evolved, with ransomware now primarily targeting data storage and backups. To combat this, modern defense strategies focus on making storage systems more resilient. Key tactics include using immutable storage that prevents data from being altered or deleted, employing AI-powered detection, and implementing air-gapping to create isolated, tamper-proof recovery points.
-
Sandbox as a Service: Building an Automated AWS Sandbox Framework
This article outlines an automated AWS Sandbox Framework to provide secure, cost-controlled environments for innovation. It leverages AWS services like Control Tower and open-source tools to automate provisioning, enforce security policies, manage resource lifecycles, and optimize costs through automated cleanup and governance.
-
Why Is My Docker Image So Big? A Deep Dive with ‘dive’ to Find the Bloat
AI images typically bloat from massive library installations and base OS components, with large Docker images slowing AI development and increasing costs. Chirag Agrawal demonstrates how to diagnose bloat using Docker's history and the interactive 'dive' tool to examine each layer in detail. The article shows how effective diagnosis leads to targeted optimizations.