InfoQ Homepage Cloud Computing Content on InfoQ
-
Cloudflare Global Outage Traced to Internal Database Change
Cloudflare’s recent global outage, linked to a database update, caused widespread disruption and highlighted the risks of single-vendor reliance. While service was restored, the incident sparked discussions on the importance of multi-vendor strategies in tech. Cloudflare's CEO vowed to enhance system resilience, emphasizing that outages can impact even the largest providers.
-
AWS Disruption Exposes Fragility in Critical Cloud Infrastructure
On October 20, 2025, Amazon Web Services (AWS) experienced a major outage that disrupted global internet services, affecting millions of users and thousands of companies across more than 60 countries. The incident originated in the US-EAST-1 region and was traced to a DNS resolution failure affecting the DynamoDB endpoint, which cascaded into outages across multiple dependent services.
-
Parting the Clouds: the Rise of Disaggregated Systems by Murat Demirbas at QCon SF 2025
Cloud computing is evolving through disaggregation, addressing inefficiencies of traditional architectures by decoupling compute and storage. This shift enhances scalability, fault isolation, and operational simplicity, driven by advancements in networking. As seen in cloud databases such as Amazon Aurora, embracing these principles enables true economic optimization and innovative design.
-
Cloudflare Introduces Data Platform with Zero Egress Fees
Cloudflare has recently announced the open beta of Cloudflare Data Platform, a managed solution for ingesting, storing, and querying analytical data tables using open standards such as Apache Iceberg.
-
Inside Duolingo’s FinOps Journey: Turning Cloud Spend into Engineering Insight
Duolingo's FinOps journey integrates financial awareness into engineering, empowering developers to link costs with performance. By leveraging real-time data, teams prioritize innovations for maximum impact. This collaborative culture shift transformed cost efficiency into a hallmark of engineering quality, paving the way for smarter, more sustainable cloud spending.
-
Vercel Ship AI 2025 Key Announcements and Technical Updates
Vercel Ship AI 2025 unveiled AI SDK 6 beta, new Marketplace agents and services, workflow support for TypeScript, Vercel Agent for code reviews, Python SDK for FastAPI/Flask, and open-source templates for lead enrichment and Slack-SQL queries.
-
Talos Linux: Bringing Immutability and Security to Kubernetes Operations
Sidero Labs has been developing Talos Linux, an immutable operating system purpose-built exclusively for running Kubernetes, alongside Omni, a cluster lifecycle management platform. InfoQ met the Sidero team in Amsterdam during the TalosCon 2025 and had conversations about their approach to simplifying Kubernetes operations through minimalism and security-first design.
-
AWS Launches Amazon Quick Suite, an Agentic AI Workspace
AWS has launched Amazon Quick Suite, a new AI-powered workspace designed to connect company data, automate workflows, and perform actions across business applications.
-
Vercel Introduces AI Gateway for Multi-Model Integration
Vercel has rolled out the AI Gateway for production workloads. The service provides a single API endpoint for accessing a wide range of large language and generative models, aiming to simplify integration and management for developers.
-
Azure Service Groups Enter Public Preview Offering New Abstraction Layer for Resource Management
Microsoft has launched Azure Service Groups in public preview, a new feature designed to simplify resource management and administration. Acting as a flexible, tenant-level container, Service Groups allow users to organize Azure resources from anywhere within their tenant without affecting RBAC or policy inheritance.
-
Google Cloud Unveils New Data Security Posture Management Offering in Preview
Google Cloud unveils its new Data Security Posture Management (DSPM) offering, enhancing data governance, privacy, and compliance. This innovative solution provides visibility into sensitive data, helping organizations identify risks and enforce controls. With advanced features integrated into the Security Command Center, it addresses the evolving challenges of cloud data security.
-
Pinterest Automates Hadoop Cluster Scaling and Migration with Internal Orchestration System
Recently, Pinterest disclosed its internal orchestration framework, called Hadoop Control Center (HCC), to automate the scaling and migration of its large-scale Hadoop clusters. This move addresses the operational complexity and limitations Pinterest previously faced when managing thousands of nodes across dozens of YARN clusters on AWS.
-
Amazon Launches Bedrock AgentCore for Enterprise AI Agent Infrastructure
Amazon announced the preview of Amazon Bedrock AgentCore, a collection of enterprise-grade services that help developers deploy and operate AI agents at scale across frameworks and foundation models. The platform addresses infrastructure challenges developers face when building production AI agents.
-
AWS Lambda Gains Native Avro and Protobuf Support for Kafka Events with Schema Registry Integration
Lambda now natively supports Apache Avro and Protobuf events, streamlining Kafka event processing - an enhancement that eliminates the need for custom deserialization, automates schema validation and filtering, and optimizes costs through efficient event handling. Integration with AWS Glue and Confluent registries simplifies development, allowing cleaner data consumption and enhanced scalability.
-
InfoQ Dev Summit Boston 2025: AI, Platforms, and Developer Experience
Software development is shifting fast. Senior engineers need real-world insights on AI, platforms, and developer autonomy. InfoQ Dev Summit Boston (June 9-10) offers 2 days with over 27 sessions of curated, technical talks delivered by engineers actively working at scale. We are focused on helping teams navigate the software evolution, with the clarity and context needed to make better decisions.