InfoQ Homepage S3 Content on InfoQ
-
Inside Agoda’s Storefront: a Latency-Aware Reverse Proxy for Improving DNS Based Load Distribution
Agoda engineers developed Storefront, a Rust-based S3-compatible reverse proxy that improves load balancing, request routing, and observability across large-scale object storage systems. The proxy addresses DNS-based distribution limitations, implements latency-aware routing, cross-data-center optimizations, IO safeguards, credential-less authentication, and exposes telemetry via OpenTelemetry.
-
AWS S3 Introduces Account-Regional Namespaces, Ending 18 Years of Global Bucket Name Collisions
AWS introduced account-regional namespaces for S3, fixing global bucket name collisions that broke IaC automation for 18 years. The new format is {prefix}-{account-id}-{region}-an. CloudFormation gets the BucketNamePrefix property, and IAM gets the s3:x-amz-bucket-namespace condition key. This prevents confused-deputy attacks by making names unpredictable when there is no account ID.
-
Pinterest’s CDC-Powered Ingestion Slashes Database Latency from 24 Hours to 15 Minutes
Pinterest launched a next-generation CDC-based database ingestion framework using Kafka, Flink, Spark, and Iceberg. The system reduces data availability latency from 24+ hours to 15 minutes, processes only changed records, supports incremental updates and deletions, and scales to petabyte-level data across thousands of pipelines, optimizing cost and efficiency.
-
Cloudflare Introduces Local Uploads for R2 to Cut Cross-Region Write Latency by 75%
Cloudflare has recently introduced Local Uploads for R2 in open beta. The new feature optimizes write performance for globally distributed users without changing bucket location, reducing cross-region write latency.
-
AWS Adds Intelligent-Tiering and Replication for S3 Tables
AWS has introduced Intelligent-Tiering and cross-region replication for S3 Tables to automate cost optimization and data availability for Apache Iceberg workloads. These features allow data to transition to lower-cost storage tiers based on access patterns while maintaining consistent, read-only table replicas across regions and accounts without manual synchronization.
-
DuckDB's WebAssembly Client Allows Querying Iceberg Datasets in the Browser
DuckDB has recently introduced end-to-end interaction with Iceberg REST Catalogs directly within a browser tab, requiring no infrastructure setup. The new feature leverages DuckDB-Wasm, a WebAssembly port of DuckDB that runs in the browser, allowing users to query, read, and write Iceberg tables in a serverless manner.
-
Amazon S3 Vectors Reaches GA, Introducing "Storage-First" Architecture for RAG
AWS has announced the general availability of Amazon S3 Vectors, increasing per-index capacity forty-fold to 2 billion vectors. By natively integrating vector search into the S3 storage engine, the service introduces a "Storage-First" architecture that decouples compute from storage, reducing total cost of ownership by up to 90% for large-scale RAG workloads.
-
MinIO GitHub Repository in Maintenance Mode: What's Next for the Open Source Object Storage?
After a contentious license change and the removal of administrator functionalities from the console, the company behind the popular open-source object storage server Minio recently announced that the project will now enter maintenance mode. The change has raised discussion in the community about the need for a fork, the challenges of open source projects, and the current alternatives.
-
Yelp Publishes Blueprint for Managing S3 Server-Access Logs at Massive Scale
In a detailed engineering post, Yelp shared how it built a scalable and cost-efficient pipeline for processing Amazon S3 server-access logs (SAL) across its infrastructure, overcoming traditional limitations of raw log storage and querying at high volume.
-
Airbnb’s Mussel V2: Next-Gen Key Value Storage to Unify Streaming and Bulk Ingestion
Airbnb’s engineering team re-architected its internal key-value storage system, Mussel, to unify streaming and bulk ingestion while simplifying operations, achieving over 100,000 writes per second and sub-25ms read latencies on 100-terabyte tables, while leveraging Kubernetes, Kafka, and a NewSQL backend to improve scalability, reliability, and operational efficiency across its internal services.
-
AWS Introduces Vector Capabilities on Amazon S3
At the recent AWS Summit in New York City, AWS announced the preview of Amazon S3 Vectors, claiming to be the first cloud object store with native support for storing large vector datasets. The new option offers subsecond query performance, reducing the cost of storing AI-ready data compared to traditional vector databases.
-
Amazon S3 Adds Sort and Z-Order Compaction to Improve Apache Iceberg Query Performance
AWS has recently announced that Amazon S3 now supports sort and z-order compaction for Apache Iceberg tables. The new features reduce scan times and engine costs, and are available for both S3 Tables and traditional S3 buckets using AWS Glue Data Catalog optimization.
-
Google Cloud Announces Rapid Storage for Millisecond-Latency Workloads
At the recent Google Cloud Next 2025, the cloud provider announced Rapid Storage, a new Cloud Storage zonal bucket designed to deliver consistent single-digit millisecond data access for frequently accessed data and latency-sensitive applications. The new storage class provides under 1ms random read and write latency, 20x faster data access, and 6 TB/s of throughput.
-
How a Manual Remediation for a Phishing URL Took down Cloudflare R2
Due to human error in handling a phishing report and insufficient validation safeguards in admin tools, Cloudflare experienced an incident affecting its R2 Gateway service on February 5th. As part of a routine remediation for a phishing URL, the R2 service was inadvertently taken down, leading to the outage or disruption of numerous other Cloudflare services for over an hour.
-
How to Defend Amazon S3 Buckets from Ransomware Exploiting SSE-C Encryption
A new ransomware campaign, dubbed Codefinger, has been targeting Amazon S3 users by exploiting compromised AWS credentials to encrypt data using Server-Side Encryption with Customer-Provided Keys (SSE-C). Attackers then demand ransom payments for the symmetric AES-256 keys required to decrypt the data. AWS has released recommendations to help users mitigate the risk of ransomware attacks on S3.