InfoQ Homepage S3 Content on InfoQ
-
AWS Adds Intelligent-Tiering and Replication for S3 Tables
AWS has introduced Intelligent-Tiering and cross-region replication for S3 Tables to automate cost optimization and data availability for Apache Iceberg workloads. These features allow data to transition to lower-cost storage tiers based on access patterns while maintaining consistent, read-only table replicas across regions and accounts without manual synchronization.
-
DuckDB's WebAssembly Client Allows Querying Iceberg Datasets in the Browser
DuckDB has recently introduced end-to-end interaction with Iceberg REST Catalogs directly within a browser tab, requiring no infrastructure setup. The new feature leverages DuckDB-Wasm, a WebAssembly port of DuckDB that runs in the browser, allowing users to query, read, and write Iceberg tables in a serverless manner.
-
Amazon S3 Vectors Reaches GA, Introducing "Storage-First" Architecture for RAG
AWS has announced the general availability of Amazon S3 Vectors, increasing per-index capacity forty-fold to 2 billion vectors. By natively integrating vector search into the S3 storage engine, the service introduces a "Storage-First" architecture that decouples compute from storage, reducing total cost of ownership by up to 90% for large-scale RAG workloads.
-
MinIO GitHub Repository in Maintenance Mode: What's Next for the Open Source Object Storage?
After a contentious license change and the removal of administrator functionalities from the console, the company behind the popular open-source object storage server Minio recently announced that the project will now enter maintenance mode. The change has raised discussion in the community about the need for a fork, the challenges of open source projects, and the current alternatives.
-
Yelp Publishes Blueprint for Managing S3 Server-Access Logs at Massive Scale
In a detailed engineering post, Yelp shared how it built a scalable and cost-efficient pipeline for processing Amazon S3 server-access logs (SAL) across its infrastructure, overcoming traditional limitations of raw log storage and querying at high volume.
-
Airbnb’s Mussel V2: Next-Gen Key Value Storage to Unify Streaming and Bulk Ingestion
Airbnb’s engineering team re-architected its internal key-value storage system, Mussel, to unify streaming and bulk ingestion while simplifying operations, achieving over 100,000 writes per second and sub-25ms read latencies on 100-terabyte tables, while leveraging Kubernetes, Kafka, and a NewSQL backend to improve scalability, reliability, and operational efficiency across its internal services.
-
AWS Introduces Vector Capabilities on Amazon S3
At the recent AWS Summit in New York City, AWS announced the preview of Amazon S3 Vectors, claiming to be the first cloud object store with native support for storing large vector datasets. The new option offers subsecond query performance, reducing the cost of storing AI-ready data compared to traditional vector databases.
-
Amazon S3 Adds Sort and Z-Order Compaction to Improve Apache Iceberg Query Performance
AWS has recently announced that Amazon S3 now supports sort and z-order compaction for Apache Iceberg tables. The new features reduce scan times and engine costs, and are available for both S3 Tables and traditional S3 buckets using AWS Glue Data Catalog optimization.
-
Google Cloud Announces Rapid Storage for Millisecond-Latency Workloads
At the recent Google Cloud Next 2025, the cloud provider announced Rapid Storage, a new Cloud Storage zonal bucket designed to deliver consistent single-digit millisecond data access for frequently accessed data and latency-sensitive applications. The new storage class provides under 1ms random read and write latency, 20x faster data access, and 6 TB/s of throughput.
-
How a Manual Remediation for a Phishing URL Took down Cloudflare R2
Due to human error in handling a phishing report and insufficient validation safeguards in admin tools, Cloudflare experienced an incident affecting its R2 Gateway service on February 5th. As part of a routine remediation for a phishing URL, the R2 service was inadvertently taken down, leading to the outage or disruption of numerous other Cloudflare services for over an hour.
-
How to Defend Amazon S3 Buckets from Ransomware Exploiting SSE-C Encryption
A new ransomware campaign, dubbed Codefinger, has been targeting Amazon S3 users by exploiting compromised AWS credentials to encrypt data using Server-Side Encryption with Customer-Provided Keys (SSE-C). Attackers then demand ransom payments for the symmetric AES-256 keys required to decrypt the data. AWS has released recommendations to help users mitigate the risk of ransomware attacks on S3.
-
AWS Announces Physical Data Transfer Terminal for High-Speed Uploads
AWS has recently introduced AWS Data Transfer Terminal, a new option for high-speed data uploads. Currently available only in the US, Data Transfer Terminals provide a physical location where customers can bring their storage devices for fast data transfer to and from the AWS cloud.
-
AWS Introduces S3 Tables Bucket: Is S3 Becoming a Data Lakehouse?
AWS has recently announced S3 Tables Bucket, managed Apache Iceberg tables optimized for analytics workloads. According to the cloud provider, the new option delivers up to 3x faster query performance and up to 10x higher transaction rates for Apache Iceberg tables compared to standard S3 storage.
-
Amazon S3 Introduces Metadata Feature for Improved Data Management and Querying in Preview
Amazon Web Services (AWS) has launched S3 Metadata, enhancing data management for S3 users. This new capability enables near real-time querying and analysis of S3 data via organized metadata updates. By adopting Apache Iceberg, it ensures interoperability and scalability, allowing businesses to efficiently leverage their data for analytics and AI applications.
-
From Aurora DSQL to Amazon Nova: Highlights of re:Invent 2024
The 2024 edition of re:Invent has just ended in Las Vegas. As anticipated, AI was a key focus of the conference, with Amazon Nova and a new version of Sagemaker among the most significant highlights. However, the announcement that generated the most excitement in the community was the preview of Amazon Aurora DSQL, a serverless, distributed SQL database with active-active high availability.