AWS Introduces S3 Files, Bringing File System Access to S3 Buckets

AWS recently introduced S3 Files, which lets users mount an Amazon S3 bucket and access its data through a standard file system interface. Applications can read and write files using standard file operations, while the system automatically translates them into S3 requests, allowing compute services to work directly with data stored in S3. Amazon S3 Files

Source: AWS blog

Sébastien Stormacq, principal developer advocate at AWS, explains how the bucket's data is exposed once it has been mounted:

As you work with specific files and directories through the file system, associated file metadata and contents are placed onto the file system's high-performance storage. By default, files that benefit from low-latency access are stored and served from the high performance storage. For files not stored on high performance storage such as those needing large sequential reads, S3 Files automatically serves those files directly from Amazon S3 to maximize throughput.

Claiming to be the only provider offering fully featured, high-performance file system access to an object store, AWS suggests S3 Files for workloads such as analytics, machine learning, media processing, and other applications that require shared file system access to large datasets. Stormacq adds:

Under the hood, S3 Files uses Amazon EFS and delivers ~1ms latencies for active data. The file system supports concurrent access from multiple compute resources with NFS close-to-open consistency, making it ideal for interactive, shared workloads that mutate data, from agentic AI agents collaborating through file-based tools to ML training pipelines processing datasets.

S3 Files supports intelligent prefetching to anticipate data access needs. Customers can control what is stored on the file system, including the option to load full file data or only metadata, enabling optimization for specific access patterns. Andrew Warfield, VP and Distinguished Engineer at Amazon, explains the motivation and design choices behind S3 Files and provides some implementation details. Warfield writes:

When you create or modify files, changes are aggregated and committed back to S3 roughly every 60 seconds as a single PUT. Sync runs in both directions, so when other applications modify objects in the bucket, S3 Files automatically spots those modifications and reflects them in the filesystem view automatically. If there is ever a conflict where files are modified from both places at the same time, S3 is the source of truth and the filesystem version moves to a lost+found directory with a CloudWatch metric identifying the event. File data that hasn't been accessed in 30 days is evicted from the filesystem view but not deleted from S3, so storage costs stay proportional to your active working set.

Some developers reacted humorously to AWS adding a filesystem interface to a service long described as "not a filesystem." At the same time, the broader community response was mixed, with some appreciating the simpler developer experience and others raising concerns about potential costs.

In the "S3 Is Not a Filesystem (But Now There's One In Front of It)" article, Corey Quinn praises the implementation ("They didn't just bolt a POSIX layer on top of S3 and call it a day"), highlights the differences with Mountpoint for Amazon S3, considers the pricing model reasonable, and analyzes how it compares with EFS pricing.

Charges apply based on the amount of data stored in the S3 file system, for small-file reads and all write operations, and for S3 requests used to synchronize data between the file system and the S3 bucket. On a popular Hacker News thread, user MontyCarloHall comments:

This is essentially S3FS using EFS (AWS's managed NFS service) as a cache layer for active data and small random accesses. Unfortunately, this also means that it comes with some of EFS's eye-watering pricing.

S3 Files runs on EFS infrastructure with identical pricing by design, but since charges apply only to the small, frequently accessed portion of data that lands on the filesystem, the overall cost might remain lower despite the matched rates. Testing the new option, Dzhuneyt Ahmed, CTO at Provenant, highlights the current limitations: S3 versioning is mandatory, there is no Infrastructure as Code support at launch, and the IAM setup is not obvious, with the trust policy using EFS service principals and S3 Files-specific conditions.

With a separate announcement, Amazon S3 introduced a new default security setting that disables server-side encryption with customer-provided keys (SSE-C) for new and existing buckets.

S3 Files is generally available in all AWS regions.

About the Author

Renato Losio

Show moreShow less

InfoQ Software Architects' Newsletter

Write for InfoQ

About the Author

Renato Losio

Rate this Article

This content is in the Cloud topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter