Amazon Brings Virtualized Storage to the Cloud with Elastic Block Storage

In April of this year Amazon CTO Werner Vogels announced the development of persistent storage for Amazon EC2. This has long been an achilles heel of the EC2 platform. Server instances startup with the contents of their parent image. Upon server failure/restart the disk reverts back to the original image definition. Today Amazon moved to address this issue with the release of Elastic Block Storage (EBS). Vogels outlines how the offering completes Amazon's suite of common storage patterns:

We had to make sure that the infrastructure storage solutions we were going to develop would be highly effective for developers by addressing the most common patterns first. That analysis led us to three top patterns:

Key-Value storage. The majority of the Amazon storage patterns were based on primary key access leading to single value or object. This pattern led to the development of Amazon S3.

Simple Structured Data storage. A second large category of storage patterns were satisfied by access to simple query interface into structured datasets. Fast indexing allows high-speed lookups over large dataset. This pattern led to the development of Amazon SimpleDB. A common pattern we see is that secondary keys to objects stored in Amazon S3 are stored in SimpleDB, where lookups result in sets of S3 (primary) keys.

Block storage. The remaining bucket holds a variety of storage patterns ranging special file systems such as ZFS to applications managing their own block storage (e.g. cache servers) to relational databases. This category is served by Amazon EBS which provides the fundamental building block for implementing a variety of storage patterns.

Amazon has also provided details in regards to pricing, durability, and performance. Highlights include:

Volumes can be between 1GB and 1TB in size.
Volumes behave like raw unformatted block devices.
Access is limited to within the same availability zone similar to a SAN in a data center.
A volume can only be attached to one EC2 instance at a time.
One EC2 instance can have several attached volumes.
Volumes can have snapshots backed up to S3. Snapshots are incremental with only changed data.
Due to data replication, complete volume failure is expected to be 0.1% - 0.5% based on volume size compared to 4% for commodity hard disks.
Pricing is $0.10 per allocated GB and $0.10 per million I/O requests.

Given this pricing it is estimated that a medium size database with 100GB of storage would cost $10 in storage and $26 in usage costs. A tutorial is available for running MySQL with EBS. Right Scale has written an overview providing further analysis of the specifications that includes a number of best practices and formulas for cost estimation. In regards to I/O rates they provide the following practical experience:

...As a point of reference, our main database server is pretty busy and chugs along at an average of 17 transactions per second, which should total to around $4.40 per month. But our monitoring servers, prior to some recent optimizations, hammered the disks as fast as they would go at over 1000 random writes per second sustained 24×7. That would end up costing over $250 per month! As far as I can tell, for most situations the EBS transaction costs will be in the noise, but you can make it expensive if you’re not careful...

Finally, GigaOM provides a business analysis of the new offering noting that traditional data centers should be worried.

InfoQ Software Architects' Newsletter

Write for InfoQ

Rate this Article

This content is in the DevOps topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter