Amazon upgrades EC2 with Persistent Storage
Amazon’s Elastic Computer Cloud (EC2), which delivers Hardware as a Service (HaaS), is adding persistent storage to its list of features. Werner Vogels, Amazon’s CTO, describes how storage volumes will attach to EC2 instances and how they handle failure and data consistency; he also talks about how you can store a snapshot of your storage in S3, Amazon’s storage service, as a backup.
Based on Cambridge University's Xen Virtualization, Amazon's Elastic Computer Cloud ("EC2"), is a computing service which allows users to create, launch and terminate Linux based server instances on demand. Each virtual machine instance is a virtual private server assigned an IP address via DHCP on start-up. Virtual Machine images, which Amazon calls Amazon Machine Images (AMIs), can be archived and transported much like VMware's virtual appliances, so a developer can set-up an initial instance of the required software and rapidly deploy it to a number of virtual servers.
One of the main drawbacks for EC2 adoption, the lack of persistent storage, has been acknowledged by Amazon and as the CTO Werner Vogels describes in his blog, has been added as a feature (early access):
I would like to introduce to you the newest feature of Amazon EC2: Persistent local storage. This has been very high on the request list of EC2 customers and I believe that combined with the Availability Zones and Elastic IP Address features released earlier this month this makes EC2 the ideal environment for building highly scalable and reliable applications.
Persistent storage for Amazon EC2 will be offered in the form of storage volumes which you can mount into your EC2 instance as a raw block storage device. It basically looks like an unformatted hard disk. Once you have the volume mounted for the first time you can format it with any file system you want or if you have advanced applications such as high-end database engines, you could use it directly.
Werner mentions that users will have the ability to take snapshots of their storage volumes and store them in S3:
…we introduced snapshot functionality: you ask the EC2 to make a snapshot of your volume and store it into Amazon S3. You can use this for long term backup purposes, for use in rollback strategies, but also for (world-wide) volume re-creation purposes.
Thorsten from RightScale, a company which provides a platform for cloud computing server deployments using Amazon Web Services (AWS), had early access experience and blogs about his experience:
With the addition of the storage volumes with all the cool snapshot features it’s now a fait accomplit: the cloud adopters will have much more computing horsepower and flexibility at their fingertips than those who are still racking their own machines. It’s going to be like agile software development: if you want to survive as an internet/web service you will have to compute in the cloud or your competitors will leave you in the dust by being able to deploy faster, better, and cheaper.
Jeff Barr from Amazon, describes the persistence functionality in a little more technical detail and emphasizes on the on-demand nature of the storage acquisition:
I spent some time experimenting with this new feature on Saturday. In a matter of minutes I was able to create a pair of 512 GB volumes, attach them to an EC2 instance, create file systems on them with mkfs, and then mount them. When I was done I simply unmounted, detached, and then finally deleted them.
Perhaps I am biased, but the ability to requisition this much storage on an as-needed basis seems pretty cool.
You will be able to create volumes ranging in size from 1 GB to 1 TB, and will be able to attach multiple volumes to a single instance. Volumes are designed for high throughput, low latency access from Amazon EC2, and can be attached to any running EC2 instance where they will show up as a device inside of the instance. This feature will make it even easier to run everything from relational databases to distributed file systems to Hadoop processing clusters using Amazon EC2. When persistent storage is launched, Amazon EC2 will be adding several new APIs to support the persistent storage feature. Included will be calls to manage your volume (CreateVolume, DeleteVolume), mount your volume to your instance (AttachVolume, DetachVolume) and save snapshots to Amazon S3 (CreateSnapshot, DeleteSnapshot
Amazon says that this new feature is already being used privately by a handful of customers, and will be publically available later this year. You can find more information at infoq.com/amazon.
The last piece of the puzzle
It is an attractive technology
This is kewl.
1. Long running QA tests at scale
2. Pounders and other throughput testing
3. even production scaled out operation.
I know some guys who were building Terracotta in EC2 with S3 as the back end storage via changes in Terracotta's source code. Now, maybe they can skip that step.