InfoQ

News

Amazon upgrades EC2 with Persistent Storage

Posted by Dionysios Synodinos on Apr 22, 2008 11:30 AM

Community
Architecture
Topics
Cloud Computing ,
Virtualization ,
Data Access
Tags
Amazon ,
EC2

Amazon’s Elastic Computer Cloud (EC2), which delivers Hardware as a Service (HaaS), is adding persistent storage to its list of features. Werner Vogels, Amazon’s CTO, describes how storage volumes will attach to EC2 instances and how they handle failure and data consistency; he also talks about how you can store a snapshot of your storage in S3, Amazon’s storage service, as a backup.

There were several articles here in InfoQ describing the technology and benefits behind the EC2 service:

Based on Cambridge University's Xen Virtualization, Amazon's Elastic Computer Cloud ("EC2"), is a computing service which allows users to create, launch and terminate Linux based server instances on demand. Each virtual machine instance is a virtual private server assigned an IP address via DHCP on start-up. Virtual Machine images, which Amazon calls Amazon Machine Images (AMIs), can be archived and transported much like VMware's virtual appliances, so a developer can set-up an initial instance of the required software and rapidly deploy it to a number of virtual servers.

One of the main drawbacks for EC2 adoption, the lack of persistent storage, has been acknowledged by Amazon and as the CTO Werner Vogels describes in his blog, has been added as a feature (early access):

I would like to introduce to you the newest feature of Amazon EC2: Persistent local storage. This has been very high on the request list of EC2 customers and I believe that combined with the Availability Zones and Elastic IP Address features released earlier this month this makes EC2 the ideal environment for building highly scalable and reliable applications.

Persistent storage for Amazon EC2 will be offered in the form of storage volumes which you can mount into your EC2 instance as a raw block storage device. It basically looks like an unformatted hard disk. Once you have the volume mounted for the first time you can format it with any file system you want or if you have advanced applications such as high-end database engines, you could use it directly.

Werner mentions that users will have the ability to take snapshots of their storage volumes and store them in S3:

…we introduced snapshot functionality: you ask the EC2 to make a snapshot of your volume and store it into Amazon S3. You can use this for long term backup purposes, for use in rollback strategies, but also for (world-wide) volume re-creation purposes.

Thorsten from RightScale, a company which provides a platform for cloud computing server deployments using Amazon Web Services (AWS), had early access experience and blogs about his experience:

With the addition of the storage volumes with all the cool snapshot features it’s now a fait accomplit: the cloud adopters will have much more computing horsepower and flexibility at their fingertips than those who are still racking their own machines. It’s going to be like agile software development: if you want to survive as an internet/web service you will have to compute in the cloud or your competitors will leave you in the dust by being able to deploy faster, better, and cheaper.

Jeff Barr from Amazon, describes the persistence functionality in a little more technical detail and emphasizes on the on-demand nature of the storage acquisition:

I spent some time experimenting with this new feature on Saturday. In a matter of minutes I was able to create a pair of 512 GB volumes, attach them to an EC2 instance, create file systems on them with mkfs, and then mount them. When I was done I simply unmounted, detached, and then finally deleted them.

Perhaps I am biased, but the ability to requisition this much storage on an as-needed basis seems pretty cool.

Matt from Amazon’s EC2 Team, also talks about the new feature:

You will be able to create volumes ranging in size from 1 GB to 1 TB, and will be able to attach multiple volumes to a single instance. Volumes are designed for high throughput, low latency access from Amazon EC2, and can be attached to any running EC2 instance where they will show up as a device inside of the instance. This feature will make it even easier to run everything from relational databases to distributed file systems to Hadoop processing clusters using Amazon EC2. When persistent storage is launched, Amazon EC2 will be adding several new APIs to support the persistent storage feature. Included will be calls to manage your volume (CreateVolume, DeleteVolume), mount your volume to your instance (AttachVolume, DetachVolume) and save snapshots to Amazon S3 (CreateSnapshot, DeleteSnapshot

Amazon says that this new feature is already being used privately by a handful of customers, and will be publically available later this year.  You can find more information at infoq.com/amazon.

3 comments

Reply

The last piece of the puzzle by Mike Burba Posted Apr 22, 2008 3:50 PM
It is an attractive technology by z q Posted Apr 22, 2008 10:11 PM
This is kewl. by ARI ZILKA Posted Apr 23, 2008 12:22 AM
  1. Back to top

    The last piece of the puzzle

    Apr 22, 2008 3:50 PM by Mike Burba

    Persistent storage was the last missing piece of the puzzle in EC2. Great job, AWS!



    --

    Mike Burba

    www.devhive.com

  2. Back to top

    It is an attractive technology

    Apr 22, 2008 10:11 PM by z q

    It is an very exciting message. In the future, all data will be stored in the Cloud computer, Out portable computer will not need any hard-disk. Just access net Cloud node.

  3. Back to top

    This is kewl.

    Apr 23, 2008 12:22 AM by ARI ZILKA

    I was hoping that AWS would deliver this feature for quite some time. Imagine now a Terracotta-based cluster that dynamically grows and shrinks, can be paused, etc. out in the cloud. Great for:

    1. Long running QA tests at scale
    2. Pounders and other throughput testing
    3. even production scaled out operation.

    I know some guys who were building Terracotta in EC2 with S3 as the back end storage via changes in Terracotta's source code. Now, maybe they can skip that step.

    AWESOME.

    --Ari

Exclusive Content

Diary of a Fence Sitting SOA Geek

In this presentation, Mark Little explains the history of SOAP/WSDL/WS-*-based web services and RESTful HTTP and highlights how the two approaches might converge into a single solution.

Flex for XML and JSON

Platforms need interoperability. In this article Flex interoperability with JSON and XML is explored including direct mapping to chart and grid components.

Measuring Agile in the Enterprise: 5 Success Factors for Large-Scale Agile Adoption

Michael Mah analyzes the development process in 5 companies: 2 Agile (one of them BMC) and 3 classic. He presents the factors which contributed to the success of BMC's Agile adoption.

Tom Preston-Werner on Powerset, GitHub, Ruby and Erlang

In this interview filmed at RubyFringe 2008, Tom Preston-Werner talks about how both Powerset and GitHub use Ruby and Erlang, as well as tools like Fuzed, god, and more.

David Laribee on Alt.NET and its Mission

David Laribee discusses the purpose of ALT.NET, its mission and future.

Discover RailsKits and Stop Writing Redundant Code

Ruby on Rails has become a popular Ruby framework for creating web applications in recent years. An aspect of creating a web application is the need to repeatedly create the same base functionality.

A Formal Performance Tuning Methodology: Wait-Based Tuning

Steven Haines talks about tackling web application performance tuning by proposing a method called wait-based tuning.

Shaw and Fowler About Forging a New Alliance

Shaw and Fowler talk about the need for a new relationship between the business department and the IT department. Studies have shown that projects mostly fail due to miscommunication between the two.