BT

Hosting a Web Site on Amazon's EC2

| by Charles Humble Follow 980 Followers on Feb 26, 2008. Estimated reading time: 3 minutes |

Based on Cambridge University's Xen Virtualization, Amazon's Elastic Computer Cloud ("EC2"), is a computing service which allows users to create, launch and terminate Linux based server instances on demand. Each virtual machine instance is a virtual private server assigned an IP address via DHCP on start-up. Virtual Machine images, which Amazon calls Amazon Machine Images (AMIs), can be archived and transported much like VMware's virtual appliances, so a developer can set-up an initial instance of the required software and rapidly deploy it to a number of virtual servers.

A previous InfoQ article looked at the appeal of the service for development teams, such as Oracle's Coherance team, who need large amounts of computing power for short periods of time. The flexibility of the service also makes it attractive to web based start-ups: You have no up-front costs since you don't need to buy expensive hardware, the running costs are generally quite low, and you can install any software you want on your Linux instances. The service can also be readily adapted to changing traffic patterns by starting and stopping additional instances as required. Finally the service is backed by a well known name in Amazon who have a track record of delivering a highly scalable, robust web infrastructure. That said, the lack of any SLA (Service Level Agreement) constitutes a significant barrier to adoption, with some businesses reluctant to entrust data or critical services to it.

There are also practical problems to overcome. For example the DHCP nature of the virtual servers means that the IP address changes each time the server is started. A consequence of this is that, following an outage, a web site would need to update its DNS entries - a process which can take up to ninety-six hours to complete. To work around this Amazon recommends using a dynamic DNS solution such as DynDNS, and in a recent blog article Codesta's Oliver Chan provides details on how to set up DynDNS for EC2.

The same blog provides some other useful hints for developers considering the EC2 service:

  1. "Before spending too much time configuring and customizing an AMI, find one that suits your needs from the start so you won't have to redo any work later on down the road. Check out the list of public AMIs in Amazon’s resource center for something that is more suitable for your needs"
  2. "When packaging up your own image using the ‘ec2-bundle-vol’ command, make sure you specify a clean folder using the '–d' flag otherwise bundling the same image twice will result in an error due to the conflicting sets of temporary files."
  3. "When working with your image, note that the main drive/partition (where the system files are) has a very limited capacity (10 GB in our case). So when dealing with large files/directories use ‘/mnt’ as it has over 100 GB."
  4. "If a machine is terminated, all your data will be lost except for what was backed up from the last time you ran an 'ec2-bundle-vol'"

As EC2 continues to gain momentum open source toools and libraries are emerging to make the life of developers using the service even easier. One such project, building on Chris Richardson's EC2Deploy, is Cloud Tools which comprises:

  • The EC2Deploy framework - a Groovy-based framework for deploying Java EE applications to Amazon EC2.
  • AMI's that are configured to run Tomcat and work with EC2Deploy.
  • A Maven plugin that uses EC2Deploy to deploy a web application to EC2.

Cloud Tools is still very much under development but it provides a means for developers to get up and running on EC2 in a matter of minutes.

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Relational Database by Corby Page

Do I remember correctly that the EC2 service is not useful if you need a persistent database? When your virtual server goes down, the database is gone?

Re: Relational Database by Charles Humble

Yes. There are a couple of common workarounds that people seem to be using. One is to use the Amazon Simple DB service to act as the DB. The other is to use Linux Volume manager snapshots to back-up the DB to Amazon S3.

Re: Relational Database by Corby Page

Ah, I guess my question was answered in point 4 above, thanks.

Cloud Tools look really cool; will investigate.

Another way to handle DNS updating by Jon Chase

I use EC2 quite a bit on my site (sendalong.com). I do use DynDNS, but to even better handle the issue of servers going up and down quickly, I run one webapp on a "normal" host. This webapp keeps a registry of all EC2 instances that are available and routes users as needed - this way, even if DNS is not updated, it can fall back to routing using a very ugly and long EC2 url.

Also, I've found a general good practice if possible (like in the set up above) is to use unique cnames like www1.sendalong.com, www2.sendalong.com, etc for EC2 instances. This means that if www1 goes down you can immediately spawn an instance and map it to www3, or www4, etc. - and since that cname hasn't been used before, the DNS update will happen very quickly (if you used www1 again with a new IP, the DNS update would take a while to propogate).

Jon Chase
www.sendalong.com - Send large files to anyone

Re: Relational Database by Brian Edwards

They actually have another service called Amazon SimpleDB which acts as a database via a web service. Not exactly what some might want but they will sell it to you! Amazon has a really good developer forums section for all its products which is a great resource if you want to develop using their apis

Thanks! by Dave Rooney

Thanks for the heads-up on this! I have an application right now that will very likely require the variable scaling that EC2 can provide, and it looks like it will do it at a fraction of the cost of other more traditional solutions.

Dave Rooney
Mayford Technologies

Multisource by Dmitriy Samovskiy

Lack of persistent disk out of the box and IP addresses that are not preserved between reboots are indeed the 2 primary issues that make hosting public web sites on Amazon EC2 "tricky" (even though not impossible).

Instead of addressing the problem head on, have you considered multisourcing your infrastructure? Let Amazon EC2 host your highly scalable computationally heavy logic, but host database layer and possibly thin static IP layer somewhere else. I described our multisourcing technique called VcubeV in my article in Feb issue of Linux Journal - www.linuxjournal.com/article/9915 (available to non-subscribers in March 08), additional info can be found at elasticserver.blogspot.com/2008/01/introducing-...

With database writes way less frequent than reads and availability of excellent caching (memcached, for example), this is quite doable.

- Dmitriy
www.cohesiveft.com/elastic/

Re: Relational Database by Tharwat Abdul-Malik

Take a look at the offering from Elastra (www.elastra.com)

They allow standard RDBMS (mysql, etc) to be deployed on EC2 instances and uses SS3 to store the data (not a backup solution, but in real time)

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

8 Discuss
BT