BT

Cloud Architectures Are Not Fully Thought Out Yet

by Abel Avram on Feb 18, 2009 |

While there are many mature software patterns for applications, not the same can be said about clouds. Each vendor employs their own solution, which is most probably subject to change and improvement. The technology is not mature enough for a clear set of patterns to emerge yet, but the first working examples are out there.

Amazon suggests using their cloud for the following tasks:

Processing Pipelines

  • Document processing pipelines – convert hundreds of thousands of documents from Microsoft Word to PDF, OCR millions of pages/images into raw searchable text
  • Image processing pipelines – create thumbnails or low resolution variants of an image, resize millions of images
  • Video transcoding pipelines – transcode AVI to MPEG movies
  • Indexing – create an index of web crawl data
  • Data mining – perform search over millions of records

Batch Processing Systems

  • Back-office applications (in financial, insurance or retail sectors)
  • Log analysis – analyze and generate daily/weekly reports
  • Nightly builds – perform nightly automated builds of source code repository every night in parallel
  • Automated Unit Testing and Deployment Testing – Test and deploy and perform automated unit testing (functional, load, quality) on different deployment configurations every night

Websites

  • Websites that “sleep” at night and auto-scale during the day
  • Instant Websites – websites for conferences or events (Super Bowl, sports tournaments)
  • Promotion websites
  • Seasonal websites - websites that only run during the tax season or the holiday season (“Black Friday” or Christmas)

An example of a cloud architecture is Amazon’s GrepTheWeb:

grep

After zooming in, the architecture looks like this:

grep1

Jinesh Varia, a Web Services Evangelist at Amazon, explained GrepTheWeb in detail through a presentation published by InfoQ.

Todd Hoff compiled a list of basic components employed by SmugMug in their cloud architecture, which is also built on Amazon EC2:

  • Work Initiators - Work comes in from your website and/or other software subsystems and is queued up for processing in the Queue Service. Work doesn't have to be large requests either. Work can be small independent parts of an overall pipeline. Don't keep state in the Workers. Bundle what you need done into a work request in shoot back into the Queuing Service for processing.
  • Provisioning Service - This is Amazon's infrastructure that allows instances to be automatically scaled up and down in relation to the work load. This will be the major difference between your VPS or typical datacenter setup. There's an API for starting and stopping AMIs and mechanisms for automatically configuring and running VMs.
  • Workers - These are the guys that continually pull work off queues and do something interesting with it. For SmugMug the results are stored on S3 but the results could be put in your own database, SimpleDB or whatever.
  • Queuing Service - This is where work is queued for consumption by the workers. SmugMug built their own queuing service, but you could just as easily use Amazon's own SQS. Creating a scalable, distributed, performant, highly available queue service is not easy, so you may want to take a look at a number of different queue product suggestions in Flickr - Do the Essential Work Up-front and Queue the Rest.
  • Controller - This component monitors many variables related to the work flow and decides how many instances of EC2 are necessary based on optimizing a small set of goals. Instances are add and removed as needed.

Each vendor has their own solution and different ones are expected to emerge in the future. The clouds have not been fully explored and slowly, but steadily, their architectural solutions are being elaborated. 

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Not Entirely In Agreement by Dan Creswell

I'd agree that there are plenty of patterns still to come but a number of relevant patterns have been developed previously atop conventional message queuing systems and in Jini/JavaSpaces systems for example.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

2 Discuss

Educational Content

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2014 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT