New-age Transactional Systems - Not Your Grandpa's OLTP
John Hugg discusses high volume transaction processing applications with high and low frequency profiles, and how VoltDB can be used for that purpose.
The content has been bookmarked!
There was an error bookmarking this content! Please retry.
Posted by Gavin Terrill on Jul 24, 2007
S3 is used to store the files for conversion:
"Amazon S3 is the perfect place to store the video files to be converted as well as any output files generated by our conversion service. In addition to being fast and reliable, we will never have to worry about our service running out of disk space."
To make the service scalable and highly available, the design of the service is message-driven, utilizing SQS's reliable message delivery. This ensures that execution of client requests happens in the order they are received.
The ConvertVideo service is written in Python and utilizes the boto library, which provides a set of classes for integrating with Amazon Web Services. To provision the service to EC2, an AMI (Amazon Machine Image) file needs to be created and registered so that instances may be created on demand.
On the client side, the boto library provides a command line interface that can upload a directory of files to an S3 bucket, posting a messages to an SQS queue for each file. Once the files have been uploaded, a service instance can be started to process the messages in the queue.
To test scalability, an initial conversion run is performed on 50 videos by 1 instance:
The next conversion run is based on 500 videos, and 10 instances:
The additional service instances have increased throughput in a linear and predictable manner:
Sure enough, the average processing time and elapsed time are almost exactly the same but our overall throughput is roughly 10 times higher than in our previous example which is exactly the sort of behavior we would expect and hope for.
The tutorial breaks down the cost of converting the 500 videos:
| Storage | 2.5 GBytes | $0.38/Month |
| Transfer | 2.5 GBytes | $0.50 |
| Messages | 1000 | $0.10 |
| Compute Resources | 8 Instances for ~ 20 minutes | $0.80 |
| Total: | $1.78 |
A total of about $1.78 for converting 500 videos means a per/video cost of less than $0.004.
Compute services such as file conversion seem a good fit for the AWS infrastructure, however questions have been raised on the utility of the platform without an a database. Dare Obasanjo, in his blog posting "Amazon EC2 + S3 doesn't cut it", laments the lack of a database while experimenting with a Facebook application:
"it seems supporting this fairly straightforward application is beyond the current capabilities of EC2 + S3. S3 is primarily geared towards file storage so although it makes a good choice for cheaply hosting images and CSS stylesheets, it's a not a good choice for storing relational or structured data."
Of course, Amazon has deep experience in scaling out services. In his summary of the Google Seattle Scalability Conference, Robin Harris remarks on Amazon's CTO Verner Wogels memorable line: "Databases are Dinosaurs". Perhaps Dynamo, Amazon's scalable data store and due to be presented at SOSP 2007, is the remaining missing piece of the AWS puzzle.
Free Gartner Cloud Services Brokerage Report
Getting Started with Stratos - an Open Source Cloud Platform
SOA All-In-One Guide: KPIs & Best Practices, ESB Report
Complimentary Gartner (Hype Cycle for Cloud Security) Report
Why NoSQL? A primer on Managing the Transition from RDBMS to NoSQL
John Hugg discusses high volume transaction processing applications with high and low frequency profiles, and how VoltDB can be used for that purpose.
Kevlin Henney examines code samples to see what can be learned from them starting from the premise that one won’t write great code unless he knows how to read it.
Jason Ayers share the observations he made watching a team of developers collaborating in real time on the same code base, pushing XP, pair programming and continuous integration to their extremes.
Michael Snoyman presents Yesod, a web framework written in Haskell and containing a web server, templating, ORM, libraries (templating, gravatar, etc.).
Richard Kreuter and Kyle Banker on how to avoid classical RDBMS transactional systems by using compensation mechanisms, transactional messaging or transactional procedures.
Attila Szegedi talks about performance tuning Java and Scala programs at Twitter: how to approach GC problems, the importance of asynchronous I/O, when to use MySQL/Cassandra/Redis, and much more.
One category of risk that project teams need to ensure they address is business value failure – delivering a product that fails to provide value for the business investor.
InfoQ spoke to the authors of Software Systems Architecture on a couple of new topics, the System Context viewpoint and Agile, which have been added to the second edition.
No comments
Watch Thread Reply