Dan Farino About MySpace’s Architecture
Dan Farino talks about the system architecture and the challenges faced when building a very large online community. Dan explains how a .NET product scales on hundreds of servers.
Tracking change and innovation in the enterprise software development community
Posted by Obie Fernandez on Nov 29, 2006 01:18 PM
Yesterday, Marcel Molina Jr. of 37signals (and member of the Rails core-team) announced the initial release of AWS::S3, a ruby library for Amazon's Simple Store Service's (S3) REST API. Marcel was nice enough to share insight into the motivations and history behind his promising new library with InfoQ. We think his answers cast light into how Amazon's web services are transforming the industry.
Even though AWS::S3 is currently just an initial release, you informed the community that "we are using it in production at 37signals." Why did 37signals decide to go the S3 route?
The main motivation is that the file servers that we currently have in our cluster are maxed out on the number of disks they can hold and we are running out of space on those disks so it was time to either add more file servers (which starts to get really expensive) or come up with something else. The technical barrier to switching parts of our infrastructure over to S3 were low enough that it made sense.
How are you using the S3 service now?
As for how we are using it, currently all uploads for Campfire are served up from S3. We are working on doing something similar with the other products. The approach is basically async migration which isn't that disruptive to introduce into the system. Files are uploaded as they always were, but there is a service that comes in every once in a while and migrates them over to S3 then marks them as migrated. When you make a request for a file we determine if it should be served off S3 yet. If so, you get it off S3, if not, it's comes from the local filesystem. The same service that migrates the data to S3 also purges files from our file server that have been migrated. How much file storage we need then becomes the amount of data that might be waiting to be migrated between runs of the service (it runs very frequently). So the cluster can't go diskless, but our data storage needs are almost nil.
Did Jeff Bezos' involvement have anything to do with 37Signals adoption of S3 or was it just a smart decision?
More than anything it was just a smart decision. Sure, Jeff Bezos thinks S3 is a good idea, of course, and that he's building a whole business around services like S3 is a testament to where he thinks the industry is going, but he doesn't mandate our technology or business decisions. We decided that from a technical as well as a business perspective that it was the solution to our file storage issues. Frankly, I don't think Jeff even knows we are using S3 at this point.
What are some of the challenges you have faced? Is performance an issue?
As for performance, the bottleneck (from profiling the code) is xml parsing. By default it uses XmlSimple which wraps REXML. This is nice from a portability and ease of installation point of view, but not ideal from a performance point of view. But, if you have libxml installed (the ruby bindings to the gnome xml library) aws/s3 will automatically use libxml instead of REXML for the parsing, which makes things an order of magnitude both faster and more efficient. Having said that, most people won't be using S3 in a way that would lead to them needing to parse huge amounts of xml. For every day use even the REXML version is fast enough. You only start running into performance problems when you, for example, ask for the list of all the objects in a bucket that contains hundreds of objects. You likely wouldn't be doing that very often, especially since you can limit the number of objects returned in a bucket by various filtering criteria. Aside from the xml parsing issue I find performance to be fine for my needs. Now that the library has been released, I'll likely find out from people using it if there are parts that really must be made more performant.
If you were asking how S3 (on the server side) performs, then my answer would be "just fine", though I'm not pushing it to its limits.
What's special about your S3 library compared to other options available in Ruby? Is your library easier to work with?
As for "easy to work with", that's more of a priority for me than most anything else. I've payed very close attention to thinking about the API and molding it to incrementally be as close to how I'd like the interface to S3 to be. There is of course still work to be done on this (always) but I'm pleased with where things are now as far as ease of use. I want the library to be a joy to use. That's why I use Ruby.
Six Free Project Management Certification Training Courses
White Paper: Writing Good Use Cases
IBM software architect eKit: Grady Booch podcast, whitepapers, articles
Agile Development: A Manager’s Roadmap for Success
The Agile Business Analyst: Skills and Techniques needed for Agile
Wholesale lingerie directly from China?
As a famous brand and specialized manufacturer of sexy clothing in China. We supply the international market with fashionable sexy lingerie and sexy costume since 2002. With advanced technology,all our products are of high quality. Now we have clients all around the world. Lingerie Wholesale and OEM are welcomed!
As a Lingerie Manufacturer, Charmingirl has standard workshop and production line, professional designers and experienced workers.
We do Wholesale Underwear,
Lingerie Wholesale, including corset and bustier,
Sexy Lingerie Wholesale, including bikini, underwear
Lingerie Wholesale, and Babydolls, Sexy Lingerie Wholesale, and
Sexy Lingerie Wholesale including sleepwear,clubwear.
Lingerie Wholesale from China: Lingerie China, you will find the
Leather Lingerie and PVC Lingerie, also you can buy
Christmas Costume and Xmas Lingerie
for your Christmas Lingerie Christmas day.
Our Wholesale center: Sexy Lingerie Wholesale can do Lingerie Wholesale online.
Halloween Costume,
also wholesale Adult Costume with fashion Babydoll Babydoll, and bra and panties Bra and Panties, Sexy Uniform Sexy Uniform is also our major products.
we have strong ability on production, research and technology, advanced facilities that is imported fromGermany and Japan, now our monthly output is 100000 pcs.
Dan Farino talks about the system architecture and the challenges faced when building a very large online community. Dan explains how a .NET product scales on hundreds of servers.
Bernd Mathiske discusses Maxine VM, Java compatibility, swapping major VM components, research areas, Object handling, code examples, optimizing compiler, snippets, bytecode generation, JNI and JIT.
Joe Armstrong speaks on various aspects of the Erlang language, presenting its roots, how it compares with other languages and why it has become popular these days.
The java double-check singleton pattern is not thread safe and can’t be fixed. In this article, Dr. Alexey Yakubovich provides an implementation of the Singleton pattern that he claims is thread-safe.
Diana and Jim talk about patterns observed in CTOs' activity. CTOs emerge as real people caring for other people in their organization, and are put under a lot of pressure and constraints.
Cloud computing feels like a tomorrow technology. Simon Thurman shows how developers can use Biztalk to create an Internet Service Bus which can be deployed locally or in the cloud.
InfoQ takes a look at the JavaFX preview build and talks to Sun Staff Engineer Joshua Marinacci about the upcoming version 1 release expected this autumn.
Jeff Sutherland, co-creator of Scrum, and Guido Schoonheim, CTO of Xebia, present an actual case of reaching hyper-productivity with a large distributed team using XP and Scrum.
1 comment
Reply