Bindings, Platforms, and Innovation
This presentation focuses on the Internet and separating myth from fact, history from the future, and the mundane from the imaginative. Bob Frankston presents a vision of what could and should be.
Tracking change and innovation in the enterprise software development community
Posted by Obie Fernandez on Nov 29, 2006 01:18 PM
Yesterday, Marcel Molina Jr. of 37signals (and member of the Rails core-team) announced the initial release of AWS::S3, a ruby library for Amazon's Simple Store Service's (S3) REST API. Marcel was nice enough to share insight into the motivations and history behind his promising new library with InfoQ. We think his answers cast light into how Amazon's web services are transforming the industry.
Even though AWS::S3 is currently just an initial release, you informed the community that "we are using it in production at 37signals." Why did 37signals decide to go the S3 route?
The main motivation is that the file servers that we currently have in our cluster are maxed out on the number of disks they can hold and we are running out of space on those disks so it was time to either add more file servers (which starts to get really expensive) or come up with something else. The technical barrier to switching parts of our infrastructure over to S3 were low enough that it made sense.
How are you using the S3 service now?
As for how we are using it, currently all uploads for Campfire are served up from S3. We are working on doing something similar with the other products. The approach is basically async migration which isn't that disruptive to introduce into the system. Files are uploaded as they always were, but there is a service that comes in every once in a while and migrates them over to S3 then marks them as migrated. When you make a request for a file we determine if it should be served off S3 yet. If so, you get it off S3, if not, it's comes from the local filesystem. The same service that migrates the data to S3 also purges files from our file server that have been migrated. How much file storage we need then becomes the amount of data that might be waiting to be migrated between runs of the service (it runs very frequently). So the cluster can't go diskless, but our data storage needs are almost nil.
Did Jeff Bezos' involvement have anything to do with 37Signals adoption of S3 or was it just a smart decision?
More than anything it was just a smart decision. Sure, Jeff Bezos thinks S3 is a good idea, of course, and that he's building a whole business around services like S3 is a testament to where he thinks the industry is going, but he doesn't mandate our technology or business decisions. We decided that from a technical as well as a business perspective that it was the solution to our file storage issues. Frankly, I don't think Jeff even knows we are using S3 at this point.
What are some of the challenges you have faced? Is performance an issue?
As for performance, the bottleneck (from profiling the code) is xml parsing. By default it uses XmlSimple which wraps REXML. This is nice from a portability and ease of installation point of view, but not ideal from a performance point of view. But, if you have libxml installed (the ruby bindings to the gnome xml library) aws/s3 will automatically use libxml instead of REXML for the parsing, which makes things an order of magnitude both faster and more efficient. Having said that, most people won't be using S3 in a way that would lead to them needing to parse huge amounts of xml. For every day use even the REXML version is fast enough. You only start running into performance problems when you, for example, ask for the list of all the objects in a bucket that contains hundreds of objects. You likely wouldn't be doing that very often, especially since you can limit the number of objects returned in a bucket by various filtering criteria. Aside from the xml parsing issue I find performance to be fine for my needs. Now that the library has been released, I'll likely find out from people using it if there are parts that really must be made more performant.
If you were asking how S3 (on the server side) performs, then my answer would be "just fine", though I'm not pushing it to its limits.
What's special about your S3 library compared to other options available in Ruby? Is your library easier to work with?
As for "easy to work with", that's more of a priority for me than most anything else. I've payed very close attention to thinking about the API and molding it to incrementally be as close to how I'd like the interface to S3 to be. There is of course still work to be done on this (always) but I'm pleased with where things are now as far as ease of use. I want the library to be a joy to use. That's why I use Ruby.
Give-away eBook – Confessions of an IT Manager
Ensuring Code Quality in Multi-threaded Applications
Effective Management of Static Analysis Vulnerabilities and Defects
This presentation focuses on the Internet and separating myth from fact, history from the future, and the mundane from the imaginative. Bob Frankston presents a vision of what could and should be.
This article explores the use of JBoss and jBPM to implement design solutions that effectively address the issue of orchestrating long running activities.
This presentation covers the use of graph databases as an optimal solution for data that is difficult to fit in static tables, rapidly evolving data or data that has a lot of optional attributes.
This session introduces Real Options and shows how it can help in running your project. Real Options is a decision-making process that can be used to manage risk.
This article discusses the use of bindings on services and references (including the instance of non-configured bindings) as the means to implement SCA communications in a Web and SOA environment.
After a short introduction to DSLs, Scott Davis plays with the keyboard showing how to approach the creation of a DSL by typing working snippets of Groovy code that get executed.
IBM Rational and InfoQ present, Scaling Agile with C/ALM, an eBook showing organizations how to become “finely tuned software delivery machines” by enabling team integration and scaling.
Amanda Laucher presents a real life enterprise application written in F#. She shows actual code snippets, explaining design decisions and suggesting how to use some of the F# constructs.
No comments
Watch Thread Reply