BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Interviews Adam Wiggins on Heroku

Adam Wiggins on Heroku

Bookmarks
   

1. My name is Ryan Slobojan, I am here with Adam Wiggins, co-founder of Heroku. Adam, can you tell us a little bit about Heroku?

Heroku is a cloud application platform, if you have a web app that is written in Ruby or Ruby on Rails you can deploy it to us using our automated system and we run and maintain and, most importantly, scale it for you.

   

2. So which version of Ruby are you using as the engine for Heroku?

On our servers for users applications we have 1.8.6 because we still think that's basically the gold standard, and basically the conservative choice. Obviously we have our eye on all the up and coming ones and I personally use Ruby 1.9 on my work station, experimenting with the other ones as well. But our primary focus is to provide a really stable and solid platform rather than the latest technology. So we are looking forward to introducing some of the new VMs as an option in the future.

   

3. And what about from the Rails prospective?

I think we have got all the latest Rails gems installed; whenever it comes out we just install it. On of the things that we really emphasized in our stack right from the beginning is making sure that it's really standard components in terms of what the user code interacts with. So, the rack, the Gems that you are relying on, the Ruby VM we are not modifying any of that stuff, we are not using anything that is different from what you would install out of the box. We are very standards oriented and that way you have no lock-in both loading your applications and it is very easy because if you had it working in a standard environment it'll work here, and likewise if you want to export your application later there is no barriers to doing that. And of course, outside of what the user code interacts with we have a lot more opportunity to build magic components and in there we use more interesting technologies but for what faces the user we try to keep it really solid, conservative and standard stack.

   

4. When Rails 3 is released, do you intend to do a full world upgrade from Rails 2 to Rails 3, or do you plan to run the two in parallel?

Definitely in parallel, that's a choice that each user's application should be making. Luckily the versioning for RubyGems makes that really easy, so we install the gems on our servers the day it comes out. We can actually use it today by vendoring it into your application and pushing it up. So it's better to use the system Gems just because your application is more lightweight, and we can distribute it across our cluster of dyno servers more quickly, but we will offer that gem as soon as it comes out. But you will always have complete flexibility to run any version of Rails that you want, or in fact any framework, so you can use Sinatra or Merb or anything else because we are fully rack compatible.

   

5. What are the important things that you are waiting to see happen with Ruby 1.9 before you migrate from your existing 1.8 to that?

In my own use Ruby 1.9 is already totally solid. There are minor incompatibilities here and there but I found them very easy to clean up. I think it's really a matter of, one, customer demand; I think people are loading on applications that for the most part work on either. And our dyno system makes it so that scaling up is very easy and therefore speed isn't a huge concern. That to me it's the biggest win of Ruby 1.9 I feel it as I am using just the speed at which even a rib session launches, it's dramatically better.

And that's certainly a win and when people start demanding that we will certainly make it a priority but I don't think we are waiting for any feature in particular it's definitely better, and it's got some really great stuff. I really like Fibers, although those have been backported to 1.8.6 as well. I don't think we are waiting for anything but I think there will come some tipping point in the community where people say "This is something that I am building applications for this, I am using 1.9 specific features or I just want the speed, and I need the place where I am deploying my application to support that" and of course we'll happily do that.

   

6. You mentioned other Ruby implementations. What would be the impetus for adopting one of those like JRuby or Rubinius?

Similar to 1.9 like the big win there is going to be a matter of performance, or being able to deploy the infrastructure differently so it's kind of a pretty low level decision in a lot of ways, when people deploy their applications to Heroku there is sort of outsourcing their system administration to us. So a lot of times that kind of choice is something that users don't even necessarily care that much about they just care that their application works and that it's fast. So obviously I have been keeping a close eye, we have all been keeping a close eye on JRuby and Rubinius, MacRuby which is a really interesting one, LLVM is going to be a corner stone to the future of VMs for many languages.

We still need to experiment with that but again we'd have to see a pretty big win in terms of performance or scalability, or maybe security sandboxing, in order to overcome, people like those binary, there are a lot of binary Gems that people are depending upon and so they would complain if they didn't work or you could give them alternatives and that sort of things, that's great. But again we are trying to do something that is going to be best for our users and right now it seams to be just adding compatibility with all the historic binary Gems seems to be the best thing to do but definitely we are keeping a close eye on those VMs.

   

7. Can you tell us a little more about what a dyno is?

Yes, absolutely, a dyno is a measure of concurrency that we use on our platform. You can think of it like being some equivalent like a mongrel or a thin or a single thread in Unicorn or Passenger, but we actually count in with that everything down to the operating system level, and up to including the web stack, including like the http cache and all that stuff. So having this unit of concurrency is very useful when you have the capability to scale by just moving a slider on a platform which is the case. So instead of thinking in terms of "Well, I am going to get three web servers and let me run, I will run two, three mongrels on each of those, and that is nine and then if I want to scale up do I want to add an extra mongrel on each server or do I want to add a new server and put three mongrels on that?"

You don't have that decision point because you are not thinking about servers when you are using Heroku you just say "Ok, I am at six dynos and I would like to go to eight" you move that slider up and we distribute that intelligently across our cluster of application servers in a way that it prevents you to worry about what the server topography is.

   

8. With this distribution across the application severs how do you insure that the Gems that have been selected from your given application are present on all these dynos? And how do you keep them separate?

Absolutely. There is a couple key things, like from a partitioning stand point the main thing we do is we basically rely on Unix users partitioning which is really rock solid, very battle tested over the years and so we rely on that and that works great. We don't try to do anything like for example sand boxing, inside the Ruby VM, or like preventing you from doing certain things. You have full access you are just prevented from reaching outside your sand box by Unix's permissions. What was the question? How do you put the Gems in each application, ok.

We have a base set of gems that are installed across the application servers just because they are still common, like Rails or a couple of big ORMs or rack and stuff like that. But of course you often have special gems that you want to use so what we do is during slug compiler process which is another unique piece of our infrastructure, when you do a git push to deploy you see the normal git output. But right after that you see some Heroku specific output, that's actually running out on pre-receive hook outside our git server and that gets to this compile process where it takes the code in your repository and it turns that into this compressed, optimized what we call a slug that can be very very rapidly fired up across our application servers.

This is a much better way to distribute application code than using revision control or using a big tarball or something like that. And one of the things we do during that slug compile process you can specify Gems in your Gem's manifest. And this is cool because it's actually better than vendoring because it allows you to get around the binary problem. Which is that you don't want to vendor a binary gem that you built on your OS X laptop and push it up to a server that's running an AMD64 Linux distribution. So this will actually build the gem, at the time the slug is compiled, and it will bundle it into the gem so it's all right there. So you have this kind of self contained application bundle which includes the dependencies, and we can fire that up to our application servers, and you have the full diversity of gems at your disposal, without us needing to install every gem that's ever been made on our apps servers.

   

9. Add-on support is something that was recently added to Heroku, can you tell us a little more about that?

Yes, absolutely that was something that we were really excited about, the Heroku add-ons system is a way that we can extend and give potentially infinite flexibility to the platform because a lot of people, we built this core product, which is really good for just deploying an application, you have got your SQL database, you have got your varnish cache, you've got a couple of other pieces, but then the next question is always "Well, my application needs something specific, it needs full text search, it needs to do something with sending and receiving email, it needs to do something with exceptions or performance monitoring", or on and on and on, there is a very long list.

So what we have done is rather than try to throw all those features in the core platform, and have it become not so stream lined anymore, instead we basically followed the model of Firefox, Firefox has add-ons and that allowed them to, if you look in the history and probably a lot of us used Mozilla when that was kind of up and coming and they kept adding features to it, and it started to get kind of big and crufty, and everyone wanted their little features so what do you do?

And Firefox kind of cut the Gordian node on that, by keeping a core piece of software that was really solid that does one thing well that everyone needs which is browsing the web, but then you have this add-ons system where you can add in whatever special thing you need like if you are a web developer for example, like I am, there is Firebug and other things that are indispensable but which would be essentially cruft to people that are not web developers or JavaScript developers. So we were very inspired by that and we wanted to do something like that for our platform where we can, when we add-on things like a memcache or that can be a little bit external to the core product and that also allows us to have third party vendors be able to provide their services to users of our could, and that would be a very integrated and streamlined experience, but it pulls all that flexibility together in a way that doesn't bloat the core product.

   

10. Did you run into Ruby specific problems, such as low speed or lots of garbage collection or anything like that?

I think the answer to that Ruby isn't a very fast language question is that if you want something that you need to be very fast, write it in something else. Everything in our system that needs to be high speed, highly concurrent, highly reliable we use Erlang, so we are a very big fan of that but it turns out that that is actually a really small portions, like just in terms of over total code that is actually a really small portion.

There is this crucial bottle neck points that need to be really fast and really concurrent and everything else is not so much and having a highly agile language to develop in, which is what Ruby offers of course, is much more important than just speed of execution. So it's really the right tool for the right job kind of thing. And I think we certainly are in the category of the direction a lot of developers and companies are moving which is kind of the polyglot language and technology, like use what is the right tool for your job. Ruby is a great tool for honestly most programming jobs, but there are certain ones, high speed being one, great example, where it is not the right tool, so use a different one, so I would say no, we haven't run into that because we are using it in the places where it is appropriate.

   

11. And further determining the places where it is appropriate, did you identify those based on testing and bottlenecks or did you look at that and say "I know this is going to be a problem?" How did you approach those?

I think there are some things where it is obvious right from the start like the component that our major Erlang component replaced was originally written in C. And it was part of the web front it handles every single web request across our system which numbers in the billions so it was clear right from the start that wasn't something we wanted to write in Ruby. I think in other places yes, you potentially prototype it out in a scripting language and you discover "Ok, this works well in concept, but as soon as we put some load on it, it falls apart, or it's starts to take up a lot of memory" so at that point you can port it to something else.

   

12. What did you use to implement the delay job and task functionality that you have?

Background jobs are an increasingly important part of the platform when you actually look at our dyno grid or application servers, that the quantity of that that is running a background process that is a DJ worker, actually pretty not too far of from the number of dynes, web processes, that we are running. We always want to make sure that we are building something that is totally standards compatible, and that you can easily import an existing application that just works and you can export an application and run it somewhere else, DJ kind of became somewhat of a standard in the Rails community recently I think probably largely by us kind of pushing on it and you can just run it locally or on a traditional host, rake jobs work and you can spin up as many as you need to.

Within our cloud though we use a system that is fairly similar, actually a lot of the same code that we use to manage dynos, so in a similar way, this is actually in beta right now but it's coming out soon, there is actually a slider for your workers that is the same as the slider for your dynos. We have the same process management stuff where again you're faced with the same choice that I mentioned earlier which is like you say "Well I have got two background jobs servers and they are each running three DJs and I need a little more concurrency to add two DJ workers, to add one more DJ worker on each one or do I get a new server or how do I move them around?" We have got a system that just handles that and so if you decide you need a little more concurrency out of your workers, you just pull that slider.

As with everything we do we try to make it so that you have all the management stuff that goes on, all the automation of your sever system's task, we build very unique IP for managing that but the actual the part that your application interacts with is totally standards based so you can take that anywhere you are not getting locked into some kind of special Heroku background job system. And hopefully once you've used it you'll enjoy it so much that you won't be too inclined to go somewhere else.

Feb 04, 2010

BT