Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Interviews Mitchell Hashimoto on Consul, Terraform, Atlas, Go as a Language for Tools

Mitchell Hashimoto on Consul, Terraform, Atlas, Go as a Language for Tools


1. We’re here at CraftConf 2015 in Budapest, in an excellent location as you can see, I am sitting here with Mitchell Hashimoto. So, Mitchell, who are you?

You got my name, my name is Mitchell Hashimoto, I am best known as the creator of Vagrant, but also the creator of four other open source projects, started a company around them called HashiCorp, that’s really what I have been doing nowadays. So I started very developer-heavy in Vagrant, more recently I spend all my time on data centers and production stuff with our other tools which are Packer, Serf, Consul and Terraform, and perhaps, by the time this video is going out, another as well, but that’s the good gist of what I do.

Werner: That sounds like enough.

It’s a lot, yes.


2. Let’s kick things off with Vagrant, what does Vagrant do?

Vagrant is a tool for building development environments for you. So, the problem I was trying to solve was I saw a lot of different projects, worked on a lot of different websites, that was six or seven years ago, and I was frustrated where they all had slightly different dependencies, one might use MySQL, one might use Postgres, and I was working on some websites that were very old, these very legacy things, some of them were very new and I just wanted a way to type one command and get everything set up and that’s the goal of Vagrant. So with any project, you type one command, vagrant up, you wait a few minutes at the most and you have a fully ready to go development environment.


3. And that’s set up in a virtual machine, I guess, right?

It can be set up in a virtual machine or in a container, but virtual machines are still the most popular way to do development environments, I think that the primary reason is that they are identical on every operating system, so as much as we may or may not want to admit it there is still a lot of Windows, of course there is still a lot of Mac developers, but there are also Linux developers and you want the development environment to be the same across all of them so virtual machines are still the easiest way. But Vagrant is also planning for the future, we already support containers completely as a development environment target, we do it in a pretty neat way, if you vagrant up and you want to containerize a development environment if you are on Linux it will just do it, but if you are on Mac or Windows it actually transparently spins up a virtual machine without you really knowing and that starts the container on that virtual machine for you. So it’s kind of nice, Vagrant still does the hard work giving you that consistent experience.

Werner: So on non-Linux environments it spins up Linux just to get the Docker containers because they don’t have container support.

That’s right.


4. Doesn’t Windows have some sort of Docker support or is that in a different area?

They are coming out with some container support, that will just invert the problem, so if you are using that it will run Windows containers, but if you need Windows containers now on Linux or Mac you will need a Linux VM to start up. So it’s the same thing just inverted.


5. Ok, I see. So, your other projects, you started with Vagrant I suppose, then you found other problems to solve and you solved them, what was your next project?

We started Vagrant in 2006, oh, no, that’s way off, 2009 and very quickly after that, it feels quick, but it was three years after we built Packer, around 2012, we built Packer, we started working on Packer and more recently they have been coming out a lot more quickly, so around 2013 I formed a company around Vagrant, it reached a point where it was popular enough, that there was enough usage, that I had the resources to really pursue what I wanted to which was solve a lot more data center problems. Vagrant is very much on the development side and at the time nothing yet on the production side, but now it’s a lot more filled out. So, Packer was three years later because I had another full time job at the time, but once I started the company, Serf, Consul and Terraform, all came out within 12 months of each other, so things started accelerating very quickly, but the next one was Packer, yes.


6. And Packer creates Vagrant setups or what does it do?

What Packer does it gives you a unified configuration format to build what we like to call deployment artifacts, and a deployment artifact might be an image like an Amazon image, it might be a virtual machine image, it might be a container image, it might be a tarball, but the problem I was trying to solve was if you are trying to create any one of those things you had to be very familiar with each one of those platforms and you had to work with their APIs and their tooling to make it happen, but I really wanted to make one tool where you still have to know what you are building for, it doesn’t hide the fact that’s it’s building the AMI, but at least it’s one configuration language, it’s a command line tool, it’s one set of APIs, then you can build the AMI, you can build the Vagrant image, so yes it can make Vagrant boxes, but it’s a simplification of that and it lets you build both development images and production. So you can see how it’s starting to head towards that side, but we’re still closer to development.


7. You mentioned Serf, what does that do?

Serf is kind of a weird project, it’s used at a very large scale, but we don’t talk about it much, we don’t market it much because it’s largely subsumed by Consul, but what Serf does, to answer your question, is it creates a gossip network across all your servers and it makes a very efficient way to do cluster management, to do sending small bits of data, to do health checking to see if the machines are up or down and what it really was, it was our first stab at service discovery, we wanted a way to find out where is the database, to know it in this distributed environment without any central server. So, it was the first stab at it and what we realized was it was a very powerful tool, but it doesn’t have quite the right user experience, the right feature set to be a full service discovery solution and that’s why we headed into Consul.


8. So let's go straight to Consul.

Straight to Consul. So what Consul does is service discovery, and the problem it’s trying to solve is as we are getting more microservices, we are getting more applications and things like that, they all have these dependencies within each other and that application needs a database, the load balancer needs to know where the applications are and so on, and answering that question is surprisingly not straightforward, we used to be able even if we had ten servers, hardcode them or manually do DNS, it was something that was easier. But as you get this world where you might have ten things with 1,000 containers that can be anywhere, that’s getting a lot harder to figure this out and it’s not just a container problem, scale is a problem. So what Consul does it exposes all of your services as DNS entries and lets you query them in that way, but also lets you configure them, do health checks, it’s really a full solution to service management.


9. With Serf, you mentioned it uses a gossip protocol; is it the same in Consul?

Yes, Consul uses Serf underneath, so we took what Serf is really good at and we put that into Consul and it uses that there, but then we built features on top that Serf doesn’t have and we don’t use Serf for, well, not directly. So we always use Serf to find out where machines are, but then Consul sometimes does a direct connection to them to communicate, whereas Serf is gossip, so it’s not direct connection.


10. Maybe you can explain gossip protocols? Is it just broadcasting or is it different?

Gossip protocol, it's not broadcasting, but it’s the end of that but similar. The way I like to explain gossip protocols is by making an analogy to the zombie apocalypse. So, we are in Budapest right now and if there is a zombie apocalypse we couldn’t know, being in this castle, right? But we would probably find out because someone would run in through the door and tell us there is a zombie apocalypse and our first response would probably be to start running and tell other people on the way, and that’s a lot how gossip protocol works. So, you tell one machine something and it might not know all the machines, but it’s going tell the ones it knows about and it’s going to tell the ones it knows about and you can see the huge fan-out, the message would get through.

So, it’s a way to eventually send the message through a lot of machines and it’s very, very scalable and so that is what Serf does really for you, that’s the gossiping part of it. But the tradeoff is, for our protocol, this is strictly a gossip tradeoff, but for our protocol it’s UDP based so it’s probabilistic, it’s not instant, there is a propagation time, but for all intents and purposes even in a 4,000 node cluster, to get a message delivered, it’s less than a second. So, it’s quite fast, just not as fast as if you had a direct connection.


11. So you have guarantees that the message will get delivered, you are saying it’s probabilistic...

It’s probabilistic, there is no guarantee the message will be delivered, but we can give you an exact number of the probability it will be delivered and usually the probability is high enough, so that’s the tradeoff. If something has to be delivered no matter what, it’s not a good solution or gossip protocol in particular is not a good solution, but we have a calculator online, you get to plug in numbers and see, in any case it’s some amount of 98+ percent, usually in the 99 with a lot of decimals, but you can play with the numbers and figure it out. So the way Consul uses that is that it uses it for noncritical information, so heart beating, if we miss a heartbeat it doesn’t matter, if we miss many heart beats in a row it’s a problem, but we probably won’t, so that’s how Consul will come into play.

And it was kind of neat as we deployed Serf we had all these numbers and predictions on what the probability is and things and it was neat as it got adopted in these clusters that are 4,000+ servers, there are some clusters that are 10,000 servers in one data center, but it was neat as we got to these high cases that math works, we started looking at the measurements and the probabilities were exact to 6 decimal places, it was really crazy how the math just works.


12. So math is useful?

Math is pretty good.


13. Listen up, kids. Is this a static algorithm or are there some parameters that you can arrange to raise the probabilities?

There are a lot of parameters. I would say 99 out 100 people never even discovered those parameters or don’t care, it doesn’t matter, so we ship with two parameter sets by defaults, the LAN set and the WAN set, so if you are on a local network just use those, if you are trying to do multi-data centers use the WAN ones, and they are probably what you want, for advanced cases there are maybe 30 or 40 settings and they're timing related things, like how long you wait without receiving an ACK until you think the message hasn’t been sent because it does retries and things like that. So it’s stuff like that.

Werner: So that's Consul.

Consul and Serf, yes.

Werner: So Consul is safely in the production area of your products.

Yes. So, Consul, Serf kind of, but Consul was the first thing to be really production oriented, it was a full solution that was really only for production people, it was also our first thing where we reached a point where people who use Consul, they don’t even know what Vagrant is, they are starting to be introduced to us without Vagrant in the picture at all. For a long time it was the Vagrant user base, the community that was finding the new things and now we are starting to enter new communities without that. And that’s really, I think, a great success for us to have both communities engaged even though they might not overlap.


14. So what next problems are you tackling? Are you still moving further, are you discovering other problems to solve?

Yes. We have three projects coming out this year, we have one coming out soon here within the next week so what we are trying to do is there are a lot of problems, what we try to do is look at how you go from development to production and the analogy I like to make is if that’s a video, what are the key frames that need to be solved and let’s solve those key frames, so that’s what we are trying to do. So, if you have something like Terraform, which we haven’t talked about that yet, but if you have Terraform which launches infrastructure and something like Consul which monitors infrastructure and there might be steps in between, you might have config management, you might have something in the middle there, that’s where we integrate with other tool but we are trying to get all the key frames for you so we can paint a picture and you can see the end goal and so start adopting things one at a time. We have a few more coming out this year that we think are pretty important.


15. You mentioned Terraform, is that out yet, can we talk about that?

Yes. Terraform has been out for almost a year and it solves the problem, so Consul came out first before Terraform, it assumed you already had an infrastructure that you needed, service discovery, things like that, Terraform solves the problem what if you don’t have the infrastructure you need and you need to create it, and that is not super straightforward. Not even ten years ago, five years ago it was pretty straightforward, you’d get a bunch of EC2 instances or a bunch of VPSs or get a bunch of compute and you were good to go, you run config management, something happens; it’s a lot more complicated today because EC2 is not necessarily just EC2, you might use auto-scale groups, you might use services like ELB or RDS, and outside of AWS there is OpenStack now, there is Docker, there are all these different things where your infrastructure isn’t just a bunch of computers from one source, it’s now a bunch of things from all over the place and your application cannot run without all those things or a lot of those things.

So, if you have a script to spin up EC2 instances and they kick off configuration management and they install everything that is great, but it’s not actually useful if that thing won’t run without the database service that you don’t install yourself. I sort of argue that you might as well not even do it because if you get up to that point and then you have to get a person anyway to set up that database service, that’s your bottleneck. So Terraform is trying to solve that problem all in one, which is you describe your infrastructure in a declarative way in a text format including all these external services and they might be different, it might be AWS here, it might be Heroku over here and Terraform will bring it all up, connect it all together, kick off config management for you and really get to the point where things are running, Consul is running, so in terms of key frames you can see how these play together.


16. How do I use Terraform? Is it a language, is it configuration?

Terraform has its own configuration format, but it’s also JSON compatible, for interoperability, it is its own format and then it’s a command application that an operator, developer runs. We like to say Terraform is a static tool so it takes a state of the world like static state A and you want state B and it takes you there, but it doesn’t stay running, it doesn’t make sure it stays in state B in real time, you run it and then it exits and you are done, and then if you think someone actually manually killed the server, you run Terraform again, it will see that and it will fix it, but you have to run it. And we did that on purpose, it’s an architectural design, it simplifies things and it helps solidify the relationship with Consul, too. Consul is very much the always on real time thing, but that comes with the burden of operational concerns, you have to keep it running, Terraform is very simple, I run it on the command line, it’s very familiar that way, it’s a different model.

Werner: So that’s Terraform.

Terraform, yes, I think we covered all now, all the open source projects.


17. You used the word open source. Does that mean there is something else there?

Yes, I am just being exact. A lot of people ask how we make money or how we intend to or however you want to phrase that question, we are a company that has five fairly successful, from Vagrant which is very successful to some that are less so, but they are all used in hundreds of companies, so they are all successful in some way. We have all these open source projects and that raises questions, how do we commercialize it, if there is a company behind it should we as a user be worried you might close source it or something, so it’s important to understand how we make money.

. So I mentioned these key frames, and what we do is we built a commercial products on top called Atlas which just connects all the dots for you. So you could take each of these tools and fill in the gaps on your own, use your own engineering effort to do it, but what we found is that most companies, even just 10+ employees, if they can afford it, they would rather work on whatever their company does and just buy something that solves the problem. And what is different about our solutions to, I’ll use a concrete example, we’ll integrate Terraform and Consul for you, we’ll make it so that Consul automatically when a server goes down it notices and runs Terraform for you to fix it, sort of like auto-healing, that’s is one of many features that we do.

But for example, you don’t want to build that on your own, you want it to work so you can work on applications, but we are different from other commercial products that might do the same thing because all we are, as a commercial product, is glue and UI and stuff on top of open source, but the open source is the work horse, it’s the thing doing all the major work and that’s nice from a lock-in perspective. If, five years from now, you decide you don’t like the way we are taking things or you don’t want to use our commercial products anymore and you leave you are not completely in trouble, you still have the configurations which are still valid with the open source, you still have Consul which still works, it’s still available as an open source project, you just lost some of the glue, but all the major pieces are there. That’s sort of the modern difference on our commercial strategy.

Werner: So that’s Atlas.

It’s called Atlas, yes.

Werner: Because it moves the earth.

There is a lot of metaphors around here, yes. It moves the earth, it holds it, it’s a map as well. Another definition is that it’s a map and Atlas helps you see everything. It’s a purposeful pun.


18. I’m trying to think what your next product will be called, Cosmos or something like that. So, let’s move on to some geekier stuff, if that wasn’t geeky enough. So you have a bunch of command line tools, services, what do you use to write them, what languages?

We have five command line tools, four are Go and one is Ruby, Vagrant is Ruby all the rest are Go.

Werner: I see a trend.

Yes, if you know the timeline there is a trend. So, Vagrant was the first one, was written in Ruby and all the rest are written in Go, so I get asked a lot why and do you regret it and things like that, and I don’t regret it. The thing is when I built Vagrant, Ruby was the language I was best at, but also Go was very immature, it wasn’t actually a choice, I will admit that if Vagrant was written today I would probably do it in Go, but that doesn’t mean it’s a mistake that I chose Ruby to begin with, I don’t think it was because it was my first project, I have the benefit now, I am very lucky that when we make a new project, we tweet about it, I tweet about it, we instantly have 10,000 users, but when Vagrant was my first project it was really hard to find people to try to use it, I put it on GitHub and what next?

And the thing about the Ruby community, then and now still, is that they are tinkerers, they like to try things, but not just try things, they like to contribute back, they like to talk about things. So, it was the perfect community to get started with because they were more willing to try it and they were even more willing to try help me fix it, pull requests, issues, that sort of thing. Go didn’t exist, but if I had done it in C or something it just wouldn’t have garnered the same community, it would have been a lot harder. So, it’s a really interesting question, but I think it was the right choice and I am happy with it.


19. So are you still writing code?

I still write code, a lot still.


20. So what do you think if Go, the language, do you like it, is it very suitable for these command line tools, what is your opinion?

I love Go, I think the phrase I would use to describe Go, but also a phrase I would use to describe myself is “frustratingly pragmatic”. People say look at these other languages, they are doing interesting things, and Go is not doing interesting things, and it’s true, I don’t deny that, but look at how much real software shipping that is written in Go versus a lot of other languages. Always in every language there is going to be the one or two handful of projects that yes, they are real, but every day there are dozens of new real things being written in Go and I think that shows how good it is at shipping software. It might not be the most theoretically safe language, it might not be the most performant, it might not be the most DRY, don’t repeat yourself, you don’t get the most abstractions, but you get just enough of everything to be very productive and that’s why I like it and it’s very educatable, it’s a very simple language, I like to say it’s a lot like C in that way, you can even learn the syntax of C within a few hours or a day at most, you won’t learn the nuance or the crazy things in C, unsigned integer math, floating-point, you won’t learn that stuff but you’ll learn the basics, it goes the same way, you could hire someone and get them to commit on a Go project within a day and a lot of other languages just aren’t like that.

Werner: And I assume Go has certain advantages in safety I suppose, better strings, sane strings, that’s a big advantage.

Yes, Go has a lot of advantages. On the scale of things, if you had a scale of a bunch of features you want in a language, whether it’s really good at it or really bad at it, Go would safely be at 80% on everything. So in terms of safety it’s garbage collected, you can't do buffer overflows, things like that, it protects you there.


21. So the Go language has some advantages for security, safety, you don’t have to worry about, null-terminated strings and stuff like that; is that an advantage for you?

Yes, definitely. We come from a background, I did a lot of stuff in Ruby, but my cofounder in particular did a lot of stuff in C, so he is really productive at C, he can write C applications really fast, but there are still a lot of concerns you just don’t have to think about anymore, so null-terminated strings, sure, garbage collection, memory management, buffer overflows, that sort of stuff just disappears with Go.


22. The obvious question: is garbage collection ever an issue for you?

That’s a good question and I think we can answer that question pretty well because we have a pretty large scale thing that stores data that sees a decent throughput in terms of requests per second and that’s Consul and it’s never an issue. We record all GC timings, we were really worried about it at the beginning and it just never was a real issue and I think the major difference between something like Go and something like Java is that Go does a lot of stack allocation, there is a lot of actual stack allocation, so the heap remains relatively small, if you do it in C of course you can get it better, but because of that you don’t get these huge garbage collection phases, it’s much smaller and I think that just helps a lot, it really hasn’t been a problem.


23. So I guess it’s value types and certain tricks in the language that really help there. So listen up, Java. I think the Java guys are looking into that, there are smart things to copy from Go.

Yes, especially dynamic languages, dynamic languages you can’t get anything on the stack really; there are some VMs that are pretty smart and will try do to stuff like that, but in general if you look across Python, Ruby, JavaScript, in general it’s pretty hard to do it and you also can’t reason about it, you need your VM to be smart for you, you don’t know if it will be smart, with Go you can just look at it and you can’t force something to be on the stack, but you can reason about it, you can say this will very likely be on the stack and things like that.


24. With the VMs, you have to rely on sufficiently smart compilers, if you are lucky and if it’s the right code then it will work but you never know, it might fail in the next version, too; it’s a VM.

To be fair, I think Go’s compiler is great, but I’ve heard from people that are fans of compilers that Go’s compiler is relatively primitive and the fact that it’s still pretty good while primitive it’s a good sign to me, if people are working on the most advanced VM ever for something like JavaScript and it makes JavaScript really fast, but it’s also the most advanced VM in the world it’s a little bit scary of how much further could we go, I think we can go far, but not much further but how much faster can we get to that future, whereas Go is still at its infancy and it’s already good. So, it’s just going to get a lot better and I don’t have any concerns about that.


25. Ok, very good. So, if people want to check out your products and your projects where can they go?

The easiest way is to go to our company site, on the homepage we have a slider that has all the open source projects there, and it will link to their website. But if you google my name or look at my GitHub, somehow the links will lead through to all our stuff or you can just google any of the words I said, like the names I said, Consul, Packer, Terraform etc.

Werner: Alright. Well, thank you, Mitchell.

Thank you.

Jun 03, 2015