Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Presentations Languages of Cloud Native

Languages of Cloud Native



Justin Cormack looks back at the early history, talking to Solomon Hykes about the development of Docker in Go, and looks at more recent trends in Cloud Native projects.


Justin Cormack is the CTO at Docker, working on unikernels.

About the conference

QCon Plus is a virtual conference for senior software engineers and architects that covers the trends, best practices, and solutions leveraged by the world's most innovative software organizations.


Cormack: I'm Justin Cormack. I'm the CTO at Docker. Also, I'm a member of the technical oversight committee of the Cloud Native Computing Foundation. I also am really interested in programming languages and how they affect the way we work.

One of the things I'm talking about is, in the Cloud Native Computing Foundation, of the 42 graduated and incubating projects we have, 26 of them are written predominantly in Go. I want to explore how this happened and which new language is emerging in the cloud native space and how we got to this point where Go is so dominant. One of the things that was really important in this historically was Docker. When I started at Docker in 2015, Go was already an established language in the company.

Why Docker Adopted Go

I want to talk to Solomon Hykes, who founded Docker, about how they started off with Go, and how really early in the Go language evolution they adopted it, moving away from Python.

Hykes: We didn't want to target the Java platform, or the Python platform, we wanted to target the Linux platform. That was one aspect. Another aspect, honestly, it was more of a personal gut feeling thing. We were Python and C developers trying to write distributed systems. A lot of what we ended up doing was writing them in Python, and then getting bitten by the typing issues of Python, so discovering problems a little bit too late at runtime when they could have been discovered earlier. Also, trying to recreate a lightweight threading system. It's been a while, but at the time we were heavily using libraries and frameworks like gevent and Greenlets and things like that, Go had goroutines built in. That was the same thing, but better. It had the typing benefits of C. From our specific point of view of C and Python developers of distributed systems, it was just the perfect tool.

Cormack: Presumably you didn't want to choose C for other reasons.

Hykes: No, exactly. Yes, C was not a consideration. Python was the default, because it's what we used. Go was just better by every metric that we cared about. One factor being the fact that it compiles to a standalone binary. The other being that it was just the right programming model for us. The third, is that, because we specifically wanted to grow a large community of open source contributors, we wanted Docker to be not just a successful tool, but a successful open source project, the choice of language mattered for social reasons. For example, we wanted something that was familiar enough to enough people, that the language itself would not be a huge barrier to reading the source code and contributing to it. The nice thing about Go is it's not radical in its syntax. If you've written C, you'll be familiar with Go. If you've written Python, you'll be familiar. It's not Haskell. It's not Lisp. It doesn't break every possible convention compared to mainstream programming languages. That was explicitly considered a benefit, because that means it's easier to contribute.

How Project Vitess Got Started In Go

Cormack: During this interview with Solomon. He called out that when he was looking around at the existing Go ecosystem at the time, it was, what's now another CNCF project, Vitess, that was something that he saw that gave him confidence. Vitess was a project that was in YouTube at the time, as YouTube was growing really fast. I talked to Sugu, who was one of the founders of Vitess, about how he had got started in Go.

Sougoumarane: I can go through some of the thought process that we went through, about how we ended up choosing Go. It was not very scientific with Go. In 2010, when we were thinking of starting this project, the primary options were Python, Java, and C++. Those were the three languages that popped up for us. Python was because YouTube was written in Python. Then Python was already losing, because it's not a systems programming language. We knew that we wanted to build an efficient proxy. Python has not the efficiency, it's not a very efficient language. We had Java and C++. I wasn't familiar with Java, and I think I was slightly bitter about it those days. I don't know why, but probably based on some people I ran into. I wasn't very excited about Java. Mike wasn't excited about C++ because he didn't feel like he could write something good with it.

There were a couple of reasons why we chose Go. The funny one is, it was just a passing comment, but it is still a funny comment, which is, if we use Go and if our project fails, we can blame it on that. I don't think that's the reason why we chose it. That was definitely one statement that was made in the conversation. Really, the reason why we chose Go is because of Rob, Russ, Ian, and Robert Griesemer. Because it was such a brand new language, we had to check out the authors, and we actually basically studied those people. We realized that their values, their thinking, their philosophy is very mature, and similar to the way we approached problems, which means that they were not too theoretical or too hacky. They had a very good pragmatic balance about how to solve problems. It was around the time where within Google there was a case where engineers were going through this phase where they had this fascination to complexity. Where anything complex is awesome, type of thing. This was one group of people that were contrarian to that. They were saying, you can be simple. I said, "I like that. I like the way you think."

What happened at that time was, I gave Dmitry Vyukov a reproduction as to why we are stuck. The challenge I gave him was, we have eight CPUs. That's all we have. The Go runtime today is only able to use six. If you optimize the runtime to use the other two CPUs, we will be true. That's the challenge I gave him. He went away for, I think, two months, and came up with this work-stealing design and a prototype implementation. We tried it, it indeed start the eight CPUs. That pulled us out of trouble. He saved our project. If that had not happened, we might have moved away from Go. It was not because of Go's design. It was just that we were getting pressured because YouTube was about to fall apart. We needed to find a solution. That solution basically restored our faith in Go. After that, we never had any struggle.

Cormack: Both Solomon and Sugu were looking for the right language for their new project, a systems language for cloud native. Both of them really also felt that community was important. We can say that for Sugu, it's the community of the creators of Go, and the people working on making the language better. For Solomon, it was the community that he wanted to create around Docker to make the language accessible to this community.

Why Go Became a Dominant Systems and Cloud Native Language

Around this time, late 2012, Derek Collison who created the NATS project, tweeted that within two years, Go would become the dominant systems language and the language for cloud native. At the time, people were very skeptical, of course, but it actually worked out that way. In that period, Docker and Kubernetes were both released, and there was a huge explosion of usage. I talked to him about how he came to that conclusion back then.

Collison: The original NATS was written in Ruby, like Cloud Foundry was. I actually from a development perspective, and just liking working in a language once the system is set up, Ruby is still awesome to me. Deploying production systems with the Ruby VM and all the dependencies, and we had dependencies on event machine to do async stuff more efficiently and stuff, wasn't going to work. In 2012, when we had started Apcera, we were internally huddling around, yes, NATS will be the control plane addressing discovery and telemetry system for the Apcera platform as well, called Continuum. I didn't want to run in Ruby anymore, and we were looking at either Go, which was the newcomer. I think it was at 0.52 at the time, or Node.js, which was also a newcomer, but not as new at least from a lexicon perspective as Go. There was definitely some initial things that we chose.

Then, after being in the Go ecosystem for so long, there were some interesting observations now about why it was the right choice that weren't necessarily the original decision makers. The original decision makers were trying to alleviate the pain that we had deploying production systems with the Ruby ecosystem. Node, even though it had npm, or the beginnings of it, at the time, it was still a virtual machine, had a package management system that had to be spun up and all wrapped around it. Go had the ability to present Quasi-Static Executables. We do full blown static executables, you had to do a little extra work. That was a huge thing, meaning our deployment could be an SCP, essentially. Goroutines and the concurrency model were interesting to us, for sure.

The other big defining factor for me, because I spent a long time at TIBCO designing a system to do this was, in TIBCO, we wrote everything in low level C. Which is still probably one of my favorite languages, even though it has a lot of challenges there, of being that close to the metal was fun. I've learned Rust. I'm going to learn Zig this holiday as my pet project. I probably would never program in C again, but I still liked it. At the time, it was very interesting to me within what we were trying to do, to flow from 80% to 90% use cases that would live on the stack, to transparently move themselves to the heap. That's very hard to do in C. I spent a lot of time and effort to get that to work in C, and Go had that for free. Almost nobody cared about that. They're like, what are you even talking about? I said, I spent so long trying to do that in C and Go has it. At 0.52, Go's garbage collector was really primitive, very primitive mark and sweep. To me, I was like, it doesn't matter, because I can architect to have most of the things on the stack. If they blow past the stack, they auto-promote in Go, I don't have to do unnatural acts like we had to in the C code base at TIBCO. It was static executables, and stacks were real, were the decision points.

The concurrency was a nice to have. Again, looking back now at the ecosystem, go-funk was bigger impact than people thought, huge. Everyone does the same thing now. The tooling, Go Vet, pprof, the way the testing all was in there. The number one thing for me is that if I go away from the code base, maybe it's because I'm old. If I come back, I immediately know what I was doing. Or even if it's let's say code that you wrote, I could figure out pretty quickly what your intent was with Go as a simple language versus Haskell, or Caml, or even sometimes if people went into Meta land with Ruby, and essentially we're programming DSLs. You went back to a code after a couple months and I'm looking at it and it would take me an hour or so to figure out what I was even trying to really do. That also lends itself to bringing new people in to get up to speed very quickly with a language. I still think that's huge.

The Adoption of Rust

Cormack: We talked a lot about how Go got started in the cloud native ecosystem. Recently, we've been seeing a bunch of projects in Rust as well and we've seen other languages. I talked to Matt Butcher about how he adopted Rust. He had started off as a Go programmer, he built Helm among other things. Recently, he started using Rust for new projects.

Butcher: Ryan Levick, who is one of Rust's core maintainers, but he also works at Microsoft when we were starting to look into this, and he just dropped into our Slack and was like, "I heard you're writing a Rust program Clippy style." Basically, anybody who wanted to learn Rust, Ryan was more than happy to walk them through the basics, then point them at some resources, and then answer those first few questions about how to do the borrow checking correctly. Very rapidly, I think six or eight of us got going in the Rust ecosystem. The default started to shift. We wanted to write Krustlet in Rust, because of the way we wanted to build a Kubernetes controller. We hadn't intended to start writing other things in Rust, it just happened out of that, that new projects started to default to being written in Rust instead of Go.

Why Krustlet Was Written in Rust

Cormack: What was it about Krustlet that made you want to write it in Rust then?

Butcher: The main one was we wanted a WebAssembly runtime, and the best WebAssembly runtimes are either written in C or C++ for the JavaScript ecosystem, or are written in Rust. The one we wanted to use was Wasmtime, which is the reference implementation of the WASI specification. That was written in Rust. We looked at, we could compile this to a library and then link it with Go. Then, once everybody else started working on Rust, and going, "I like the generics. There's a Kubernetes library, the crate is pretty good." Before long, everybody wanted to write it in Rust. Ron had to write all of Krustlet in Rust. Where it started, really, because of the necessity of wanting the WebAssembly runtime, it ended with us choosing it because it felt like the right language for what we were building. Then the surprising conclusion from that was we started writing other projects in Rust because it felt like the right fit for the things we were starting to do moving forward from there.

WebAssembly and Zig

Cormack: Derek had quite similar thoughts about lighter weight languages for lighter weight processing, particularly on the edge. We talked about WebAssembly as well, and also Zig.

Collison: Most of the new ecosystems have taken a similar approach. The standard library can't just be scalable. Even Zig, which is one of the newer lower level languages has spent quite a bit of time on their standard library, fleshing it all out.

Cormack: Even C++ has decided it needs HTTP and TLS, but it's going to take another decade to get there.

Collison: I don't know how long my career will keep going for, but I can say with confidence, I will never program in C or C++ again. I'm ok with that. I think there's better alternatives now, for sure. I also think with the other prediction around edge computing, at least my opinion that it's going to dwarf cloud computing. Cloud computing will become the mainframe very quickly. We know they exist, but who cares? Nobody ever really interacts with them, they just live in the background type stuff. Efficiency, so not necessarily performance, but efficiency. How much energy and resources are you using to do the same amount of work, is going to come back into play. I think enterprise with .NET and Java will still remain and still be driven especially within the data center or the cloud world. I think you're going to see C, Rust, Zig, and then of course, very high speed Wasm or JavaScript engines as the looser, maybe some MicroPython, CircuitPython type stuff. TinyGo is becoming really interesting, in my opinion.

Q Programming Language

Cormack: Solomon is still a big believer and a user of Go, but it was another language that we talked about where he would like to see changes.

Hykes: I still write Go. I'm not the typical programming language early adopter. I tend to use the same tools for a long time. We were probably a strong influence in the adoption of Go, and also in the adoption of YAML in the cloud landscape, and so there's one I feel better than in the other. YAML I think is just a source of problems. It's not that it's bad. It's just that it's used for things that it wasn't meant to be used for. It's just being overused. That's the sign that there's something missing. This new project that we're working on, Dagger, it's written in Go, but it's configurable and customizable to the extreme. YAML or JSON just didn't support the features that we wanted to express. We found this language called Q. Initially, we used HCL in our first prototype. Terraform and other HashiCorp tools use HCL. I think it's an in-house project. It spun out as a library, so you can use it in your own tool. It has limitations, pretty severe limitations. You can tell it started its life tied to a specific tool, and not as a standalone language meant from the beginning to be used by multiple tools. Q on the other hand started out as a language. It's Arthur [inaudible 00:21:33], is a language experts. Exactly like Go solves a specific problem, it felt like it was written perfectly for us. Q felt the same way as a replacement to YAML. I'm a huge believer in Q's future. I think it will, or at least it should replace YAML in many cloud native configuration scenarios.

Lessons Learned from the Adoption of Languages in Cloud Native

Cormack: What have we learned about the adoption of languages in cloud native in particular? The first thing that's clearly important, very important is community. This is the community around you as you start to think about using a new language, and the things they've built and the way they're building them. Second is the community that you want to bring to your projects, and how you want them to be able to adopt the language and tools you're building. The second one is fit for a problem domain. For cloud native, there were some requirements that a lot of people mentioned around things like static binaries that were useful to be able to distribute their code easily or let people run as easily in production that were important. Always, you knew this fit between the problem you're working on. Moving into a new domain is actually a great opportunity to examine the fit for the tools that you're using, the languages you're using now and decide whether that's a good point to make a change.

Performance was also important for the cloud native use case. It was interesting that it came up a little bit. The language performance actually grew in line with the requirements. The conversation with Sugu about YouTube, it was really interesting that Go managed to keep growing and meeting those requirements as the requirements became more difficult, and they never got to the point where they had to give up. It's important to remember that languages can change and evolve with your users, and they grow, and the ecosystem around them grows as you start using them. Those things are really important.

Then, finally, everyone's journey into learning new languages was different. People often thought about things, experimented maybe years before they actually adopted a language. Also, there's a whole journey towards internalizing how to work in a new language and how to use the opportunities it presents best. That process of learning new languages is incredibly important to people. It's really important that we all continue to learn new programming languages, experiment and see new ways we could do things, so that when we get an opportunity, like when we're moving into a new area or experimenting with a new idea, we can think about what programming language would work best for this, and what kind of community do I want to build?

Questions and Answers

Schuster: It seems that ahead-of-time compilation, or having static binaries is one of the big selling points for languages like Go or Rust. Even Java nowadays has ahead-of-time compilation. Is that going to be essential for all future languages that come along?

Cormack: Yes, it's interesting why it matters, and then what for? I think the comment was around serverless. Serverless, really, startup time is incredibly important, and it becomes one of the constraints because you're there and you've got to do things, and you get people who work around it by trying to snapshot things after startup. Interestingly enough, Emacs even used to do that, as an editor. Emacs used to snapshot itself after startup because the startup was too slow. It does depend on what that period is, and how to work around it. Emacs no longer does, because computers were fast enough, it wasn't an issue. It does depend exactly what those constraints are. Ahead of time has those big advantages. The user experience is worse. In theory, with the JavaScript model, you can start running the code slowly with an interpreter, and maybe it doesn't need to be fast, and you only compile it if it's really going to be used. Static compilation is just not worth it for those kinds of applications where most applications are so small. Even like an interpreter is fine. I think there are compromises, but I think we're seeing a lot of spaces where ahead of time is working better.

We've gone back to that, because it's how languages originally were from the '70s. It was Java that moved away from that, but JavaScript followed there. There was a huge investment in these JIT technologies. Then we are seeing a little bit of a swing back to ahead of time. There is always the theory that JIT and profiles based on actual execution can be faster. In general, that's mostly been true for dynamically typed languages where you can work out what the types are. I think the ahead of time thing has gone with a revival in static typing and the shift back to let's fix these bugs at compile time, because it's annoying to fix them at runtime as well. I think that combination of static typing and ahead of time, definitely we've swung back that way again, for some of those reasons.

Schuster: It's also important for serverless, especially because there you don't want to pay essentially for the compiler to do some work if you can do it ahead of time.

Cormack: As serverless has had billing with smaller intervals, that becomes more important, and as we want to do really lightweight things in serverless. Small code size also becomes important for those things. It's very much the case with WebAssembly, where, again, Rust has become a popular language to compile to WebAssembly, because it compiles to a small static binary without a runtime. I remember talking to Cloudflare about the hoops they were having to go through with Go in WebAssembly, because compiling the language runtime to WebAssembly was a few megabytes of overhead. Again, they were really space constrained by how quickly they could load code into a machine. A megabyte of code is much quicker than 100 megabytes of code just to load up and how much concurrency you can get, and those types of issues as well. Those kind of constraints are related as well. I think that's what a lot of the discussion about edge use cases, and Derek's conversation about TinyGo, and MicroPython, and things like that, where they're really designed for really small runtimes. That gives you advantages if you want to run them for very short periods of time, or a lot of things at once, and those kinds of things. Memory consumption is one of the big constraints for how many customers can you run at once, is take your memory consumption divide it by the size of the application. That basically gives you the amount of the things you can multiplex onto a CPU at one time. As serverless and those things started to get into those constraints, those types of constraints start to matter a lot too.

Schuster: We just heard a talk about how Shopify is using WebAssembly to allow people to extend their platform, and that also quite nice kick in by Rust for it and it can pack a lot of code into a small space, because it's naturally isolated. It doesn't need containers, and stuff on virtual machines to isolate the code from others. It's also an interesting trend in what WebAssembly allows here.

Cormack: I think isolation is a technology that has always been important. It just has different shapes over time and different kinds of sizes. We started with virtual machines and containers were smaller and more convenient. We're now looking at things like, can I isolate parts of a single application? Because I don't trust them, or I don't want to audit the code in them. Google has a rule that every untrusted bit of code has to have two isolation layers between that and their code, for example. Those two isolation layers could be different things in different cases, but one isolation layer could be broken, but two is much more difficult. Yes, if you're on Shopify and you're embedding customer code in your code, then that's something you have to isolate, and you want isolation layers with that. Those might be the Wasm runtime and some Linux kernel process isolation. For example, you run the runtime in a separate process, or they might have a couple of other ways of doing it. The more we make our applications out of sets of code with different trust levels, the more we need isolation at lots of different scales. Everything from VMs for cloud tenants, down to containers for, "I'm running six applications at once, and I don't want them to interfere with each other," to, "I'm running a library that the customer provided and I have noticed it," or, "I'm running a library that I got from npm and I don't trust it." Can I run that in isolation as well?

Schuster: There was some interesting work with capability based isolation inside of JavaScript Engines.

Cormack: Kate Sills did a talk a while back at QCon that was really good.

Schuster: What happened to that? It had one of those ungoogleable names that's hard to keep up with it. It was supposed to come in one of the ECMAScripts. Something to check up on, I think.

I have not heard of this, but it occurs to me that a cloud provider could provide a JVM Platform as a Service in a serverless manner.

Cormack: I think there were some back in the day. The JVM was like the first language runtime that was designed for secure isolation. The security isolation wasn't actually very good, in the end. It was broken a lot of times after it was initially released mainly through security issues in the standard library and so on. It was an experiment in isolation. You can see Wasm as being a more secure version of the JVM. There have been a lot of lessons learned in the last 20 years, or is it even longer, since JVM. A lot of lessons learned on how to build secure isolation for language runtimes. Wasm really is the state of the art that came out of the browser as the most attacked piece of software we ever built. There was a lot of work, particularly from the Google team around Chrome and those kinds of layers of isolation there that taught us how to do that better. I think back in the day, people did have that idea. The JVM runtime wasn't quite designed for that, and wasn't quite secure enough. It's very much in the same line of forms of isolation that we've worked on over the years.

Schuster: The advantage that Wasm has is that it just doesn't stuff as many features into the standard runtime, because with Java, you can import data file formats and stuff like that.

Cormack: The type system is reflected inside, which turned out to be quite complicated. Whereas Wasm has very simple linear arrays, and again, the language has to compile down the type system it wants on top of that, so it's even simpler. WebAssembly is almost recognizably an assembly language apart from has better looping constructs, but it feels more like a machine level thing than a language level thing. That, again, makes it easier because it's simpler.

Schuster: I found it was quite fun trying to write code in it with the text format, which is a Lispy type of thing. It's much easier than assembly.

Cormack: Yes. It is fun. It's not perhaps designed for that. When I was younger, I wrote PostScript, which was like that too. It was Forth based and it felt amazing to be able to program a printer.

Schuster: Did you hack the printer? Any stack overflows in there, any recursion overflows?

Cormack: Yes. You got them all the time.


See more presentations with transcripts


Recorded at:

Aug 07, 2022