BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Interviews Amir Chaudhry on Unikernels, MirageOS, HalVM, Rump Kernels

Amir Chaudhry on Unikernels, MirageOS, HalVM, Rump Kernels

Bookmarks
   

1. We are here at Code Mesh 2015 in London. I am sitting here with Amir Chaudry. So Amir, who are you?

My name is Amir, I am one of the contributors to MirageOS which is a project working towards Unikernels, based in Cambridge in England.

   

2. So Unikernels, I think that’s a term our audience might not be familiar with. Can you give an explanation?

Unikernels are single address space machine images that are compiled from a set of modular libraries, and they are compiled along with application code down to a single machine image which is then deployable onto the cloud or onto embedded devices. One way of thinking about it is that you build your own bespoke operating system, for the specific needs of your application, and then that becomes the thing that you deploy. So it’s much smaller than a traditional operating system, the end result is much more secure because the attack surface is smaller, and it only has the components that you actually need in it.

   

3. You say you can deployed to the cloud or to embedded devices, can it also deploy to bare metal?

Yes, there are ways to deploy to bare metal, the thing that we talk about there is this coming wave of IoT and devices all around us, so we can also take the Unikernel approach and deploy things onto bare metal and in our particular cases to little devices called CubieBoards which are ARM-based devices and we can do that as bare metal.

In the case of Mirage, MirageOS is the project that I am most involved with, I am also interested in others, we can talk about that a bit later, but with MirageOS we target primarily Xen, and Xen is a hypervisor, so essentially it takes a giant machine that you find in the cloud and makes it available, provides isolation guarantees. And we target Xen because it’s a stable platform, we don’t have to worry about drivers, because that lives somewhere else in the tool stack. So we target a stable platform which is Xen and wherever Xen works, MirageOS Unikernels will work.

   

4. Xen is the technology that basically powers Amazon EC2 and others, can I just go to EC2 and deploy MirageOS? A MirageOS compiled binary?

You can, it’s a little bit convoluted at the moment, so essentially you have to build your own AMI image, so people have done this, it is possible to take a Mirage Unikernel and then deploy it onto one of the existing cloud providers. If you have your own machines which is in a colo somewhere else, you can deploy to that, if you can run Xen on that machine, so it is entirely possible, we need to make that story a little bit easier for people to do. One of the things that we are working on now is the deployment step for Mirage Unikernels.

Werner: That’s something that’s definitely coming.

It’s coming and it’s important. Because it’s really fun when you build one of these things on your local machine and then you want to deploy it somewhere else so it’s live, the deployment process is important.

Werner: You mentioned that MirageOS is – if you write an application with MirageOS, you write it in OCaml.

Yes, I should have mentioned that. So, the Unikernel projects that are out there, they tend to take different approaches, they have different things they want to optimize for. In the case of MirageOS we are using a language called OCaml and that meant that all of the protocols had to be rewritten as libraries using OCaml to get type safety and essentially to rewrite everything. And all of those are written in OCaml so your application code is also OCaml and you link against those libraries, compile everything down, it’s all statically linked and then compiled down to produce the static machine image. So a lot of work has gone into rewriting the necessary libraries. I believe that we now have just over a hundred of our own libraries that we have been involved in over the years, there are even more libraries available through the OPAM package manager, so the OCaml ecosystem is also growing. I would like to think partially because more people become interested in MirageOS and so those libraries are also available to people and they weren’t written with MirageOS in mind, it’s just that they are available, they’re written in pure OCaml, they are usable as Unikernels.

Werner: Everything that is pure OCaml is available for Mirage.

Exactly and OPAM now has about a thousand unique packages.

   

5. You have already mentioned other Unikernels I suppose, other Unikernel approaches, how do they differ from MirageOS?

The different projects tend to take different views on the things they want to optimize for. So in MirageOS it was about clean slate thinking, how do we do things from scratch and what protocols do we need and essentially doing the work of rewriting those. Another one that I am getting more interested in and I want to play with it more is Rump Kernels. Rump Kernels provide something called rumprun, rumprun Unikernels is something you can build there and essentially they have taken NetBSD and broken it up into separate parts that you can then take your application code and use the drivers and the protocol libraries that are in there and then linking them to produce your Unikernel. The approach there is essentially answering the question of how do we use what we already have and achieve the same or similar benefits benefits to Unikernels done as a clean slate?

Werner: But you don’t get the type safety aspect there.

You don’t get the type safety aspect but you do get the really important property of being able to use all of the existing code that is out there in the world, which is a really really important thing to look at. For example I know that you can run MySQL as a Rumprun Unikernel, so that means that your MySQL database can run now as a Unikernel on bare metal or up in the cloud somewhere. And that’s quite profound and the interesting thing is that it shouldn’t need be any modifications made to that code, you don’t need to change anything in MySQL to make it work this way.

Werner: As long as it runs on NetBSD, I guess.

You don’t get the benefits of type safety or the additional security things that things like MirageOS might provide but you do get to use the legacy software and still benefit from the other aspects of Unikernels.

   

6. With Rump Kernels you use – what do you leave out with Rump kernels, what parts of NetBSD? You use the drivers I believe and the interface, what do you drop from NetBSD?

The bits that are not necessary for you to link against, for example you would need the networking stack, most likely, you may not need the other drivers that live in NetBSD. So you don’t pull them in. You only pull in the bits that are necessary for your application. So HalVM is written using Haskell, and that’s from Galois in the US.

   

7. Does HalVM use the same approach as MirageOS? Or are they after different aspects or do they optimize for different things?

You still get the type safety guarantees with HalVM, with Haskell specifically, and I think they have done additional work in terms of whole system optimization, this is something that I would go and look in the project and see exactly how much work they have done. But I’ve heard through various people that it’s possible to do a greater degree of optimization with things you’re producing as Unikernels so that you can remove more of the code that is effectively not touched. And that’s a very interesting property, it’s a very useful thing and eventually when we start to think about optimization of the MirageOS libraries it’s something that we would try to do there as well.

Werner: So that you can you shake out more stuff that you don’t need?

Yes.

   

8. What’s the concurrency story with OCaml and MirageOS, are they single threaded, can they support multiple cores? What’s the current story?

The current story is all single threaded, you essentially have that one thread and you compile your Unikernel to be that way. So if you have things that need to run multiply, one way of doing that is to create multiple Unikernels.

Werner: So you basically have a coarse grained concurrency story. Your machines are very lightweight so that’s not much of a problem I suppose.

Yes, and they are usually running a single thing in a Unikernel anyway.

Werner: You based on the swarm of Unikernel.

That would be the idea, yes.

   

9. Do you have a name for that yet, like a herd of Unikernels, gaggle of Unikernels?

No, we should come up with one. One of the questions around that is how do we make all those Unikernels talk to each other. That’s another interesting thing that we are looking at so if they are all living on one host, are there other ways that we can make them all pass messages and information between each other? But we could use the networking stack. Another library that we’ve worked on is Vchan, which can help Unikernels on the same host communicate with each other.

   

10. We’ve already talked about few of the implementations, Rump Kernels, HalVM, are there others that you’d like to mention? Or do we have a place of finding them?

Of the top of my head I think there are about seven implementation approaches to building Unikernels right now. There’s a Wikipedia page now, which is great, which means we are legitimate, and of the top of my head I know of MirageOS which is the one that I am involved in, Rump Kernels which is the next one I am going to start playing with, HalVM, ClickOS, Ling is built with Erlang, OSv is another one. There is definitely a growing list of these projects, IncludeOS is one I’ve heard of recently which I think is C++, Runtime JS which is JavaScript, that’s another one I’ve forgotten, so the ideas behind this approach I think are becoming more prevalent and I’m starting to get a sense of inevitability from people in the community and not just the Unikernel community but community in general because the idea of being able to take the application code you have, pull in only the extra bits that you need that are necessary for just making it run and deploying it somewhere else and forgetting about it, essentially, not having to reach in and then do an update on the live machine. Those ideas just seem to be the direction everything is going in so when you think about what’s happening with containers, or what’s happening with things like AWS lambda for example, I just have this code I want to run, just make it run, all the stuff underneath I don’t necessarily have to care about as a developer just take my code run it and then turn it off again when it is not being used.

Werner: It’s the way it should be.

That stuff exists but you don’t need to necessarily expose any of that, to the person who is creating the code in the first place. Someone needs to solve those problems definitely and Unikernels is one approach to solving those, which is why I think that this is inevitable, that this is how we’ll write code in the future.

Werner: As you mentioned the trend to Docker shows that there is definitely interest in the industry for exactly this, for specializing, componentizing and so on.

Yes and the term I've heard used is immutable infrastructure, which is essentially the idea that you make this thing it’s an artifact, you put the artifact out there and it does its job, and when it’s time to change something you just swap it out for the next version. Functional infrastructure is another term I’ve used to describe a similar thing. I think that’s just the way the world is now going. And I think that’s going to be really interesting when we get to all the devices that are going to be around us, the embedded devices, the IoT. As you want to be able to have a development process that allows you to deploy something up on to a cloud instance, say EC2 but then also deploy that to a device that is living in someone’s living room, half way around the world, and there may be millions of those. So having a deployment process that works across both of those, and essentially abstracts that complexity away from the developer that’s going to be important and obviously I believe Unikernels are going to be one of the ways we are getting there.

   

11. Definitely. So talking about this, how can people find out more about MirageOS and would you like contributors?

We’d love contributors, MirageOS you can find out more via Mirage.io we have a lot of blog content out there, lot of tutorials, we also have a bunch of videos which we are trying to push as well, I put together a set of pioneer projects, which is what we call them, so if you search online for Mirage OS pioneer projects that should take you to a list of curated projects with mentors, so if you are trying to get your feet wet, you can essentially pick one of those, directly contact a mentor and then start working on something that will ultimately be used by the community. But I think it is also important to say that contributing to any of the Unikernel projects is definitely a good thing to be doing, especially if you are interested in things like support the legacy stuff Rump Kernels is probably a good place to go and look because if you can raise the tide for everyone, if we can get more people interested in Unikernels in general and playing around with them, we are going to get more mindshare, so helping raise a tide for this entire ecosystem is going to be really really important.

Werner: That's a good point to end on, we’ll all check out MirageOS and the Wikipedia page with other Unikernels, and thank you Amir.

Thank you.

Jan 29, 2016

BT