Anne Currie Discusses Microscaling, Following in the Footsteps of Google, and the Future of Containers
Bio Anne Currie has been an engineer for over 20 years, devising and developing products as well as running the tech team for one of Europe's earliest pure-play eRetailers. She is a co-founder of Force12.io, an open source project for real time container scaling.
Software is Changing the World. QCon empowers software development by facilitating the spread of knowledge and innovation in the developer community. A practitioner-driven conference, QCon is designed for technical team leads, architects, engineering directors, and project managers who influence innovation in their teams.
Daniel's full question: Welcome everyone. I am here at QCon London with Anne Currie, cofounder of Force 12 (now re-branded to Microscaling Systems). Anne, you should be talking later on about “Containers change everything”. Quite a big title there! A lot of promise, hopefully. You and I obviously chat about all things container. One thing you mentioned before is that containers are quite a game changer. Could you talk a little bit more about that? Do you think that as the years have gone by, the last say three years, that the actual notion of the ‘game changing’ thing itself has changed?
That is an interesting question. So how do I think containers are game changers? I think that they really, really change the way that we will be thinking about architecting and building our infrastructures in the future, and I will be talking a little bit about that later. But, if you look at companies like Google or Netflix, who are vastly ahead of the curve with their infrastructures, they use containers to get vastly better server utilization, a better server density than the rest of us do of not quite orders of magnitude, but maybe 5, 6, 7 times as good as the rest of us do. And they also have self-healing systems, they have systems that are cheaper to run and much easier to maintain with fewer people just constantly having to watch and tweak the machines more self-healing, more self-managing systems. And you can do that with containers, which is quite interesting because containers themselves are really, really simple concepts. They are nowhere near as complicated as VMs, they are nowhere near as powerful as VMs in many ways and VMs have completely changed the world. We would not have the Cloud if it wasn’t for VMs. But containers change things in a different way, almost by making it simpler again. They are very light-weight compared to VMs, they instantiate in seconds as opposed to minutes, and the introduction of Docker into the world has meant that there is a very, very common format for using containers and managing containers in your infrastructure, which means that loads and loads of extraordinarily clever orchestrators have grown up around containers and around that common Docker format for containers. That is really meant that we have loads and loads of options for how we architect and how we run applications and how we build our infrastructure, particularly microservices – that is a terrible buzz word – microservices-based architectures. You can get enormous reductions in costs and kind of fault tolerance and bin-packing and self management. It is a really interesting new area. A bit like infrastructure became so much more exciting when we had the Cloud, or when we had VMs and the Cloud. Again, infrastructure is so much more exciting now that we have containers, I think.
2. It is a very interesting comment, Anne. As you were talking, I was thinking some comment I have heard with containers is that they are quite operationally complex. Or not necessarily the containers, as you have touched on, but more like the scheduling, the orchestration. What are your thoughts around that?
Yes. It is interesting because containers themselves are incredibly simple. You are basically just saying: “Here is a bunch of processes and they can do this, but they cannot do this.” That is it. That is all a container is really. It is a bit of kernel management of processes. So phenomenally simplistic compared to VMs. But, it is their very simple nature that means that you can start to do incredibly complicated things with them, which is kind of the nature - it is kind of the Lego block. You can do incredibly complicated things with it. If you look at the things like Google, they are doing incredibly complicated things and actually, we all need to kind of catch up to pick up some of their complexity, but not all of their complexity. How can we get some of the benefits that they are getting without taking on the crazy complexity that they have internally. Now they are already doing lots of that de-complexification, they are simplifying what they have got internally and makes it a little easier to use in Kubernetes. But, there is a lot that they are not pulling out of their systems yet because it is too complicated and I would really like to see some of that simplified and pulled out. Which is actually what we are working on.
Daniel: I figure that is the perfect lead Anne, to explain what you are working on. That was my next question.
One of the things that Google does is that they basically use the speed of instantiation and the speed of shutdown of containers, and an incredibly ‘microserviced’ architecture. What I mean by that is that they have lots of different components that are somewhat interchangeable, somewhat decoupled. So if one of them can just go away, the rest of the system works, and then something can come in and the rest of the system works. They are not that horribly inter-dependent. They are not a giant monolith. What you can start to do when you have that is you can say “Right now, that type of service is really important to me and this type of service is not important to me.” So I can make decisions about how I am using my fixed infrastructure on a moment-by-moment bases, on a real-time basis, which is possible with containers which are so fast compared to VMs where you talk about minutes, in which you cannot do something as clever as that. So it is clever and it is complicated, but it is good. We all need to be doing it in the future, I think.
Absolutely. Yes. I mean if all of this was about how to manage a million machines, then I am not that interested. I mean I have lots of problems in life, managing a million machines is not one of them. So, there has to be a way of mapping what they are doing back to a simpler world. I mean it is just a world that you are running in e-commerce – my background is in e-commerce. It is very, very difficult to predict traffic so anything where you can say “I have a web server that is very important to me, that the traffic is satisfied, that people can come to my site and they can buy goods, but I have other things that are running that might be processing orders in the warehouse. They are important, but they are not urgent for me.” The demand peak that I am experiencing on my website is not going to be disappointed by the taking longer to process in the warehouse, but they will be massively disappointed if the website does not service their request. So any situation where you have something that is demand dependant and something that is less latency sensitive, you can start to say “What if I just shelve that a little bit, or did that later a little bit more slowly so I have more capacity for the thing that is actually demand dependant”
Daniel's full question: Just hearing what you were saying there, do you think there is any relation to data science or data processing? Because there is obviously the batch job you are talking about - we are seeing a lot of things like Spark or Storm - and even the Lambda architecture, where you have this notion of speed versus batch. Do you think there is any relation to what you described there or is it separate?
I think there are lots of similarities. This whole idea of having something that I really need to run now. I need it to run. When I need it to run, it must run. And there is something that needs to run for me at some point, but maybe not within the next 24 hours, or within the next week, or within the next couple of days. Actually, having your systems be able to differentiate between those and prioritize between what is going on, that is something that is possible with containers and is not possible with VMs, as lovely as VMs are.
Daniel's full question: I am thinking very much that scheduling is the hard problem. I am moving nicely on to the next question. You mentioned when we were chatting, actually, offline, you mentioned scheduling being sometimes too high in abstraction or too low currently. There is a lot of interesting things like HashiCorp Nomad, Kubernetes, and Mesos and Marathon. I don’t think we have hit the sweet spot quite yet. What are your thoughts on that?
Well I actually think that the way the scheduler ecosystem is developing is very good because they layer effectively. Things are being layered on top of schedulers so at the moment, most of the schedulers are very much focused on infrastructure decisions, decisions that you can make at quite a low level-like fault tolerance: “OK. That machine has gone down. I know that there is something on that machine that I have to have another copy of – I am going to start it” So it is an infrastructure decision. It was driven by the machine falling over. Or the bin packing – again, it is an infrastructure decision which is: “I have this shaped machine, I have these things running on it, I can fit something in here and I can get really good server density or I could stop that, move that over there” – again, it is all infrastructure level. I think that there will be a type of scheduler, an advance scheduler, a business scheduler which is what we are looking at. We will sit on top of that and actually it will not be really part of your infrastructure. It will let these clever infrastructure schedulers do their job, but it will start to feed in additional information like “There is a lot of traffic on the load balancers at the moment and I would actually really need to service that traffic.” That kind of business decision which is independent of the infrastructure and I am not sure it should move down. I think it should stay at that kind of higher level, almost the application level. I do not know if you have done any looking into Network Function Virtualization (NFV), because they are doing quite a lot of thinking along these lines as well with orchestrators, what goes into an orchestrator and what goes into a VNFM – virtual network function manager, which is kind of a high level scheduler and at the moment it is almost split. There is quite a lot of standards that has gone into that, and the scheduler intelligence is set at a different level, to the level that we are doing it in containers. But I actually think that the container level is right. I think it is the more architecturally correct way of doing things, that the infrastructure decision should be made with the infrastructure, and business decisions should be made with the applications.
Daniel: That is very interesting. I have not heard of that, and this is something I am definitely going to take away. Yesterday, Adrian Colyer presented the opening keynote to QCon London, and he talked about the benefits of reading papers, and looking outside of your domain. What you have said there, I think, is that this is something you are doing - you are looking perhaps at stuff that it not revolutionary, or it has already been done elsewhere, but you are lifting it into a new domain because it is very interesting. We get blinkered, I think, as an industry, and don’t always look at what had been done already.
Yes. Well, actually, NFV is very new as well. They are both almost co-developing. They are very similar, but there are some slight differences. It will be very interesting to see what works better. It is a slightly different environment, but the network function virtualization is merging into software. So it will be interesting to see.
6. You mentioned about things being new, and have we reached the peak point of the [Gartner] hype curve? You know we have talked about this offline. What do you think? Do you think there is higher up to go? Is there more inflated expectations, or are we now on the gentle slope down into the trough of disillusionment? What do you think?
I think there is an awful lot of things that containers can do, that most people do not even know it can do yet. So I do not think it should have actually reached peak hyperbole – if you can say that, if it is not terribly mixing metaphors. I do not think we have reached that yet. I think there is an awful lot that could be done, more than we are realizing. I would be sad if we started to move away from it now because I think that if you look at Netflix, if you look at Google, if you look at these players who invested a lot in containerizing their infrastructure, in containerizing their architecture, they are getting huge benefits from that. If you read anything about economics, like if you read the excellent “Capital in the 21st century” last year, a lot of economic growth is where one individual, usually a nation, gets ahead and then everybody else looks at it and says “Oh, I could do that!” and catches up, a bit like Europe versus the US after World War II. It feels like people like Google are those people who have got ahead and the rest of us can look at it and go “Is there something I can learn from that? Is there something I can cherry-pick to get the advantages that they are getting and catch up?” I think there will be a lot of growth in efficiency that comes with catching up with what those guys have done and learned from, and are now taking about, which is good.
Daniel's full question: On a similar note, a lot of vendors are moving into the space now, as with a lot of things in IT where there is a sort of popularity or some success, people want to productize things. What do you think of that? I mean Docker obviously is a very strong brand in the container space, CoreOS with their Rocket coming out. What are your thoughts? Now and in the future - do you think that Docker is in for the long term? Or do you think that others will challenge it?
I have no doubt that Docker are in for the long-term. Whether they will dominate forever to the extent that they currently are? – probably not. To me, Docker feels a little bit like – this is a strange analogy, that I could easily be beaten up for if I speak this in public – Docker is effectively is the killer platform for container technology. It wraps containers, it makes them easy to use, and it is also the killer application for containerization which is a packaging and deployment tool and tool chain, that fits in very well with continuous delivery. So people can go “Oh, yes. I can see what containers are good for and yes, I will get benefit from doing that tomorrow and I will do it.” So they are kind of the killer platform and the killer application. It slightly reminds me, and it is only because I am a very old person, it slightly reminds me of Windows and the personal PC in the 90s. Where Windows was the killer platform for the personal PC and say Office was the killer application. And it actually made everybody go “Oh yes. I need a personal PC!” So you go from a piece of technology which is actually quite a bit nerdy and people go “What would I use that for?” and then somebody comes and says “Use it for this!” and “Here I will make it really easy for you” and the platform makes it possible to build an entire ecosystem, the killer app makes people want to have the underlying personal PC – the technology. I think it is the same with Docker. By building a really easy to use API effectively around containerization, they have made it possible for a whole ecosystem to build up – I mean the ecosystem has gone bananas. I mean this is an ecosystem that is building vastly in advance of people actually using the things in production. I have never seen anything like it – but to a very high standard as well. What I am seeing is very good. And we also have this killer application which is the deployment and packaging, which is really zeitgeisty because it fits very well with continuous delivery. Then suddenly everybody goes “I can see why we would use containerization” and then it is in and people are thinking about it and you can to start to.
Actually, that is a question that I know less well. I have never done it myself, so I understand it is quite straight forward, but every new technology is hard to get your head around. But that is a good reason for doing it, because I do think the containers are going to become more significant in production, not just for continuous delivery, not just as a packaging tool, but also as a way of getting all this clever orchestration. If you learn it in dev and test, as part of the continuous delivery, then you will have the skills to actually really exploit it in its kind of operational form.
Daniel: That makes a lot of sense. People learn when it is not important to some degree and then when rubber meets the road, in production it should be minimal learning, I guess, in terms of infrastructure – that totally makes sense.
Well, that is what we did for VMs. That is how we learned VMs. We used VMs in test primarily and the ops teams really learned how to control VMs there, before they took them into production.
10. That is a nice way of thinking, actually. It totally makes sense. So it is a bit of a curve ball now, as we move on from that one. What do you think about the current unikernel vs containers theme?
Well, they have just been bought by Docker so I am really looking forward to Justin’s talk on unikernels this afternoon, which is after mine. I do think that there is a synergy. There is great opportunity to use a unikernel approach to make containers even faster and more lightweight, which is what I really like about them, where I think the power will come, but also to make them more secure. Of course, in terms of the infrastructure, in terms of not the packaging and I think the Docker has done absolutely the right thing by focusing on the packaging and deployment. But in terms of speed of instantiation, then unikernel that is running on really lightweight VMs would definitely be an alternative, but if Docker wins the day and everybody is thinking about containers, then why bring in the new technology, why not just go with that? It is what Google do. It must be reasonably decent.
I will not necessarily rule that out.
Daniel: Thank you for your time, Anne. I really enjoyed the conversation there. Thank you so much.
I enjoyed talking to you too. Thank you very much.