InfoQ Homepage Presentations The 10 Kubernetes Commandments

The 10 Kubernetes Commandments

View Presentation

Speed:

Download

46:38

Summary

Bryan Liles and Carlos Amedee explore topics from booting Kubernetes clusters to running complex workloads as a list of 10 items. They share ideas that our teams can employ to make working Kubernetes less of a chore and more of a way of life. The topics of this session cover tips and hints ranging from bootstrapping clusters to managing custom workloads, and more.

Bio

Bryan Liles is a Staff Engineer at Heptio. Carlos Amedee is a senior software engineer at DigitalOcean. He is passionate about distributed systems, cloud architecture, container orchestration and metrics. Over the last 20 years, he has been in various roles including systems administration, systems engineer and web programming.

About the conference

Software is changing the world. QCon empowers software development by facilitating the spread of knowledge and innovation in the developer community. A practitioner-driven conference, QCon is designed for technical team leads, architects, engineering directors, and project managers who influence innovation in their teams.

Transcript

Liles: The next set of speakers are Carlos [Amedee] and myself. And what we did here is, it's a different take on Tech Talk. I'm a huge hip-hop fan, Carlos has known, I dragged him to a concert with me before. So my favorite artist of all times died on March 9th, 1997, the Notorious B.I.G. And he had a song that came out actually literally after he died called the Ten Crack Commandments and I figured it would be against the COC to rhyme about how to deal drugs at QCon. So we really changed this to be in the whole containers, containers, containers type thing. And we have another iteration called the 10 Kube Commandments. So we're going to get this started.

I'm not going to read the slides because I really, I feel like I know this, but I've been in this game for years. It's made me an animal. There's rules to this, beep. I've wrote me a manual, a step by step booklet to keep your game intact. Not sure wig push back. So to 21-year-old Bryan, see hearing those lines along with the music was quite profound. He had Chuck D with the one, two, three, four, five, six, seven, eight, nine, the Ten Crack Commandments. And that's really stuck with me. So over the years, I've revisited this a couple of times, but today we're going to do something a little bit different. We're going to talk about this in a Kubernetes context.

So rewinding, we, Carlos and I, have been in this game for years. That in itself is admirable. There's rules to this biz. So we wrote y'all a manual, a step-by-step cough talk for you to get your clusters on track, and not your releases push back. And that's how we're rolling with this talk. So I'm Bryan Liles. I am a staff engineer at a company called Heptio. According to my bio here, I have lots of years and years of experience. I tweet @bryanl. But at Heptio and whatever my future company is, I am in charge of Kubernetes developer experience. What does that mean? Well, it basically means that we spend a lot of time in figuring out how to build our clusters, but we have not spent enough time figuring out how to get our developers to run software on these clusters. And me combined with some of these talks here that we've had today, are showing that there's a lot of things that we can think about. So moving on.

Amedee: Hi, I'm Carlos. I'm a senior software engineer at DigitalOcean. According to my slides, it says I've worked a lot in observability, cloud compute services and systems engineering. That's all. Nice, wonderful buzzwords. I worked on the initial product that DigitalOcean has for Kubernetes as a service. I'm no longer with that group, I'm back with my other team, which works on the control plane for DigitalOcean's cloud services, their compute stack. And I tweet @cagedmantis. It's a weird name, yes.

Rule Number One: to Go Fast, You Must Start Slow

Liles: Getting onto our talk. So in the original talk after we had the preamble, it start with rule number uno: never let no one know how much dough you hold. So once again, not appropriate for a tech conference. So we've changed this. So rule number uno: to go fast, you must start slow, different cadence. I'm a programmer and not a lyricist. So this is how we roll. So I'm changing this a little bit. To go fast you must start deliberately. And let me talk about how we start with Kubernetes.

So right now, there are really three different main ways people are running Kubernetes. They're running on public cloud, or they're running in a data center, or like a lot of us we're just running on our desktop. And when we go down into this, you have GKE and AKS and lots of vendors. They're not paying me to say who they are, so I am not saying them. And then in our data center, there's Kubeadm, which I'm going to talk about a little more. And then once again, lots of vendors. And then on your desktop, there are three options. A popular one is Minikube. If you're running Kubernetes right now, you can go to download Minikube, whether you're on your windows- well not so much on windows, it's not great on windows- but on Linux or Mac, you can actually run this right now and you can run Kubernetes on your desktop. And then, but if you're running something that can run snap, so for all you Ubuntu or maybe Debian users who use snap, there's this thing called Minik8s. And what it is it allows you to snap install and it just runs Kubernetes locally on your machine without a VM. It uses just Docker straight up and it does the right thing. And then you can also use Docker for Mac or for windows.

But here's the problem, I gave you the easy part. You're probably like, "Bryan, we already know this," but here's the problem we have with Kubernetes and starting Kubernetes up. The whole thing about Kubernetes is we need to make a declarative. We don't write the YAML because we like YAML, we write the YAML because YAML allows us to make declarative statements to our API. So what I've been thinking about is that…actually we, as me and Heptio and me and the community have been thinking about, we really need to stop thinking about our vendors in our data center. We need to stop thinking about this. And the idea that we've come up with is something called the Cluster API, and this eye chart right here is demonstrating the actual future of Kubernetes.

And really, what it comes down to is now instead of having to make specific Google calls or make specific EKS calls if you're using AWS or AKS calls if you're using Azure. What we'll be able to do is talk to your Kubernetes API the way that you talk to anything else, send it a manifest that says, "I want a new cluster member," and that's declarative. And you can say, "I want these many cluster members." And what the Cluster API will do is actually extend your cluster. And the neat thing is that this can actually run anywhere. So now we don't have to think about where we're running our servers. We just know that we can talk Kubernetes and if the Kubernetes cluster knows where it's running and the Cluster API knows where it's running, it can be on-prem extend if you're using something like VMware, or if you're in AWS and you spin up your own, we can actually boot up new instances and to make your cluster larger.

Rule Number Two: Always Let Them Know Your Next Move

The next one is number two, always let them know your next move. This is quite the opposite of the lyric, but what is your next move? Well, in cluster land, it's all images and containers. So that's how we're going to think about this. And what I wanted to put up here is something that people don't think about, is a lot of people just think that building images is Docker build. And really if we stopped at Docker build for basically the state of art for building our images, we've missed so many things. And I don't have a lot of time. We actually have eight more topics to talk about, so I can't go through all of these. But what we really need to think about is moving past Docker build, is the only way that we need to build.

So if you're in Google, look at their GCP Container Builder. It's declarative, and the neat thing about it is that it can build outside of your cluster, you don't need a Docker build. And these other two items, buildah and this IMG, are what if we could build our images without needing roots? So if you think about this, and I'll talk about this a little bit later, we should not be doing roots inside of our clusters. We should be able to build images, maybe on a coordinate piece of our cluster, but we shouldn't have root because when we have root we have those problems.

Rule Number Three: Never Trust Nobody- Hookup That Pod Security Policy

Next one, rule number three, never trust nobody; hook up that pod security policy. And this is a really simple rule right here. Kubernetes out of the box gives you something called the pod security policy. And what this rule does and what this actual manifest does, this little piece of YAML does, is it basically says how not to get robbed. And this is something that a lot of people don't think about. They create. We say, "Oh no, I can create a deployment. I can create my service, I can create my ingress or my whatever, my whatever, my whatever. I can create these workloads." But here's the problem, these workloads are running inside of your cluster and they have the permissions of whatever. So what you can do with the pod security policy is we can actually say that there is no root inside of the containers running in my pod. Or we can actually do things like configure SELinux, or we can actually say that you can only touch this specific volume, so not everything in your cluster can see everything else. So next up, Carlos.

Rule Number Four: Never Get High off What Kube Supplies

Amedee: I know you’ve heard this before. Never get high off with the Kube supplies. The original lyrics mean something a little different, but we don't want to rely on the native resources, or we don't have to rely on the native resources that Kubernetes supplies. We have this wonderful, amazing thing called Custom Resource Definitions. So you can create your own resource and by itself, that doesn't really buy as much. It just gives us an end point and a place to store this custom resource. But when combined with a custom controller, that's when the magic starts happening. That's when you can really extend some of the capabilities of Kubernetes.

So as I was going through this, I started thinking, I was like, "Hey, what are some possible custom resource patterns that I've experienced, that I've seen, that I partake in every day." And the first one I was thinking of was this toy. I'll give an example of a toy little project that I'm working on. And I created a custom resource definition. I created a custom controller. And what that custom controller does, is it manages two separate resources, or it watches two separate resources and sort of modifies them. One of them is a native Kubernetes resource and the other one is for, again, DigitalOcean, like a DNS controller, right?

So I said I want to create a load balancer service or I want to create an Ingress or a load balancer service. And I want this DNS address attached to whatever IP address the load balancer is tied to. The custom controller is waiting for changes in events, changes in the load balancer service. And whenever that changes, it reaches out on the API for DigitalOcean and it changes where that address is pointing, that A name or that C name. This was just a toy, but this is sort of a common pattern that I've seen at least internally and through some projects that are open source. When I think of this, I also think of the- although it doesn't necessarily follow this pattern- I think of like the Controller Manager for particular cloud implementations. This sort of ties in. It sort of does similar things, but it doesn't have a one-to-one mapping.

A new pattern that I've experienced is this operator, where you have a custom resource definition and you have a custom controller, and this was actually mentioned in the talk right before this one, in the track. You create a custom controller that you consider an operator for objects that are incredibly complex and needs special logic for you to have events happened with it, like upgrades or downgrades or migrations. Something that people are doing or creating operators for that. And that's definitely an interesting pattern that I've seen recently.

So this one is close and dear to my heart. Basically, you have a customer resource definition that you create, you have a custom controller. And the custom controller is actually modifying and keeping state for a lot of different native resource types. At DigitalOcean, we have a platform where all engineers deploy to. We don't have straight access to Kubernetes. We have access to a layer on top of Kubernetes. So when I deploy software, I don't have to know about every single thing, every single component that's associated with that piece of software or associated with that container. I get a replica, I get an ingress, I get all these things, but I have a very small, relatively small YAML or JSON file that I deploy in order to make that happen, right? I don't have to worry about the security, that's taken care of by this custom controller. So a lot of defaults, a lot of things that maybe I don't need or your typical user of your system or your typical deploy of a system in your company doesn't necessarily need every fundamental block to deploy their software. And that's sort of the ones that are close and dear to my heart.

I've also heard certain companies that already have preexisting deployment platforms that they've built over the years. And one of the ways that they've actually migrated over to using Kubernetes instead of their custom work is by following this sort of pattern, where they're creating some sort of custom resource that seems similar or is exactly the same or can be translated by some sort of service, and then running it through a custom controller which manages what it should behave like in a similar manner than their original deployment platform.

Rule Number Five: Communicating with Pods- Never Mix Internal and External Traffic

Rule number five: communicating with pods. Never mix internal and external traffic. Again, very different from the original lyrics, but one of the most powerful things about Kubernetes is how it manages the network. I, for some time, I worked on how we manage the network for customer Kubernetes instances at DigitalOcean. And one day they sat me down, they said, "Hey, go and deploy this." Or, "Go and figure out how this works." And essentially, I pulled my hair out and I was confused and none of this made sense. I was like, "What is a node IP? What is it? What is a Cluster IP? What is a load balancer? What is an Ingress?" It was very confusing to me ,not just because there were a lot of new terms that I was learning and that they didn't necessarily have a one-to-one mapping with things that already exist, but because there is a distinct difference of what these things are when you are on-prem, when you're running on your own hardware, versus when you're running on a service like GKE, or some other Kubernetes provider.

So the first one that I want to discuss or sort of mention was the Cluster IP. Cluster IP is a service. It does not expose your service externally. What it does is it provides an IP for you to communicate with your service internally from inside of your cluster. You can use Kubectl, Kubectl whatever term you prefer for Kubectl, to proxy into some sort of view into and communicate with your running containers. This doesn't really buy us anything in terms of ingress. There's Node Port where you create another service with a port that is exposed on your nodes- that's what the gray boxes are- which are then anytime you access those ports, it proxies to the appropriate container and the service labeled or selected by it.

This is not ideal because do you really want to send a DNS address to every node in your fleet? If you lose a node, then you have a problem. You have to update your DNS address. Actually that's happened with bare metal and with deployed instance. There's also this load balancer, and this is where I got confused as heck, right? You can create a load balancer service and it's great. It operates at L4, you'd have a port, you have a protocol type and you have a service somewhere or a node that is exposing this. That is if it's completely native, but what all the providers have done, all the cloud providers, is they have created some sort of load balancer service. If you lose one box, you're not losing your connection to that particular load balancer. If you create this load balancer service, you have an IP address, a public IP address, that talks to your private network that then connects to your nodes and can do health checks and automatically remove or know when a node is down or service has been moved or you've moved your pods to another node.

Ingress is the really interesting one. Ingress, it operates at L7, which is way more powerful for your typical use cases. So you can create rules attached to different services. You can create rules attached to different services, you can set a port for them. You can do matching on the path name for the URL. You can, even amazingly these days, you can automatically- your ingress controller can automatically generate certs for you. That's something that's sort of been provided. There are a lot of controllers like the Google Cloud Controller, Nginx, HAProxy Controller, there's contour by my wonderful friends at Heptio, which controls Envoy. And it's kind of amazing because with Envoy, if you make changes to the configuration, you don't necessarily have to restart the entire service and that's one of the big buys.

Liles: So let me interject some controversy into this talk. And you noticed that I'm going to do that because I'm touching both sides of the podium right now. Carlos created these slides and I deleted them because I was triggered, and I'll let him keep the Ingress one. But let me talk to you about Ingress. Ingress is Kubernetes says, "Let me get traffic into my cluster." The problem is, is that we didn't know how we could get traffic into the cluster, so what we actually are, and this is actually, this is the basically how to not get robbed part of this talk, is using Ingress by itself is actually a fallacy. Think about it. I can define a host, I can define paths and I can actually move traffic between those to my service. The problem is, is that works great if your application is ABC, my first app. Whenever you scale past the small app, the problem is, is that ingress is not granular enough.

What we've done and what Google's done a couple of times, what Microsoft has done, is actually thought about how can we expand these things out? So this is where tools like Contour from Heptio and our Ingress routes comes from. This is where the beginnings of this Service Mesh come from. This is why we have gateways in virtual services, because really this Ingress is not a great idea. So we put this in here. But really, I'm actually, as I see it on the big screen, I see it 20-feet tall. I say that I would actually advocate for people to try for better options. Use it to understand, use it because it's there. But as we grow and as the community grows, we really need to move past Ingress route. And I'm talking to any of you all who are thinking about extending the Nginx Controller, the traffic controller, whatever controllers, we really as a community need to come past. So my rant is over. I'm going back to the back.

Amedee: He doesn't have a mic, so he can't drop it. Yes, what he said. So there's a fantasy in our industry where when we think security, when we think about riding traffic, we only consider things that are incoming, and we always ignore Egress traffic. You can set Egress policies in Kubernetes and in Kubernetes like 112, I think, you can get a cider block and set Egress policies with that. So this will make sure that your pods can communicate with a certain portion of your network, which is intended to connect to, or communicate with. So if you have a database server, if you have some sort of service that's not in your Kubernetes cluster, you can direct traffic out to that.

There's also the ability to make sure that, and this came up earlier today in one of the other talks, if you have some sort of vulnerability and somebody gains access to one of your pods, if you have the right policy, this will stop them from having access to the entire network. So it's an important thing. I will admit that I'm also very bad with Egress, but I'm promising to make sure that I actually pay more attention to it. And I just wanted to throw a Service Mesh slide in here just because it's amazing and it's growing. And personally, I think that it's one of those things where you have this new buzzword that pops up, or this new set of ideas that pops up, and it gets all smashed into one concept called Service Mesh. And I would love to see it broken down into some pieces that when I think of Service Mesh, I think of taking a lot of business logic from the reconnection, a lot of the monitoring, a lot of the circuit breakers, a lot of the routing out of your application and out of the need for your engineers to manage that individually in each app, and puts it into a service, which is kind of nifty and should theoretically improve developer productivity. But those are my thoughts. I'm waiting for the hot tape.

Liles: No hot tape on me.

Rule Number Six: If You Think You Know What’s Happening in Your Cluster ... Forget it

Amedee: Rule number six: if you think you know what's happening in your cluster, forget it. So I like observability.

Liles: Hold on there. In the real song, this is actually, I don't care, if you think a crackhead is paying you back, forget it. That's funny. So y'all just missed that one. That was gold, so keep on going.

Amedee: Not brave enough to say that on stage? But I'll take. So observability- very, very big topic and I'm going to brush over a lot of things and give you my opinions. Kubernetes is complicated. Deploying microservices is complicated in right self, deploying microservices on Kubernetes is complicated. Your engineers need to understand what's happening, or you need to understand what's happening with every component inside of your cluster, inside of your service to be able to really service it. There's a flip, right? Back in the old days, I would SSH into a box that had my service running and I would look at the logs locally and the world would be great. I go into system D and make whatever changes and restart the service and I would have this great of what's happening on this one box. And then we moved into a world where things were sort of distributed, and my services running on 7 boxes or 12 boxes or hundreds of boxes and they keep moving.

I had a shift over to a different paradigm, which I'll discuss in a second. But typically, from what I've seen or from what we experienced, your users don't have access to SSH into the boxes and they don't even know what boxes things are running on. So you need some way to collect logs. You need some way to get an easy way to get a metrics out of it. You need something with tracing and we'll talk shortly about each one of those. I talked about what's happening in your cluster Kubernetes. If you take something like Prometheus and deploy it into your cluster, it has like the service discovery plugin, you can automatically pull metrics in from what's happening on Kubernetes or what's happening on Kubernetes or in Kubernetes, but you don't have an automatic way to understand what's happening on Kubernetes.

You could use Kubectl top, that only gets you so far. You can use Kubectl logs, that's only going to get you so far. It's going to give you a limited view of what's happening in your cluster. At a very specific point in time, you need to extend. Observability, the more feature full and the more dedication you put towards extending the observability of your cluster, the more likely the engineers won't vote when you switch over to Kubernetes.

With the metrics and alerting, I generally prefer and I'm a big advocate and lover of a Prometheus. It's a wonderful service. It has a lot of native integrations with Prometheus. It is a pull-based service. So this was a flip; most of my career I would go and I would have an agent running on whatever service, and on every single agent I would control how often it pushes data to some central service. And then I would go and query that central service and attach some sort of dashboard to it. This is a pull-based system. It's amazing. Don't be scared by the pull-based system. It's really just a change in the way that you're thinking. It's great. You can control how often things get pulled from one place, as opposed to going into configuration management and changing it and deploying it and waiting for that change to propagate. For applications that you deploy that have native Kubernetes integration, it's great. Not Kubernetes integration, native Prometheus integration, it's great, because you can expose the metrics and Kubernetes would just come and scrape them. And it's sometimes it's as easy as just adding an annotation to your service for those services to get scraped.

For things that do not have native Prometheus integration, there are things called exporters. And exporters exist for many, many, many different applications where you can run this small binary as a sidecar to your service, which is configured to talk to your service. It could be like a Postgres, it could be an Nginx, it could be a lot of different things. There's a JMX exporter and then that side car can get scraped, that exporter can get scraped and you'll get a lot of metrics in a very quick and easy way. Logging.

Liles: No, I'm just moving up.

Amedee: When I was listening to Biggie, I'd be afraid when people ran up behind me anyway. Logging. It's simple in that it is complicated. With native logging, it's a combination of configuring what your runtime logging settings are, your container runtime settings are, and how much retention you have on that that particular load, that particular host. But it's very limited what you get with native logging. So I would recommend that you deploy elastic search with some sort of aggregator, trust boring old technologies like our Syslog, and provide Kibana for you to query through the logs.

Again, distributed tracing, I would follow the same sort of idea. There was an open tracing talk right before this one or a session, two sessions before this one by one of the core maintainers, I think our creative open tracing. It's a great product. And you need to deploy some sort of observability dashboard, like Grafana, to make sure that you can actually read off of elastic search, read off of Prometheus, create dashboards, and move forward. I also forgot to mention alert manager during monitoring and metrics, and that's just the way to alert off of your metrics.

Once you have a better metric story with your cluster, you can deploy and have a better story with the Horizontal Pod Autoscaler, which allows you to have workloads, sort of like websites or Nginx instances, if it's the holidays and for some reason everybody is hitting your site and you have all these boxes that aren't doing anything, the Pod Autoscaler - you can configure it to notice that your request rate has gone up past some sort of threshold or has gone down past some sort of threshold, and it will automatically deploy multiple instances of your container.

Rule Number Seven: Keep Your Storage and the Business Rules to Manage it Completely Separated

Rule number seven, keep your storage and the business rules to manage it completely separated. This is me being a fan boy for the container storage interface. It is a standard that was created not too long ago, which allows you to use one plugin for any container runtime to attach and detach storage from pods and mount them and to delete them and basically just process the runtime for any containers, for any storage that you attach to a container. And it's not just tied to Kubernetes, it's tied to any of the container runtimes.

It's a very simple interface. A coworker of mine at DigitalOcean worked on our version of it for our block storage. His name is Fatih, he is the creator of vim-go, amazing person and a very smart engineer. And he taught me that it's fairly, fairly simple; like there are three interfaces that are provided. There's one called the node interface and it essentially is a controller, it is an interface that you need to run in a controller on every node that you intend to attach storage to. And it ensures that you attach and detach and do the whole life cycle of nodes and move them over to other pods in other nodes.

There's a controller, there's one controller running as opposed to the DaemonSet of the node controller and that’s just some of the greater logic, and there's an identity interface that needs to run on your components, which registers your plugin with the registry, the container, the Kubernetes plugin registry. In version 1.12, there's this new feature, it's great. You could take snapshots and it's all part of this same interface. I think it's alpha or beta. I think it's alpha. But it's amazing, so if you're running a database and you store a backup on one of these volumes, then taking a snapshot of it will be great and it will be saved somewhere theoretically in your cloud provider or your homegrown safe or GlusterFS storage networking.

Rule Number Eight: Using Tools

Liles: The last three. So we have 8, 9, and 10. When you leave here, I would like you to go to Spotify or Apple music or whatever else you might use, and go listen to this song because then it all makes sense. And then you're going to remember this. Whenever you listen to this song from ever and ever and ever, you're going to see this face. And I tweeted earlier that I look like Idris Elba, and you're going to think of Idris Elba, and then you're going to think of James Bond because he was shafted by that. But then you're going to go back and you're going to think of Sean Connery because he was the best bond. And then you're going to go back and be like, "Well, he was kind of Scottish." And then you're going to think about Edinburgh, and then you're going to realize that Mary Queen of Scots was 6-foot tall. And I don't know how we got there, but that's where you're going to get.

So using tools. So really what it comes down to is when you're thinking about Kubernetes, Kubernetes is a platform of platforms. People think about Kubernetes is that thing, “I ship Kubernetes, I ship Kubernetes.” No, you don't ship Kubernetes, you don't ship Kubernetes. The community ships Kubernetes. So whether you are, my company Heptio, or you're vendor X, vendor Y, vendor Z, really all we're doing is shipping Kubernetes with a brand, a special sauce. But basically, at Heptio we're not, we're actually shipping upstream. So let's think about some kind of tools that we would use. And the two most important types of tools that I think about in this space, not the only types of tools, but the two most important types, are package management and configuration management. And these are the problems that we find. We build these clusters, we create software and then we say, “Developers, go do what you need to do”, and our developers are like, "This is too hard."

So what do we do? Well, we tell them they use Helm. And actually, I've heard this in a couple of talks today where we said we're just going to go Helm install, we're going to get these charts and we're going to do that. So you know about the compliment sandwich? The last time I spoke about Helm in a talk I was really nice. So this time I figured I can say something that's less nice, but still true. And the next time I talk about Helm I’ve got to make sure it's really nice again. So we're in the middle now of the Helm 2 sandwich, and the problem with Helm is this: package management is super hard. If you look at how a yum went through, and now they're DNF and you have D package and you look at like things like chocolaty and NuGet on windows, we realized it's really hard.

But one thing we've learned a long time ago is that if you have a package management, you have templates in there, using something like jinja2 to actually do the text search replacements inside of your text, that's hard. And that means that whenever someone runs your thing and you have a small typo, you can't even infer if your file is correct anymore. But that being said- here comes the compliment part- Helm 2 is the greatest thing we have right now. It actually enables the community install software CockroachDB or any other things in software cluster. You can run Helm install and you can get these things and they run right now. So just wanted to put that out there.

But there are other types of tools. So we're talking about package management. I have this piece of YAML junk and I want to make that work in my cluster. But that only works if you have one cluster. What if you have two clusters, three clusters? What if you have cluster in Dev, staging and production? So now let's think about that. So this is another set of tools, and I put my tool on a top case on it because in my alphabet K comes before capital P and capital B, but this is what we're trying to do. We're basically saying that we're going to describe our clusters as something that exists at a time. So I have a production cluster that exists in, if I'm using Amazon us-west-1 in zone A, but if I have my test cluster, which is basically a smaller version of that, I still want to be able to install the same software on there, but I want my configuration to scale up my deployment.

So this is what these tools are doing. They’re configuration management, they're basically, let's think about Ansible or Terraform, but think about it in a light of Kubernetes. And just to show that I'm nice, I actually put these things on here, and this is where we're headed for in a community. So instead of having a whole bunch of YAML, what people are doing in the case of Ballerina is creating a whole new DSL to allow us to actually declaratively describe what goes in our cluster as a software. And the reason that Ballerina works is because the Java developers, they’ve targeted it. And actually, Ballerina is right down there in the corner right now. The developers they're targeting are used to working this way. But our friends Pulumi who are out in Seattle, what they've done is said, "Well, what we're going to do is take the whole Kubernetes API and we're going to convert it to something that can run with typescript." And the great thing about typescript is that, it's strictly types and you can actually check your code to make sure it's correct. So now we can actually write things that are basically typescripts, you know, fancy JavaScript for us non-JavaScript people. But you can actually write these things to configure your software with languages you are familiar with, rather than the languages of Swagger and just straight YAML.

So those are neat, and we should look at these. But there's other types of tools as well. And the two that I want to bring out and unfortunately, I didn't have any diversity in here, they're both google projects. But skaffold is, "How about I have this application that I'm working on, but I want a fast run cycle and I want to be able to deploy it locally to my Minikube or I want to be able to deploy remotely to my dev cluster. And as soon as I hit save in my browser, or my editor, I want this to happen." Skaffold actually allows you to build that functionality out. And kustomize is another tool. And the neat thing about kustomize is now that I have this piece of YAML, and 75% of it is the same across all my deployments. And really what I want to do is overlay more data on top for that 25% that changes between my environments. So that's another tool if you want to look at.

Rule Number Nine: Extending Kubernetes

So number nine is extending Kubernetes. And actually number 9 and number 10 are related. So with Kubernetes, this is actually the real feature of Kubernetes. It's a platform of platforms. It allows you to build software APIs without having to worry about coding the APIs. But what happens whenever we get something for free? Well, basically what happens is we’ve become lazy. And then what we do is we talked about the CRD, so we had earlier, we stuff more and more CRDs in there and we have more and more controllers and more and more processing. What happens is our API starts to get slow. So what Kubernetes does, because it is a platform of platform, what you can do is you can actually extend the API. You can say that this whole branch of the API is another piece of software, or is another end point that actually runs in cluster and the API server will actually send that down.

Rule Number Ten: A Live Word Called Refinement- Building on Kubernetes

Why do I talk about that? Well, really, this brings us to the number 10, and this is actually where I wanted to be, but this is the long way of getting there, is basically, we need to think about building on Kubernetes. So a lot of people build like this; they build on top of Kubernetes. So you have app one, app two, app three, there's a little buffer and your cluster is below it. But really what we want to do is we want to get to here, where we have app one, app two, app three, that is actually knowledgeable about our cluster, instead of just running on top of it.

So here's a good example of that. So everyone here is familiar with your CI/CD tools? So one, I'm going to say one, two, three, shout out your favorite CI/CD tool. One, two, three. Really, Jenkins? I said your favorite, not the one you're forced to use at work. That's a joke for cloudy people, I gist. But no, this is the point that I'm making, is that Jenkins is a cool tool for CI and CD. It's used all across the industry. But the problem is, is that you're using Jenkins to deploy to your server and Jenkins knows something about your server, and you're saying, "Oh, yes, smart guy. I've heard of Jenkins X.” Once again, Jenkins X is a piece of software that is basically making Jenkins work in a Kubernetes environment, but it's not giving Kubernetes any…it's not actually getting any information from Kubernetes.

So really what we need to do is build our CI/CD tools that basically based on CRDs where we can actually say that, listen to GitHub with this CRD and then push it here. And then we write another set of controllers that says, "Hey, when I get this message, I'm going to build this software using this pipeline." It's basically making spinnaker. And then what we can do is have this thing now. I have these gates, this gate, this gate, and I can just release to the production, and it listens and it uses the same machinery to listen to the Kubernetes API that everything else does. This is the ecosystem we want to build. That's what I'm trying to do, and that was the long way of trying to get there.

To end this, if you follow these rules, you'll have mad bread to break up. If not, you have 24 hours on-call with constant wake ups. And that's actually what I want to leave you with, is to say that Kubernetes is a fun thing. There's lots of good ideas in Kubernetes, but we're only four years into our journey. And what we've realized is that, like a four-year-old-child, they are nowhere near being an adult. I made that mistake, so don't do that.

But I want to leave you with one last thing, is that Kubernetes is broad. Notice we talked about seven really different things. To say that we are all Kubernetes experts, to say that we are experts in the container space, to say that we're expert in distributed applications is a fallacy. So for anyone who's trying to get started and is daunted by looking at these experts or these people on stage, guess what? We're just people like you all are. We're learning every day and some of us actually come back and share. So I wanted to actually bring Carlos up here because - Carlos is this your first conference talk?

Amedee: In a room of this sizing.

Liles: This is also the each one teach one; use your privilege to get other people who don't normally do this to get to do this. And I do want to leave you with this, and this is good, and that's it.

See more presentations with transcripts

Recorded at:

Feb 09, 2019

InfoQ Software Architects' Newsletter