Transcript
Losio: We are going to be chatting about the Kubernetes environment, and more specifically about container security and observability.
Before introducing our panelists, just an introduction of what we mean by container security and observability in the Kubernetes environment. Containers have become the default component for all modern applications. Kubernetes is the main player on the market. It's the main container orchestration system. With that, bigger audiences come with a bigger responsibility. Developers and architects have to take care to secure and monitor their workload on distributed systems, almost all on public cloud environments. As this market keeps growing, we are going to address what the major challenges are, discussing observability and security.
Background, and Experience with Kubernetes
My name is Renato Losio. I'm a Principal Cloud Architect at Funambol. I'm an InfoQ editor. I'm joined by four experts. We'll discuss with them how they manage security and observability on containers. Let's start with a round of introduction of each one of our speakers.
Zarouali: My name is Rachid Zarouali. I'm a freelance cloud architect. I'm helping companies striving to move to cloud providers, wherever they are. In my Kubernetes journey, I'm working with global companies, so not only developers, but also DevOps, and also the organization to put security in the center of their projects, so that each and any participants in these projects knows exactly what it means. Eventually, what are the caveats that exist? Also, how we can avoid them before going any further into the project.
Washburn: Caleb Washburn, director from VMware. I play the role of our global technology and strategy lead for our professional services group that leads up a group of consultants working with customers, helping them adopt and leverage best practices around Kubernetes, from taking their applications from source code to production in a secure fashion.
Fricke: I'm Thomas. Now it is 30 years in Unix, 15 years in the cloud, 6 years in Kubernetes, and now focusing on Kubernetes security, mostly on customer demand. I really know all the products from a customer perspective. This is a challenging part to see what a real customer does with this nice product, and this might differ a little bit from what we would expect.
Andersson: My name is Jessica. I work on a platform team. What we say when we say a platform team is that it's a team that provides services and products internally to other developers to make their lives easier. We look at things that every developer team needs in order to deliver the product. In the days of DevOps, that is quite a lot of things that a developer needs to care about, among other things like a frontend environment and a cloud infrastructure, in other words, Kubernetes. In addition to that, I'm also a CNCF ambassador. I'm also very engaged in the local cloud native community out in Cloud Native Nordics, which is a grouping in Norway, Sweden, Finland, and Denmark, and also in Gothenburg where I'm living in.
The First Step in Security and Monitoring with Kubernetes
Losio: I'm not a strong Kubernetes user. I come from a different background, but as any workload and application, in the end, Kubernetes workloads are no exception to security problems or security challenges. Let's say that I'm a developer, and I'm someone that finally moved to Kubernetes. I've played with containers for some time, I haven't used Kubernetes. I finally move to public cloud, I have my first Kubernetes cluster. Until now, I haven't thought about security. What's my first step? Should I be worried? I haven't thought about security. I haven't thought about monitoring. Should I be worried? What should be my first step?
Andersson: I think that if you haven't thought about security at all, yes, you should probably be worried. That's a good start. Don't worry too much, because there is actually a lot of tools out in the ecosystem that will help you, especially when it comes to observability and analysis. There's plenty of really good tools that are super easy to get started with, which will help you gain a lot of insights in your containers when they are running on top of Kubernetes. Most of these are so simple to get started with. There's a lot of resources to help you get started. Yes, be worried, but don't worry too much because there are resources out there.
Fricke: I totally agree, you should definitely be worried, especially if you run Kubernetes including the infrastructure. Help is on the way, a lot of tools, sometimes I think too much tools, but you also need security people. I'm one of the people who definitely promote DevSecOps, or better, SecDevOps, at the customer side. This means, please don't add additional security tasks to the developer. The developers need to handle 10 times or 100 times more code than 10 years ago. Please add extra security people and then teach them about Kubernetes. Don't put another ton of load on the developers just, "You are a developer. You can handle Kubernetes security." Yes, they can. In the beginning, if they don't have the experiences, they will fail with a high probability. This is my demand. Do it always in a team which has all the capabilities inside. We know from agile processes, where you have designers, and programmers, and all the people who know the customer stuff, please think in the same way about DevSecOps, add security people in the very beginning. They don't need to be present all the time. You should have access to security developers. I will give training to auditors, so please take care about this.
Difference between DevSecOps and SecDevOps
Losio: One little question for the ones like myself that are not really strong on the subject. You say DevSecOps, and then better, SecDevOps, what's the difference?
Fricke: Security is first, especially in my environment. This means, think all your applications, your architecture with security in mind. This means don't be as naive as sometimes you will just copy a cluster admin role binding from the internet, don't think about it. Then your entire cluster is flawed, and I can show you how. It's so easy because the new thing is complexity in Kubernetes is so high, you need a lot of experience to deal with it.
Observability and Security Tooling
Losio: We mentioned the importance of tools that can help. What are the tools in terms of observability and security that we should use? Why? When? When is too much?
Andersson: I think that, for me, I would start out easy. You always have to begin somewhere. I would recommend beginning with Prometheus, getting it up and running. Then just getting the metrics from your containers. You know how developers usually set up metrics inside their applications, and they measure the things that matters to them. Then you also want to have the runtime metrics. Like, how many requests I'm hitting. Getting how much CPU is it using, and all these kinds of questions. If every developer had to set up this themselves, then that would be a lot of work. The tools that you get for Kubernetes has this automagically, you run a small thing inside your cluster and it collects all these metrics for you.
Washburn: As Jessica mentioned, there's getting observability, getting the information out of your ephemeral environment, those containers, they come and go. Making sure you're getting those metrics, using tooling like Prometheus and Grafana to visualize that. Making sure you're getting log information out of your containers using solutions like Fluent Bit, or other solutions that are out in the CNCF. There's a lot of tools. Then, as you start to build your initial set of containers, start looking at higher levels of abstraction, looking at things like buildpacks, where the security is already built in, the middleware is being patched. The underlying images are essentially curated versus starting with a random image on Docker Hub as your base image. Stand on the shoulders of other work that's out there, so that you can think about, as a developer, the things that are your immediate concerns, which is your code. Then leveraging the power of the community that is already creating patched operating systems and middleware component trees, like your Java runtime, your Python runtime, whatever your language of choice is.
Losio: What are the tools that you recommend, or the challenge you see, in that sense?
Zarouali: First, we have to keep our developers focused on what they are best at, building applications. Having the right tools to detect whatever could happen with their application, vulnerable libraries, bad behaviors, and so on. This should be, I think, implemented way before deploying in production, in their CI/CD platform, in their development environment platform. There's a lot of tools actually that can help that. We can speak about vulnerability scanners like Clair, Trivy, or Kubei for example. There are also tools like SonarQube that can help in detecting some bad behaviors in the application. When I speak about security with my clients, my first step is make sure that your application is secure. Once you make sure that your application is secure, get to know how to secure your development environment and your production environment using the right tools. Obviously, Prometheus and Grafana help in getting knowledge about what's happening in the cluster, or how your cluster behaves, and what your application is doing. We should not put aside the fact that an application can have its own behavior changing during its own lifecycle. This should be also kept in mind by using tools like Falco, for example, for behavior analysis.
There's a lot of tools. Before speaking about tools, I think we should speak about what should be secured. Where should we get the metrics? How to get those metrics. What our goal is in keeping our application lifecycle secure in our environment. Once we wrote that somewhere in a basic workflow, then we will speak about what are the tools that are available in our community. Which one can help us in this and this? It's more about getting a high overview of the security aspects of our application in the production environment, and then speak about what are the tools that we can use.
Fricke: What is also important is you need somebody in the team, not everybody needs to understand it. Somebody who really understands role-based access control system, the RBAC, because it shows strange behavior if you delete projects and role binding stays there, then you have zombie role bindings reappearing, if somebody recreates a project with the same name. You have this infamous cluster-admin role binding, which effectively renders your cluster completely insecure. You can check for this. There are tools for this where you can have a graphical output, I think they are from outside. Please understand it. Make it absolutely clear that every application has only the role bindings it needs. Probably, it's also a good idea to think about security architecture. This means nearly no operator needs access from the internet, and nearly no service which is exposed to the internet needs operator capabilities. This is a no. Then you are much secure. Because recently, if we could publish something with ImageTragick, with some common mistake steps, I can show you how to completely take over any cluster, especially an OpenShift cluster from outside by uploading flawed pictures. This is something really special but this isn't a chain, which needs to be addressed. The chain is relatively short, so only three mistakes, and then you're done.
You also have big capabilities you need to understand, so network policy is a big plus, and all the security built in. You have to know them. Not everybody has to know them. You have to have somebody in the team who knows it. It does not mean to go away from Kubernetes, because all the other systems are also flawed somehow. Here you can have a centralized security, if you understand it. You need to understand it, really.
Kubernetes Security in the Future
Losio: At the moment it is ok. In the last few years, Kubernetes became the reference point. When we talk about security and observability, we talk about Kubernetes cluster, and public cloud. As a developer, we have seen in this conversation, we see even more up to which point I have to worry and which one are my responsibilities, which ones are not. Where do you see us in 5 years or in 10 years, will we still be worried about Kubernetes, or any orchestrator, or where do you see the direction going? A developer that knows even less, or a developer that knows even more about security?
Andersson: I find it really interesting, both Thomas and Rachid, that you are going, you have to understand exactly how Kubernetes works, how it's set up, and understand all the security tracks and all these things. While I do agree with you, I think that the role of a developer, from their perspective, they need to understand less about Kubernetes. I think that's the direction we're taking in the next years to come. If you look at a 10-year scale, I would be surprised if a random developer is aware of the concept of Kubernetes. It might be still what is running underneath, and powering everything else. I think there needs to be a good interface that abstracts that level of knowledge that you need to have about how your container is running. You will probably just be more about running code and not running containers on top of Kubernetes. That's where I think the future is going. Some of us will have to understand all these things. If we're going to have effective developers that can focus on doing what they do best, as Caleb you mentioned that it's supposed to be writing the code for the product, then we need to take away Kubernetes because that's way too complex for everyone to get to know.
Washburn: Kubernetes is a platform for creating platforms. That abstraction is leaky to a developer. There are things that make it consistent where you can deploy to multiple clouds, but there's a lot of things that leak out of that. A lot of details that are not really important for the purpose of what a developer needs to be able to write that business application, add that feature for their end users. Over the next n number of years, I see that encapsulation being better, where we have platforms that are Kubernetes native, that are leveraging things, so that developers don't necessarily need to understand the core Kubernetes constructs that are there. They're understanding a higher level API that is really just focusing on running their application, and serving the needs of what they have today. Someone in your organization is going to need to understand Kubernetes though. It's not like it's magically just going to get better, but thinking about how you create a platform team, that is understanding, and has that deep knowledge that understands and provides services out to their developers to simplify and delight the needs of your developers. That's the way to go about doing it.
Losio: You still see someone inside the organization, not someone from the cloud provider or from any other layer of abstraction. You still need to have that knowledge. It's probably not the developer, but someone has to be there to validate it.
Washburn: I fully believe you need someone in your organization, many folks, security folks, folks that know the nitty-gritty details of how a lot of these components work. Try to focus on that value line for your developers, and ensuring that value line is much higher than what it is today, if you just give them a Kubernetes cluster, and say good luck.
Zarouali: I think there's a role that doesn't really exist in many organizations, and this role is about knowing how the application behaves, and how Kubernetes works. It's more about creating a missing link in the organization, so that we can help developers to get focused on what they're best at. Also, helping them understand how their application will work in production. Not how Kubernetes works, but getting knowledge about, we have network. We have storage. We have secrets that we have to pull from here and here. Those people will have the responsibility of running Kubernetes in production, of making sure that the application behaves correctly in production. Also, have the responsibility to help developers getting knowledge about how the application will behave. I think that's the point here.
We should not, I think, avoid the fact, as we all agree, that developers could not learn everything about Kubernetes, that's not what they're best at. That's not what we are asking them to be best at. At the same time, we have to help them get sufficient knowledge so that they can see, caveats about storage, for example, with common PHP frameworks, and so on. This is something that doesn't really exist in organizations I've been working with. This is something that I really work on creating these missing links so that they can all of them help each other in making sure that what runs in production behaves exactly the way they want it to be, but also as secure as possible to avoid any issues.
When Kubernetes Comes in Handy
Losio: We start talking about container, observability, security, but not everyone necessarily needs Kubernetes. Maybe you have an application that has a few containers or have many use cases where you don't need Kubernetes. When do you need Kubernetes? When is it better to move to Kubernetes, for observability and security, and when is it perfectly fine to run containers in other ways?
Fricke: You need Kubernetes if you have a huge distributed environment in the future. What I hope to see is that Kubernetes distributions have a higher level of default security, so that we get a way of this NGINX port 80 example, which is notoriously wrong. Everything on this example is wrong and this is the full example. There is no exception, all the Kubernetes distributions at the moment, and what the cloud providers offer, has a lot of room for improvement. No exception. What I've seen so far is not so well. You need Kubernetes if you want to scale. In the future, I see a lot of things in edge computing, so you have lots of small distributed Kubernetes clusters. Unfortunately, these are the use cases for critical infrastructure, so it must be secure. Security here is not an option if you want to have power, and healthcare, and things like this. Failing here is not an option. I would definitely kick the providers of distributions to deliver a higher level of security by default. Then it can be very secure.
Why Kubernetes Security is Still Lagging
Losio: When you say, up to now they haven't kept that level, do you think it's just because they are sloppy, lazy, whatever, or it's just really intentionally to encourage the adoption and make it easier as a first step in, in terms of observability, in terms of managing. That it's a tradeoff that then doesn't pay when you start to really get large. I don't know if any of you has that experience of a customer that wants to jump into Kubernetes and find the defaults of maybe a cloud vendor, or the default installation easy?
Zarouali: It's more or less a tradeoff in a way that there are many ways to improve security in Kubernetes. As we improve security in Kubernetes, we will also add different layers of complexity on top of it. What cloud providers try to offer is something that is very easy to step in and start working with Kubernetes. At the same time with not that much restriction. We see that quite often, is that we often see Kubernetes clusters that are not that good from configuration, put it in. We see like the ability to get access to Kubernetes cluster from wherever you are on the internet. How about making sure that the default Kubernetes distribution that we use has the highest level of security. I think it's a fairy tale in a way that each company has its own behaviors. Even if I'd like to, I don't think we can really easily push the highest level of security in Kubernetes with each and any company that's striving to use it. If Kubernetes is the right solution to whoever wants to run containers, I'm not sure. I'd really love to see good alternatives to Kubernetes, easier solutions to step in rather than Kubernetes. This is a day to day job. This is not an option. Kubernetes is a go-to job. We have to keep in mind that if there's not enough security now in a production platform, then we are opening up our doors, and this is not what we want to see.
Washburn: We're seeing that while it can be secure by default, many times those clusters are not secure by default. Essentially, if you were thinking about this in the construct of a Linux operating system, everyone that has access has sudo rights to do everything as root. You would never do that in a Linux operating system, but yet that's what we're doing where developers have an admin context and are able to do whatever they want to that Kubernetes cluster. Thinking about least privilege access, thinking about the users of that cluster, what do they need, from a permissioning perspective? It's easier to just say you have root access to this. That causes other concerns that you need to address. Thinking about, again, as a group of folks in your organization that are providing Kubernetes as a service to your developers. Thinking about what actual levels of access they need, and what things can be added to the cluster by default to ensure that that level of security is there, and secure by default.
The Ideal Barrier for a Developer
Losio: What is the minimum that a developer needs? Ideally, as in the old sysadmin process, the less the developer has access to the production, the less damage he can do, the better it is for himself or for everyone else. In terms of knowledge, the developer should know about security and observability, worry but not worry too much, because it's not his main job, from a developer point of view, at least who will understand the point. What is the ideal barrier? The example I gave before of using Grafana, or any tool we use, if the developer is just focusing on using that, probably he's not developing, he's not doing his job. He's not providing the feature he wanted to do. On the other side, I had the feeling that if the developer doesn't even know that Grafana exists, and he has to think about how the system is going to be monitored, we have the opposite problem. We have the old wall between a developer and the rest of the team managing. Where should the barrier be? It's still a DevOps scenario?
Andersson: I am a true believer of DevOps, when it comes to not, as for me, change the name of sysadmins to be DevOps, not that part. The one where product teams actually take full responsibility from the idea to running and maintaining it in production, that is what I believe in. I think that whatever we can do to make that process more easy, and more accessible, and getting as much as possible for free in that process, so you don't have to care about getting the metrics for your CPU, whatever it is. I think the more we can do towards that, the better it becomes. I see this discussion is going back and forth, and we are approaching from different angles. In my mind, one part of it is maintaining our Kubernetes cluster, making that secure and available to the users. Then the other part is, as a user, I want to run something on a load. I want that to be, of course, a secure and good application that I can keep running and maintaining. These are the two approaches. I think the end goal has to be, can the user deploy and maintain a secure and reliable application in production? That is the end goal for me. Whatever I have to do in order to make that happen is to make it happen. I think the more we can abstract away, the easier that will become. Because if we expose too much of the underlying things, then that will not be good either.
When should you run your own Kubernetes cluster? I don't know. You probably shouldn't. If you have a lot of scale, if you have a lot of distributions, then probably that could be a good idea. If you're running just a few containers, I know hosted solutions does not come fully secure by default, but that is probably still a better solution than running your own cluster, because it takes a lot of time, takes a lot of knowledge. It's an ongoing thing. You can't just spin it up and let it go. You have to continue to maintain it. Then, comes all the other things, because then you have to add the monitoring. You have to add the logging. You have to add the other things, and it just keeps adding on top.
Fricke: I will add GitOps to that solution, because then this means you have automated everything. You have a single source of truth, which is your Git, and this is easier to audit than anything else. Fully automated processes based on GitOps are definitely part of the solution, in my opinion.
The Point at Which the Kubernetes Approach Changes
Losio: One point that Jessica made in the last comment was about, if you have a few containers probably the complexity might be different. One thing that I was thinking is, I am new guy moving to a new project. I start with a few containers, a few users, everything is fine. I start to worry about observability. I start to worry about security. When does it really change, or should it change, or it should be the same running 10,000 containers or running 2 containers? When should I change approach or should I actually change approach? Does anyone see any difference in the approach?
Zarouali: The real problem is not necessarily running thousands of containers, it's more about workloads. It's more about how your application will behave in your cluster. What I mean by that is, in my machine, I can run hundreds of containers that are simply running, and that's it. They're not moving. Nothing happens in the cluster. That's it. Where Kubernetes has great features is when your application moves in your cluster, when your application has to scale automatically in your cluster. That's where, at some point, you will have to rethink about your application architecture, speaking about clustering, databases, and so on, or adding other mechanism using observability to autoscale your nodes, if you're working with a cloud provider, for example. Either if we have to get concerned about running 100 or 1000 containers, yes, we have to. Where does it change? It changes when you have to keep an eye on your cluster. Once you start keeping an eye on your cluster and looking really at your cluster and how it works, I think this is the point where you should automate as much as you can, and give Kubernetes as much power as it needs to manage your application lifecycle and make its running whatever happens.
Fricke: No, I totally agree. Automation is the key. We need to automate things because we actually don't have enough operations people anymore. We have to test it. We need to get experiences with it because everything that is not automated in a few years will not scale. Then we have a real problem.
Learning Kubernetes vs. Developer Self-Development
Losio: I need to actually pose another interesting question that is basically a follow-up on what we said. It's basically extending the topic. It's more on the developer self-development prospects. Would you advise a developer to spend time learning Kubernetes? When should they do it? Is it better actually to use the limited time a developer has to focus on his own self-development of whatever language, whatever skill, whatever pattern, whatever architecture? Focus more on the developer side, and not on the DevOps side? I think this is a much wider question. It's not really related only to observability or security. It's a question from a developer side.
Washburn: I think it really depends on, ultimately, what your organization looks like, what outcomes you're trying to drive towards. What are the facilities that are available to you? In a larger organization where you've got dedicated teams that are running Kubernetes clusters and understanding and working with their developers to provide those environments. Provide the abstraction. Eventing clusters that have the necessary non-functional application requirements, but things that are helpful to run those applications and monitor those applications. Then you don't need to know as much about Kubernetes. If you're a small shop, and you're wanting to do everything, and you've got fewer people, then you're going to have to know a lot more about it. The bigger question is, at that point in time, do you really need Kubernetes, or are you doing it for the sake of doing it?
Losio: You're suggesting, the moment is still a choice versus the moment you realize that you need to know it.
Washburn: Definitely there's a size of company. There's other characteristics, how often is this application changing? What's the rate of change? Do you need APIs to consistently roll out and deploy your applications? What are the non-functional things you're trying to accomplish? Kubernetes is super powerful, but it's also with great power comes great responsibility. There are some challenging things that you have to understand. Until you're starting to see again, on that roadmap, or that vision that, that Kubernetes environment is the underpinnings of things, not necessarily something that you only want to expose to every developer. It's just not where it's at right now. It's evolving there. It's not the equivalent of a MacBook where you can open it up and it just works. It's like building your own PC. You got to know all the pieces and parts and how to put it together. There's a lot of trial and error with that. Do you want to be an expert in that or do you want to be an expert in your primary craft, which is writing software applications, understanding the needs of your users. This is a hosting concern more than it is actually something I want to deeply invest my time in.
Andersson: I agree to a lot of what you're saying. I think that when it comes to, should a developer transition from Dev to DevOps? My answer is, yes, you should always do that because I think you should not write code if you don't care about who uses the code in the end. The question is not, should you transition to DevOps? It's like, why are you not there already? If I may be so frank. That's the real question here. If you should run your cluster or not, I think he covered that correctly. I just wanted to add on that DevOps part.
Washburn: You can do DevOps without Kubernetes. You do not need Kubernetes to do DevOps.
Action Items for a Kubernetes Developer
Losio: I'm the developer that has a Kubernetes cluster live. I'm happy, until now I haven't had a major problem. I listen to this InfoQ event and people talking about observability, security. I now go home, what is a small, simple action item I can do tomorrow as a step forward. It can be read something. It can be control one configuration file.
Fricke: I just would look for a team, or if I'm in a team, just ask these questions in a team, who is responsible? Who knows about what? We only have a chance with distributed knowledge, so you cannot expect from a single developer to know everything. You should get support.
Losio: Share the knowledge. Try to get the knowledge in the team.
Fricke: Support from the management for the DevOps change is also very important. In big companies, sometimes they have classic procedures, but the management has to support it also.
Zarouali: Start printing your own Kubernetes cluster at home, play with it. Don't hesitate to run tools like kube-bench, like K8s Guards that will help you understand how your own cluster behaves. I think the bigger part is train yourself or get trained. Get to people that can help you get knowledge or the knowledge that you need, or get the knowledge by yourself by playing with your Kubernetes cluster, by following best practices that we can find anywhere on the documentations, and so on. This is a long journey, if we want to have developers knowing everything about security, observability in Kubernetes cluster. They don't have to follow that road. They just have to keep in mind that their application runs in Kubernetes, and what does it mean for their application running in Kubernetes being secure and having a perfect lifecycle.
Andersson: If you're already running Kubernetes, figure out what its looks are, and take a look at them.
Washburn: If you don't have an organization that has someone that's providing this capability for you, and you're wanting to leverage Kubernetes as your deployment model, dive in and understand all of it. Because the minute you decide to deploy out there, you need to really understand all of that implication to your application. Otherwise, you're opening yourself from a security exposure perspective. Getting something working is not that hard. Getting something working well in a secure fashion without the knowledge, so invest in training. Invest in taking certifications. Invest in the various different things, not just Stack Overflow to get something working.
See more presentations with transcripts