Transcript
Newcomer: Given the mix in the audience, I'll try and do a little bit of both, some conversation about containers and Kube and not just assume you have full knowledge, but let's see how that goes.
Containers have absolutely exploded across the industry, and I work with many customers who have adopted containers and are using Red Hat's OpenShift container platform, which is built on Kubernetes. We've been partnering with the Kubernetes community for about four-plus years now or so - very excited to be part of that community. One of the key values of containers over deploying your apps simply on VMs is that you get to package all of the system dependencies that are required for that application to work in your container. Everything travels together, and that makes it much easier to ensure that, as your application moves from development to test to production, nothing changes, except maybe some of the secrets you're using to communicate with databases. It also makes it easier to put more applications on the same server, because you don't have to worry about conflicting system dependencies when you're using containers, and that's one of the reasons why they've become so popular.
Of course, enterprise apps rarely are going to be delivered in a single container. This is especially true if you're building microservice-based applications, where you have multiple services, each in their own container. They need to talk to each other, they need to discover each other. They need to be spun up when they're needed, and ideally, spun down when they're not needed. All of that is what you need a container orchestration engine for, and Kubernetes has really taken the orchestration world by storm. It's about three years ago that the market was much more fragmented around container orchestration. Today, Kubernetes is the dominant container orchestration engine and has this really strong open source community behind it, which is one of the reasons it's doing so well. It gets contributions from many different players.
Similarly, there are other things that enterprises need when they're deploying containerized apps and orchestrating them. They need to think about resiliency, HA, failover. All of that is built into Kubernetes itself. As an example, for those who aren't familiar, if you want to have three instances of a web front-end application running, you can tell Kubernetes you want three instances running. It'll take a look at all of the servers that are in your Kubernetes cluster. It'll see which has the most capacity. It'll deploy your instances to whichever server has the appropriate capacity. If one of those containers - or in Kubernetes terms, pods - goes down, Kube will notice and will automatically spin up another one and put it again on whatever server has the most capacity. If one of those servers goes down, those containerized apps get redeployed from the container image, so it's always a known start point, always consistent, and deployed again wherever there's capacity. Kubernetes gives you your failover, your high availability.
It also gives you scalability. It's the Christmas rush or some other holiday, you're in retail, suddenly, your web front-end is just getting hammered, and you need a lot more capacity. You can tell Kube to increase from three to six, or whatever the appropriate number is. As long as there's capacity in your cluster on this set of servers, those additional instances will be deployed without you having to do anything else except change that number of deployments. Similarly, you can do things with Kubernetes, like, you can do blue/green deployments. You can say you've got a new version of your application, and you want to deploy that new version without shutting down the version that's running, so that things are seamless for your end-users. You can get three instances of the new version up and running, and then when traffic's moved to the new version and everything's successful, you can shut down those other versions. You can do A/B testing, you can do canary deployments. All of these are capabilities that are built into Kubernetes that you get the advantage of, so you want to think Kubernetes as your distributed platform for applications.
Agility and Utilization Improvement
In some of the customary examples that we see, there are some benefits, in addition. Macquarie Bank is an OpenShift customer - OpenShift is a CNCF certified Kubernetes distribution - and one of their key values is agility. Macquarie wanted to be in a position to be more competitive in the banking industry. There's a lot of churn and change in the banking industry due to FinTechs and things like that. They wanted to be in a position where they could easily go from idea to execution to the customer every day, and they've been able to do that using the OpenShift Kubernetes platform. They're able to release daily within minutes, with zero downtime. They can deliver faster without losing control or visibility. If they have a new environment that they need to bring up, they can do that within an hour. It's also helping them to hire and retain talent, because it's such a hot space.
Another industry is Cathay Pacific airlines. In addition to the agility that they also got from using Kubernetes, they have been able to significantly reduce their infrastructure costs through greater density of deployment. Deploying containerized apps on a server means that, again, because those system dependencies are packaged with the container app, it's easier to deploy apps with different system requirements on a single server, and that gives you greater density. Their infrastructure costs are going down as well.
At Enterprise Scale, the Cluster Is Not the Security Boundary
Tons of value out there for containers and Kubernetes, for the enterprise - and not just the enterprise, but that's my space, so that's what I'm focused on. However, all of these big companies have multiple teams, multiple business units sometimes in those business units. If they want to get the most use out of a single cluster, they need to make it possible for those different teams, with potentially different security requirements or regulatory requirements for their applications, to deploy to a single cluster.
Think of this as a soft multitenancy. We're all part of the same organization, but it's still a multitenant scenario. Also, many of the customers I work with, they're concerned about insider threat as well. It's not just when they look at what's their security posture, what do they need to control for. It's not just for accidental breach or a vulnerability getting introduced into an app that they didn't know about. It's also, "Is there somebody inside who either unintentionally leaked some information or maybe even potentially did so maliciously?" These are things to think about.
As we think about multitenancy for a Kubernetes cluster, we have to think about all the different ways that we can tackle that. I also want to mention that there's a whole slew, HIPAA, of regulations that mean that we have to think hard about the applications that fall into that category. Our companies have to deal with auditors, and those auditors are still learning about container technology and container orchestration technology. Frankly, a lot of the security teams I talk to are still learning about that as well. As we go through and talk about configuring a Kubernetes cluster for multitenancy, one of the things, from time to time, to take into consideration is, is your auditor going to accept some of the things that I suggest? Are they just going to be so disconcerted by the changes that you may need to take some other approaches, such as an organization might decide that they need a separate cluster for PCI DSS apps or for HIPAA apps that have to have HIPAA compliance?
Security best practices still apply, and security in today's day and age needs to be adaptive, reactive. Just like we're doing agile development, we need to do agile security. It's really important to think about all of these things, but an additional thing about containers and Kubernetes is that it's an opportunity also to shift security left in the lifecycle. We'll talk about that maybe a little bit later in the presentation, but again, we want to think about, "How do I take these best practices and apply them to the containerized world and to Kubernetes?" We're going to look at these four aspects. We're going to talk about the host OS, the container platform, networking, and containerized apps, which is our opportunity, again, to shift security left as we think about the application CI/CD pipeline.
Host OS Container Multi-Tenancy
Windows containers orchestrated by Kubernetes is work in progress. Not GA yet. Also, I'm with Red Hat, so I'm going to talk about the Linux host and security for the Linux host and Linux containers, but Red Hat is actively working with Microsoft on Windows containers, and we're really looking forward to when that actually GAs and being able to support both Linux and Windows containers. There are some key capabilities built-in to Linux that help to isolate containers running on a single server from each other and to isolate those container processes from the kernel as well. In many ways, a running container is simply a process running on Linux.
Some of those key features that help to deliver that isolation, Linux kernel namespaces, that's the fundamental. It provides abstraction and makes it appear to each of those container processes within that namespace, so each container has its own kernel namespace that they have their own instance of global resources. You also should take advantage of capabilities like SELinux. AppArmor might be another option if you're not using Red Hat Enterprise Linux, though we hope, of course, that you are. SELinux is a mandatory access control system. It allows administrators to enforce those controls, those access controls on every user, application, process, and file on the server. In fact, we know of multiple cases where SELinux has mitigated vulnerabilities that were found in the Docker daemon. This control is a critical value-add when you're running Linux containers. Cgroups help to limit account for an isolate resource usage across a common server. Cgroups ensure that your container won't be stumped on by another container on the same host. Then, another thing you can take advantage of is secure computing profiles which can be associated with a container to restrict the system calls that can be made by that container.
You also might want to look at a host OS that is container optimized. Previously, we had Atomic Host. Now, we have just rolled out RHEL CoreOS, which is our new container-optimized operating system. There are others out there, so don't mean to say you can't use anything else. These are some of the things, some of the value-adds you get. It's a smaller operating system, it minimizes the attack surface. If you use a container-optimized OS, you want to look for one that is deployed from container images so that you get the advantage of immutability that container images provide you. Just like we've talked about, a containerized app moving from dev to test production, with all of its dependencies built-in, you're always deploying from the same container image that's your binary. You can do the same thing for your OS. If you're deploying your OS from container images, you've got an immutable read-only OS. You can always recreate it from that image. If you need to scrap it, it's easy to kill the host and build a new one and know that you've got exactly the same thing. Those are some key things to look for.
The Container Platform
That's at the host level. Then if we start looking at the container platform itself, Kubernetes, we need to think about the multitenancy features that need to be available in Kubernetes. Kubernetes is very configurable, it's designed with a strong plugin model, and so you're going to want to be taking advantage of that and thinking about what are the plugins you need to use and enable to be sure you've got multitenancy. We're going to walk through each of these bullets on this slide, we're going to talk through some of the capabilities of each of those.
I also want to mention that there are some good hardening guides available out there. I tend to run into folks who use CIS primarily in the financial sector, but they are becoming more popular in other sectors as well. It stands for Center for Internet Security. They provide a lot of benchmarks for hardening different types of solutions. The benchmarks themselves are free, available to anybody to download. They have a series of versioned benchmarks for Kubernetes that are definitely worth looking at and provide a much more low-level, granular, point-by-point information on how to harden a cluster once it's deployed. Then, there are some open-source tools that give you the ability to do some automated evaluation of a Kubernetes cluster to see whether it conforms with those benchmarks. The slides will be available, but if you look in GitHub, Docker bench and Kube-bench are both available on GitHub if you want to explore.
I talked about the need for host OS. How does the host OS support multitenancy for containers? That was a fair amount of low-level information about Linux features and capabilities. How does the Kubernetes cluster take advantage of those capabilities in a way that means that you don't have to go talk to your Linux admin to be sure that everything's configured appropriately? It does that with something called pod security policies. Pod security policies are not yet GA in Kubernetes. Red Hat helped to contribute to this, because it's something that our customers needed before it was high priority for the rest of the community. In OpenShift, we call them security context constraints, and over time, once pod security policies migrate, we will shift in OpenShift to using those. What do they do for you? What does a pod security policy do for you?
It's a cluster-level resource that controls security sensitive aspects of the pod or container specification. For those who aren't familiar, a container gets wrapped with some additional metadata in Kubernetes. That means it gets called a pod. We'll use the term somewhat interchangeably, but in general, if I'm talking about containers on Kubernetes, I'm talking about pods. The pod security policy defines a set of conditions that a pod or container must run with in order to be accepted into the system. For example, you can create a pod security policy that says no container, no pod can run with root privileges - and in fact, by default, you should do that. You should make sure that none of your containerized apps run as root. Sometimes that's a bit of a transition for folks that historically haven't had to worry about that, and now, there's more scrutiny as you're moving or refactoring your app, modernizing your app. You get a little bit more scrutiny from your security team, but it's a real benefit to the organization as a whole, to ensure that none of your applications are running as root.
You can also do things like limit the Linux capabilities or the system access things. You can limit whether the running container has access to network ports or to the file system. You can make them read-only. There are a lot of great capabilities available to you, and you should be taking advantage of these to make sure, especially in a multitenant cluster, just to protect against accidental container breakouts. Again, in OpenShift, it runs on RHEL with SELinux on by default. We always have SELinux as our brick wall protecting against breakouts. There are other types of solutions available in other operating systems. That's a big part of the runtime security.
The other thing I wanted to mention, ops teams do typically often have certain types of agents or scanners that run on a host. When you're working with a container-optimized host that deploys only containers as deployed from container images, that means that these scanners need to be able to run as container images as well. Alternatively, you might find there are a whole bunch of the changes in technology that containers and Kubernetes create, have meant that there's just been this explosion of security startups out there. You'll find that there are new companies doing new security solutions that understand containers, understand Kubernetes, and some of those require a privilege. Again, if you're using pod security policies, you use one for most applications that's restricted, doesn't use privilege, and you have a different one for other containers or pods that may need privileges.
Identity access and management is a fundamental element of security. In Kubernetes, there are two types of users, or two types of identity - users that can actually log in to the system and service accounts. If we think about how automated this environment is, you want to be sure that you have a way to do as much automation as possible, take advantage of that automation as much as possible. I'm configuring my application for deployment. I've got maybe a bunch of different containers, and one way that I can be sure that they all know what to do when I'm not logged into the system and how to work is, I associate them with a service account. That service account has the appropriate privileges for the service that I am deploying on the cluster. We want to be thinking about identity both for users who are logging in and for those service accounts. For individual users, Kubernetes doesn't have a built-in concept of identity for users, but it has the ability for you to plug in identity providers for your userbase.
This are just some examples of some of the identity providers that are popular and that wind up being connected. If we're talking enterprise, your company might be using Active Directory as their main way to manage users and groups, and you want to be able to connect Active Directory to your cluster so that you can take advantage of the sophistication that something like Active Directory provides. That's where you have the ability to move your users and disable folks who have left, etc. For you service accounts, you want to be sure that those running services can authenticate to each other and identify to each other. You use X.509 certificates for that. Tokens, authenticating proxy is another option, as well as HTTP Basic auth. Most of the enterprises that I talk to don't use HTTP Basic auth. They do something that's more involved than that.
There are some links here if you want to learn a little bit more about configuring, but again, you want to be sure that you connect to an external identity provider and that you configure your certificates for the services that are talking to each other appropriately.
We talked about namespaces at the kernel layer, kernel namespaces. Kubernetes namespaces provide isolation for the container platform as well. You've got a set of identified users, they have login access. A user can only access projects that they're assigned to or projects being the OpenShift word for Kubernetes namespaces. You have to be given access to a specific namespace to see the content that's in that namespace. Services run in an assigned namespace, and services or applications can only see other running applications that are in their namespace by default. There are a couple things we can do that are more sophisticated than that, but again, host OS isolation with kernel namespaces, Kubernetes isolation with Kube namespaces, identity namespaces, role-based access control.
Another normal key element, if you're building apps, you probably think about this for your applications as well. RBAC is available in Kubernetes, you want to be sure that you take advantage of it, that you can do project or namespace level roles. You can also have cluster level roles, and all of this RBAC ensures , that you're only allowed to do what you are authorized to do once you've logged in. It’s basic stuff but, honestly, this was not something that was initially part of the view early on in Kubernetes. It's as the adaption has grown and we've seen more and more enterprise use, features like this have been added to Kubernetes, and it's good to take advantage of them.
Secrets management. Most of you are doing applications. Those of you who are doing containerized apps, how are you managing your secrets? Are you putting them in Docker files?
Participant 1: Using the secret.
Newcomer: You're using the secret capability in Kubernetes, this is good. Ideally, you never want your secrets in the container image. You don't want them in a Docker file anywhere. You want to make sure that you're taking advantage of the mechanisms Kubernetes provides to access secrets either as environment variables, volume mounts, or with external vaults, and there are a couple of very popular external vaults that are available that work with Kubernetes. One, in particular, is open source: HashiCorp has an open-source vault that works with Kubernetes if you want to store your secrets external to the cluster. CyberArk Conjur is another popular vault in this space.
In addition, you can use the key-value store that is part of any Kubernetes cluster, and that's etcd. Etcd is where known state for the cluster stored. When I talked about, how does Kubernetes know that I need three instances of that web front-end running and one of them died, and now, it needs to spin up another one? It stored the expected state for that application in etcd, the key-value store etcd. We recently GA-ed the ability to encrypt the data store for etcd. You can also store your secrets in etcd, and if you do, encrypt the data store.
Cluster logging and audit, also critical, just again, basic best practices, but by default, if you were just to deploy a vanilla Kubernetes cluster, it would not have a logging stack there, you need to add your own. You want to think about doing that, you want to make sure you do that. You want to look at a set of capabilities that allows you to aggregate those logs. I've got hundreds, maybe thousands, of containers running across hundreds of servers. How do I aggregate all the logged data? How do I aggregate logs from the control plane, the master nodes, those nodes that know everything about the server, the APIs, everything about the cluster, the API server, etcd, etc.?
You want to make sure that you aggregate those, and you want to use log shipping. You need a log stack, you want to push those logs off your cluster as soon as possible to maintain integrity, and ideally, maybe to a seam and for analysis. You want to manage access control to the logs. Sure, your app teams need access to the logs for their applications, but they shouldn't have access to the cluster level logs. Then you also want to be sure that you are auditing all user events, all API events, and any configuration changes that are happening on the cluster.
Network Multi-Tenancy
Third level in our stack of multitenancy is network multitenancy, and this is the place where a fair of network security teams who are not yet comfortable with software-defined networking. Kubernetes requires a software-defined network for all that intercluster communication, all those services talking to each other, connecting to the master so that the master knows what's going on in the cluster. That's done with an SDI. This, again, is a plugin space. There are a lot of different SDNs that are available for plugging in to your Kubernetes cluster. I'd strongly recommend that you look at SDNs that support network policy. Network policy became popular within the last year and a half or so, and it really gives you fine-grained control over communication between containers or pods in your cluster. Network policies allow you to define which pods can talk to each other on which port and in which direction.
How many of you are comfortable, have to do some thinking about network communication today? A little bit. You folks are doing a bunch of microservice-based apps, so that's partly why this is where you wind up having to spend a certain amount of your thinking. One of the other areas that's interesting about network policy is, this is actually something that's managed at the Kubernetes namespace or project level. This also means that it's not the cluster admin necessarily who's defining these policies, because this is about the applications that need to talk to each other. Network policies give you great micro-segmentation - we'll talk about some other alternatives in a little bit - but they also mean that you're going to need to negotiate potentially with your network team. If they're used to you building services that have this thing built in, that conversation will be a little easier, I imagine. We'll talk about service mesh in a little bit. Are you folks thinking about service mesh also? A little bit, maybe not so much.
If I've got a multitenant cluster, I need to think not just about how do I manage communication within the SDN, intercluster communication. I need to think about ingress and egress. How do I control access from, communication coming into the cluster, and how do I control communication going out of the cluster? This model is something that a lot of network security teams are comfortable with conceptually, that I have multiple zones in my environment, so I might have multiple zones today in a traditional infrastructure. That can be done with a single Kubernetes cluster. You can have more than one ingress and egress controller running on that cluster so that you can determine what type of traffic is allowed on each of those ingress and egress points. For those who've got HIPAA or PCI DSS where, you've got to be careful about regulatory compliance. You can use a combination of these controllers and network policy to ensure that those regulatory requirements around the communication - we're not yet talking data - are met. If you have an auditor or someone else in the company who's really uncomfortable with the idea that a PCI DSS app could be deployed anywhere in the cluster, you can use Kubernetes node selectors to ensure that regulated apps are only deployed to certain physical nodes, and that limits the scope that the auditor then is concerned with. The control plane will always be something the auditor cares about for regulates environments.
One of the really cool new things - I'm a little bit of a geek about this - is Multus. Kubernetes now gives you the ability, through the Multus meta CNI plugin, to have more than one network interface, more than one SDN interface on a single pod. One of the cool things for me is that, now, my customers who've asked for the ability to isolate the control plane, all the masters, the API servers, etc. from the data plane where the worker nodes run, where the applications are running, I can do that with this thing. You can have communication to the API server on one of these, with one of these CNI plugins and any other external communication with another. This might be more than you need. We have plenty of customers running OpenShift with configurations like the one I showed before, meeting regulatory compliance, but I have a handful of folks for whom this additional capability is just a big win. I'm really excited that this is coming - or is here, actually, I should say.
Securing Containerizes Applications
From here, I was going to step in to talking about securing your applications and your application pipeline. Again, best practices apply to securing your application pipeline. It's not that different from the best practices you would use for securing any application pipeline. However, one of the things that makes it more important and also gives us an opportunity again to shift security left is the fact that you never patch a containerized app that's running. You always rebuild and deploy from the updated container image. I see a few people laughing out there. Does that mean you're actually trying to patch running containers?
Participant 2: I was just wondering why you would try to do that.
Newcomer: Honestly, some people have not gotten yet enough automation into their pipeline to make it easy for them to quickly patch and deploy, because you really do need the right amount of automation to be confident in your updated container image in order to deploy it. I have some customers who have been doing apps for a long time, and yes, they can containerize them, but they don't yet have all the automated tests built that they need, and so it's a challenge for them. It's a real mindset shift, also. It does change who's responsible for what in many ways, too. The system dependencies are built into that container image that is what you deploy for your containerized app. It might be that in a big organization, somebody else used to be responsible for updating or patching some of those system dependencies.
One of the ways you get those dependencies into your containerized app is with something like a base image. You have all of the Java runtime that you need for your Java apps or, if it's Ruby, you have whatever you need for Ruby. Whatever type of application you're using, you want a base image that includes the dependencies that you need for that programming language, that application to run. Maybe historically, somebody else patched the JRE. Now, what it means is you need to make that patch be part of the app dev pipeline, and that's a bit of a mindset change for people, too.
Just like with traditional apps, you get that content many times from external sources. If you're using Tomcat, you might download Tomcat from Apache. If you're using a base image, you might download it from Red Hat, you might download it from Apache in some other cases. You've got some external content that's going into your application, you've got the code you write, you need to put them together. You need to store the output, the binary that comes out of that somewhere, and you need to manage all of those stages of the process. Because some of that content comes externally, one of the opportunities that you have and one of the things to think about is how do I track changes to those external container images.
I'm not just going to download a JAR file if something in my base image changes. I need to download that updated base image. How am I going to track that? You want to be using...you want to take advantage of metadata to track changes to wherever you're pulling that base content from. Ideally then, you can also, as you're monitoring updates upstream of your application build, figure out how you're going to pull those down, what are the scrutiny, what scrutiny are you going to apply when you pull it down, and how automated are your rebuilds with that new content going to be.
I mentioned earlier automated tests, that's a key part of the pipeline, but in addition - and I imagine many of you have these kinds of capabilities built into your pipeline today - vulnerability scanning is also key. This is a place where a lot of the companies that do vulnerability scanning, a number of them saw the container wave coming and updated their technology so that they are able to scan container images, but not all of them did, which meant it opened also up for some new players in that space. These are just some of the players of these players and open source scanner. Aqua Sec and Twistlock do things that go beyond vulnerability scanning. They also have some capabilities in runtime security that are interesting, as well as some compliance scanning capabilities. If you remember, you're never going to patch a running container. You've got to rebuild and redeploy. You need automation in your pipeline and security gates.
Actually, this is a place where some security teams get really excited, because ideally - some of them not - what this means is you, as developers, you get the information early in the process, and you can evaluate and manage and assess the impact of those vulnerabilities. If instead I've got a whole bunch of applications that are waiting for whoever is responsible for running those scans to put them through the scan, and I get a huge output, and then it has to be filtered and triaged and sent back to the dev team, that's a huge delay to my deployment. Part of the value of agility that you get from using Kubernetes needs to be built into your pipeline as well.
How many of you are familiar with Kubernetes operators? Ok, not too many. I'm just going to touch lightly on this. We could spend a whole hour on operators, but I talked earlier about a lot of applications that are fairly complex. You need to specify deployment information about those applications and interoperability. Operators give you a way to build operational logic around the containerized apps that you're using and extend the Kubernetes API. Remember, Kube is all API driven, everything is done through APIs. You can get your business-specific logic into a form that Kube understands and that you can leverage through Kube. You can do that with Helm Charts, Ansible playbooks, and in Go.
This is an example of why you might need an operator. If you've got just a containerized Postgres database, that's not everything you need to really make good use of Postgres. You also need some of these additional things. You need cloud storage if you're doing it in the public cloud. You need replication, you need backup. That's a further step, but ideally, you also want to manage updates, and updates are not necessarily simple. Maybe I've got a schema change, I need to be sure that that's managed appropriately across my replicas. I want observability, I want some customization. Operators are the way you build all of that logic into your Kubernetes app. The reason I mentioned operators is it gives you more control and allows you to build some things in.
Istio or service mesh. Service mesh is highly popular these days. For those of you who are doing microservice-based apps, if you haven't heard much about it, you will be hearing more. We talked earlier about network policies and how that helps to control traffic within the cluster. However, there are some things that aren't in place with network policies, such as encrypted traffic between east-west traffic within the cluster. Service mesh adds that. Service mesh also adds additional policies and circuit breaking capabilities, and it does it at the cluster level. For those of you who are having to think about this and build it into your applications, one of the great things about a service mesh is that it tracks that. You no longer have to do language-specific programming around it. You can take advantage instead of the services in service mesh, the capabilities, the security policies, the circuit breaking, the load balancing, and you get things that your security team cares about, like TLS encryption of your east-west traffic.
Attached storage, again, pluggable. Storage is pluggable in Kubernetes. Many types are supported, NFS, AWS, EBS, GCE Persistent Disks, GlusterFS, iSCSI. There are multiple ways that you can protect your attached storage, and because we're running short on time, I'm not going to walk through all of them, but be sure that you take a look at how you're going to do that, especially in a multitenant cluster.
Finally, also, let's make sure to take advantage of the ecosystem of security tools that are out there. There's a whole bunch of new tools from new companies and new tools from existing companies that understand how to help you secure and monitor a multitenant Kubernetes cluster.
See more presentations with transcripts