BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles Best Practices for Running Stateful Applications on Kubernetes

Best Practices for Running Stateful Applications on Kubernetes

Key Takeaways

  • If you have a Kubernetes cluster and need to run stateful applications, you can run them outside the cluster, alongside it as a cloud service, or within your cluster.
  • Kubernetes provides robust mechanisms for deploying stateful applications - mainly the StatefulSet and DaemonSet controllers.
  • Stateful applications must have access to persistent storage. In Kubernetes you can allocate persistent storage manually or automatically.
  • PersistentVolumes (PV) and PersistentVolumeClaim (PVC) are two concepts that help you manage dynamic storage in Kubernetes.
  • Best practices for stateful application management in Kubernetes include making effective use of namespaces, service routing, ConfigMaps, and securing secrets.

 

In the early days of containers, they were envisioned as a mechanism for running stateless applications.

Over the past few years, the community realized the value of running stateful workloads in containers, and orchestrators like Kubernetes introduced the necessary features. 

Kubernetes provides the Persistent Volumes (PV) architecture and controllers like StatefulSet and DaemonSet, which allow you to create pods with stateful workloads, which remain in operation even while Kubernetes scales and provisions resources in the cluster, and guarantee existing client connections are not broken.

It is far from straightforward, but it works, and anyone adopting Kubernetes as their runtime infrastructure must be familiar with it.

Image Source: Pixabay

In this article I’ll explain the importance of running stateful applications in Kubernetes, present three options for running stateful workloads, and describe these mechanisms in more detail.

What Are Stateful Applications?

A stateful application allows a user to return to it over and over again and resume previous operations - for example, an email or online banking app. Stateful applications record the context of previous transactions, which may affect current or future transactions. Therefore, a stateful app must ensure each user always contacts the same application instance, or otherwise have a scheme for synchronizing data between instances.. 

The advantage of stateful processes is that the application can store the history and context of each transaction, tracking elements such as recent activity, configuration preferences, and window location, and allowing users to resume transactions. Stateful transactions behave like a continuous conversation with the same server.

Today, most applications are stateful. Technological advances such as containers and microservices have facilitated cloud-based application development, yet due to their dynamism, they make management of stateful processes more challenging.

Containerized Stateful Application Use Cases

There is an ever greater demand to run stateful applications on containers. Containerized applications can be used to simplify deployment and operations in complex environments such as edge-to-cloud or hybrid. Statefulness is also important for continuous integration and continuous delivery (CI/CD), because CI/CD pipelines must maintain state to ensure a coherent process from development to production deployment. 

Common use cases for containerized stateful applications include:

  • Machine learning operations (MLOps) - containers require statefulness in MLOps environments for several purposes, including sharing inference and training results and checkpointing for training jobs.
  • AI and data analytics processing - data processing and machine learning frameworks such as Apache Spark, Hadoop, Kubeflow, Tensorflow, and PyTorch are increasingly supporting containerization. These platforms must repeatedly process massive amounts of data, requiring stateful mechanisms.
  • Messaging and databases - you may prefer to use local flash storage to achieve low latency, but this makes it hard to move containers between worker nodes, because data persists to the node. High-performance shared storage is important for a variety of applications including single-instance DBs such as MySQL, in-memory databases such as Redis, NoSQL databases such as MongoDB, business-critical applications such as SAP or Oracle, and messaging applications like Kafka.

3 Stateful Deployment Options in Kubernetes

There are three main options for running stateful workloads in a Kubernetes cluster: running it outside the cluster, as a cloud service alongside your cluster, or within the Kubernetes cluster.

1. Running A Stateful Application Outside of Kubernetes 

A common approach is to run your stateful application in a VM or bare metal machine, and have resources in your Kubernetes cluster communicate with it. The stateful application becomes an external integration from the perspective of pods in your cluster. 

The upside of this approach is that it allows you to run existing stateful applications as is, with no refactoring or re-architecture. If the application is able to scale up to meet the workloads required by the Kubernetes cluster, you do not need Kubernetes’ fancy auto scaling and provisioning mechanisms. 

The downside is that by maintaining a non-Kubernetes resource outside your cluster, you need to have a way of monitoring processes, performing configuration management, performing load balancing and service discovery for that application. You are essentially duplicating work by setting up a parallel software workflow outside Kubernetes.

2. Running A Stateful Workload as a Cloud Services

A second, equally common approach is to run stateful applications as a managed cloud service. For example, if you need to run a SQL database with a containerized application, and you are running in AWS, you can use Amazon’s Relational Database Service (RDS). Managed databases tend to be elastically scalable, so as Kubernetes resources scale up, the stateful service can accommodate the increased demand.

The upside of this approach is that setup is easy, ongoing maintenance of the stateful workload should be straightforward, and you have working with a cloud-native resource that is compatible with your Kubernetes cluster.

The downside is that the managed cloud service comes at a cost, it will usually have limited customization, and may not offer the performance or latency properties you need. Also, by taking this approach, you are locking yourself into your cloud provider.

3. Running your Stateful Workload Inside Kubernetes

This approach is the most difficult to implement, but will give you the greatest flexibility and operating efficiency in the long term. You can use two native controllers provided by Kubernetes to run your stateful application: StatefulSet and DaemonSet.

The StatefulSet Controller

StatefulSet is a Kubernetes controller that manages multiple pods that have unique identities, and are not interchangeable (unlike a regular Kubernetes Deployment, in which pods are stateless and can be destroyed and recreated as often as needed). 

In a StatefulSet, each pod has a persistent, unique ID. Each pod can have its own persistent storage volumes. If Kubernetes needs to scale up or down, it maintains existing connections with external users or other applications in the cluster. 

The DaemonSet Controller

A DaemonSet is a pod that Kubernetes makes sure to run across all nodes in your cluster, or a specific subset of nodes defined by selectors. Whenever an eligible node is added to the cluster, the pod is started on it.

DaemonSets are very useful for stateful applications that need to run as background processes, such as monitoring or log aggregation. Generally speaking, DaemonSets are less flexible, but are easier to manage and have more predictable resource usage than StatefulSets.

Persistent Storage in Kubernetes

A volume is the Kubernetes entity that provides persistent storage. All containers within a pod can share volumes. You can use persistent volumes to enable use of the same mounted file system by several services that run in the same pod. 

Non-Persistent Storage Volumes

In Kubernetes, to grant containers access to persistent storage, you specify the required volume and the desired location to mount the volume in the container's file system.

A regular storage volume in Kubernetes has a defined lifetime - each volume is bound to the pod's lifecycle. A volume remains within the pod while the pod is active and gets reset if you restart the pod. This model is not suitable for stateful workloads, which is why Kubernetes introduced the concept of Persistent Volumes.

PersistentVolumes (PV)

Kubernetes PersistentVolumes (PV) are storage objects that exist at the cluster level. Binding PVs to the cluster extend their lifetime beyond the lifecycle of a single pod. Since a PV is located at the cluster level, pods can share data. You can expand the size of a persistent volume and scale, but you cannot reduce its size. 

There are two ways to provision a PV:

  • Statically - lets you pre-allocate storage resources. This assumes the physical storage resources available to the cluster are static.
  • Dynamically - lets you extend the available storage space to meet growing demands. You can use this option by enabling the DefaultStorageClass admission controller on the Kubernetes API server.

PersistentVolumeClaim (PVC) 

A PVC lets Kubernetes users request storage. It works similarly to a pod, but while pods consume node resources, PVCs consume PV resources. Additionally, just as a pod can request specific levels of resources, PVCs can request specific access modes and sizes. 

Here are key differences between PVs and PVCs:

 

PVs

PVCs

Who creates them

Only cluster administrators and Kubernetes (via dynamic provisioning) can create PVs.

Developers and users can all create PVCs.

Type of resource

A PV is a cluster resource.

A PVC is a request for storage resources. 

Consumption

A PVC consumes PV resources.

Pods consume PVCs.

StatefulSets and DaemonSets

StatefulSets 

StatefulSet is a workload API object designed to assist in managing stateful applications. It can manage the scaling and deployment of a collection of pods and provide guarantees about the uniqueness and ordering of these pods.

StatefulSet can help you handle storage volumes that provide persistence. Note that even though individual pods in a StatefulSet are susceptible to failure, your stateful workloads can be resilient to failure. Persistent pod identifiers let you match existing volumes to new pods provisioned by Kubernetes to replace failed ones.

StatefulSets are ideal for any application that requires the following:

Stable, unique network identifiers.

Ordered, graceful deployment and scaling.

Stable, persistent storage.

Ordered, automated rolling updates.

Below is an example, taken from the Kubernetes documentation, which demonstrates StatefulSet components. 

The example uses the nginx service to control a network domain. The StatefulSet is called web and has a Spec indicating that three replicas of the nginx container must be launched in unique pods. It also specifies that a volumeClaimTemplates provides stable storage when using PVs provisioned by a PV Provisioner.

apiVersion: v1
kind: Service
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  ports:
  - port: 80
    name: web
  clusterIP: None
  selector:
    app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
spec:
  selector:
    matchLabels:
      app: nginx # has to match .spec.template.metadata.labels
  serviceName: "nginx"
  replicas: 3 # by default is 1
  template:
    metadata:
      labels:
        app: nginx # has to match .spec.selector.matchLabels
    spec:
      terminationGracePeriodSeconds: 10
      containers:
      - name: nginx
        image: k8s.gcr.io/nginx-slim:0.8
        ports:
        - containerPort: 80
          name: web
        volumeMounts:
        - name: www
          mountPath: /usr/share/nginx/html
  volumeClaimTemplates:
  - metadata:
      name: www
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "my-storage-class"
      resources:
        requests:
          storage: 1Gi

DaemonSets

DaemonSets are responsible for ensuring that all or certain nodes run copies of pods. Once nodes are added to a cluster, pods specified in the DaemonSet are added to the nodes. Once nodes are removed from a cluster, DaemonSet pods are garbage collected. Deletion of a DaemonSet cleans up the pods it created.

Here are common uses of DaemonSets:

  • Run a cluster storage daemon on each node
  • Run a logs collection daemon on each node
  • Run a node monitoring daemon on each node

You can use one DaemonSet that covers all nodes for each daemon type. You can also use several DaemonSets for one daemon type, using different flags, memory, and CPU requests for different types of hardware.

Creating a DaemonSet

Run this command in order to create a DaemonSet in a Kubernetes cluster:

kubectl apply -f [Path to Daemonset spec].yaml

Defining DaemonSet parameters

Kubernetes lets you describe a DaemonSet using a YAML file. The daemonset.yaml file example below defines a DaemonSet running a fluentd-elasticsearch Docker image. This example too is from the official documentation.

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd-elasticsearch
  namespace: kube-system
  labels:
    k8s-app: fluentd-logging
spec:
  selector:
    matchLabels:
      name: fluentd-elasticsearch
  template:
    metadata:
      labels:
        name: fluentd-elasticsearch
    spec:
      tolerations:
      # this toleration is to have the daemonset runnable on master nodes
      # remove it if your masters can't run pods
      - key: node-role.kubernetes.io/master
        operator: Exists
        effect: NoSchedule
      containers:
      - name: fluentd-elasticsearch
        image: quay.io/fluentd_elasticsearch/fluentd:v2.5.2
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers

Best Practices for Stateful Applications on Kubernetes

I covered several methods you can use to run stateful workloads on Kubernetes. Here are a few recommendations for running stateful applications more effectively:

  • Make effective use of namespaces - preferably, separate each stateful application into its own namespace to ensure clear isolation and easier resource management.
  • Use ConfigMaps - all scripts and custom configuration should be placed in a ConfigMap, to ensure all application configuration is handled declaratively.
  • Service routing - consider manageability of service routing as your application grows. Prefer to use a headless service instead of a load balancer.
  • Secrets management - plaintext secrets can create critical security risks for production applications. Ensure all secrets are managed in a robust secret management system.
  • Carefully plan storage - determine the persistent storage requirements of the application, ensure the physical storage equipment is available for use of the cluster, and define Storage Classes and PVCs in a way that will guarantee required storage resources for each application component.

Conclusion

In this article I explained the basics of stateful containerized applications, and explained how to manage stateful workloads in Kubernetes. This includes the following key building blocks:

  • PersistentVolumes (PV) - a construct that allows you to define a persistent storage unit and mount it to pods within Kubernetes clusters.
  • PersistentVolumeClaims (PVC) - a mechanism that allows pods to dynamically request storage that meets their requirements.
  • StatefulSet - a controller that lets you create a pod with a persistent ID that remains in place even as Kubernetes dynamically scales applications in the cluster.
  • DaemonSets - a controller that lets you run a stateful workload across all nodes in a cluster, or a specific subset.

When you familiarize yourself with these building blocks, you’ll be able to run stateful workloads directly in Kubernetes clusters, safely and repeatedly. Like everything in Kubernetes, stateful mechanisms are far from intuitive and take time to master, but are robust and dependable when you get the hang of them. Get practicing and you’re on your way to becoming a stateful Kubernetes pro.

About the Author

Rate this Article

Adoption
Style

BT