BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Autopilot Became the Default Operation Mode for Google Kubernetes Engine

Autopilot Became the Default Operation Mode for Google Kubernetes Engine

Google announced that Autopilot is now the default and recommended operation mode for Google Kubernetes Engine (GKE) clusters. With Autopilot, all of the Kubernetes cluster management tasks are automatically controlled. Autopilot, introduced in 2021, creates clusters based on the best practices learned from Google SRE and engineering.

The GKE architecture consists of a control plane and worker nodes. Autopilot manages the entire infrastructure in the following diagram. With GKE standard mode, Google manages the control plane and the system components, and the users manage the nodes.

                                                                   GKE architecture

Autopilot abstracts developers from the GKE cluster management; the provisioning of the cluster infrastructure is based only on the workload. Autopilot uses the concept of compute classes, which are a subset of the GCP Compute Engine machine series, as a consumption model. GKE Autopilot Pods run, by default, on a computing platform optimized for general-purpose workloads such as web serving and medium-intensity batch jobs.

Thanks to the concept of compute classes, the developers can define specific resources and CPU platforms in the workload definition. Autopilot automatically spins up the infrastructure and sets the appropriate taints and tolerations for the workload. An example of Deployment with GPU (tesla T4) specification is the following:

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: tensorflow
  name: tensorflow-t4
spec:
  replicas: 1
  selector:
    matchLabels:
      app: tensorflow
  template:
    metadata:
      labels:
        app: tensorflow
    spec:
      nodeSelector:
        cloud.google.com/gke-accelerator: nvidia-tesla-t4
      containers:
      - image: tensorflow/tensorflow:latest-gpu-jupyter
        name: tensorflow-t4
        resources:
          limits:
            nvidia.com/gpu: "4"

With Autopilot, the control plane of the cluster is constantly monitored by Google in order to ensure the Pods are always scheduled and scaled according to the needs. Security issues are also managed by Google. Autopilot uses the security-focused version of Kubernetes, and security best practices are applied. By default the shielded node is used.

                                                                Autopilot Pod base pricing

Autopilot allows users to pay only for the effective Pods resources requested made in PodSpecs. No other infrastructure costs are required, as may be the case with traditional Kubernetes clusters, where some resources are overprovisioned (for different reasons). In a traditionally managed cluster, Kubernetes reserve some of the necessary resources for system workloads on each node, which are billed to the customer. Autopilot eliminates this because the customer pays only for the resources requested in the PodSpecs.

About the Author

Rate this Article

Adoption
Style

BT