Expedia Open-Sources Container-Startup-Autoscaler (CSA) for Scaling Kubernetes Workloads

Expedia's Performance and Reliability team recently open-sourced its container-startup-autoscaler (CSA). CSA is a Kubernetes controller that leverages the In-Place Update of Pod Resources feature to dynamically adjust CPU and/or memory resources of containers during startup based on user-defined startup/post-startup configurations.

The In-Place Update of Pod Resources feature is available since Kubernetes 1.27.0 is in alpha state. This functionality enables modifying pod container resources (requests and limits) without necessitating a pod restart. Previously, any adjustments to container resources mandated a pod restart for implementation.

A longstanding concern in Kubernetes workload management has been optimizing container resources for workloads exhibiting contrasting resource utilization patterns during the startup and post-startup phases. Before the introduction of in-place resource updates, there existed a trade-off for startup-intensive workloads between achieving consistent startup times and minimizing resource wastage post-startup:

Burstable Quality of Service (QoS):
- Setting limits higher than requests, banking on surplus resources beyond requests during startup.
- Startup times are unpredictable due to dependency on cluster node-loading conditions.
- Post-startup performance may also be erratic due to the volatile nature of additional scavenged resources, especially with cluster consolidation mechanisms.
Guaranteed QoS (1):
- Establishing limits equal to requests, prioritizing startup time.
- Predictable startup time and post-startup performance, but potential wastage, particularly with excessive pod replica counts.
Guaranteed QoS (2):
- Setting limits equal to requests, emphasizing normal workload-servicing performance.
- Predictable and acceptable post-startup performance, albeit at the expense of slower startup times, impacting operational efficiency by elongating deployment durations and horizontal scaling reaction times.

The container-startup-auto scaler (CSA) operates at the pod level. It integrates with various workload management APIs such as Deployments, StatefulSets, and DaemonSets, ensuring compatibility across different pod management methods. It accommodates both initial container startups and Kubernetes-initiated restarts.

CSA logic schema

CSA can focus on a single non-init/ephemeral container within a pod. Details such as the target container's name and the desired startup/post-startup resource configurations are encapsulated within specific pod annotations.

Upon monitoring pods designated for scaling (identified via a label), CSA responds to changes within these pods. Upon detecting alterations in an eligible pod, CSA evaluates the current state of the target container and executes one of several actions based on its state:

Commanding startup resource settings (when the target container is inactive with post-startup settings applied).
Commanding post-startup resource settings (when the target container is active with startup settings applied).
Assessing the status of previously executed scaling commands and reporting accordingly. Successful scaling is recognized as enacted.

CSA intervenes during the initial creation of the target container by its pod and in instances where Kubernetes restarts the target container. CSA refrains from scaling actions when unnecessary. For instance, if the target container repeatedly fails to start before becoming ready (prompting Kubernetes to restart it in a CrashLoopBackOff manner), CSA applies startup resources only once. Additionally, CSA generates metrics, Kubernetes pod events, and detailed status updates, all of which are incorporated into an annotation within the scaled pod.

CSA has some limitations:

Initially declared target container resources must be guaranteed (requests == limits) to align with the guaranteed nature of startup resources. The current Kube API rejects changes in resource Quality of Service (QoS). This limitation should be addressed as the In-Place Update of Pod Resources feature evolves.
Post-startup resources must also be guaranteed (requests == limits) to match the guaranteed nature of startup resources, as outlined above.
Failed attempts to scale the target container are not retried.

CSA's primary objective is to empower Kubernetes workload administrators to finely tune container resources during startup, distinct from post-startup resource configurations, thereby mitigating associated trade-offs. This approach facilitates:

Mitigating resource wastage by segregating resource settings for startup and post-startup phases.
Enhancing startup performance and predictability, enabling faster horizontal scaling operations.

As of Kubernetes 1.29, the In-Place Update of Pod Resources feature, upon which CSA relies, is in the alpha stage. Consequently, CSA functionality necessitates enabling the InPlacePodVerticalScaling feature gate. Given the ongoing development of both the feature and CSA implementation, caution is advised. Expedia team recommends CSA for preview purposes only within local or non-production Kubernetes environments until it achieves stable status.

About the Author

Claudio Masolo

Show moreShow less

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

InfoQ Article Contest

About the Author

Claudio Masolo

Rate this Article

This content is in the DevOps topic

Related Topics:

Related Editorial

Related Sponsored Content

Popular across InfoQ

The InfoQ Newsletter