Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Detecting Malicious Behaviour in GKE Using OSS Memory Analysis Tools

Detecting Malicious Behaviour in GKE Using OSS Memory Analysis Tools

The Spotify R&D team recently shared how they analyze the memory of a Google Kubernetes Engine (GKE) cluster node when suspicious behaviour is detected. The primary goal is to understand if something malicious is occurring within their workloads. A new methodology they developed is based on dumping the kernels of the cluster nodes using open source tools: AVML, dwarf2json, and Volatility 3.

To understand what is happening on a Kubernetes node and what processes are running on it, the operating system (OS) kernel is the right place to find this information. The kernel is the main layer between the OS and the node resources, and it is responsible for important tasks like process and memory management, network control, file system and device control. An overview of the kernel layout is the following:


Kernel layout


There are many commercial tools available to perform kernel analysis, but combining the open source tools AVML, dwarf2json, and Volatility 3, it is possible to reach the same results.

The Spotify R&D team shared its methodology to achieve this. It consisted of three steps:

  • Step 1: Create a kernel memory dump
  • Step 2: Build a symbol file of the kernel
  • Step 3: Analyze the kernel memory dump.

And it was tested on the following architecture created with Terraform and Python:


Test architecture


The kernel memory dump is a snapshot of all the kernel activities at a specific time. GKE nodes run the hardened Container-Optimized OS (COS), which doesn’t permit the access to the kernel space. Acquire Volatile Memory for Linux (AVML) is a tool written in Rust that can be used to acquire the memory without knowing the OS or kernel a priori. By running AVML in a privileged container on GKE it is possible to access the kernel space in the file path: /proc/kcore and take a kernel memory dump.

In the second step, it is important to interpret the dump file and build an Intermediate Symbol File (ISF) of the current GKE node kernel version. This can be done by accessing the vmlinux file that is an uncompressed version of the kernel image and using the open source tool dwarf2json to build the symbol file. dwarf2json is a utility written in Go, that processes files containing symbol and type information to generate Volatility3 Intermediate Symbol File (ISF) JSON output suitable for Linux and macOS analysis. Thanks to the undocumented API, and given the build_id of the COS version running on GKE node, it is possible to access the vmlinux file of the node and build the ISF using dwarf2json.

Step 3 consists of analyzing the kernel memory dump with Volatility 3: a framework to extract the digital artifacts from the volatility memory. This tool allows seeing all the running processes, both privileged and other pods, on the same GKE node.

This approach based on the combination of open source tools provides an alternative to preexisting commercial solutions for monitoring containerized workloads. The Spotify team stated:

Although this approach provides a snapshot of the process activity, it can be used either as a starting point for memory analysis in GKE or as a complement to existing commercial solutions.

All the code used in this research project is available in the bsidesnyc2023 repo on GitHub.

About the Author

Rate this Article