Memory Issues with Linux Control Groups Might Affect Containerized Applications

LinkedIn’s engineering team recently published an article titled “Don’t Let Linux Control Groups Run Uncontrolled”. Control groups (cgroups) are a Linux feature used by projects like Docker and CoreOS to restrict resource usage by processes. The article outlines several problems with memory management in cgroups that might lead to performance degradation and possible workarounds for them.

Cgroups is a way to ensure that an application does not use more than its quota, but it does not ensure isolation. There can be multiple cgroups running inside the same operating system instance, with each of them having different quotas set for RAM, CPU and so on. However, the operating system’s behavior when there is demand for memory (what the authors term ‘memory pressure’) can lead to unexpected and undesirable consequences for the applications running inside the cgroups.

Cgroups are arranged in a hierarchy, with the operating system in a ‘root’ cgroup with all others as its children. For example, a Docker container would run in a cgroup that is a child of the root cgroup.

The issues outlined in the paper deal with ‘anonymous memory’ - which is memory that a program requests, and ‘page cache’, which is memory that is used to store cached versions of a program’s data which usually reside on permanent storage like a hard drive and is used during program execution. The cache is used to speed up access to that data. The allocation of these two memory types can be always superseded by the root cgroup or the operating system.

The operating system loads page caches into RAM when main memory is available but it reclaims the memory when there is demand from applications. Reclamation leads to removal of the page cache(s), and this is done across cgroups since the OS does not respect cgroup ownership settings in this case. This might lead to page caches being claimed from cgroups and application performance suffering as a result.

Another problem arises when memory demand for a cgroup is satisfied by evicting the page cache. The memory used for storing the page cache is part of a cgroup’s memory limit. So if a cgroup (in the case of Docker, a container) is allocated 8 GB of RAM, it will have to use the 8 GB for both page cache and anonymous memory. This is something that can be easily overlooked and thus might lead to incorrect expectations about performance.

The OS also performs swapping, which is writing program data that is stored in main memory into secondary memory, like a hard drive, when the demand for main memory exceeds what the system has. The OS can swap out user memory from any of the child cgroups, leading to performance degradation for the applications running in those groups.

The paper authors suggest several workarounds for these problems, including pre-touching the memory, which involves ensuring that the memory is allocated when the process starts, rather than on demand. The exact methods of doing this vary across platforms. Another option is to better assess the memory footprint of an application so that allocation can be done more accurately. The page cache usage is not easy to estimate, but the anonymous memory can be estimated easily. The anonymous memory can be estimated from system metrics like the Resident Set Size (RSS).

A new version of cgroups has been released with improvements, but is yet to be tested for these cases.

InfoQ Software Architects' Newsletter

Follow us on

Rate this Article

This content is in the DevOps topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter