BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Contribute

Topics

Choose your language

InfoQ Homepage News Uber's Engineering Manages to Cut 70k CPUs by Tuning Go GC

Uber's Engineering Manages to Cut 70k CPUs by Tuning Go GC

This item in japanese

Bookmarks

In an effort to help the company become profitable, Uber’s Maps Production Engineering department has focused their efforts on making the usage of infrastructure more efficient. As an outcome of this effort, they managed to develop a semi-automated Go Garbage Collection tuning mechanism that saved 70K CPU cores across 30 mission critical services. The tuning library was mostly built in Go and ran on top of their cloud native scheduler-based infrastructure.

Based on their prior experience in increasing the efficiency of Java services by tuning garbage collection, the team’s profiling exploratory sessions led them to understand that almost twenty five percent of the CPU time of their Go services was being spent in garbage collection activities (identified by the runtime.scanobject method).

Microservices within Uber’s application portfolio have a significantly diverse memory utilization portfolio. For instance, a sharded system can have quite different live sets. In one case, the p99 utilization was 1GB, but the p1 was 100MB, therefore the p1 occurrences were having a huge GC impact. As a service is not aware of the maximum amount of memory the container has allocated, it became obvious to the team that a fixed value tuning approach would not be appropriate in their case.

This led to the conception of the GOGCTunner: a library which simplifies the process of tuning garbage collection for service owners and adds a layer of reliability on top. The tuner dynamically computes the correct GOGC value in accordance with the container’s memory limit (or the upper limit from the service owner) and sets it using Go’s runtime API.

The library was built with the following features:

  • Simplified configuration for easier reasoning and deterministic calculations.
  • Protection against Out Of Memory (OOM) kills: the library reads the memory limit from the cgroup and uses a default hard limit of 70%, a safe value from the perspective of the team's experience. Nevertheless, this protection has a limit; it can only adjust the buffer allocation, so if a service’s live objects are higher than the limit, the tuner would set a default lower limit of 1.25X the live objects utilization.
  • Allowing higher GOGC values for corner cases like: if the live dataset doubles at peak value, the tuner would enforce the same memory limit at the cost of more CPU. A manual approach would cause an OOM.

For better observability during their effort, the team observed several metrics:

  • Intervals between garbage collections: as Go triggers a GC at most every two minutes, if this graph indicates that this is regularly occurring, the team needs to work on allocations optimisations.
  • GC CPU impact: this enabled the team to observe CPU utilization and understand how much the services are being affected.
  • Live dataset size: as the amount of used memory increased, this metric allowed the team to observe a steady utilization (live usage).
  • GOGC value: to understand how the tuner is reacting to different values.

As an outcome of this effort, they managed to develop a semi-automated Go Garbage Collection tuning mechanism which in turn saved 70K CPU cores across 30 mission critical services. Taking into account that many of the tools used by Uber’s infrastructure are built in Go as well (among others: Kubernetes, Prometheus, Jaegger), any large-scale deployment from the outside could benefit from memory tweaking.

Even if the tool is still for Uber’s internal purposes only, other Go developers have been inspired by these efforts and have developed related open source tools.

About the Author

Rate this Article

Adoption
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • Link to the Jaeger

    by Edmondas Girkantas,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    The link to Jaeger is incorrect, it should be www.jaegertracing.io/

  • Re: Link to the Jaeger

    by Olimpiu Pop,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Thank you, Edmondas! Corrected.

  • Are Garbage Collector the right solution for the wrong problem?

    by Enrique Benito,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    There are languages like Rust or Vlang that work like a charm with no need at all for complex garbage collectors.
    Even "old" languages like C can get rid of manual memory cleaning using the appropiate library (I created long time ago a very primitive one by just using the appropiate macros and it took me no more than a week -even if I am a very poor C developer- that was good enough for my daily tasks. For anyone curious it is available at github.com/earizon/libctrans).

    It looks to me that GCs are complex solutions trying solve a complex problem but this complex problem could be reduced to a non-complex one in first place.

  • Re: Are Garbage Collector the right solution for the wrong problem?

    by Olimpiu Pop,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Thank you for sharing, Enrique. I don't know the memory model Rust has, but I am curious now. Thank you!

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

BT