Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Pinterest’s Journey to a Kubernetes Platform

Pinterest’s Journey to a Kubernetes Platform

This item in japanese

Pinterest software engineers have revealed the custom tools and resources they introduced in the company's adoption of Kubernetes. The key takeaways for other teams looking to build their own platform as a service (PaaS) and associated developer workflow include how container orchestration systems can provide a way to unify workload management, that the Kubernetes workload model can be enhanced with custom resource definitions, and that a robust end-to-end test pipeline is key to avoiding regression.

Pinterest, a social media web and mobile app that allows users to save or "pin" information, has a huge user base who have collectively saved more than 200 billion pins across 4 billion boards. As a result of this volume and the associated growth of their infrastructure stack, the Pinterest team had several challenges. They stated that their engineers didn’t have a unified experience when launching their workload and that managing huge numbers of virtual machines was creating a huge maintenance load for the infrastructure team. Furthermore, it was hard to build infrastructure governance tools across the separate systems and to determine which resources could be recycled. This echoes Airbnb’s experience in simplifying their Kubernetes workflow. The team attempted to address these problems across three key themes: service reliability, infrastructure efficiency and developer productivity.

According to lead author Lida Li and team, the Cloud Management Platform team started their journey with Kubernetes in 2017 by dockerizing their production workloads and evaluating different container orchestration systems. The Kubernetes native workload model covered deployment, jobs and daemonsets but the team needed more to model their workloads. They stated that usability issues were ‘huge blockers’ on the way to adopting Kubernetes and that it would have been difficult to support different versions of runtime support on the same Kubernetes cluster. Their solution was to design custom resource definitions (CRDs). This was a pre-release deploy workflow available to early adopters of the new Kubernetes-based Compute Platform. The team was integrating this workflow into their CI/CD platform to create a cleaner service for their engineers.

Pinterest Kubernetes pipeline.
An overview of how to deploy Pinterest CRDs (image taken from the Pinterest Engineering Blog)

Pinterest designed its CRDs to achieve various ends that may also be informative for engineers considering Kubernetes adoption. Firstly, they wanted to bundle various native Kubernetes resources to work as a single workload, which saved their engineers from doing this piece by piece. Secondly, they wanted to inject necessary runtime support for their applications by adding the necessary sidecars, init containers, equipment variables and volumes into the specification. Lastly, these definitions were used to perform the life cycle management for native resources, such as reconciling the specifications and updating the event record. The Pinterest team surmised that this evolution significantly reduced the workload on engineers and therefore the risk of error. This echoes the experience which the Shopify team shared at QCon New York last year.

One consideration for engineers taking on similar problems is that in order to avoid inconsistencies between applications as well as bloating maintenance and support burdens, Pinterest found their infrastructure team needed to deploy all workflow types such as pod-level sidecars, node-level daemonsets or VM-level daemons. Tinder, whose platform has run exclusively on Kubernetes since March 2019, took the opposite approach and its infrastructure responsibility is shared between all engineers in the organisation.

Another consideration is that the Pinterest team built an end-to-end test pipeline on top of the native Kubernetes test infrastructure with tests deployed to all clusters. This mitigated risks associated with going beyond the Kubernetes native workflow model and the engineers stated it caught many regressions before they reached production. The Pinterest team was also integrating their deployment workflow into their new CI|CD platform.

Rate this Article