BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Uber Migrates 4000+ Microservices to a New Multi-Cloud Platform Running Kubernetes and Mesos

Uber Migrates 4000+ Microservices to a New Multi-Cloud Platform Running Kubernetes and Mesos

This item in japanese

Uber moved most of its containerized microservices from µDeploy to a new multi-cloud platform named Up in preparation for migrating a considerable portion of its compute footprint to the cloud. The company spent two years working on making its many microservices portable so that they can be migrated between different compute infrastructure and container management platforms.

Uber started as a monolithic application in 2014 but, with growth, transitioned to a microservice architecture. The company created µDeploy to help standardize application service deployments at scale. The move abstracted away host management and placement aspects, but service management and placement remained highly manual, meaning that service engineers still had to decide in which zone (physical data center) in a specific region the service should run.

Mathias Schwarz, a senior staff engineer, and Andrew Neverov, an engineering manager at Uber, explain the reasons for Uber’s decision to detach engineering teams from the infrastructure completely:

Operating our on-prem data centers, we experienced long lead times due to chip shortages and supply chain issues. On February 13, 2023, Uber partnered with Oracle and Google, aiming to diversify and decrease the company’s exposure to supply chain issues. Executing on this strategy would be impossible without having a system in place to abstract away the underlying infrastructure from thousands of Uber engineers working on hundreds of various services powering the business.

In 2018, Uber’s platform team started working on a new multi-cloud, multi-tenant federation control plane responsible for automating service placements and infrastructure-level migrations. The new platform, named Up, was meant to become the primary tool for service engineers to interact with the infrastructure systems. It would also manage and enforce best practices to drive towards safe code rollouts.

Up: High-Level Architecture (Source: Uber Engineering Blog)

Up platform has a layered architecture with the experience layer responsible for user interactions and system housekeeping, including workload management and scaling. The platform layer provides common abstractions and the conceptual model for the experience layer components to use, and it is used to express service placement constraints based on the capabilities of host machines and compute capacity. The federation layer implements integration with compute clusters and is responsible for exercising service placements based on available capacity and defined placement constraints. The change management component delivers gradual rollout capabilities supported by health monitoring. The bottom-most layer contains the actual cluster instances, using Peleton (Uber’s own open-source container orchestration platform, built on top of Apache Mesos) and Kubernetes.

In preparation for the move to the cloud, the company spent two years working towards making all stateless microservices portable so that their placement in zones and regions can be managed centrally without any involvement from the service engineers. The team used existing tooling to move services between zones in order to ensure they were portable. Firstly, they allowed services to be moved back to the original zone to resolve any portability issues, but once resolved, services would be moved periodically to validate portability and prevent regressions.

Once portable, microservices have been gradually and mostly automatically migrated to Up, resulting in substantial monetary savings due to autoscaling and efficiency efforts and reducing the maintenance burden on the service teams considerably. With most of Uber’s microservices platform now managed by Up, the company is free to kick off its cloud migration effort without much impact on the service teams. They also want to focus on automated continuous delivery and deployment safety.

About the Author

Rate this Article

Adoption
Style

BT