Engineers from TikTok have announced a new tool -- Sparo -- to help deal with the problems associated with using monorepos, solving many of the performance issues that come with larger repos.
A monorepo is a single git repository that houses multiple projects, from applications to microservices. It includes better visibility, collaboration and tooling standardization across teams. It is a widely-used but often debated technique for an engineering team to move towards monorepos, especially when their code base grows in scale and complexity. However, as monorepos balloon in size, developers can run into significant performance issues when running common Git commands like status, diff, and checkout. TikTok's front-end team recently faced this challenge as their TypeScript monorepo grew to over 1,000 projects and 200,000 source files.
In a post on TikTok's developer blog, Adrian Zhang, an engineer on TikTok's front-end infrastructure team, explained the issues that engineers were experiencing with monorepo performance:
"People with slow internet frequently reported git clone taking more than 40 minutes. It is a scalability problem: Git stores everything forever, which means a high-traffic repository will steadily increase in every metric - file writes, disk storage, download size. Git will eventually become slow for everyone - it's just a question of when!"
The TikTok team tried various techniques to mitigate the slowness, including partial clone, shallow clone, and Git Large File Storage (LFS). However, they ultimately created a new open-source tool named Sparo to address the performance issues.
Sparo leverages two key Git features - "sparse checkout" and "partial clone" - to dramatically improve the speed of common Git commands. "Sparse checkout" allows developers to check out only the subset of files they need rather than the entire repository. "Partial clone" optimizes this further by fetching file contents on demand and excluding irrelevant history.
"Sparo follows the spirit of [Microsoft's] Scalar and [Twitter's] Focus, adding a couple other details," says Zhang in the blog post. "Checkout profiles allow teams to define the set of projects and dependencies their developers typically work on. And we designed the Sparo CLI to be a drop-in replacement for the git CLI, intercepting every command to ensure Git is invoked optimally."
The team's rationale for developing Sparo was clear - Git's built-in features, while powerful, were proving cumbersome for their large-scale monorepo. "When we advised people to configure Git directly, they found it awkward to use," explained Zhang. "Sparse checkout requires you to determine which folders you need, expressed using cone mode globs that are error-prone. It's feasible to educate a small team about Git best practices, but when you reach 6-digit merge request IDs, things really need to be as simple as possible."
With Sparo, the TikTok team achieved significant performance improvements. For example, a git clone operation that previously took 23 minutes was reduced to just over 2 minutes using Sparo. Similarly, a git checkout operation went from 1 minute and 26 seconds to 30 seconds.
The GitHub engineering team has also been working to improve monorepo performance, introducing the new built-in Git file system monitor (FSMonitor) feature in version 2.37.0. FSMonitor reduces the time required for commands like git status by only searching for changes in recently modified files, rather than scanning the entire working tree.
Jeff Hostetler, a software engineer at GitHub, explained that FSMonitor works by registering with the operating system to receive change notification events, so it knows exactly which files have been modified without having to do a full search.
"When FSMonitor is enabled, git status takes less than a second on worktrees with millions of files" - Jeff Hostetler
FSMonitor can be further optimized by enabling the core.untrackedcache feature, which remembers the results of previous untracked file searches. Combined with FSMonitor, this can result in a 10x speedup for the untracked file portion of git status.
Code review software vendor Graphite also emphasizes the importance of best practices for maintaining scalable Git monorepos in a blog post. These include:
- Keeping commit history clean and linear using rebase
- Managing tags and references to prevent performance degradation
- Organizing the directory structure for easy navigation
- Maintaining a clean branch management strategy, such as trunk-based development
"As a Git monorepo grows, commands like git log or git blame can slow down due to the large number of commits. To mitigate this, you can use tools that bypass the performance issues, and carefully manage your refs to ensure operations involving them are not hindered by the sheer volume." - Greg Foster, Graphite engineer
For the TikTok team, open-sourcing Sparo was an essential step in the development process. "Although it seems natural to open source this project, by January, the growing concerns about Git slowness pressured our team to deliver a fix as quickly as possible," said Zhang. "We decided to start closed but pursue open source concurrently as long as those efforts didn't impact our timeline."
According to Zhang, working on GitHub brought some important benefits, with documentation turning out more professionally and written for a broader audience. "Engineers seem to write better code when they know it's public! We shared our designs and demos with the Rush Stack community, receiving valuable input from senior engineers at other companies." added Zhang.
The TikTok team also had to consider security implications when developing Sparo. In the blog post, Zhang explained that TikTok's approval workflow includes a review by a security expert, which added an interesting new perspective to this project. Looking ahead, the TikTok team plans to focus on two key features for Sparo's future development: a telemetry plugin system to power monitoring dashboards, and support for other frontend workspaces beyond their current RushJS implementation.