Inside Netflix’s Graph Abstraction: Handling 650TB of Graph Data in Milliseconds Globally

Netflix engineers have built Graph Abstraction, a high throughput system designed to manage large scale graph data in real time. The platform powers several internal services, including social graphs for Netflix Gaming and service topology graphs used for operational monitoring and incident analysis. By separating edge connections from edge properties and replicating data globally, the system enables millisecond level queries across roughly 650 TB of graph data, allowing engineers to analyze complex relationships quickly and reliably.

Graph processing systems often face a trade off between expressive queries and predictable performance. Traditional graph databases tend to prioritize flexible traversal and complex query capabilities, but many operational workloads require extremely fast responses at high throughput. To meet these requirements, Netflix’s system restricts traversal depth and often requires a defined starting node, trading some query flexibility for consistent low latency at scale.

Graph Abstraction supports several internal use cases. These include a real time distributed graph that captures interactions across services in the Netflix ecosystem, a social graph used by Netflix Gaming to model user relationships, and a service topology graph that helps engineers analyze dependencies during incidents and root cause investigations. The platform also maintains historical graph state through a TimeSeries abstraction, enabling analytics, auditing, and temporal queries over graph evolution.

Rather than building a standalone graph database, Graph Abstraction is implemented as a layer on top of Netflix’s existing data infrastructure. The latest graph state is stored in a Key Value abstraction, while historical changes are recorded through the TimeSeries abstraction. To reduce latency, the system integrates with EVCache, Netflix’s distributed caching layer. Graph schemas are loaded into memory and strictly enforced, enabling validation, optimized traversal planning, and elimination of invalid query paths.

Graph Abstraction architecture built on Netflix’s data infrastructure (Source: Netflix Tech Blog)

The platform also uses caching strategies: write-aside caching prevents duplicate edge writes, while read-aside caching accelerates access to node and edge properties. These techniques reduce read and write amplification and maintain performance under heavy workloads. Graph Abstraction exposes a gRPC traversal API inspired by Gremlin, enabling services to chain traversal steps, apply filters, and limit results.

Netflix engineers describe that global availability was achieved through asynchronous replication across regions, ensuring eventual consistency while supporting high throughput. Both caching layers and durable storage replicate graph data across regions, balancing latency, availability, and consistency trade-offs.

Global replication across caching (Source : Netflix Tech Blog)

Netflix engineers emphasize that in production the system delivers single digit millisecond latency for single hop traversals and under 50 milliseconds for two hop queries at the 90th percentile, providing predictable performance at scale. The design carefully balances traversal planning and execution to enable efficient exploration of large graph datasets.

As Netflix expands into new verticals such as live content, gaming, and advertising, Graph Abstraction is expected to play an increasingly important role in modeling relationships between users, services, and content while maintaining high throughput, global availability, and low latency access across the platform.

About the Author

Leela Kumili

Show moreShow less

InfoQ Software Architects' Newsletter

Write for InfoQ

About the Author

Leela Kumili

Rate this Article

This content is in the Distributed Systems topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter