Uber Open Sources Its Large Scale Metrics Platform M3

Uber's engineering team released its metrics platform M3, which it has been using internally for some years, as open source. The platform was built to replace its Graphite based system, and provides cluster management, aggregation, collection, storage management, a distributed time series database (TSDB) and a query engine with its own query language M3QL.

Uber's previous metrics collection and monitoring system was based on Graphite, backed by a sharded Carbon cluster, with Nagios for alerting and Grafana for dashboarding. Issues with this included poor resiliency and clustering, high operational cost to expand the Carbon cluster, and a lack of replication which made each node a single point of failure. M3 - the new metrics system - was born out of these shortcomings. In addition to scalability, global, responsive querying across datacenters, the goals for the new system were the ability to tag metrics, and maintain backwards compatibility with services that emitted metrics in the StatsD and Graphite format. Rob Skillington, staff software engineer at Uber, describes the architecture of M3 in a recent article. M3 currently stores 6.6 billion time series, aggregates 500 million metrics per second, and stores 20 million metrics per second.

The initial version of M3 had open source components like statsite for aggregation, Cassandra for storage, and Elasticsearch for indexing. Each component was gradually replaced by an in-house implementation because of increasing operational overhead and a demand for new features. Due to the widespread use of Prometheus in multiple teams at Uber, M3 was built to integrate with Prometheus as a remote storage backend.

The Prometheus integration is via a sidecar component that writes to local regional M3DB instances and fans out queries "to inter-regional coordinators which coordinate reads from their local regional M3DB (the storage engine) instances". This model is similar to the the way that Thanos, an extension to Prometheus that provides cross-cluster federation, unlimited storage and global querying across clusters, works. However, the Uber team did not choose Thanos for various reasons, with the primary one being high latencies for metrics that are not stored locally. Thanos pulls and caches metrics data from AWS S3, and the associated latencies as well as the additional disk usage for the cache were unfeasible due to Uber's latency requirements and the large amount of data.

M3's query engine provides a single global view of all metrics without cross region replication. Metrics are written to local regional M3DB instances and replication is local to a region. Queries go to both the regional local instances as well as to coordinators in remote regions where metrics are stored. The results are aggregated locally, and future work is planned wherein any query aggregation would happen at the remote coordinators.

M3 lets users specify the retention period and granularity per metric for storage, like Carbon does. M3's storage engine replicates each metric to three replicas in a region. To reduce disk usage, data is compressed using a custom compression algorithm. Most time series databases have a compaction feature where existing smaller data blocks are rewritten into larger ones, and restructured to improve query performance. M3DB avoids compactions where possible, to maximize the utilization of host resources for more concurrent writes and provide steady write/read latency.

Skillington says in the article that "M3DB itself only compacts time-based data together when absolutely necessary, such as backfilling data or when it makes sense to combine time window index files together." Metrics are downsampled using a streaming model where the downsampling happens as the metrics come in.

M3's own query language - M3QL - is used internally at Uber due to features that are not available in PromQL. There are limits to the cardinality of metrics that can be handled, which are more in terms of querying than of storage. M3's storage also optimizes access times by utilizing Bloom filters and indexes in memory-mapped files. A Bloom filter is used to determine if something might exist in a set, and in M3 it's used to determine if a series that is queried for needs to be retrieved from disk. The team is working on adding support for running M3 on Kubernetes.

M3 is written in Go and available on Github.

Topics

Pitfalls of Unified Memory Models in GPUs

Evolving Trainline Architecture for Scale, Reliability and Productivity

Generally AI - Season 2 - Episode 3: Surviving the AI Winter

Mastering Observability: Unlocking Customer Insights with Gojko Adzic

Proactive Approaches to Securing Linux Systems and Engineering Applications

Helpful links

Choose your language

Write for InfoQ

Rate this Article

This content is in the DevOps topic

Related Topics:

Related Editorial

Related Sponsored Content

Popular across InfoQ

Microsoft Introduces Drasi: Open-Source System for Real-Time Event Processing and Automation

How Cell-Based Architecture Enhances Modern Distributed Systems

Article Series: Cell-Based Architectures: How to Build Scalable and Resilient Systems

Orchestrating a Path to Success - a Conversation with Bernd Ruecker

OpenAI Releases Swarm, an Experimental Open-Source Framework for Multi-Agent Orchestration

Generally AI - Season 2 - Episode 3: Surviving the AI Winter

Challenges and Lessons Porting Code from C to Rust

Copilot Now Available in OneDrive: AI-Powered Features for Streamlined Document Management

Ephemeral IDs: Cloudflare's Latest Tool for Fraud Detection

Evolving Trainline Architecture for Scale, Reliability and Productivity

Taking Advantage of Cell-Based Architectures to Build Resilient and Fault-Tolerant Systems

No EC2 or Kubernetes Allowed: Insights from Building Serverless-Only Architecture at PostNL

Mastering Observability: Unlocking Customer Insights with Gojko Adzic

How a Sustainable Mindset in Software Engineering Can Increase Team Performance and Prevent Burnout

The Ongoing Challenges of DevSecOps Transformation and Improving Developer Experience

University Researchers Publish Analysis of Chain-of-Thought Reasoning in LLMs

Microsoft and Tsinghua University Present DIFF Transformer for LLMs

OpenAI Releases Swarm, an Experimental Open-Source Framework for Multi-Agent Orchestration

Google Cloud Adds Scalable Vector Search to Memorystore for Valkey & Redis Cluster

Podman Desktop 1.13 Launches with Hyper-V Support and Additional Enhancements

Uber Completes Major MySQL Fleet Upgrade, Boosting Performance and Security

QCon San Francisco

QCon London

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?