Uber Freight Near-Real-Time Analytics Architecture

Uber Freight is the Uber platform dedicated to connecting shippers with carriers. Providing reliable service to shippers is crucial for Uber Freight. This is why the Carrier Scorecard was developed, with several metrics including on-time pickup/delivery, tracking automation, and late cancellations. It’s important to show this information in near real-time on the Carrier app, and it is important that the architecture that provides this service is fast enough to meet the requirements.

The requirements for this architecture are data freshness, latency, reliability, and accuracy. The performance scores are updated with low latency once a load is complete or bounced. Every carrier can view their performance score in the app with low delay. The data can be processed and served with a high level of reliability, and the entire system can be gracefully recovered in case of failure. Performance metrics must be calculated accurately.

Some potential solutions were considered before designing and implementing the final architecture, particularly for the aggregation metrics stage. On-the-fly aggregation with MySQL and pre-aggregation of data with MySQL. Both of these solutions have some cons, the main problem is the upsert of records in batch to be sure the historical data are up to date correctly. Another considered solution was to use two OLAP database tables, one storing the raw data, and events trigger asynchronous functions to update the metrics in the other table, but this solution is not scalable, especially while writing traffic is high.

Final architeture schema

The final architecture uses Kafka, Flink, and Pinot. The Kafka events, generated by backend services, are aggregated by Flink. The aggregated data are ingested in Pinot which uses real-time ingestion from Kafka to cover the last three days of data. For historical data, the ingestion is made from HDFS.

Apache Pinot provides index optimization techniques like JSON, sorted columns, and Star-tree in order to accelerate the query's performance. Fast queries allow a better interactive experience for carriers. To achieve the 250 ms query latency on a table, two kinds of indexes are created on Pinot tables: inverted index and sorted index. The inverted index can speed up the queries with WHERE clause by a factor of 10, sorted index by carriers unique id, reduces the table size by half and this reduces the query latency too.

Neutrino is the query gateway to access the Pinot datasets; it is a different deployment of Presto, where the coordinator and worker run on each host and can run queries independently. Neutrino accepts PrestoSQL queries and translates them into Pinot query languages. A Redis cash is added in front of Neutrino in order to store aggregated metrics for a maximum of 12 hours, this allows to achieve more than 90% of the cache hit rate.

Uber observed a statistically significant boost across all key metrics since it started to provide the information on performances to Freight drivers: -0.4% of late cancellations, +0.6% of on-time pick-up, +1% of on-time drop-off and +1% of auto tracking performances. These performance improvements resulted in an estimated cost saving of $1.5 million in 2021.

About the Author

Claudio Masolo

Show moreShow less

InfoQ Software Architects' Newsletter

Write for InfoQ

About the Author

Claudio Masolo

Rate this Article

This content is in the AI, ML & Data Engineering topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter