How Uber Sped up SQL-based Data Analytics with Presto and Express Queries

Uber uses Presto, an open-source distributed SQL query engine, to provide analytics across several data sources, including Apache Hive, Apache Pinot, MySQL, and Apache Kafka. To improve its performance, Uber engineers explored the advantages of dealing with quick queries, a.k.a. express queries, in a specific way and found they could improve both Presto utilization and response times.

Express queries, i.e., those that are executed in under two minutes, amounted to about half the Uber analytics process queries. Engineers at Uber found out that handling them just as the rest of queries caused underutilization and high latency due to them being throttled to avoid overloading the system.

To prevent express queries from being throttled, the first step was to predict when an incoming query is an express queries. Uber engineers based this prediction on historical data where each query was assigned an exact fingerprint, that is, a unique hash calculated after removing comments, whitespaces, and any literal values.

We tested the P90 and P95 query execution times using the exact fingerprint and abstract fingerprint of the query with lookback windows of 2 days, 5 days, and 7 days. [...] We used this candidate definition to predict if a query was express: if the X runtime of the query in the last Y days based on Z fingerprint was less than 2 minutes.

Using this definition, Uber engineers found that the P90 value of the abstract fingerprint with a 5-day lookback window provided the best accuracy and coverage to predict which queries would be completed in less than two minutes. They also kept enough data in the table used to look up if an incoming query is an express query to be able to change to a different percentile and/or larger or smaller lookback window.

While this idea seemed straightforward, its design and implementation went through a couple of iterations, moving from the idea of introducing express-query handling as late as possible in the system pipeline to handling them as early as possible.

In their first design, Uber engineers decided to queue express and non-express queries in the same queues based on their user priority. That means all queries, either express or non-express, went to the queue corresponding to their user priority, e.g., batch or interactive. Each queue had its designated Presto cluster, with a subset dedicated to handling express queries after all user/source and cluster limits checks were carried through.

This design led to underutilizing the express Presto clusters, basically because of mixing slow and express queries in the same queues, which made slow queries still dictate the pace at which express queries reached express clusters.

As a second attempt at designing the system, Uber engineers tried to use a dedicated queue for express queries, which were routed as soon as entering the system after the validation step. So, express queries had a direct, faster path to the designated express clusters, bringing an order-of-magnitude improvement in end-to-end SLA for over 75% of scheduled queries.

While this approach improved cluster utilization and running time for express queries, Uber engineers are considering how to improve it even more. In fact, running this design in production showed that express query handling is so fast that it is not necessary to route express queries across different queues based on their priority.

This led Uber engineers to start working on a further evolution of their design, where express queries have a completely dedicated subsystem. This promises to increase cluster utilization and simplify routing logic.

About the Author

Sergio De Simone

Show moreShow less

InfoQ Software Architects' Newsletter

Write for InfoQ

About the Author

Sergio De Simone

Rate this Article

This content is in the AI, ML & Data Engineering topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter