LinkedIn Develops MySQL Query Analyzer Tool

| by Andrew Morgan Follow 3 Followers on Sep 21, 2017. Estimated reading time: 2 minutes |

In order to optimise the performance of their MySQL instances, LinkedIn has created a Query Analyzer tool in order to analyze and tune expensive queries. It runs on the network layer, stores everything within a centralized server, and provides a UI for users to analyze metrics.

LinkedIn runs a multitenant MySQL architecture. This is mainly due to having over 500 applications which are dependant on it, meaning that single tenancy would cause extremely high resource costs. The tradeoff to this approach means that queries from one application can negatively detriment the performance for another, which is why it is important for them to be able to understand and optimise them as much as possible.

The query analyzer is split into three components:

  1. Agent: Runs on the MySQL nodes and collects metrics about running queries and their performance.
  2. Centralized Server: Where all the data about queries is stored for reporting.
  3. UI: Sits on top of the server and provides a user-friendly means to analyze the data.

Karthik Apigatla, senior database engineer at LinkedIn, highlights its benefits as:

… allowing our database engineers to identify problematic queries at a single glance, to compare a week-over-week overlay of query activity, and to troubleshoot database slowdowns quickly and efficiently.

Originally, out of the box solutions such as MySQL Performance Schema and slow query log were considered, but they came with too much of a performance penalty when introduced. Unlike those tools, the Query Analyzer runs on the network layer, minimising any kind of performance hit to the instances. 

The agent captures raw query packets and reconstructs the query using the MySQL protocols, and then calculates response times based on a time diff between when packets first enter the port and when they are first produced by the database response.

The UI provides a tabular view of distinct queries for hosts, allowing the user to filter by periods of time. Individual queries can also be selected, providing graphs and other useful metrics such as load and average time.

The tool also provides a query load metric which is calculated as ‘execution time * number of executions’. By taking this and converting it into a ratio, it can be compared against other queries to see which ones took the largest percentage of load. For example, even if a query takes a few milliseconds, the metric would still correctly report if it is running too many times, thus taking up the majority of the load.

LinkedIn has not laid out a specific timeline, but intends to open source the tool in the near future.

Rate this Article

Adoption Stage

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread


Login to InfoQ to interact with what matters most to you.

Recover your password...


Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.


More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.


Stay up-to-date

Set up your notifications and don't miss out on content that matters to you