Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News LinkedIn Open Sources Greykite, a Python-based Forecasting Library

LinkedIn Open Sources Greykite, a Python-based Forecasting Library

This item in japanese

LinkedIn open sourced Greykite, a Python library that promises to provide accurate future forecasts in an interpretable manner, allowing visualizations of the trend, seasonality, and other effects. Built to be flexible, intuitive and fast, the LinkedIn team demonstrated that it performed 4 times better than FB’s prophet, providing more accurate results for 1-day and 7-day forecasts.

Written in Python, it provides mechanisms that can be used for short and long term forecasting. Due to its fast, accurate and intuitive nature, Silverkite, the library’s main algorithm, is suitable for interactive and automated forecasting at scale. Time series forecasts can provide future expectations for metrics and other quantities that are measurable over time.

These models allow businesses to optimise and better prepare for the future from any perspective.

For instance, in LinkedIn’s case it was used for resource planning, performance management, optimization and ecosystem insight generation. More concretely, a couple of scenarios in which it is used at LinkedIn:

  1. To provision sufficient infrastructure to handle peak traffic.
  2. To set business metric targets and track progress for operational success.
  3. To optimize budget decisions by forecasting growth of various markets.
  4. To understand which countries are recovering faster or slower after a shock like the COVID-19 pandemic.

With the help of forecasts, LinkedIn’s site reliability engineering (SRE) team ensures site availability in a cost effective manner: they forecast peak minute-level QPS (queries per second) and service service QPS for the next year in order to provision sufficient capacity without excessive buffers and costs. More accurate insights regarding future traffic, corroborated with careful site capacity measurements, enables confident decision-making. Every minor cost saving translates into a reduction of total cost, precise forecasts having a big business impact.

Applications of forecasting can be found also with LinkedIn’s Marketing Solutions, where short term forecasts of budgets, clicks, revenue and other metrics feed into a health dashboard that helps point out potential issues. The forecasts indicate any deviation, also providing context about which metric dimension or related metric may help explain anomalies. Long term forecasts allow metric targets setting and routine checks to ensure that they are on track to meet them.

The output is interpretable, allowing visualizations of the trend, seasonality, and other effects, along with their statistical significance. The Silverkite algorithm works well on time series with (potentially time-varying) trends and seasonality, repeated events/holidays, and/or short-range effects. At LinkedIn, it was successfully applied to a wide variety of metrics in different time frequencies (hourly, daily, weekly, etc.), as well as various forecast horizons, e.g., 1 day ahead (short-term) or 1 year ahead (long-term).

Some key benefits:

  • Flexible: provides time series regressors for trend, seasonality, holidays, changepoints, and autoregression.
  • Intuitive: provides exploratory plots, templates for tuning, and explainable forecasts with clear assumptions.
  • Fast: allows for quick prototyping and deployment at scale.

A benchmark conducted by the Greykite’s development team concluded that Silverkite’s out of the box configuration performs better for 1-day and 7-day forecast horizons in comparison with Auto-Arima and Prophet. In terms of average runtime, both Greykite and Auto-Arima performed 4 times faster than Prophet (as can be seen in the next table published by LinkedIn).

Besides Silverkite, Greykite also supports Facebook Prophet, and plans are to enable other open source algorithms in the future.

LinkedIn’s open sourcing of Greykite provides a tool for anybody who wants to be better prepared for the future. This continues the series of tools released to date: Dagli, a ML library for Java; Lift, a library for measuring AI models fairness; GDMix, a framework for training AI personalization models; Ambry, an object store for media files, and others. Greykite is available on GitHub and PyPI.

Greykite promises to provide accurate future forecasts, both on the short term and long term horizons.

Rate this Article