You are now in FULL VIEW

Streaming Auto-scaling in Google Cloud Dataflow
Recorded at:

| by Manuel Fahndrich Follow 0 Followers on May 01, 2016 |

Manuel Fahndrich describes how they tackled one particular resource allocation aspect of Google Cloud Dataflow pipelines, namely, horizontal scaling of worker pools as a function of pipeline input rate. Managing the redistribution of key ranges across new pool sizes and the associated persistent data storage was particularly challenging.

Sponsored Content


Manuel Fahndrich earned his Ph.D. in C.S. from UC Berkeley in 1999. He spent the next 15 years as a Research Scientist at Microsoft, working on static and dynamic verification tools for object-oriented programs and system software. After joining Google in 2014 he has been working on data-parallel infrastructure, in particular auto-scaling for batch and streaming pipelines.

Software is Changing the World. QCon empowers software development by facilitating the spread of knowledge and innovation in the developer community. A practitioner-driven conference, QCon is designed for technical team leads, architects, engineering directors, and project managers who influence innovation in their teams.

Login to InfoQ to interact with what matters most to you.

Recover your password...


Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.


More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.


Stay up-to-date

Set up your notifications and don't miss out on content that matters to you