You are now in FULL VIEW

The Mechanics of Testing Large Data Pipelines
Recorded at:

| by Mathieu Bastian Follow 0 Followers on Apr 24, 2016 |

Mathieu Bastian explores the mechanics of testing large, complex data workflows and tries to identify the most common challenges developers face. He looks at good practices to develop unit, integration, data and performance tests for data workflows. In terms of tools, he looks at what exists today for Hadoop, Pig and Spark with code examples.

Sponsored Content


Mathieu Bastian spent the last four years building data products at LinkedIn. He joined LinkedIn's Data Science team in late 2010 and first worked on InMaps, a system that served millions of complex graph visualizations to users. Previously, Bastian co-founded and led the development of Gephi, an award-winning open-source software to visualize and analyze large graphs.

Software is Changing the World. QCon empowers software development by facilitating the spread of knowledge and innovation in the developer community. A practitioner-driven conference, QCon is designed for technical team leads, architects, engineering directors, and project managers who influence innovation in their teams.

Login to InfoQ to interact with what matters most to you.

Recover your password...


Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.


More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.


Stay up-to-date

Set up your notifications and don't miss out on content that matters to you