BT

New Early adopter or innovator? InfoQ has been working on some new features for you. Learn more

Older rss
40:26
Data Science Follow 275 Followers

Machine Learning in Academia and Industry

Posted by Deborah Hanus  on  Oct 10, 2017 Posted by Deborah Hanus Follow 1 Followers  on  Oct 10, 2017

Deborah Hanus discusses some of the challenges that can arise when working with data.

39:23
Data Science Follow 275 Followers

AI-Based Data Extraction

Posted by George Roth  on  May 28, 2017 Posted by George Roth Follow  Followers  on  May 28, 2017

George Roth presents the challenges of data extraction from unstructured content in the context of preparing the data for Data Analytics.

45:00
Data Science Follow 275 Followers

Data Preparation for Data Science: A Field Guide

Posted by Casey Stella  on  Apr 23, 2017 Posted by Casey Stella Follow 0 Followers  on  Apr 23, 2017

Casey Stella presents a utility written with Apache Spark to automate data preparation, discovering missing values, values with skewed distributions and discovering likely errors within data.

46:03
Data Science Follow 275 Followers

Straggler Free Data Processing in Cloud Dataflow

Posted by Eugene Kirpichov  on  Apr 11, 2017 Posted by Eugene Kirpichov Follow 0 Followers  on  Apr 11, 2017

Eugene Kirpichov describes the theory and practice behind Cloud Dataflow's approach to straggler elimination, and the associated non-obvious challenges, benefits, and implications of the technique.

46:03
Architecture & Design Follow 619 Followers

Scaling up Near Real-Time Analytics @Uber &LinkedIn

Posted by Chinmay Soman  on  Mar 30, 2017 Posted by Chinmay Soman Follow 1 Followers , Yi Pan Follow 0 Followers  on  Mar 30, 2017

Chinmay Soman and Yi Pan discuss how Uber and LinkedIn use Apache Samza, Calcite and Pinot along with the analytics platform AthenaX to transform data to make it available for querying in minutes.

45:22
Data Science Follow 275 Followers

Effective Data Pipelines: Data Mngmt from Chaos

Posted by Katharine Jarmul  on  Mar 29, 2017 Posted by Katharine Jarmul Follow 0 Followers  on  Mar 29, 2017

Katharine Jarmul discusses implementation decisions for those looking for a practical recommendation on the "what" and "how" of data automation workflows.

48:49
Data Science Follow 275 Followers

Building Data Pipelines in Python

Posted by Marco Bonzanini  on  Mar 28, 2017 Posted by Marco Bonzanini Follow 0 Followers  on  Mar 28, 2017

Marco Bonzanini discusses the process of building data pipelines and all the steps necessary to prepare data, focusing on data plumbing and going from prototype to production.

40:48
Data Science Follow 275 Followers

Data Science in the Cloud @StitchFix

Posted by Stefan Krawczyk  on  Feb 17, 2017 Posted by Stefan Krawczyk Follow 0 Followers  on  Feb 17, 2017

Stefan Krawczyk discusses how StitchFix used the cloud to enable over 80 data scientists to be productive and have easy access, covering prototyping, algorithms used, keeping schema in sync, etc.

44:06
Data Science Follow 275 Followers

Scaling the Data Infrastructure @Spotify

Posted by Mārtiņš Kalvāns  on  Jan 28, 2017 Posted by Mārtiņš Kalvāns Follow 0 Followers , Matti Pehrs Follow 0 Followers  on  Jan 28, 2017

Mārtiņš Kalvāns and Matti Pehrs overview the Data Infrastructure at Spotify, diving into some of the data infrastructure components, such us Event Delivery, Datamon and Styx.

01:07:25
Data Science Follow 275 Followers

Data Microservices in the Cloud

Posted by Mark Pollack  on  Jan 08, 2017 Posted by Mark Pollack Follow 0 Followers  on  Jan 08, 2017

Mark Pollack introduces Spring Cloud Data Flow enabling one to create pipelines for data ingestion, real-time analytics and data import/export, demoing apps that are deployed onto multiple runtimes.

39:14
Data Science Follow 275 Followers

Targeting Your Audience: Data Visualization to Communicate Data Insights

Posted by Randy Krum  on  Dec 16, 2016 Posted by Randy Krum Follow 0 Followers  on  Dec 16, 2016

Randy Krum explains how to use the power of data visualization to convey actionable insights to an audience, making data clear and memorable by showing the audience what the data means.

50:44
Data Science Follow 275 Followers

Ingest & Stream Processing - What Will You Choose?

Posted by Pat Patterson  on  Aug 14, 2016 1 Posted by Pat Patterson Follow 0 Followers , Ted Malaska Follow 0 Followers  on  Aug 14, 2016 1

Pat Patterson and Ted Malaska talk about current and emerging data processing technologies, and the various ways of achieving "at least once" and "exactly once" timely data processing.

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT