InfoQ Homepage Big Data Content on InfoQ

Presentations

RSS Feed

Newer Older

AI, ML & Data Engineering

Streaming SQL Foundations: Why I ❤Streams+Tables

Tyler Akidau explores the relationship between the Beam Model and stream & table theory, stream processing in SQL with Apache Beam, Calcite, Flink, Kafka KSQL and Apache Spark’s Structured streaming.

Tyler Akidau
on Feb 17, 2018

Icon

51:39
AI, ML & Data Engineering

Bias in BigData/AI and ML

Leslie Miley discusses how inherent bias in data sets has affected things from the 2016 Presidential race to criminal sentencing in the United States.

Leslie Miley
on Dec 23, 2017

Icon

25:30
AI, ML & Data Engineering

Scaling with Apache Spark

Holden Karau looks at Apache Spark from a performance/scaling point of view and what’s needed to handle large datasets.

Holden Karau
on Aug 05, 2017

Icon

46:58
Architecture & Design

Serverless Design Patterns with AWS Lambda: Big Data with Little Effort

Tim Wagner discusses Big Data on serverless, showing working examples and how to set up a CI/CD pipeline, demonstrating AWS Lambda with the Serverless Application Model (SAM).

Tim Wagner
on Jul 29, 2017

Icon

50:45
AI, ML & Data Engineering

Scio: Moving Big Data to Google Cloud, a Spotify Story

Neville Li tells the Spotify’s story of migrating their big data infrastructure to Google Cloud, replacing Hive and Scalding with BigQuery and Scio, which helped them iterate faster.

Neville Li
on May 26, 2017

Icon

54:50
AI, ML & Data Engineering

Data Preparation for Data Science: A Field Guide

Casey Stella presents a utility written with Apache Spark to automate data preparation, discovering missing values, values with skewed distributions and discovering likely errors within data.

Casey Stella
on Apr 23, 2017

Icon

45:00
AI, ML & Data Engineering

AI from an Investment Perspective

The panelists discuss AI from an investment perspective, the challenges, the risks, trends, the role of Deep Learning, successful AI use cases, and more.

Yashwanth Hemaraj Kartik Gada Pankaj Mitra Kiersten Stead Sanjit Dang Leonard Speiser Doug Dooley
on Apr 18, 2017

Icon

42:48
AI, ML & Data Engineering

Big Data Infrastructure @ LinkedIn

Shirshanka Das describes LinkedIn’s Big Data Infrastructure and its evolution through the years, including details on the motivation and architecture of Gobblin, Pinot and WhereHows.

Shirshanka Das
on Apr 02, 2017

Icon

50:48
AI, ML & Data Engineering

Real-Time Recommendations Using Spark Streaming

Elliot Chow discusses the data pipeline that they built with Kafka, Spark Streaming, and Cassandra to process Netflix user activities in real time for the Trending Now row.

Elliot Chow
on Mar 30, 2017

Icon

47:03
AI, ML & Data Engineering

Building a Data Science Capability from Scratch

Victor Hu covers the challenges, both technical and cultural, of building a data science team and capability in a large, global company.

Victor Hu
on Mar 23, 2017

Icon

49:06
AI, ML & Data Engineering

Data Science in the Cloud @StitchFix

Stefan Krawczyk discusses how StitchFix used the cloud to enable over 80 data scientists to be productive and have easy access, covering prototyping, algorithms used, keeping schema in sync, etc.

Stefan Krawczyk
on Feb 17, 2017

Icon

40:48
DevOps

Petabytes Scale Analytics Infrastructure @Netflix

Tom Gianos and Dan Weeks discuss Netflix' overall big data platform architecture, focusing on Storage and Orchestration, and how they use Parquet on AWS S3 as their data warehouse storage layer.

Tom Gianos Dan Weeks
on Feb 15, 2017

Icon

45:26

Newer Presentations

Older Presentations

InfoQ Software Architects' Newsletter

Presentations