Older rss
Culture & Methods Follow 800 Followers

Privacy Ethics – A Big Data Problem

Posted by Raghu Gollamudi  on  Aug 23, 2018 Posted by Raghu Gollamudi Follow 0 Followers  on  Aug 23, 2018

Raghu Gollamudi broadly covers best practices with respect to Data Management aspects from mapping Enterprise data to applying Data Protection rules like GDPR at petabyte scale.

Culture & Methods Follow 800 Followers

What is a Data Citizen?

Posted by Caitlin McDonald  on  Aug 16, 2018 Posted by Caitlin McDonald Follow 0 Followers  on  Aug 16, 2018

Caitlin McDonald discusses how big data affects people online and the ethics to be considered when dealing with data.

Culture & Methods Follow 800 Followers

When Data Kills

Posted by Cori Crider  on  Aug 10, 2018 Posted by Cori Crider Follow 0 Followers  on  Aug 10, 2018

Cori Crider shares insights from her investigations of US drone strikes in Yemen and Pakistan, and explores how misuse of mass surveillance data has claimed innocent lives.

AI, ML & Data Engineering Follow 999 Followers

Streaming SQL Foundations: Why I ❤Streams+Tables

Posted by Tyler Akidau  on  Feb 17, 2018 Posted by Tyler Akidau Follow 1 Followers  on  Feb 17, 2018

Tyler Akidau explores the relationship between the Beam Model and stream & table theory, stream processing in SQL with Apache Beam, Calcite, Flink, Kafka KSQL and Apache Spark’s Structured streaming.

AI, ML & Data Engineering Follow 999 Followers

Bias in BigData/AI and ML

Posted by Leslie Miley  on  Dec 23, 2017 1 Posted by Leslie Miley Follow 0 Followers  on  Dec 23, 2017 1

Leslie Miley discusses how inherent bias in data sets has affected things from the 2016 Presidential race to criminal sentencing in the United States.

AI, ML & Data Engineering Follow 999 Followers

Scaling with Apache Spark

Posted by Holden Karau  on  Aug 05, 2017 Posted by Holden Karau Follow 3 Followers  on  Aug 05, 2017

Holden Karau looks at Apache Spark from a performance/scaling point of view and what’s needed to handle large datasets.

Architecture & Design Follow 2419 Followers

Serverless Design Patterns with AWS Lambda: Big Data with Little Effort

Posted by Tim Wagner  on  Jul 29, 2017 Posted by Tim Wagner Follow 4 Followers  on  Jul 29, 2017

Tim Wagner discusses Big Data on serverless, showing working examples and how to set up a CI/CD pipeline, demonstrating AWS Lambda with the Serverless Application Model (SAM).

AI, ML & Data Engineering Follow 999 Followers

Scio: Moving Big Data to Google Cloud, a Spotify Story

Posted by Neville Li  on  May 26, 2017 Posted by Neville Li Follow 0 Followers  on  May 26, 2017

Neville Li tells the Spotify’s story of migrating their big data infrastructure to Google Cloud, replacing Hive and Scalding with BigQuery and Scio, which helped them iterate faster.

AI, ML & Data Engineering Follow 999 Followers

Data Preparation for Data Science: A Field Guide

Posted by Casey Stella  on  Apr 23, 2017 Posted by Casey Stella Follow 0 Followers  on  Apr 23, 2017

Casey Stella presents a utility written with Apache Spark to automate data preparation, discovering missing values, values with skewed distributions and discovering likely errors within data.

AI, ML & Data Engineering Follow 999 Followers

AI from an Investment Perspective

Posted by Yashwanth Hemaraj  on  Apr 18, 2017 Posted by Yashwanth Hemaraj Follow 0 Followers , Doug Dooley Follow 0 Followers , Kartik Gada Follow 0 Followers , Leonard Speiser Follow 0 Followers , Kiersten Stead Follow 0 Followers , Sanjit Dang Follow 0 Followers , Pankaj Mitra Follow 0 Followers  on  Apr 18, 2017

The panelists discuss AI from an investment perspective, the challenges, the risks, trends, the role of Deep Learning, successful AI use cases, and more.

AI, ML & Data Engineering Follow 999 Followers

Big Data Infrastructure @ LinkedIn

Posted by Shirshanka Das  on  Apr 02, 2017 Posted by Shirshanka Das Follow 0 Followers  on  Apr 02, 2017

Shirshanka Das describes LinkedIn’s Big Data Infrastructure and its evolution through the years, including details on the motivation and architecture of Gobblin, Pinot and WhereHows.

AI, ML & Data Engineering Follow 999 Followers

Real-Time Recommendations Using Spark Streaming

Posted by Elliot Chow  on  Mar 30, 2017 Posted by Elliot Chow Follow 0 Followers  on  Mar 30, 2017

Elliot Chow discusses the data pipeline that they built with Kafka, Spark Streaming, and Cassandra to process Netflix user activities in real time for the Trending Now row.

Login to InfoQ to interact with what matters most to you.

Recover your password...


Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.


More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.


Stay up-to-date

Set up your notifications and don't miss out on content that matters to you