BT

New Early adopter or innovator? InfoQ has been working on some new features for you. Learn more

Older rss
41:42
Data Science Follow 125 Followers

Orchestrating Chaos: Applying Database Research in the Wild

Posted by Peter Alvaro  on  Aug 10, 2017 Posted by Peter Alvaro Follow 0 Followers  on  Aug 10, 2017

Peter Alvaro describes LDFI’s (Lineage-driven Fault Injection) theoretical roots in database research, presenting early results from the field and opportunities for near and long-term future research.

31:56
Development Follow 55 Followers

Power of the Log:LSM & Append Only Data Structures

Posted by Ben Stopford  on  Jun 15, 2017 Posted by Ben Stopford Follow 0 Followers  on  Jun 15, 2017

Ben Stopford talks about the beauty of sequential access and append only data structures in the context of “Log Structured Merge Trees”.

01:09:27
Architecture & Design Follow 262 Followers

Applied Distributed Research in Apache Cassandra

Posted by Jonathan Ellis  on  Jun 10, 2017 Posted by Jonathan Ellis Follow 1 Followers  on  Jun 10, 2017

Jonathan Ellis explains the challenges and successes Cassandra has had in creating transactions, materialized views, and a strongly consistent cluster membership within this peer-to-peer paradigm.

54:50
Data Science Follow 125 Followers

Scio: Moving Big Data to Google Cloud, a Spotify Story

Posted by Neville Li  on  May 26, 2017 Posted by Neville Li Follow  Followers  on  May 26, 2017

Neville Li tells the Spotify’s story of migrating their big data infrastructure to Google Cloud, replacing Hive and Scalding with BigQuery and Scio, which helped them iterate faster.

47:56
Architecture & Design Follow 262 Followers

In-Memory Caching: Curb Tail Latency with Pelikan

Posted by Yao Yue  on  May 02, 2017 Posted by Yao Yue Follow 0 Followers  on  May 02, 2017

Yao Yue introduces Pelikan - a framework to implement distributed caches such as Memcached and Redis. She discusses the system aspects that are important to the performance of such services.

45:00
Data Science Follow 125 Followers

Data Preparation for Data Science: A Field Guide

Posted by Casey Stella  on  Apr 23, 2017 Posted by Casey Stella Follow 0 Followers  on  Apr 23, 2017

Casey Stella presents a utility written with Apache Spark to automate data preparation, discovering missing values, values with skewed distributions and discovering likely errors within data.

50:39
Architecture & Design Follow 262 Followers

Building Reliability in an Unreliable World

Posted by Greg Murphy  on  Apr 20, 2017 Posted by Greg Murphy Follow 0 Followers  on  Apr 20, 2017

Greg Murphy describes how GameSparks has designed their platform to be tolerant of many things: unreliable and slow internet connectivity, cloud resources that can fail without warning, and more.

42:48
Data Science Follow 125 Followers

AI from an Investment Perspective

Posted by Sanjit Dang  on  Apr 18, 2017 Posted by Sanjit Dang Follow 0 Followers , Kiersten Stead Follow 0 Followers , Yashwanth Hemaraj Follow 0 Followers , Pankaj Mitra Follow 0 Followers , Leonard Speiser Follow 0 Followers , Kartik Gada Follow 0 Followers , Doug Dooley Follow 0 Followers  on  Apr 18, 2017

The panelists discuss AI from an investment perspective, the challenges, the risks, trends, the role of Deep Learning, successful AI use cases, and more.

49:40
Architecture & Design Follow 262 Followers

Causal Consistency for Large Neo4j Clusters

Posted by Jim Webber  on  Apr 07, 2017 Posted by Jim Webber Follow 0 Followers  on  Apr 07, 2017

Jim Webber explores the new Causal clustering architecture for Neo4j, how it allows users to read writes straightforwardly, explaining why this is difficult to achieve in distributed systems.

50:48
Data Science Follow 125 Followers

Big Data Infrastructure @ LinkedIn

Posted by Shirshanka Das  on  Apr 02, 2017 Posted by Shirshanka Das Follow 0 Followers  on  Apr 02, 2017

Shirshanka Das describes LinkedIn’s Big Data Infrastructure and its evolution through the years, including details on the motivation and architecture of Gobblin, Pinot and WhereHows.

46:03
Architecture & Design Follow 262 Followers

Scaling up Near Real-Time Analytics @Uber &LinkedIn

Posted by Chinmay Soman  on  Mar 30, 2017 Posted by Chinmay Soman Follow 1 Followers , Yi Pan Follow 0 Followers  on  Mar 30, 2017

Chinmay Soman and Yi Pan discuss how Uber and LinkedIn use Apache Samza, Calcite and Pinot along with the analytics platform AthenaX to transform data to make it available for querying in minutes.

47:03
Data Science Follow 125 Followers

Real-Time Recommendations Using Spark Streaming

Posted by Elliot Chow  on  Mar 30, 2017 Posted by Elliot Chow Follow 0 Followers  on  Mar 30, 2017

Elliot Chow discusses the data pipeline that they built with Kafka, Spark Streaming, and Cassandra to process Netflix user activities in real time for the Trending Now row.

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT