BT

New Early adopter or innovator? InfoQ has been working on some new features for you. Learn more

rss
40:48

Data Science in the Cloud @StitchFix

Posted by Stefan Krawczyk  on  Feb 17, 2017 Posted by Stefan Krawczyk  on  Feb 17, 2017

Stefan Krawczyk discusses how StitchFix used the cloud to enable over 80 data scientists to be productive and have easy access, covering prototyping, algorithms used, keeping schema in sync, etc.

33:53

Streaming Live Data and the Hadoop Ecosystem

Posted by Oleg Zhurakousky  on  Jan 29, 2017 Posted by Oleg Zhurakousky  on  Jan 29, 2017

Oleg Zhurakousky discusses the Hadoop ecosystem – Hadoop, HDFS, Yarn-, and how projects such as Hive, Atlas, NiFi interact and integrate to support the variety of data used for analytics.

30:29

Achieving Mega-Scale Business Intelligence through Speed of Thought Analytics on Hadoop

Posted by Ian Fyfe  on  Oct 26, 2016 Posted by Ian Fyfe  on  Oct 26, 2016

Ian Fyfe discusses the different options for implementing speed-of-thought business analytics and machine learning tools directly on top of Hadoop.

51:04

The Game of Big Data: Scalable, Reliable Analytics Infrastructure at KIXEYE

Posted by Randy Shoup  on  Jul 19, 2014 Posted by Randy Shoup  on  Jul 19, 2014

Randy Shoup describes KIXEYE's analytics infrastructure from Kafka queues through Hadoop 2 to Hive and Redshift, built for flexibility, experimentation, iteration, testability, and reliability.

38:11

REEF: Retainable Evaluator Execution Framework

Posted by Rusty Sears  on  Dec 10, 2013 Posted by Rusty Sears  on  Dec 10, 2013

Rusty Sears introduces REEF along with examples of computational frameworks, including interactive sessions, iterative graph processing, bulk synchronous computations, Hive queries, and MapReduce.

38:16

Apache Tez: Accelerating Hadoop Query Processing

Posted by Bikas Saha  on  Dec 05, 2013 Posted by Bikas Saha Arun Murthy  on  Dec 05, 2013

Bikas Saha and Arun Murthy detail the design of Tez, highlighting some of its features and sharing some of the initial results obtained by Hive on Tez.

52:24

Big Data Platform as a Service at Netflix

Posted by Jeff Magnusson  on  Nov 18, 2013 Posted by Jeff Magnusson  on  Nov 18, 2013

Jeff Magnusson details some of Netflix' key services: Franklin, Sting and Lipstick.

Petabyte Scale Data at Facebook

Posted by Dhruba Borthakur  on  Dec 17, 2012 3 Posted by Dhruba Borthakur  on  Dec 17, 2012 3

Dhruba Borthakur discusses the different types of data used by Facebook and how they are stored, including graph data, semi-OLTP data, immutable data for pictures, and Hadoop/Hive for analytics.

Hadoop and Cassandra, Sitting in a Tree ...

Posted by Jake Luciani  on  May 30, 2012 Posted by Jake Luciani  on  May 30, 2012

Jake Luciani introduces Brisk, a Hadoop and Hive distribution using Cassandra for core services and storage, presenting the benefits of running Hadoop in a peer-to-peer masterless architecture.

BT