Mohammad Quraishi presents implementing a Big Data initiative, detailing preparation, goal evaluation, convincing executives, and post implementation evaluation.
Marcel Kornacker presents a case study of an EDW built on Impala running on 45 nodes, reducing processing time from hours to seconds and consolidating multiple data sets into one single view.
Akmal B. Chaudhri introduces Apache™ Hadoop® 2.0 and Yet Another Resource Negotiator (YARN).
Brenden Matthews describes the infrastructure built at Airbnb using Mesos in order to support Hadoop and Storm.
Matthias Broecheler discusses graph computing, introducing the Aurelius graph cluster enabling graph computing at scale by building on distributed systems like Cassandra, HBase, and Hadoop.
Bikas Saha and Arun Murthy detail the design of Tez, highlighting some of its features and sharing some of the initial results obtained by Hive on Tez.
Oleg Zhurakousky discusses architectural tradeoffs and alternative implementations of real-time high speed data ingest into Hadoop.
Uri Laserson reviews the different available Python frameworks for Hadoop, including a comparison of performance, ease of use/installation, differences in implementation, and other features.
Michael Kopp explains how to run performance code at scale with Hadoop and how to analyze and optimize Hadoop jobs.
Hairong Kuang explains how Facebook uses HDFS to store and analyze over 100PB of user log data.
Nikita Ivanov shows adding real-time capabilities to Hadoop through a demo application streaming word counting on a 2-nodes cluster.
Kathleen Ting details 8 misconfigurations that can bring ZooKeeper down.