Hairong Kuang explains how Facebook uses HDFS to store and analyze over 100PB of user log data.
Nikita Ivanov shows adding real-time capabilities to Hadoop through a demo application streaming word counting on a 2-nodes cluster.
Kathleen Ting details 8 misconfigurations that can bring ZooKeeper down.
Costin Leau discusses Big Data, current available tools for dealing with it, and how Spring can be used to create Big Data pipelines.
Nathan Marz introduces Twitter Storm, outlining its architecture and use cases, and takes a look at future features to be made available.
Rob Lancaster explains the steps made by Orbitz in order to bridge the gap between their data in the data warehouse and the data in Hadoop.
Eli Collins introduces Hadoop: why it came about, the benefits it produces, its history, its architecture, use cases and applications.
Dhruba Borthakur discusses the different types of data used by Facebook and how they are stored, including graph data, semi-OLTP data, immutable data for pictures, and Hadoop/Hive for analytics.
Yaniv Rodenski introduces Hadoop, then running Hadoop on Azure and the available tools and frameworks.
Dean Wampler discusses the strengths and weaknesses of MapReduce, and the newer variants for big data processing: Pregel and Storm.
Parand Tony Darugar overviews Hadoop, its processing model, the associated ecosystem and tools, discussing some real-life uses of Hadoop for analyzing and processing large amounts of data.
Ashish Thusoo presents the data scalability issues at Facebook and the data architecture evolution from EDW to Hadoop to Puma.