Bikas Saha and Arun Murthy detail the design of Tez, highlighting some of its features and sharing some of the initial results obtained by Hive on Tez.
Oleg Zhurakousky discusses architectural tradeoffs and alternative implementations of real-time high speed data ingest into Hadoop.
Uri Laserson reviews the different available Python frameworks for Hadoop, including a comparison of performance, ease of use/installation, differences in implementation, and other features.
Michael Kopp explains how to run performance code at scale with Hadoop and how to analyze and optimize Hadoop jobs.
Hairong Kuang explains how Facebook uses HDFS to store and analyze over 100PB of user log data.
Nikita Ivanov shows adding real-time capabilities to Hadoop through a demo application streaming word counting on a 2-nodes cluster.
Kathleen Ting details 8 misconfigurations that can bring ZooKeeper down.
Nathan Marz introduces Twitter Storm, outlining its architecture and use cases, and takes a look at future features to be made available.
Eli Collins introduces Hadoop: why it came about, the benefits it produces, its history, its architecture, use cases and applications.
Yaniv Rodenski introduces Hadoop, then running Hadoop on Azure and the available tools and frameworks.
Dean Wampler discusses the strengths and weaknesses of MapReduce, and the newer variants for big data processing: Pregel and Storm.
Parand Tony Darugar overviews Hadoop, its processing model, the associated ecosystem and tools, discussing some real-life uses of Hadoop for analyzing and processing large amounts of data.