Uri Laserson reviews the different available Python frameworks for Hadoop, including a comparison of performance, ease of use/installation, differences in implementation, and other features.
Michael Kopp explains how to run performance code at scale with Hadoop and how to analyze and optimize Hadoop jobs.
Nathan Marz shares lessons learned building Storm, an open-source, distributed, real-time computation system.
Eli Collins overviews how to build new applications with Hadoop and how to integrate Hadoop with existing applications, providing an update on the state of Hadoop ecosystem, frameworks and APIs.
Dean Wampler supports using Functional Programming and its core operations to process large amounts of data, explaining why Java’s dominance in Hadoop is harming Big Data’s progress.
Alex Robbins introduces Cascalog, a Clojure library for writing declarative Hadoop jobs.
Josh Suereth designs a distributed search service with Akka using Actors, covering: message passing, designing topologies, handling failure, service overload detection and tracking user sessions.
Ramnivas Laddad sketches the architecture of Cloud Foundry, explaining how they manage to do hot swaps without application downtime, including lessons applicable in general distributed environments.
Adrian Cockcroft presents Netflix globally distributed architecture, the benchmarks used, scalability issues, and the open source components their implementation is based upon.
Scott Andreas discussing creating fault tolerant distributed applications, and demoes Ordasity, a framework for building self-organizing systems with services.
Hairong Kuang explains how Facebook uses HDFS to store and analyze over 100PB of user log data.
Pieter Hintjens explains how to use contracts and rapid iterative design cycles to architect large-scale distributed systems with ZeroMQ.