Shirshanka Das describes LinkedIn’s Big Data Infrastructure and its evolution through the years, including details on the motivation and architecture of Gobblin, Pinot and WhereHows.
Tom Gianos and Dan Weeks discuss Netflix' overall big data platform architecture, focusing on Storage and Orchestration, and how they use Parquet on AWS S3 as their data warehouse storage layer.
Randy Shoup describes KIXEYE's analytics infrastructure from Kafka queues through Hadoop 2 to Hive and Redshift, built for flexibility, experimentation, iteration, testability, and reliability.
Brenden Matthews describes the infrastructure built at Airbnb using Mesos in order to support Hadoop and Storm.
Mike Nolet shares lessons learned scaling AppNexus and architectural details of their system processing 30TB/day: Hadoop, DNS built in GSLB and Keepalived, and real-time data streaming built in C.
Manvir Singh Grewal and Brandon Byars propose a business intelligence workflow along with Lean principles and practices for implementing a data warehouse and reporting capability.
Bhaven Avalani and Yuri Finklestein discuss 4 aspects encountered at eBay when dealing with monitoring data: reduction of data entropy, robust data distribution, metric extraction, efficient storage.
Cliff Click discusses RAIN, H2O, JMM, Parallel Computation, Fork/Joins in the context of performing big data analysis on tons of commodity hardware.
Serkan Piantino discusses news feeds at Facebook: the basics, infrastructure used, how feed data is stored, and Centrifuge – a storage solution.
Bruce Durling discusses the impact of cloud computing on the climate and what can be done to reduce the amount of CO2 generated by data centers in order to process big data.
Ram C Singh discusses using Big Data for infrastructure telemetry along with good practices and an autonomic engine to create an autonomic computing infrastructure that might prevent downtime.