Neha Narkhede of Kafka fame shares the experience of building LinkedIn's powerful and efficient data pipeline infrastructure around Apache Kafka and Samza to process billions of events every day.
The authors discuss patterns and technologies needed to scale large enterprise mobile systems, covering handling network connectivity, data reliability and real-time communication.
Brian Degenhardt discusses lessons that Twitter learned managing a high rate of change and complexity, and how those can be applied anywhere.
Sean Owen provides examples of operational analytics projects, presenting a reference architecture and algorithm design choices for a successful implementation based on his experience Oryx/Cloudera.
Josh Wills discusses using Hadoop technologies to build real-time data analysis models with a focus on strategies for data integration, large-scale machine learning, and experimentation.
Chris Riccomini discusses: Samza's feature set, how Samza integrates with YARN and Kafka, how it's used at LinkedIn, and what's next on the roadmap.
Avi Bryant discusses how the laws of group theory provide a useful codification of the practical lessons of building efficient distributed and real-time aggregation systems.
Dan Frank discusses stream data processing and introduces NSQ – Bitly’s open source queuing system – and other new technologies used for communication between streaming programs.
Oleg Zhurakousky discusses architectural tradeoffs and alternative implementations of real-time high speed data ingest into Hadoop.
Mike Nolet shares lessons learned scaling AppNexus and architectural details of their system processing 30TB/day: Hadoop, DNS built in GSLB and Keepalived, and real-time data streaming built in C.
Bijan Vaez discusses building large-scale cross-platform mobile apps with HTML5 including offline support, real-time interactivity, and device APIs (camera, GPS).