Eugene Dvorkin provides an introduction to Storm framework, explains how to build real-time applications on top of Storm with Groovy, how to process data from Twitter in real-time, etc.
Garrett Wampole describes an experimental methodology of applying Enterprise Integration Patterns to the near real-time processing of surveillance radar data, developed by MITRE.
Neha Narkhede of Kafka fame shares the experience of building LinkedIn's powerful and efficient data pipeline infrastructure around Apache Kafka and Samza to process billions of events every day.
The authors discuss patterns and technologies needed to scale large enterprise mobile systems, covering handling network connectivity, data reliability and real-time communication.
Brian Degenhardt discusses lessons that Twitter learned managing a high rate of change and complexity, and how those can be applied anywhere.
Sean Owen provides examples of operational analytics projects, presenting a reference architecture and algorithm design choices for a successful implementation based on his experience Oryx/Cloudera.
Josh Wills discusses using Hadoop technologies to build real-time data analysis models with a focus on strategies for data integration, large-scale machine learning, and experimentation.
Chris Riccomini discusses: Samza's feature set, how Samza integrates with YARN and Kafka, how it's used at LinkedIn, and what's next on the roadmap.
Avi Bryant discusses how the laws of group theory provide a useful codification of the practical lessons of building efficient distributed and real-time aggregation systems.
Dan Frank discusses stream data processing and introduces NSQ – Bitly’s open source queuing system – and other new technologies used for communication between streaming programs.
Oleg Zhurakousky discusses architectural tradeoffs and alternative implementations of real-time high speed data ingest into Hadoop.