Sean Owen provides examples of operational analytics projects in the field, presenting a reference architecture and algorithm design choices for a successful implementation based on his experience with customers and Oryx/Cloudera.
Josh Wills discusses using Hadoop technologies to build real-time data analysis models with a focus on strategies for data integration, large-scale machine learning, and experimentation.
Chris Riccomini discusses: Samza's feature set, how Samza integrates with YARN and Kafka, how it's used at LinkedIn, and what's next on the roadmap.
Avi Bryant discusses how the laws of group theory provide a useful codification of the practical lessons of building efficient distributed and real-time aggregation systems.
Dan Frank discusses stream data processing and introduces NSQ – Bitly’s open source queuing system – and other new technologies used for communication between streaming programs.
Oleg Zhurakousky discusses architectural tradeoffs and alternative implementations of real-time high speed data ingest into Hadoop.
Mike Nolet shares lessons learned scaling AppNexus and architectural details of their system processing 30TB/day: Hadoop, load balancer-free DNS architecture built in GSLB and Keepalived, and real-time data streaming built in C.
Bijan Vaez discusses building large-scale cross-platform mobile apps with HTML5 including offline support, real-time interactivity, and device APIs (camera, GPS).
Gustavo Garcia explores actual use cases for real time communication in verticals ranging from telepresence to healthcare, where WebRTC fits and where it falls short, and what developers can do.
Charles Cai, Ashwani Roy discuss a robust, cost effective, hypothetical solution to address extreme challenges in financial institutions, from decision making support to pricing and risk management.
Nikita Ivanov shows adding real-time capabilities to Hadoop through a demo application streaming word counting on a 2-nodes cluster.