Sameer Farooqui demos connecting to the live stream of Wikipedia edits, building a dashboard showing what’s happening with Wikipedia datasets and how people are using them in real time.
Sudhir Tonse discusses using stream processing at Uber: indexing and querying of geospatial data, aggregation and computing of streaming data, extracting patterns, TimeSeries analyses and predictions.
Joe Stein makes an introduction for developers about why and how to use Apache Kafka. Apache Kafka is a publish-subscribe messaging system rethought of as a distributed commit log.
Sharad Murthy & Tony Ng present Pulsar, a real-time streaming system which can scale to millions of events per second with high availability and 4GL language support.
Ajit Jaokar discusses data science and IoT: sensor data, real-time processing, cognitive computing, integration of IoT analytics with hardware, IoT’s impact on healthcare, automotive, wearables, etc.
Vaclav Petricek discusses how to train models, architect and build a scalable system powered by Storm, Hadoop, Spark, Spring Boot and Vowpal Wabbit that meets SLAs measured in tens of milliseconds.
Trisha Gee uses Java 8 streams and lambdas to build an app consuming a real-time feed of high velocity data, using services to make sense of the data, and presenting it in a JavaFX dashboard.
This session explores the power of Spring XD in the context of the Internet of Things (IoT).
Eugene Mandel discusses challenges of conforming data sources and compares processing stacks: Hadoop+Redshift vs Spark, showing how the technology drives the way the problem is modeled.
Garrett Wampole describes an experimental methodology of applying Enterprise Integration Patterns to the near real-time processing of surveillance radar data, developed by MITRE.
Sean Owen provides examples of operational analytics projects, presenting a reference architecture and algorithm design choices for a successful implementation based on his experience Oryx/Cloudera.
Josh Wills discusses using Hadoop technologies to build real-time data analysis models with a focus on strategies for data integration, large-scale machine learning, and experimentation.