Josh Wills discusses using Hadoop technologies to build real-time data analysis models with a focus on strategies for data integration, large-scale machine learning, and experimentation.
Brenden Matthews describes the infrastructure built at Airbnb using Mesos in order to support Hadoop and Storm.
Matthias Broecheler discusses graph computing, introducing the Aurelius graph cluster enabling graph computing at scale by building on distributed systems like Cassandra, HBase, and Hadoop.
Rusty Sears introduces REEF along with examples of computational frameworks, including interactive sessions, iterative graph processing, bulk synchronous computations, Hive queries, and MapReduce.
Bikas Saha and Arun Murthy detail the design of Tez, highlighting some of its features and sharing some of the initial results obtained by Hive on Tez.
Ben Johnson discusses the Raft protocol and how it works. Raft is a consensus distributed protocol.
Steve Pember discusses creating Grails applications integrating message broker technologies, especially RabbitMQ, and applying SOA principles.
Sebastian Kanthak overviews Spanner, covering details of how Spanner relies on GPS and atomic clocks to provide two of its most innovative features: Lock-free strong (current) reads and global snapshots that are consistent with external events.
Avi Bryant discusses how the laws of group theory provide a useful codification of the practical lessons of building efficient distributed and real-time aggregation systems.
Kyle Kingsbury discusses some of the limitations found in distributed systems and the way some of them behave under partitioning.
Jeff Magnusson takes a deep dive into key services of Netflix’s “data platform as a service” architecture, including RESTful services that: provide comprehensive metadata management across data sources (Franklin); enable visualization and caching of results of Hadoop jobs (Sting); and visualize the execution plans produced by languages such as Pig and Hive (Lipstick).
Joshua Suereth designs a scalable distributed search service with Akka and Scala using actors, and covering practical aspects of how to scale out with Akka’s clustering API.