Pedro Canahuati describes how Facebook's operations maintains their infrastructure, including challenges faced and lessons learned: prioritizing calls, managing technical debt, incident management.
Dianne Marsh presents the open source tools used by Netflix to keep the continuous delivery wheels spinning.
Indrajit Roy presents HP Labs’ attempts at scaling R to efficiently perform distributed machine learning and graph processing on industrial-scale data sets.
Ben Christensen describes how the Netflix API evolved from a typical one-size-fits-all RESTful API designed to support public developers into a web service platform optimized to handle the diversity and variability of each device and user experience. The talk will also address the challenges involving operations, deployment, performance, fault-tolerance, and rate of innovation at massive scale.
Rusty Sears introduces REEF along with examples of computational frameworks, including interactive sessions, iterative graph processing, bulk synchronous computations, Hive queries, and MapReduce.
Mike Krieger discusses Instagram's best and worst infrastructure decisions, building and deploying scalable and extensible services.
Nick Kolegraff discusses common problems and architecture to support all the phases of data science and how to start a data science initiative, sharing lessons from Accenture, Best Buy, and Rackspace.
Jeff Magnusson takes a deep dive into key services of Netflix’s “data platform as a service” architecture, including RESTful services that: provide comprehensive metadata management across data sources (Franklin); enable visualization and caching of results of Hadoop jobs (Sting); and visualize the execution plans produced by languages such as Pig and Hive (Lipstick).
Crista Lopes writes a program in multiple styles -monolithic/OOP/continuations/relational/Pub-Sub/Monads/AOP/Map-reduce- showing the value of using more than a style in large scale systems.
Dan Frank discusses stream data processing and introduces NSQ – Bitly’s open source queuing system – and other new technologies used for communication between streaming programs.
Peter Niederwieser discusses building a continuous delivery pipeline with Gradle and Jenkins.
Ken Collier discusses Agile Analytics, a combination of sophisticated analytics techniques, lean learning principles, agile delivery methods, and "big data" technologies.