Julien Lavigne du Cadet discusses how Criteo uses Druid: an open-source, real-time data store designed to power interactive applications at scale, covering Druid's architecture and internals.
Evelina Gabasova explains how to run a social network analysis on Twitter and how to use data science tools to find out more about followers.
Thore Thomassen shares from experience how to combine structured data in a DWH with unstructured data in NoSQL, and using parallel data warehouse appliances to boost the analytical capabilities.
Bob Kelly presents case studies on how Platfora uses Hadoop to do analytics for several of their customers.
Seth Juarez shares insight on how to create applications that use dashboards to drive value, convert raw data into answers, and simplify business processes.
Wesley Chow presents Chartbeat's real-time analytics platform and how able to handle the requests in a cost efficient manner using a custom written analytics engine in C and Lua.
Bryan Nehl makes an introduction to the data science: data formats, ETL tools, NoSQL databases, languages, libraries, techniques and approaches for exploring data and extracting value from it.
Stefan Edlich discusses big data systems -Spanner, Presto- and the future of data persistence, data analytics, data formats and of NoSQL/NewSQL in general.
Erik Hinton discusses the successes and failures of making a cultural shift in the newsroom at NYT to accept Haskell and some of the projects Haskell has been used for.
Indrajit Roy presents HP Labs’ attempts at scaling R to efficiently perform distributed machine learning and graph processing on industrial-scale data sets.
Paco Nathan reviews an example data analysis application written in Cascalog used for a recommender system based on City of Palo Alto Open Data.
Avi Bryant discusses how the laws of group theory provide a useful codification of the practical lessons of building efficient distributed and real-time aggregation systems.