Erik Hinton discusses the successes and failures of making a cultural shift in the newsroom at NYT to accept Haskell and some of the projects Haskell has been used for.
Cliff Click introduces a coding style & API for in-memory analytics that handles datasets from 1K to 1TB without changing a line of code and clusters with TB of RAM and hundreds of CPUs.
Josh Wills discusses using Hadoop technologies to build real-time data analysis models with a focus on strategies for data integration, large-scale machine learning, and experimentation.
Indrajit Roy presents HP Labs’ attempts at scaling R to efficiently perform distributed machine learning and graph processing on industrial-scale data sets.
Nick Kolegraff discusses common problems and architecture to support all the phases of data science and how to start a data science initiative, sharing lessons from Accenture, Best Buy, and Rackspace.
Paco Nathan reviews an example data analysis application written in Cascalog used for a recommender system based on City of Palo Alto Open Data.
Avi Bryant discusses how the laws of group theory provide a useful codification of the practical lessons of building efficient distributed and real-time aggregation systems.
Ken Collier discusses Agile Analytics, a combination of sophisticated analytics techniques, lean learning principles, agile delivery methods, and "big data" technologies.
Mike Nolet shares lessons learned scaling AppNexus and architectural details of their system processing 30TB/day: Hadoop, DNS built in GSLB and Keepalived, and real-time data streaming built in C.
Michael Hausenblas introduces Apache Drill, a distributed system for interactive analysis of large-scale datasets, including its architecture and typical use cases.
Jim Webber explains how to understand the forces and tensions within a graph structure and to apply graph theory in order to predict how the graph will evolve over time.
Uri Laserson reviews the different available Python frameworks for Hadoop, including a comparison of performance, ease of use/installation, differences in implementation, and other features.