Dmitriy Ryaboy shares some of the lessons learned scaling Twitter’s analytics infrastructure: Data loves a schema, Make data sources discoverable, and Make costs visible.
Dhruba Borthakur discusses the different types of data used by Facebook and how they are stored, including graph data, semi-OLTP data, immutable data for pictures, and Hadoop/Hive for analytics.
Scott Vokes presents several less known data structures and their advantages: skiplists, difference lists, rolling hashes, and jumpropes.
Sean Cribbs discusses Convergent Replicated Data Types, data structures that tolerate eventual consistency.
Ian Plosker explains why a data model needs to follow the query patterns when using a NoSQL storage solution.
Rich Hickey discusses the complexity introduced by a database into a system, and a way to deal with it by using Datomic. He also discusses immutability, epochal time, and persistent data structures.
Stuart Sierra discusses using a data-oriented programming approach in order to create programs that are easier to write and test. The session is accompanied with Clojure code samples.
Frank Tarsillo , John Davies, Jon Vernon and Ari Zilka (moderator) discuss the technologies and architectures used these days to manage large amounts of sensitive data in top financial institutions.
Jim Webber talks about the data of these days, how integrated data looks, how to model it using actual data stores and the implications of this modeling.
James Spooner discusses the need to make good use of the underlying silicon using Dataflow computing and parallelism to improve throughput and latency for optimized data processing performance.
Ashish Thusoo presents the data scalability issues at Facebook and the data architecture evolution from EDW to Hadoop to Puma.
Baishampayan Ghose discusses creating custom data types in Clojure, covering: types vs. records, interfaces and corresponding protocol, mutable types, and example implementations.