Benjamin Hindman discusses Apache Mesos, focusing on the Mesos API and how the primitives provided by Mesos can make it easier to build new stateful services and frameworks.
Julien Le Dem discusses the advantages of a columnar data layout, specifically the features and design choices Apache Parquet uses to achieve goals of interoperability, space and query efficiency.
Yann Yu discusses how Solr and Hadoop complement each other, and how to use Solr as a real-time, analytical, full-text search front-end to data stored in Hadoop.
Paco Nathan keynotes on how Spark fits into the big data landscape, describing what other systems work with Spark, and explaining why Spark is needed in the future.
This talk provides a broad overview of the new features introduced in the latest Spring Data release trains: recent additions in Spring Data Commons and the latest features of individual store modules
Camille Fournier explains what projects ZooKeeper is useful for, the common challenges running it as a service and advice to consider when architecting a system using it.
In this solutions track talk, sponsored by DataStax, Johnny Miller introduces the Cassandra native protocol, native drivers and CQL, explaining how to query Cassandra without Trift or RPC.
Bikas Saha and Arun Murthy detail the design of Tez, highlighting some of its features and sharing some of the initial results obtained by Hive on Tez.
Chris Riccomini discusses: Samza's feature set, how Samza integrates with YARN and Kafka, how it's used at LinkedIn, and what's next on the roadmap.
Michael Hausenblas introduces Apache Drill, a distributed system for interactive analysis of large-scale datasets, including its architecture and typical use cases.
Michael Brunton-Spall shares his experience re-architect The Guardian’ Content API from a system based on Solr to a message queue cloud service based upon Elastic Search, without any downtime.
Kumar Palaniapan and Scott Fleming present how NetApp deals with big data using Hadoop, HBase, Flume, and Solr, collecting and analyzing TBs of log data with Think Big Analytics.