Elliot Chow discusses the data pipeline that they built with Kafka, Spark Streaming, and Cassandra to process Netflix user activities in real time for the Trending Now row.
Mark Paluch and John Blum present the changes in Spring Data Cassandra and what to expect from the upcoming version.
Ben Lackey and Cornelia Davis start with the use cases for on-demand, dedicated DSE clusters, cover the solution design, and demo the system, touching also the support that Spring has for Cassandra.
Helena Edelson addresses new architectures emerging for large scale streaming analytics based on Spark, Mesos, Akka, Cassandra and Kafka (SMACK) or Apache Flink or GearPump.
Piotr Kołaczkowski discusses how they integrated Spark with Cassandra, how it was done, how it works in practice and why it is better than using a Hadoop intermediate layer.
Emily Green is taking a look at how SoundCloud uses Cassandra. She describes a couple of Cassandra instances, from the point of view of the products and functionality they support.
Julien Le Dem discusses the advantages of a columnar data layout, specifically the features and design choices Apache Parquet uses to achieve goals of interoperability, space and query efficiency.
Eric Redmond explains the differences and commonalities amongst many kinds of databases and takes a stab at the marketing term “NoSQL.”
John Leach explains using HBase co-processors to support a full ANSI SQL RDBMS without modifying the core HBase source, showing how Hadoop/HBase can replace traditional RDBMS solutions.
The authors focus on POJO persistence over Cassandra, including automatic Cassandra schema generation and Spring context configuration using both XML and Java.
This talk goes over the design motivation for Zen and describe its internals including the API, type system and HBase backend.
Jayesh Thakrar shows what can be done with irb, how to exploit JRuby-Java integration, and demonstrates how the Shell can be used in Hadoop streaming to perform complex and large volume batch jobs.