Helena Edelson addresses new architectures emerging for large scale streaming analytics based on Spark, Mesos, Akka, Cassandra and Kafka (SMACK) or Apache Flink or GearPump.
Piotr Kołaczkowski discusses how they integrated Spark with Cassandra, how it was done, how it works in practice and why it is better than using a Hadoop intermediate layer.
Emily Green is taking a look at how SoundCloud uses Cassandra. She describes a couple of Cassandra instances, from the point of view of the products and functionality they support.
Julien Le Dem discusses the advantages of a columnar data layout, specifically the features and design choices Apache Parquet uses to achieve goals of interoperability, space and query efficiency.
Eric Redmond explains the differences and commonalities amongst many kinds of databases and takes a stab at the marketing term “NoSQL.”
John Leach explains using HBase co-processors to support a full ANSI SQL RDBMS without modifying the core HBase source, showing how Hadoop/HBase can replace traditional RDBMS solutions.
The authors focus on POJO persistence over Cassandra, including automatic Cassandra schema generation and Spring context configuration using both XML and Java.
This talk goes over the design motivation for Zen and describe its internals including the API, type system and HBase backend.
Jayesh Thakrar shows what can be done with irb, how to exploit JRuby-Java integration, and demonstrates how the Shell can be used in Hadoop streaming to perform complex and large volume batch jobs.
In this solutions track talk, sponsored by DataStax, Johnny Miller introduces the Cassandra native protocol, native drivers and CQL, explaining how to query Cassandra without Trift or RPC.
Details on Pinterest's architeture, its systems -Pinball, Frontdoor-, and stack - MongoDB, Cassandra, Memcache, Redis, Flume, Kafka, EMR, Qubole, Redshift, Python, Java, Go, Nutcracker, Puppet, etc.
Matthias Broecheler discusses graph computing, introducing the Aurelius graph cluster enabling graph computing at scale by building on distributed systems like Cassandra, HBase, and Hadoop.