Josh Wills discusses using Hadoop technologies to build real-time data analysis models with a focus on strategies for data integration, large-scale machine learning, and experimentation.
Details on Pinterest's architeture, its systems -Pinball, Frontdoor-, and stack - MongoDB, Cassandra, Memcache, Redis, Flume, Kafka, EMR, Qubole, Redshift, Python, Java, Go, Nutcracker, Puppet, etc.
Burt Beckwith discusses performing transactions in Grails, covering services, customizing transaction attributes (isolation, propagation levels), two-phase commit, using JMS, and testing the code.
Matthias Broecheler discusses graph computing, introducing the Aurelius graph cluster enabling graph computing at scale by building on distributed systems like Cassandra, HBase, and Hadoop.
Indrajit Roy presents HP Labs’ attempts at scaling R to efficiently perform distributed machine learning and graph processing on industrial-scale data sets.
Craig Andera explains Datomic from the perspective gained in implementing and optimizing a real-world production system, detailing the Datomic indexing process.
Garrett Eardley explores how Riot Games is leveraging Riak for their stats system, discussing why they chose Riak, the data model and indexes, and strategies for working with eventually consistent data.
Rusty Sears introduces REEF along with examples of computational frameworks, including interactive sessions, iterative graph processing, bulk synchronous computations, Hive queries, and MapReduce.
Nick Kolegraff discusses common problems and architecture to support all the phases of data science and how to start a data science initiative, sharing lessons from Accenture, Best Buy, and Rackspace.
Sebastian Kanthak overviews Spanner, covering details of how Spanner relies on GPS and atomic clocks to provide two of its most innovative features: Lock-free strong (current) reads and global snapshots that are consistent with external events.
Paul King presents working with databases in Groovy, covering datasets, GMongo, Neo4J, raw JDBC, Groovy-SQL, CRUD, Hibernate, caching, Spring Data technologies, etc.
Paco Nathan reviews an example data analysis application written in Cascalog used for a recommender system based on City of Palo Alto Open Data.
CONTENT IN THIS BOX PROVIDED BY OUR SPONSOR
- 10 Things Developers Should Know about Couchbase
- When one is better than two: Collapsing data management layers for scalability and simplicity
- Couchbase NoSQL @ Tunewiki : A billion documents and counting
- The Essential Couchbase APIs Cheat Sheet
- Why MySQL 5.6 is no real threat to NoSQL
- How to Move from MySQL to Couchbase Server 2.0: Part 1
- Making Sense of NoSQL
- Couchbase in Action – Real world app demo
- Making the Shift from Relational to NoSQL