Paco Nathan reviews an example data analysis application written in Cascalog used for a recommender system based on City of Palo Alto Open Data.
Avi Bryant discusses how the laws of group theory provide a useful codification of the practical lessons of building efficient distributed and real-time aggregation systems.
Jeff Magnusson takes a deep dive into key services of Netflix’s “data platform as a service” architecture, including RESTful services that: provide comprehensive metadata management across data sources (Franklin); enable visualization and caching of results of Hadoop jobs (Sting); and visualize the execution plans produced by languages such as Pig and Hive (Lipstick).
Tamar Bercovici presents Box’s transition from a single MySQL database to a fully sharded MySQL architecture, all the while serving 2 billion queries per day.
Crista Lopes writes a program in multiple styles -monolithic/OOP/continuations/relational/Pub-Sub/Monads/AOP/Map-reduce- showing the value of using more than a style in large scale systems.
Dan Frank discusses stream data processing and introduces NSQ – Bitly’s open source queuing system – and other new technologies used for communication between streaming programs.
Jeff Scott Brown demoes creating a web application with Grails 2 using the command-line, GORM and Hibernate, GSP, and Spring Integration.
Ken Collier discusses Agile Analytics, a combination of sophisticated analytics techniques, lean learning principles, agile delivery methods, and "big data" technologies.
Oleg Zhurakousky discusses architectural tradeoffs and alternative implementations of real-time high speed data ingest into Hadoop.
Mike Nolet shares lessons learned scaling AppNexus and architectural details of their system processing 30TB/day: Hadoop, load balancer-free DNS architecture built in GSLB and Keepalived, and real-time data streaming built in C.
Michael Hausenblas introduces Apache Drill, a distributed system for interactive analysis of large-scale datasets, including its architecture and typical use cases.
Peter Boros discusses a MySQL architecture useful for the majority of projects, backup, online schema changes, reliability and scalability issues, and basics of sharding.
CONTENT IN THIS BOX PROVIDED BY OUR SPONSOR
- 10 Things Developers Should Know about Couchbase
- When one is better than two: Collapsing data management layers for scalability and simplicity
- Couchbase NoSQL @ Tunewiki : A billion documents and counting
- The Essential Couchbase APIs Cheat Sheet
- Why MySQL 5.6 is no real threat to NoSQL
- How to Move from MySQL to Couchbase Server 2.0: Part 1
- Making Sense of NoSQL
- Couchbase in Action – Real world app demo
- Making the Shift from Relational to NoSQL