BT
Older Newer rss
39:22

Spanner - Google's Distributed Database

Posted by Sebastian Kanthak  on  Nov 25, 2013

Sebastian Kanthak overviews Spanner, covering details of how Spanner relies on GPS and atomic clocks to provide two of its most innovative features: Lock-free strong (current) reads and global snapshots that are consistent with external events.

37:49

Add ALL the Things: Abstract Algebra Meets Analytics

Posted by Avi Bryant  on  Nov 20, 2013 4

Avi Bryant discusses how the laws of group theory provide a useful codification of the practical lessons of building efficient distributed and real-time aggregation systems.

43:00

Partitions for Everyone!

Posted by Kyle Kingsbury  on  Nov 19, 2013 1

Kyle Kingsbury discusses some of the limitations found in distributed systems and the way some of them behave under partitioning.

52:24

Big Data Platform as a Service at Netflix

Posted by Jeff Magnusson  on  Nov 18, 2013

Jeff Magnusson takes a deep dive into key services of Netflix’s “data platform as a service” architecture, including RESTful services that: provide comprehensive metadata management across data sources (Franklin); enable visualization and caching of results of Hadoop jobs (Sting); and visualize the execution plans produced by languages such as Pig and Hive (Lipstick).

59:10

Scaling out with Akka Actors

Posted by Joshua Suereth  on  Oct 31, 2013 4

Joshua Suereth designs a scalable distributed search service with Akka and Scala using actors, and covering practical aspects of how to scale out with Akka’s clustering API.

53:38

High Speed Smart Data Ingest into Hadoop

Posted by Oleg Zhurakousky  on  Oct 24, 2013

Oleg Zhurakousky discusses architectural tradeoffs and alternative implementations of real-time high speed data ingest into Hadoop.

33:44

The Free Lunch Is Over, Again

Posted by Andy Gross  on  Oct 20, 2013

Andy Gross discusses the challenges introduced by distributed systems and the need for developing new skills and tools for dealing with them.

28:12

A Guide to Python Frameworks for Hadoop

Posted by Uri Laserson  on  Oct 03, 2013

Uri Laserson reviews the different available Python frameworks for Hadoop, including a comparison of performance, ease of use/installation, differences in implementation, and other features.

35:50

Leveraging Your Hadoop Cluster Better - Running Performant Code at Scale

Posted by Michael Kopp  on  Aug 16, 2013

Michael Kopp explains how to run performance code at scale with Hadoop and how to analyze and optimize Hadoop jobs.

44:03

Lessons Learned Building Storm

Posted by Nathan Marz  on  Aug 11, 2013 2

Nathan Marz shares lessons learned building Storm, an open-source, distributed, real-time computation system.

30:33

Building Applications using Apache Hadoop

Posted by Eli Collins  on  Aug 11, 2013

Eli Collins overviews how to build new applications with Hadoop and how to integrate Hadoop with existing applications, providing an update on the state of Hadoop ecosystem, frameworks and APIs.

46:43

Copious Data, the "Killer App" for Functional Programming

Posted by Dean Wampler  on  Aug 03, 2013 2

Dean Wampler supports using Functional Programming and its core operations to process large amounts of data, explaining why Java’s dominance in Hadoop is harming Big Data’s progress.

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2014 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT