BT
rss
42:05

Why Spark Is the Next Top (Compute) Model

Posted by Dean Wampler  on  Dec 15, 2014

Dean Wampler argues that Spark/Scala is a better data processing engine than MapReduce/Java because tools inspired by mathematics, such as FP, are ideal tools for working with data.

01:21:15

Spring XD for Real-time Hadoop Workload Analysis

Posted by Vineet Goel,Girish Lingappa,Rodrigo Meneses  on  Nov 30, 2014

The authors explain how the Pivotal team leveraged familiar SQL-based queries to analyze fine-grained cluster utilization using Spring XD.

19:49

Getting Real with the MapR Platform

Posted by Jim Scott  on  Nov 09, 2014

Jim Scott keynotes on the history of Hadoop, the difficulties that this technology has gone through, exploring the reasons why enterprises need to evaluate their targets and prepare for the future.

37:17

JS Optimization Techniques

Posted by Guillaume Lathoud  on  Jun 19, 2014

Guillaume Lathoud suggests expanding JavaScript with mutual tail-call optimization, map/filter/reduce and math computations to obtain faster code.

51:26

Scaling Pinterest

Posted by Yash Nelapati, Marty Weiner  on  Dec 30, 2013

Details on Pinterest's architeture, its systems -Pinball, Frontdoor-, and stack - MongoDB, Cassandra, Memcache, Redis, Flume, Kafka, EMR, Qubole, Redshift, Python, Java, Go, Nutcracker, Puppet, etc.

38:11

REEF: Retainable Evaluator Execution Framework

Posted by Rusty Sears  on  Dec 10, 2013

Rusty Sears introduces REEF along with examples of computational frameworks, including interactive sessions, iterative graph processing, bulk synchronous computations, Hive queries, and MapReduce.

39:26

Exercises in Style

Posted by Crista Lopes  on  Nov 13, 2013

Crista Lopes writes a program in multiple styles -monolithic/OOP/continuations/relational/Pub-Sub/Monads/AOP/Map-reduce- showing the value of using more than a style in large scale systems.

MapReduce and Its Discontents

Posted by Dean Wampler  on  Oct 05, 2012 1

Dean Wampler discusses the strengths and weaknesses of MapReduce, and the newer variants for big data processing: Pregel and Storm.

Approachable Concurrency for the JVM with Groovy Parallel Systems

Posted by Dierk König  on  Mar 16, 2012 2

Dierk König introduces GPars, Groovy’s library for concurrent programming, explaining a simpler and less error-prone way to use fork/join, map/reduce, actors, and dataflow in Java and Groovy.

Wrap Your SQL Head Around Riak MapReduce

Posted by Sean Cribbs  on  Feb 03, 2012 3

Sean Cribbs explains what Map-Reduce and Riak are, why and how to use Map-Reduce with Riak, and how to convert SQL queries into their Map-Reduce equivalents.

Large Scale Map-Reduce Data Processing at Quantcast

Posted by Ron Bodkin  on  Dec 21, 2010 2

Ron Bodkin presents the architecture used by Quantcast to process 100s of TB of data daily using Hadoop on dedicated systems, the applications, the type of data processed, and the infrastructure used.

Abstractions at Scale–Our Experiences at Twitter

Posted by Marius Eriksen  on  Dec 14, 2010 1

Marius Eriksen considers that leaky abstractions lead to scalability issues, while those providing narrow access to explicit resources - map-reduce, shared-nothing web apps, big table - scale better.

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2014 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT