InfoQ Homepage MapReduce Content on InfoQ

Presentations

RSS Feed

Newer Older

AI, ML & Data Engineering

Hydrator: Open Source, Code-Free Data Pipelines

Jonathan Gray introduces Hydrator, an open source framework and user interface for creating data lakes for building and managing data pipelines on Spark, MapReduce, Spark Streaming and Tigon.

Jonathan Gray
on Oct 23, 2016

Icon

41:39
Translating Imperative Code to MapReduce

The authors present an approach for automatic translation of sequential, imperative code into a parallel MapReduce framework using Mold, translating Java code to run on Apache Spark.

Cosmin Radoi Manu Sridharan Stephen J Fink Rodric Rabbah
on Jun 10, 2015

Icon

19:02
Hadoop 201 -- Deeper into the Elephant

Roman Shaposhnik discusses more advanced features of HDFS, in addition to how YARN has enabled businesses to massively scale their systems beyond what was previously possible.

Roman Shaposhnik
on Dec 28, 2014

Icon

01:31:56
Why Spark Is the Next Top (Compute) Model

Dean Wampler argues that Spark/Scala is a better data processing engine than MapReduce/Java because tools inspired by mathematics, such as FP, are ideal tools for working with data.

Dean Wampler
on Dec 15, 2014

Icon

42:05
Spring XD for Real-time Hadoop Workload Analysis

The authors explain how the Pivotal team leveraged familiar SQL-based queries to analyze fine-grained cluster utilization using Spring XD.

Vineet Goel Rodrigo Meneses Girish Lingappa
on Nov 30, 2014

Icon

01:21:15
Getting Real with the MapR Platform

Jim Scott keynotes on the history of Hadoop, the difficulties that this technology has gone through, exploring the reasons why enterprises need to evaluate their targets and prepare for the future.

Jim Scott
on Nov 09, 2014

Icon

19:49
JS Optimization Techniques

Guillaume Lathoud suggests expanding JavaScript with mutual tail-call optimization, map/filter/reduce and math computations to obtain faster code.

Guillaume Lathoud
on Jun 19, 2014

Icon

37:17
Scaling Pinterest

Details on Pinterest's architeture, its systems -Pinball, Frontdoor-, and stack - MongoDB, Cassandra, Memcache, Redis, Flume, Kafka, EMR, Qubole, Redshift, Python, Java, Go, Nutcracker, Puppet, etc.

Marty Weiner Yash Nelapati
on Dec 30, 2013

Icon

51:26
REEF: Retainable Evaluator Execution Framework

Rusty Sears introduces REEF along with examples of computational frameworks, including interactive sessions, iterative graph processing, bulk synchronous computations, Hive queries, and MapReduce.

Rusty Sears
on Dec 10, 2013

Icon

38:11
Exercises in Style

Crista Lopes writes a program in multiple styles -monolithic/OOP/continuations/relational/Pub-Sub/Monads/AOP/Map-reduce- showing the value of using more than a style in large scale systems.

Crista Lopes
on Nov 13, 2013

Icon

39:26
MapReduce and Its Discontents

Dean Wampler discusses the strengths and weaknesses of MapReduce, and the newer variants for big data processing: Pregel and Storm.

Dean Wampler
on Oct 05, 2012

Icon

48:41
Wrap Your SQL Head Around Riak MapReduce

Sean Cribbs explains what Map-Reduce and Riak are, why and how to use Map-Reduce with Riak, and how to convert SQL queries into their Map-Reduce equivalents.

Sean Cribbs
on Feb 03, 2012

Icon

34:34