BT

New Early adopter or innovator? InfoQ has been working on some new features for you. Learn more

Older rss
41:39
Data Science Follow 209 Followers

Hydrator: Open Source, Code-Free Data Pipelines

Posted by Jonathan Gray  on  Oct 23, 2016 Posted by Jonathan Gray Follow 0 Followers  on  Oct 23, 2016

Jonathan Gray introduces Hydrator, an open source framework and user interface for creating data lakes for building and managing data pipelines on Spark, MapReduce, Spark Streaming and Tigon.

19:02
Followers

Translating Imperative Code to MapReduce

Posted by Cosmin Radoi  on  Jun 10, 2015 Posted by Cosmin Radoi Follow 0 Followers , Rodric Rabbah Follow 0 Followers , Stephen J Fink Follow 0 Followers , Manu Sridharan Follow 0 Followers  on  Jun 10, 2015

The authors present an approach for automatic translation of sequential, imperative code into a parallel MapReduce framework using Mold, translating Java code to run on Apache Spark.

01:31:56
Followers

Hadoop 201 -- Deeper into the Elephant

Posted by Roman Shaposhnik  on  Dec 28, 2014 Posted by Roman Shaposhnik Follow 0 Followers  on  Dec 28, 2014

Roman Shaposhnik discusses more advanced features of HDFS, in addition to how YARN has enabled businesses to massively scale their systems beyond what was previously possible.

42:05
Followers

Why Spark Is the Next Top (Compute) Model

Posted by Dean Wampler  on  Dec 15, 2014 Posted by Dean Wampler Follow 1 Followers  on  Dec 15, 2014

Dean Wampler argues that Spark/Scala is a better data processing engine than MapReduce/Java because tools inspired by mathematics, such as FP, are ideal tools for working with data.

01:21:15
Followers

Spring XD for Real-time Hadoop Workload Analysis

Posted by Vineet Goel  on  Nov 30, 2014 Posted by Vineet Goel Follow 0 Followers , Girish Lingappa Follow 0 Followers , Rodrigo Meneses Follow 0 Followers  on  Nov 30, 2014

The authors explain how the Pivotal team leveraged familiar SQL-based queries to analyze fine-grained cluster utilization using Spring XD.

19:49
Followers

Getting Real with the MapR Platform

Posted by Jim Scott  on  Nov 09, 2014 Posted by Jim Scott Follow 0 Followers  on  Nov 09, 2014

Jim Scott keynotes on the history of Hadoop, the difficulties that this technology has gone through, exploring the reasons why enterprises need to evaluate their targets and prepare for the future.

37:17
Followers

JS Optimization Techniques

Posted by Guillaume Lathoud  on  Jun 19, 2014 Posted by Guillaume Lathoud Follow 0 Followers  on  Jun 19, 2014

Guillaume Lathoud suggests expanding JavaScript with mutual tail-call optimization, map/filter/reduce and math computations to obtain faster code.

51:26
Followers

Scaling Pinterest

Posted by Yash Nelapati  on  Dec 30, 2013 Posted by Yash Nelapati Follow 0 Followers , Marty Weiner Follow 0 Followers  on  Dec 30, 2013

Details on Pinterest's architeture, its systems -Pinball, Frontdoor-, and stack - MongoDB, Cassandra, Memcache, Redis, Flume, Kafka, EMR, Qubole, Redshift, Python, Java, Go, Nutcracker, Puppet, etc.

38:11
Followers

REEF: Retainable Evaluator Execution Framework

Posted by Rusty Sears  on  Dec 10, 2013 Posted by Rusty Sears Follow 0 Followers  on  Dec 10, 2013

Rusty Sears introduces REEF along with examples of computational frameworks, including interactive sessions, iterative graph processing, bulk synchronous computations, Hive queries, and MapReduce.

39:26
Followers

Exercises in Style

Posted by Crista Lopes  on  Nov 13, 2013 Posted by Crista Lopes Follow 0 Followers  on  Nov 13, 2013

Crista Lopes writes a program in multiple styles -monolithic/OOP/continuations/relational/Pub-Sub/Monads/AOP/Map-reduce- showing the value of using more than a style in large scale systems.

Followers

MapReduce and Its Discontents

Posted by Dean Wampler  on  Oct 05, 2012 1 Posted by Dean Wampler Follow 1 Followers  on  Oct 05, 2012 1

Dean Wampler discusses the strengths and weaknesses of MapReduce, and the newer variants for big data processing: Pregel and Storm.

Followers

Approachable Concurrency for the JVM with Groovy Parallel Systems

Posted by Dierk König  on  Mar 16, 2012 1 Posted by Dierk König Follow 0 Followers  on  Mar 16, 2012 1

Dierk König introduces GPars, Groovy’s library for concurrent programming, explaining a simpler and less error-prone way to use fork/join, map/reduce, actors, and dataflow in Java and Groovy.

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT