BT
x Your opinion matters! Please fill in the InfoQ Survey about your reading habits!
Older rss
45:01

The Next Wave of SQL-on-Hadoop: The Hadoop Data Warehouse

Posted by Marcel Kornacker  on  Jul 09, 2014

Marcel Kornacker presents a case study of an EDW built on Impala running on 45 nodes, reducing processing time from hours to seconds and consolidating multiple data sets into one single view.

45:22

Next Gen Hadoop

Posted by Akmal B. Chaudhri  on  Apr 22, 2014

Akmal B. Chaudhri introduces Apache™ Hadoop® 2.0 and Yet Another Resource Negotiator (YARN).

46:49

Data & Infrastructure at Airbnb

Posted by Brenden Matthews  on  Dec 31, 2013

Brenden Matthews describes the infrastructure built at Airbnb using Mesos in order to support Hadoop and Storm.

36:51

Graph Computing at Scale

Posted by Matthias Broecheler  on  Dec 27, 2013

Matthias Broecheler discusses graph computing, introducing the Aurelius graph cluster enabling graph computing at scale by building on distributed systems like Cassandra, HBase, and Hadoop.

38:16

Apache Tez: Accelerating Hadoop Query Processing

Posted by Bikas Saha, Arun Murthy  on  Dec 05, 2013

Bikas Saha and Arun Murthy detail the design of Tez, highlighting some of its features and sharing some of the initial results obtained by Hive on Tez.

53:38

High Speed Smart Data Ingest into Hadoop

Posted by Oleg Zhurakousky  on  Oct 24, 2013

Oleg Zhurakousky discusses architectural tradeoffs and alternative implementations of real-time high speed data ingest into Hadoop.

28:12

A Guide to Python Frameworks for Hadoop

Posted by Uri Laserson  on  Oct 03, 2013

Uri Laserson reviews the different available Python frameworks for Hadoop, including a comparison of performance, ease of use/installation, differences in implementation, and other features.

35:50

Leveraging Your Hadoop Cluster Better - Running Performant Code at Scale

Posted by Michael Kopp  on  Aug 16, 2013

Michael Kopp explains how to run performance code at scale with Hadoop and how to analyze and optimize Hadoop jobs.

44:36

Running the Largest Hadoop DFS Cluster

Posted by Hairong Kuang  on  Mar 15, 2013 5

Hairong Kuang explains how Facebook uses HDFS to store and analyze over 100PB of user log data.

37:29

Making Hadoop Real Time with Scala & GridGain

Posted by Nikita Ivanov  on  Mar 04, 2013

Nikita Ivanov shows adding real-time capabilities to Hadoop through a demo application streaming word counting on a 2-nodes cluster.

24:56

Building an Impenetrable ZooKeeper

Posted by Kathleen Ting  on  Feb 13, 2013 1

Kathleen Ting details 8 misconfigurations that can bring ZooKeeper down.

Storm: Distributed and Fault-Tolerant Real-time Computation

Posted by Nathan Marz  on  Jan 04, 2013

Nathan Marz introduces Twitter Storm, outlining its architecture and use cases, and takes a look at future features to be made available.

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2014 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT