InfoQ Homepage Hadoop Content on InfoQ
-
MapReduce and Its Discontents
Dean Wampler discusses the strengths and weaknesses of MapReduce, and the newer variants for big data processing: Pregel and Storm.
-
Hadoop: Scalable Infrastructure for Big Data
Parand Tony Darugar overviews Hadoop, its processing model, the associated ecosystem and tools, discussing some real-life uses of Hadoop for analyzing and processing large amounts of data.
-
Big Data Architectures at Facebook
Ashish Thusoo presents the data scalability issues at Facebook and the data architecture evolution from EDW to Hadoop to Puma.
-
NetApp Case Study
Kumar Palaniapan and Scott Fleming present how NetApp deals with big data using Hadoop, HBase, Flume, and Solr, collecting and analyzing TBs of log data with Think Big Analytics.
-
Hadoop and Cassandra, Sitting in a Tree ...
Jake Luciani introduces Brisk, a Hadoop and Hive distribution using Cassandra for core services and storage, presenting the benefits of running Hadoop in a peer-to-peer masterless architecture.
-
Grid Gain vs. Hadoop. Why Elephants Can't Fly
Dmitriy Setrakyan introduces GridGain, comparing it and outlining the cases where it is a better fit than Hadoop, accompanied by a live demo showing how to set up a GridGain job.
-
Distributed Data Analysis with Hadoop and R
Jonathan Seidman and Ramesh Venkataramaiah present how they run R on Hadoop in order to perform distributed analysis on large data sets, including some alternatives to their solution.
-
Panel: Hadoop for the Enterprise Architect
Peter Sirota, Amr Awadallah, Eric Baldeschwieler, Ted Dunning, Guy Bayes, and moderator Ron Bodkin discuss various existing Hadoop use cases, ecosystems, and disaster recovery.
-
NoSQL at Twitter
Ryan King presents how Twitter uses NoSQL technologies - Gizzard, Cassandra, Hadoop, Redis - to deal with increasing data amounts forcing them to scale out beyond what the traditional SQL has to offer
-
NoSQL at Twitter
Kevin Weil presents how Twitter does data analysis using Scribe for logging, base analysis with Pig/Hadoop, and specialized data analysis with HBase, Cassandra, and FlockDB.
-
Large Scale Map-Reduce Data Processing at Quantcast
Ron Bodkin presents the architecture used by Quantcast to process 100s of TB of data daily using Hadoop on dedicated systems, the applications, the type of data processed, and the infrastructure used.
-
Social Networks: Getting Distributed Web Services Done with NoSQL
Lars George and Fabrizio Schmidt present Germany’s largest social networks, Schuelervz, Studivz and Meinvz, the initial architecture, why it didn’t work and how they solved it with a NoSQL solution.