InfoQ

InfoQ

Topic/Tag specific view

Hadoop Content on InfoQ


Latest featured content about Hadoop

All things Hadoop

Topics
NoSQL,
Big Data

In this interview Ted Dunning talk about Hadoop, its current usage and its future. He explains the reasons for Hadoop's success and make recommendations on how to start using it.

News about Hadoop

MapReduce Patterns, Algorithms, and Use Cases

Topics
Map-Reduce,
Design Pattern,
Big Data

In his new article “MapReduce Patterns, Algorithms, and Use Cases”, Ilya Katsov gives a systematic view of the different MapReduce patterns, algorithms and techniques that can be found on the web or in scientific articles along with several practical use case studies.

Apache Hadoop 1.0.0 Supports Kerberos Authentication, Apache HBase and RESTful API to HDFS

Topics
Announcements,
NoSQL,
Big Data

After six years of gestation, Big data framework Apache Hadoop 1.0.0 was recently released. Core features in the release include Kerberos Authentication, support for Apache HBase and RESTful API to HDFS. InfoQ spoke with Arun Murthy, VP of Apache Hadoop, about the new release.

Articles about Hadoop

Exploring Hadoop OutputFormat

Topics
Big Data

As more companies adopt Hadoop, its integration with other applications is becoming more important. A key to such integration is usage of the appropriate OutputFormat allowing to produce output data in a form most appropriate for other applications.

Extending Oozie

Topics
Java,
Business Process Management,
Big Data

In this article authors show how leverage Oozie extensibility to implement custom language extensions. This approach can be viewed a specializing workflow language for a given company/line of business.

Presentations about Hadoop

NoSQL at Twitter

Topics
Operations,
NoSQL,
Performance & Scalability,
Architecture

Ryan King presents how Twitter uses NoSQL technologies - Gizzard, Cassandra, Hadoop, Redis - to deal with increasing data amounts forcing them to scale out beyond what the traditional SQL has to offer.

NoSQL at Twitter

Topics
NoSQL,
Architecture

Kevin Weil presents how Twitter does data analysis using Scribe for logging, base analysis with Pig/Hadoop, and specialized data analysis with HBase, Cassandra, and FlockDB.

Interviews about Hadoop

Ville Tuulos on Big Data and Map/Reduce in Erlang and Python with Disco

Topics
Map-Reduce,
Dynamic Languages,
Open Source,
Parallel Programming,
Ruby,
Language,
Big Data,
Fault Tolerance,
Architecture,
Performance & Scalability

Ville Tuulos talks about Disco, the Map/Reduce framework for Python and Erlang, real-world data mining with Python, the advantages of Erlang for distributed and fault tolerant software, and more.

Ron Bodkin on Big Data and Analytics

Topics
Map-Reduce,
Machine Learning,
Operations,
Big Data,
Architecture

Ron Bodkin discusses big data architecture, real-time analytics, batch processing, map-reduce, and data science.