InfoQ

Topic/Tag specific view

All content and news on InfoQ about Hadoop


Latest featured content about Hadoop

Yahoo's Doug Cutting on MapReduce and the Future of Hadoop

Community
Java
Topics
Data Access,
Clustering & Caching

InfoQ's lead Java editor, Scott Delap, recently caught up with Hadoop project lead Doug Cutting. Hadoop is an open source distributed computing platform that includes implementations of MapReduce and a distributed file system. In this special InfoQ interview Cutting discusses how Hadoop is used at Yahoo, the challenges of its development, and the future direction of the project.

News about Hadoop

HBase Leads Discuss Hadoop, BigTable and Distributed Databases

Community
Java
Topics
Data Access,
Cloud Computing,
Database Design

Google's recent introduction of their Google Application Engine has created renewed interest in alternative database technologies. InfoQ recently sat down with the leads of HBase, an open-source, distributed, data store modeled after the Google's BigTable.

Hypertable Lead Discusses Hadoop and Distributed Databases

Community
Architecture,
Java
Topics
Data Access,
Cloud Computing,
Clustering & Caching

Two open source projects related to Hadoop, HBase and Hypertable, provide Big Table inspired scalable database implementations. InfoQ sat down with Doug Judd, Principal Search Architect at Zvents, Inc. and Hypertable project founder, to discuss its implementation.

Lucene 2.3: Large indexing performance improvements, new machine-learning project

Community
Java
Topics
Search,
Open Source

The Apache Lucene project, a high-performance full-featured text search engine library written entirely in Java, released version 2.3 today. InfoQ spoke with committer and Project Management Committee (PMC) member Grant Ingersoll to learn more about this release and the future plans for Lucene.

MapReduce A Step Backwards: Is Comparison to Relational Databases Fair?

Community
Java
Topics
Grid Computing

A recent article on the Database Column by David J. DeWitt and Michael Stonebraker attempts to compare the increasingly popular MapReduce programming paradigm to a relational database. The blogsphere has quickly called foul on the comparison and its reasoning.

Interview: Yahoo's Doug Cutting on MapReduce and the Future of Hadoop

Community
Java
Topics
Data Access,
Clustering & Caching

In this special InfoQ interview, Hadoop project lead Doug Cutting discusses MapReduce, the benefits of open source, and the future direction of the project.

Open Source Google-Like Infrastructure Project Hadoop Gains Momentum

Community
Java
Topics
Clustering & Caching,
Grid Computing

While it has been in existence for over a year, open source Google-like infrastructure project Hadoop is just now receiving wider noticed by the development community. Recently Yahoo's Jeremy Zawodny provided a status update showing benchmark performance improving by 20x in the last year.

MapReduce Gaining Traction: Tools Plugin Released for Eclipse and Amazon EC2 Support

Community
Java
Topics
Performance & Scalability,
Clustering & Caching

IBM's Alphaworks website has released an Eclipse plugin to simplify the development of applications using Hadoop, the open source Java MapReduce framework. Work has also been done to easily allow Hadoop applications to run on Amazon's EC2 and S3 platforms for processing and storage.

Run Your Own Google Style Computing Cluster with Hadoop and Amazon EC2

Community
Java
Topics
Clustering & Caching,
Grid Computing

Amazon's EC2 Elastic Computing cloud allows developers to acquisition computing power a the rate of $0.10 per hour consumed. Work as been done to allow Hadoop an open source MapReduce implementation written in Java to run on EC2. This combination will allow developers to write scalable algorithms and then bring up large numbers of servers to use as computing power for them as needed.