InfoQ

News

MapReduce Gaining Traction: Tools Plugin Released for Eclipse and Amazon EC2 Support

Posted by Scott Delap on Mar 28, 2007 10:56 AM

Community
Java
Topics
Performance & Scalability ,
Clustering & Caching
Tags
Amazon ,
Hadoop ,
EC2
IBM's Alphaworks website has released an Eclipse plugin to simplify the development of applications using Hadoop, the open source Java MapReduce framework. Hadoop which was originally created to support Nutch includes a distributed filesystem and an implementation of the MapReduce programming structure used extensively by Google for parallel processing of large data sets across a cluster. This year integration work has been performed to easily allow the running of Hadoop MapReduce applications on Amazon's EC2 platform and use Amazon's S3 platform for storage. The Amazon Web Services blog notes: "Because bandwidth between EC2 instances and data stored in S3 is not metered or billed, this is a very cost-effective way to process large amounts of data.".

The IBM MapReduce plugin supports the following features:

  • the ability to package and deploy a Java™ project as a JAR (Java Archive) file to a Hadoop server (local and remote)
  • cheat sheets that assist with the development process
  • a separate perspective with a view of Hadoop servers, the Hadoop distributed file system (DFS), and current job status
  • wizards for facilitating the development of classes based on the MapReduce framework.

It also includes improved cheat sheets and full OS X compatibility. The plugin uses SCP and SSH to interact with Hadoop servers and HTTP to poll job statuses.

No comments

Reply

Exclusive Content

Book Except and Interview : Aptana RadRails, An IDE for Rails Development

Aptana RadRails: An IDE for Rails Development by Javier Ramírez discusses the latest Aptana RadRails IDE, a development environment for creating Ruby on Rails applications.

Fast Bytecodes for Funny Languages

Cliff Click discusses how to optimize generated bytecode for running on the JVM. Click analyzes and reports on several JVM languages and shows several places where they could increase performance.

Scott Ambler On Agile’s Present and Future

Scott Ambler, Practice Lead for Agile Development at IBM, speaks on the current status of the Agile community and practices having a look at the perspective of the Agile’s future.

Manager's Introduction to Test-Driven Development

Dave Nicolette and Karl Scotland try to introduce non-technical managers to one of the most popular Agile development techniques: Test-Driven Development (TDD).

Structured Event Streaming with Smooks

Smooks is best known for its transformation capabilities, but in this article Tom Fennelly describes how you can also use it for structured event streaming.

How to Work With Business Leaders to Manage Architectural Change

Successful architectures evolve over time to meet changing business requirements. Luke Hohmann presents how to collaborate with key members of your business to manage architectural changes.

Colors and the UI

In this article, Dr. Tobias Komischke explains how colors used in a GUI can influence our interaction with a computer and offers advice on using the appropriate colors for the interface.

Building your next service with the Atom Publishing Protocol

In his presentation, recorded at QCon San Francisco, MuleSource architect Dan Diephouse explores ways to use the Atom Publishing Protocol (AtomPub) when building services in a RESTful way.