InfoQ

InfoQ

News

My Bookmarks

Login or Register to enable bookmarks for unlimited time.

The content has been bookmarked!

There was an error bookmarking this content! Please retry.

MapReduce Gaining Traction: Tools Plugin Released for Eclipse and Amazon EC2 Support

Posted by Scott Delap on Mar 28, 2007

Sections
Development,
Architecture & Design
Topics
Performance & Scalability ,
Clustering & Caching ,
Java
Tags
Amazon ,
EC2 ,
Hadoop
IBM's Alphaworks website has released an Eclipse plugin to simplify the development of applications using Hadoop, the open source Java MapReduce framework. Hadoop which was originally created to support Nutch includes a distributed filesystem and an implementation of the MapReduce programming structure used extensively by Google for parallel processing of large data sets across a cluster. This year integration work has been performed to easily allow the running of Hadoop MapReduce applications on Amazon's EC2 platform and use Amazon's S3 platform for storage. The Amazon Web Services blog notes: "Because bandwidth between EC2 instances and data stored in S3 is not metered or billed, this is a very cost-effective way to process large amounts of data.".

The IBM MapReduce plugin supports the following features:

  • the ability to package and deploy a Java™ project as a JAR (Java Archive) file to a Hadoop server (local and remote)
  • cheat sheets that assist with the development process
  • a separate perspective with a view of Hadoop servers, the Hadoop distributed file system (DFS), and current job status
  • wizards for facilitating the development of classes based on the MapReduce framework.

It also includes improved cheat sheets and full OS X compatibility. The plugin uses SCP and SSH to interact with Hadoop servers and HTTP to poll job statuses.

No comments

Watch Thread Reply

Educational Content

Jesper Boeg on Priming Kanban

In this interview, Jesper Boeg, author of the new InfoQ book – Priming Kanban, discusses the keys to using Kanban effectively, and how to get started if you are currently using other approaches.

New-age Transactional Systems - Not Your Grandpa's OLTP

John Hugg discusses high volume transaction processing applications with high and low frequency profiles, and how VoltDB can be used for that purpose.

Cool Code

Kevlin Henney examines code samples to see what can be learned from them starting from the premise that one won’t write great code unless he knows how to read it.

Collaboration: At the Extremities of Extreme

Jason Ayers share the observations he made watching a team of developers collaborating in real time on the same code base, pushing XP, pair programming and continuous integration to their extremes.

Yesod Web Framework

Michael Snoyman presents Yesod, a web framework written in Haskell and containing a web server, templating, ORM, libraries (templating, gravatar, etc.).

Transactions without Transactions

Richard Kreuter and Kyle Banker on how to avoid classical RDBMS transactional systems by using compensation mechanisms, transactional messaging or transactional procedures.

Attila Szegedi on JVM and GC Performance Tuning at Twitter

Attila Szegedi talks about performance tuning Java and Scala programs at Twitter: how to approach GC problems, the importance of asynchronous I/O, when to use MySQL/Cassandra/Redis, and much more.

10 tips on how to prevent business value risk

One category of risk that project teams need to ensure they address is business value failure – delivering a product that fails to provide value for the business investor.