BT

Google Open Sources MapReduce Framework for C to Run Native Code in Hadoop

| by Srini Penchikala Follow 36 Followers on Feb 25, 2015. Estimated reading time: 1 minute |

Google announced last week the release of open source MapReduce framework for C, called MR4C, that allows developers to run native code in Hadoop framework.

MR4C framework brings together the performance and flexibility of natively developed algorithms with the scalability and throughput provided by Hadoop execution framework. The goal of the project is to abstract the details of MapReduce framework and allow users to focus on developing custom algorithms.

The framework was originally developed by Skybox team for satellite image processing and geospatial data science use cases. The team wanted to leverage the image processing libraries developed in C and C++ along with job tracking and cluster management capabilities of Hadoop well-suited for scalable data handling.

In MR4C, algorithms are stored in native shared objects that access data from a local file or an uniform resource identifier (URI). And the input/output datasets, runtime parameters, and any external libraries are configured using JavaScript Object Notation (JSON) files. Splitting mappers and allocating resources can be configured with Apache YARN based tools (for Hadoop v2) or at the cluster level for MapReduce Version 1 (MRv1). Workflows of multiple algorithms can be connected together using an automatically generated configuration. The framework also supports callbacks for logging and progress reporting which can be viewed using the Hadoop JobTracker interface. The workflows can be tested on a local machine using the same interface employed on the target Hadoop cluster.

For more details of the framework, check out the documentation and source code at the MR4C GitHub website. If you are interested in contributing to the project, the team has created a web page to help the project contributors.

 

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Mapreduce Framework by Sonam Gupta

Thanks for your post! Google first formulated the framework for the purpose of serving Google’s Web page indexing, and the new framework replaced earlier indexing algorithms. Beginner developers find the MapReduce framework beneficial because library routines can be used to create parallel programs without any worries about infra-cluster communication, task monitoring or failure handling processes. More at www.youtube.com/watch?v=1jMR4cHBwZE

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

1 Discuss

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT