In his new article Josh Wills introduces Crunch - a new Apache incubating project providing a Java library for creating MapReduce pipelines. Crunch is based on a set of high level abstractions simplifying MapReduce applications design and provides library of patterns to implement common tasks like data joins, aggregations, and sorting.
Hadoop MapReduce jobs have a unique code architecture that raises interesting issues for test-driven development. In this article Michael Spicuzza shows how to use MRUnit to solve these problems. 1
Stefan Edlich reviews NoSQL, considering its evolution, financial impact, standards or their lack of, current landscape, books, the leaders and some newcomers, concluding that NoSQL is here to stay. 3
In this article, authors Arun Viswanathan and Shruthi Kumar discuss how to implement common aggregation functions on a MongoDB document database using its MapReduce functionality. 6
InputFormat class provides a powerful mechanism for tighter control of Maps execution in Map Reduce jobs. In this article authors show how to leverage this mechanism for solving specific problems. 1
Matrix presents a white paper on using the open source tool, Hadoop, to implement the MapReduce strategy and a Cloud computing strategy to solve business intelligence problems. 1
In this article, Boris Lublinsky explains how Grid computing can be used in the overall SOA architecture, and introduces a programming model for Grid utilization in the implementation of SOA services. 2