In his new article Josh Wills introduces Crunch - a new Apache incubating project providing a Java library for creating MapReduce pipelines. Crunch is based on a set of high level abstractions simplifying MapReduce applications design and provides library of patterns to implement common tasks like data joins, aggregations, and sorting.
Hadoop MapReduce jobs have a unique code architecture that raises interesting issues for test-driven development. In this article Michael Spicuzza provides a real-world example using MRUnit, Mockito, and PowerMock to solve these problems.
Stefan Edlich reviews NoSQL, considering its evolution, financial impact, standards or their lack of, current landscape, books, the leaders and some newcomers, concluding that NoSQL is here to stay. 3
In this article, authors Arun Viswanathan and Shruthi Kumar discuss how to implement common aggregation functions on a MongoDB document database using its MapReduce functionality. 6
InputFormat class provides a powerful mechanism for tighter control of Maps execution in Map Reduce jobs. In this article authors show how to leverage this mechanism for solving specific problems. 1
Matrix presents a white paper on using the open source tool, Hadoop, to implement the MapReduce strategy and a Cloud computing strategy to solve business intelligence problems. 1
In this article, Boris Lublinsky explains how Grid computing can be used in the overall SOA architecture, and introduces a programming model for Grid utilization in the implementation of SOA services. 2