Hadoop MapReduce jobs have a unique code architecture that raises interesting issues for test-driven development. In this article Michael Spicuzza provides a real-world example using MRUnit, Mockito, and PowerMock to solve these problems.
Stefan Edlich, Senior Lecturer at Beuth HS of Technology Berlin, Germany, reviews NoSQL, considering its evolution, financial impact, the standards or their lack of, the current landscape, books, the leaders and some newcomers, concluding that NoSQL is here to stay.
In this virtual panel, InfoQ talks to several Hadoop vendors and users about their views at current and future state of Hadoop.
Rich Hickey, the author of Clojure, explains the architecture of Datomic - a new database designed as a composition of simple services, combining the capabilities of RDBMS and scalability of NoSQL. 2
Open source web-search framework Apache Nutch version 2 supports link-graph database and HTML parsing. InfoQ spoke with Julien Nioche, VP of Apache Nutch project, about the new features.
In his new article Jonathan Natkins explains how to use components of Apache Hadoop, including Flume, Hive and Oozie to implement a typical Data management system. 2
This article answers the question, is cloud computing really all that hard? 2
A new Apache HCatalog provides a metadata and table management system for Hadoop ecosystem, simplifying data interoperability between different data processing tools
This article contains an interview with Dipti Borkar, Director of Product Management at Couchbase, on the challenges, benefits and the process of migrating from RDBMS to NoSQL. 6
In this article, authors Arun Viswanathan and Shruthi Kumar discuss how to implement common aggregation functions on a MongoDB document database using its MapReduce functionality. 7
Approaches to integrating data are changing with emergence of cloud computing. 2