Rich Hickey, the author of Clojure, explains the information model of Datomic - a new database designed as a composition of simple services, combining the capabilities of RDBMS and scalability of NoSQL.
In his new article Josh Wills introduces Crunch - a new Apache incubating project providing a Java library for creating MapReduce pipelines. Crunch is based on a set of high level abstractions simplifying MapReduce applications design and provides library of patterns to implement common tasks like data joins, aggregations, and sorting.
Hadoop MapReduce jobs have a unique code architecture that raises interesting issues for test-driven development. In this article Michael Spicuzza provides a real-world example using MRUnit, Mockito, and PowerMock to solve these problems.
InfoQ spoke with NoSQL Distilled book authors, Pramod Sadalage and Martin Fowler about NoSQL database space and the emerging trends in NoSQL.
Stefan Edlich reviews NoSQL, considering its evolution, financial impact, standards or their lack of, current landscape, books, the leaders and some newcomers, concluding that NoSQL is here to stay. 3
In this virtual panel, InfoQ talks to several Hadoop vendors and users about their views at current and future state of Hadoop.
Rich Hickey, the author of Clojure, explains the architecture of Datomic - a new database designed as a composition of simple services, combining the capabilities of RDBMS and scalability of NoSQL. 2
Open source web-search framework Apache Nutch version 2 supports link-graph database and HTML parsing. InfoQ spoke with Julien Nioche, VP of Apache Nutch project, about the new features.
In his new article Jonathan Natkins explains how to use components of Apache Hadoop, including Flume, Hive and Oozie to implement a typical Data management system. 2
Most ORM libraries make you write a new class for each item you want to keep in the database, extending this and that for no apparent reason. arrayDB looks at simplifying the whole process. 4
This article answers the question, is cloud computing really all that hard? 2
A new Apache HCatalog provides a metadata and table management system for Hadoop ecosystem, simplifying data interoperability between different data processing tools