Wrap Your SQL Head Around Riak MapReduce
Sean Cribbs explains what Map-Reduce and Riak are, why and how to use Map-Reduce with Riak, and how to convert SQL queries into their Map-Reduce equivalents.
Sean Cribbs explains what Map-Reduce and Riak are, why and how to use Map-Reduce with Riak, and how to convert SQL queries into their Map-Reduce equivalents.
After six years of gestation, Big data framework Apache Hadoop 1.0.0 was recently released. Core features in the release include Kerberos Authentication, support for Apache HBase and RESTful API to HDFS. InfoQ spoke with Arun Murthy, VP of Apache Hadoop, about the new release.
Corporations are increasingly using social media to learn more about what their customers are saying about their products. This presents unique challenges as unstructured content needs analytic techniques to interpret the sentiment embodied in the blog posts. InfoQ caught up with Subramanian Kartik to learn more about the blog sentiment analysis project his team worked on.

Matrix presents a white paper on using the open source tool, Hadoop, to implement the MapReduce strategy and a Cloud computing strategy to solve business intelligence problems.

Grid technology for improving scalability, high availability and throughput in SOA implementations. In this article, Boris Lublinsky explains how Grid computing can be used in the overall SOA architecture and introduces a programming model for Grid utilization in service implementation. He also introduces an experimental Grid implementation that can support this proposed architecture.
Ron Bodkin presents the architecture used by Quantcast to process 100s of TB of data daily using Hadoop on dedicated systems, the applications, the type of data processed, and the infrastructure used.
Marius Eriksen considers that scalability problems appear when leaky abstractions are used, exemplifying with RDBMS, GC, and threads, presenting abstractions that help dealing with scalability issues: map-reduce, shared-nothing web applications, big table, all providing narrow access to explicit resources.

Ron Bodkin of Big Data Analytics discusses early adoption of Hadoop, NoSQL and big data technologies. He discusses common patterns and explains how developers can write low-level primitives to optimize MapReduce function. Other topics include Hive, Pig, multi tenancy, and security.

In this interview Ted Dunning talk about Hadoop, its current usage and its future. He explains the reasons for Hadoop's success and make recommendations on how to start using it.