Newer rss
  • Hadoop and Metadata (Removing the Impedance Mis-match)

    by Alan Gates, Russell Jurney on  Sep 26, 2012

    A new Apache HCatalog project is a table and storage management layer for Hadoop that enables different data processing tools – Pig, MapReduce, and Hive – to more easily inter-operate data. HCatalog’s presents users with a relational view of data and ensures that users need not worry about where or in what format their data is stored – RCFile format, text files, or sequence files.

  • Transitioning from RDBMS to NoSQL. Interview with Couchbase’s Dipti Borkar

    by Abel Avram on  Sep 08, 2012 6

    While relational databases have been used for decades to store data, and they still represent a viable solution for many use cases, NoSQL is being chosen today especially for scalability and performance reasons. This article contains an interview with Dipti Borkar, Director of Product Management at Couchbase, on the challenges, benefits and the process of migrating from RDBMS to NoSQL.

  • Implementing Aggregation Functions in MongoDB

    by Arun Viswanathan, Shruthi Kumar on  Jun 20, 2012 6

    In this article, authors Arun Viswanathan and Shruthi Kumar discuss how to implement common aggregation functions on a MongoDB document database using its MapReduce functionality. They also discuss a typical application of aggregations which includes business reporting of sales data.

Evolution in Data Integration From EII to Big Data

Posted by JP Morgenthal on  Feb 22, 2012

Approaches to integrating data are changing with emergence of cloud computing. 2

Implementing Lucene Spatial Support

Posted by Boris Lublinsky on  Jan 13, 2012

In this article, Boris Lublinsky shows how to extend Hbase - based Lucene implementation with geospatial search support.

Exploring Hadoop OutputFormat

Posted by Jim.Blomo on  Dec 07, 2011

Usage of custom Hadoop OutputFormat allows to produce output data in a form most appropriate for other applications. 2

Uncovering mysteries of InputFormat: Providing better control for your Map Reduce execution.

Posted by Boris Lublinsky, Mike Segel on  Nov 04, 2011

InputFormat class provides a powerful mechanism for tighter control of Maps execution in Map Reduce jobs. In this article authors show how to leverage this mechanism for solving specific problems. 1

Extending Oozie

Posted by Boris Lublinsky, Mike Segel on  Aug 02, 2011

In this article authors show how to extend Oozie by introducing custom actions, specific for a given company/line of business. 4

Oozie by Example

Posted by Boris Lublinsky, Mike Segel on  Jul 18, 2011

Complete Oozie example, demonstrating language features and their usage in real world examples 2

Data Mining in the Swamp: Taming Unruly Data With Cloud Computing

Posted by John Brothers on  Aug 13, 2010

Matrix presents a white paper on using the open source tool, Hadoop, to implement the MapReduce strategy and a Cloud computing strategy to solve business intelligence problems. 1

SOA Agents: Grid Computing meets SOA

Posted by Boris Lublinsky on  Dec 11, 2008

In this article, Boris Lublinsky explains how Grid computing can be used in the overall SOA architecture, and introduces a programming model for Grid utilization in the implementation of SOA services. 2

General Feedback
Editorial and all content copyright © 2006-2013 C4Media Inc. hosted at Contegix, the best ISP we've ever worked with.
Privacy policy