BT

Your opinion matters! Please fill in the InfoQ Survey!

Cloudera Seeks to Make Hadoop More Accessible With Packaged Distribution

| by Scott Delap Follow 0 Followers on Mar 16, 2009. Estimated reading time: 1 minute |

A note to our readers: As per your request we have developed a set of features that allow you to reduce the noise, while not losing sight of anything that is important. Get email and web notifications by choosing the topics you are interested in.

Numerous projects have sprouted up around the popular Hadoop open source implementation of map reduce in the last year. Now Cloudera is releasing Cloudera Distribution for Hadoop, an open source product seeking to make it easier for company's to begin using Hadoop. From the press release:

...The Cloudera Distribution for Hadoop is freely available for download and immediate use. The product is distributed as a pre-packaged RPM bundle for Red Hat Linux systems or an Amazon EC2 image. To make Hadoop easy to install and use, Cloudera is launching a new portal called my.cloudera.com where people can use a Web-based configuration tool to create custom packages that are optimized to their specific needs. Settings for the cluster can also be saved on the portal to enable automatic updates. There is no charge to use my.cloudera.com. The RPM packages and EC2 images are freely distributed under the Apache 2 software license...

Cloudera is also making a pre-configured VMware image freely available for evaluation and use with their free online training (http://www.cloudera.com/hadoop-training). People that want to test the Cloudera Distribution for Hadoop or learn more about Hadoop and Cloudera’s online training can download the image and run it on their Linux, Mac or Windows desktop. The image ships with example code and all the components needed to use the Cloudera Distribution for Hadoop, including a master server and single node.

All forms of the distribution currently include:

    HDFS - Hadoop Distributed File System, a distributed and fault-tolerant file system designed to run on commodity hardware. HDFS assumes that hardware failure is normal and provides quick detection and automatic recovery.

    MapReduce - divides up applications into many small blocks of work for automatic parallelization and execution on large clusters.

    Hive - a data warehousing infrastructure built on top of Hadoop that provides tools for easy data summary generation, ad hoc querying, and analysis. Hive comes with HQL, a simple query language based on SQL.

More information can be found at www.cloudera.com/hadoop.

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Discuss

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT