BT

Big Data Hadoop Solutions, State of Affairs in Q1/2014

by Boris Lublinsky on Mar 04, 2014 |

According to a new Forrest report, many companies are looking for deeper insights from the massive amount of structured, unstructured, semi structured, and binary data that they have. One of the report's conclusions is that

Most firms estimate that they are only analyzing 12% of the data that they already have, leaving 88% of it on the cutting-room floor. Repressive data silos and a lack of analytics capabilities are key reasons for this. In addition, it’s often impossible to judge what data is valuable and what isn’t. In the age of big data, you have to capture and store it all. Data that might seem completely irrelevant to your business now, such as mobile GPS data, might be a gold mine in the future.

As a result, companies are looking at Hadoop in order to solve the following problems:

  • Capturing and storing all data relevant for the company’s business functionality
  • Supporting advanced analytics capabilities including business intelligence, advanced visualization and predictive analytics to explore data in new ways.
  • Sharing data quickly with all those who need it. Combining data from multiple silos can help an organization find answers to complex questions that no one has previously asked or even new how to ask.
  • Continuously accommodating greater data volumes and new data sources. Hadoop allows solutions to scale quickly and cost-effectively, enabling them to handle the increasing volume, velocity, and variety of data.

According to the report, Hadoop buying cycle is currently on the upswing, thus creating more and more vendors in this space. Although Hadoop is an Apache open source project that anyone can download for free, the majority of consumers prefer to get packaged solutions from vendors. In addition to packaging all Hadoop components and ensuring that they work together (compatible versions), the vendors typically provide enterprise level support and extend, and augment Apache Hadoop (Common, HDFS, MapReduce) as a core component of the solution with additional implementations and add differentiated features to make their solutions attractive to enterprises.

Forrester’s report takes a closer look at nine vendors: Amazon Web Services, Cloudera, Hortonworks, IBM, Intel, MapR Technologies, Microsoft, Pivotal Software, and Teradata, evaluating them based on the following criteria:

  • Current offering, including solution’s architecture, data and processing features, setup, management, and monitoring tools and compatibility and community features.
  • Strategy, including vendor’s plan to meet current customer demands and fill gaps for enterprise deployments. Strategy evaluation included acquisition options, internal ability to execute the strategy, product roadmap and customer support capabilities.
  • Market presence, including company’s financials, global presence and installed base and strategic partnerships with other software vendors, professional services firms, and software-as-a-service (SaaS)/cloud/hosting providers.

The main report’s findings are:

  • There are lots of Leaders, but none dominates.
The Leaders in this big data Hadoop solution evaluation are Amazon Web Services, Cloudera, Hortonworks, IBM, MapR Technologies, Pivotal Software, and Teradata. Vendors start with the Apache open source project and then add packaging, support, integration, and innovations that fill the Hadoop enterprise gaps. All of the Leaders have done this, albeit in slightly different ways — as the individual vendor scorecards and the vendor profiles make clear.
  • Younger solutions match well with Strong Performers.
The Strong Performers in this big data Hadoop solution evaluation are Intel and Microsoft. Microsoft has a robust road map for HDInsight that will make it as compelling as any of the Leaders. Microsoft HDInsight is also engineered for Azure, so it is the best solution for Microsoft customers wishing to implement Hadoop on Azure. Intel has focused most of its innovation at the chip level; it needs to beef up its strategy and enterprise tools to make more inroads as an enterprise solution.

Although, according to the report, this evaluation of the big data Hadoop solutions market is intended to be a starting point only, it provides an excellent initial information for any company trying to navigate the complex landscape of Hadoop vendors.

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

HPCC Systems by Azana Baksh

Very informative article Boris. One other open source technology to mention is HPCC Systems from LexisNexis, a data-intensive supercomputing platform for processing and solving big data analytical problems. Their open source Machine Learning Library and Matrix processing algorithms assist data scientists and developers with business intelligence and predictive analytics. Its integration with Hadoop, R and Pentaho extends further capabilities providing a complete solution for data ingestion, processing and delivery. In fact, both libhdfs and webhdfs implementations are available. More at hpccsystems.com/h2h

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

1 Discuss

Educational Content

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2014 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT