BT
rss

Hadoop Gets Better Security, Several Operational Improvements

by Roopesh Shenoy on  Apr 18, 2014

Hadoop 2.4.0 was recently released with several enhancements to both HDFS and YARN. This includes support for Access Control Lists, Native support for Rolling upgrades, Full HTTPS support for HDFS, Automatic failover of YARN and other operational improvements

Hydra Takes On Hadoop

by Rags Srinivas on  Apr 11, 2014

The social-networking company AddThis open-sourced Hydra under the Apache version 2.0 License in a recent announcement. Hydra grew from an in-house platform created to process semi-structured social data as live streams and do efficient query processing on those data sets.

Spark Officially Graduates From Apache Incubator

by Alex Giamas on  Feb 28, 2014

Recently, Spark graduated from the Apache incubator. Spark claims up to 100x speed improvements over Apache Hadoop over in-memory datasets and gracefully falling back to 10x speed improvement for on-disk performance. Based on Scala, it can run SQL queries and be used directly in R. It provides Machine Learning, Graph database capabilities and other further discussed in the article.

Interactive SQL in Apache Hadoop with Impala and Hive

by Alex Giamas on  Feb 07, 2014

In the race for interactive SQL in Big Data environments, there are two open source based front-runners, Impala and Hive with the Stinger project. Cloudera recently announced that Impala is up to 69 times faster than Hive 0.12 and can outperform DBMS. Other than raw speed, we take a look at other considerations in choosing a SQL engine for Hadoop and also Tez, an application framework for YARN.

Google Improves Hadoop Performance with New Cloud Storage Connector

by Richard Seroter on  Jan 20, 2014

With a new connector, it is now possible for Hadoop to run directly against Google Cloud Storage instead of using the default, distributed file system. This results in lower storage costs, fewer data replication activities, and a simpler overall process.

Cascading 2.5 Supports Hadoop 2

by Boris Lublinsky on  Nov 19, 2013

New version of Cascading released this week incorporates Hadoop 2 support and includes Cascading Lingual - an open source project that provides a comprehensive ANSI SQL interface for accessing Hadoop-based data

Best Practices for Amazon EMR

by Boris Lublinsky on  Aug 16, 2013 2

In his new whitepaper, Best Practices for Amazon EMR, Parviz Deyhim outlines the best practices in using AWS EMR including moving data to AWS, strategies for collecting, compressing, aggregating the data, and common architectural patterns for setting up and configuring Amazon EMR clusters for processing.

DataStax Brings Enterprise Security To Cassandra, Hadoop, Solr

by Roopesh Shenoy on  Mar 18, 2013

Datastax Enterprise 3.0 was announced last month with several Enterprise security features for a cluster using Cassandra, Hadoop and Solr. InfoQ caught up with Robin Schumacher, VP of Products at DataStax to learn more.

Concurrent Releases Lingual, a SQL DSL for Hadoop

by Boris Lublinsky on  Feb 28, 2013

Concurrent, Inc., the enterprise Big Data application platform company, today announced Lingual, an open source project enabling fast and simple Big Data application development on Apache Hadoop using SQL.

Greenplum Pivotal HD Combines the Strengths of SQL and Hadoop

by Abel Avram on  Feb 27, 2013

EMC Greenplum has announced Pivotal HD, a new Hadoop distribution including a fully compliant SQL MPP database running on HDFS and being “hundreds of times faster than Hive”.

Competition between Real-time Hadoop Implementations Heats Up

by Boris Lublinsky on  Feb 25, 2013 7

Hortonworks’ new Stinger initiative joins Apache Drill and Cloudera Impala in competition for the best real-time Hadoop implementation.

A Look at Oracle’s NoSQL Database

by Jonathan Allen on  Feb 08, 2013 4

Oracle’s key-value database, known simply as “Oracle NoSQL Database” has hit version 2.0. Oracle NoSQL Database is essentially a distributed frontend for Berkeley DB, but it offers much more than that. Support for SQL queries, both absolute and eventual consistency, and the option to reduce storage space using Avro schemas sets it apart.

Managing Hadoop with Apache Ambari

by Boris Lublinsky on  Dec 19, 2012 2

In his new blog post Hortonworks Vice President of Corporate Strategy Shaun Connolly discusses the importance of Apache Ambari incubation project and the main milestones achieved by the project in 2012: simplified cluster provisioning, pre-configured key operational metrics, job execution visualization, a RESTful API and an intuitive UI.

News from O’Reilly Strata Conference + Hadoop World 2012: Azure HDInsight, Cloudera Impala, MapR M7

by Boris Lublinsky on  Oct 29, 2012

Several new Hadoop-based frameworks where announced during this year O’Reilly Strata Conference + Hadoop World 2012 in New York last week.

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2013 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT