Hadoop Summit Day One report covers the important trends and changes from last year's summit. It also covers the important announcements of the day in relation to this year's trending topics. This report focuses on the platform specific innovations and announcements and not the broader partner ecosystem, which will be covered in the next few days.
Hortonworks announced the release of Hive 0.13 which marks the completion of the Stinger initiative. The new release also includes performance improvements as well as some new SQL features. Hive is an open source SQL Engine written on top of Hadoop that lets users query big data warehouses by writing SQL queries instead of MapReduce jobs.
In the race for interactive SQL in Big Data environments, there are two open source based front-runners, Impala and Hive with the Stinger project. Cloudera recently announced that Impala is up to 69 times faster than Hive 0.12 and can outperform DBMS. Other than raw speed, we take a look at other considerations in choosing a SQL engine for Hadoop and also Tez, an application framework for YARN.
EMC Greenplum has announced Pivotal HD, a new Hadoop distribution including a fully compliant SQL MPP database running on HDFS and being “hundreds of times faster than Hive”.
Hortonworks’ new Stinger initiative joins Apache Drill and Cloudera Impala in competition for the best real-time Hadoop implementation.
Yahoo spun-out its core Hadoop team, forming a new company Hortonworks. CEO Eric Baldeschwieler presented their vision of easing adoption of Hadoop and making core engineering improvements for availability, performance, and manageability. Hortonworks will sell support, training, and certification, primarily indirects through partners.