BT

Hadoop Summit 2014 Day Two - On the Path to Enterprise Grade Hadoop

by Jeevak Kasarkod on  Jun 05, 2014

Hadoop Summit Day Two report covers the important trends and changes from last year's summit. It also covers the important announcements of the day in relation to this year's trending topics. This report shares an analysis of the Hadoop market by leading analysts, competing benchmarks by vendors and platform specific innovations and announcements.

Hadoop Summit 2014 Day One - On the Path to Enterprise Grade Hadoop

by Jeevak Kasarkod on  Jun 04, 2014

Hadoop Summit Day One report covers the important trends and changes from last year's summit. It also covers the important announcements of the day in relation to this year's trending topics. This report focuses on the platform specific innovations and announcements and not the broader partner ecosystem, which will be covered in the next few days.

Community the Focus at ApacheCON NA 2014

by Carlos Sanchez on  May 15, 2014

This year's ApacheCON North America conference saw key speakers focus on open source and its community. With more than 400 attendees, over 70 projects represented and 180 conference sessions it covered as many diverse topics as diverse the Apache Software Foundation projects are.

Twitter's Manhattan: A Real-time, Multi-tenant Distributed Database

by Michael Hausenblas on  May 15, 2014

Twitter Engineering has released details about Manhattan, its real-time, multi-tenant distributed database.

Hortonworks Announces Hive 0.13 with Vectorized Query Execution and Hive on Tez

by Matt Kapilevich on  May 13, 2014

Hortonworks announced the release of Hive 0.13 which marks the completion of the Stinger initiative. The new release also includes performance improvements as well as some new SQL features. Hive is an open source SQL Engine written on top of Hadoop that lets users query big data warehouses by writing SQL queries instead of MapReduce jobs.

Introducing Microsoft Avro

by Jonathan Allen on  May 08, 2014 3

Microsoft has announced their implementation of the Apache Avro wire protocol. Avro is described a “compact binary data serialization format similar to Thrift or Protocol Buffers” with additional features needed for distributed processing environments such as Hadoop.

Coverity Scan Gets Better with Java, Apache Hadoop, HBase and Cassandra Support

by Anand Narayanaswamy on  May 02, 2014

The recently released open source scan report by Coverity mainly detected and fixed Resource Leaks, Null Pointer and Control Flow issues besides several other issues. It also scanned the source code of Linux and fixed several bugs.

Cloudera Partners with MongoDB to Store Hadoop Data on Their NoSQL DB

by Abel Avram on  Apr 29, 2014

Starting from the premise that today “80 percent of enterprise data is unstructured and growing at twice the rate of structured data”, Cloudera and MongoDB have announced a “strategic” partnership meant to provide customers the option to combine Cloudera’s Apache-based Big Data platform with MongoDB’s NoSQL solution.

A Roundup of Cloudera Distribution Containing Apache Hadoop 5

by Alex Giamas on  Apr 18, 2014

Cloudera recently released the latest version of its software distribution, CDH5. Almost 20 months after the last major version, CDH4 seems like ages in the Big Data world. We take a look at new features this release brings and the future direction of Cloudera after the latest round of investment from Intel and Google Ventures.

Hadoop Gets Better Security, Several Operational Improvements

by Roopesh Shenoy on  Apr 18, 2014

Hadoop 2.4.0 was recently released with several enhancements to both HDFS and YARN. This includes support for Access Control Lists, Native support for Rolling upgrades, Full HTTPS support for HDFS, Automatic failover of YARN and other operational improvements

Hydra Takes On Hadoop

by Rags Srinivas on  Apr 11, 2014

The social-networking company AddThis open-sourced Hydra under the Apache version 2.0 License in a recent announcement. Hydra grew from an in-house platform created to process semi-structured social data as live streams and do efficient query processing on those data sets.

Spark Gets a Dedicated Big Data Platform

by Charles Menguy on  Apr 03, 2014

Spark users can now use a new Big Data platform provided by intelligence company Atigeo, which bundles most of the UC Berkeley stack into a unified framework optimized for low-latency data processing that can provide significant improvements over more traditional Hadoop-based platforms.

Rebecca Parsons on the ThoughtWorks Technology Radar

by Shane Hastie on  Mar 28, 2014 3

In January ThoughtWorks released the latest version of their Technology Radar in which they track what's interesting in the software development ecosystem. The big themes this year are (1) early warning systems and recovery in production, (2) the tension between privacy and big data, (3) the javascript ecosystem and (4) blurring of the line between the physical and virtual worlds.

Big Data Hadoop Solutions, State of Affairs in Q1/2014

by Boris Lublinsky on  Mar 04, 2014 1

According to a new Forrest report, Hadoop’s momentum is unstoppable. Its usage in the enterprise is continuously growing due to its ability to offer companies new ways to store, process, analyze, and share big data. The report takes a look at Hadoop vendors and ranks them.

Spark Officially Graduates From Apache Incubator

by Alex Giamas on  Feb 28, 2014

Recently, Spark graduated from the Apache incubator. Spark claims up to 100x speed improvements over Apache Hadoop over in-memory datasets and gracefully falling back to 10x speed improvement for on-disk performance. Based on Scala, it can run SQL queries and be used directly in R. It provides Machine Learning, Graph database capabilities and other further discussed in the article.

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2014 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT