Google announced last week the release of open source MapReduce framework for C, called MR4C, that allows developers to run native code in Hadoop framework. MR4C framework brings together the performance and flexibility of natively developed algorithms with the scalability and throughput provided by Hadoop execution framework.
MapR Technologies, provider of the Apache Hadoop distribution, has open sourced their MapR-DB NoSQL database for unlimited production use. MapR-DB is a Wide Column NoSQL database with native integration to Hadoop and support for strong consistency and ACID transactions.
Hunk is a relatively new product from Splunk for exploring and visualizing Hadoop and other NoSQL data stores. New in this release is support for Amazon’s Elastic MapReduce.
MapR recently announced including Apache Drill in its latest release of MapR distribution. Apache Drill is the open source version of Google’s Dremel. Dremel is the infrastructure on which BigQuery is based upon. Drill is offering a low latency SQL-on-Hadoop interface. While this puts it in the same space as several other technologies around Hadoop, Drill has some unique characteristics setting it
The social-networking company AddThis open-sourced Hydra under the Apache version 2.0 License in a recent announcement. Hydra grew from an in-house platform created to process semi-structured social data as live streams and do efficient query processing on those data sets.
Hazelcast, an open source in-memory data grid solution introduces a MapReduce API for its offering.
Twitter has open sourced their MapReduce streaming framework, called Summingbird. Available under the Apache 2 license, Summingbird is a large-scale data processing system enabling developers to uniformly execute code in either batch-mode (Hadoop/MapReduce-based) or stream-mode (Storm-based) or a combination thereof, called hybrid mode.
2013 has been rich in announcements for new programs, degrees and grants for aspiring data scientists and Big Data practitioners.
The MapReduce paradigm is not always ideal when dealing with large computationally intensive algorithms. A small team of entrepreneurs is building a product called ParallelX to solve that bottleneck by harnessing the power of GPUs to give Hadoop jobs a significant boost.
EC2 users can now automate the deployment of Apache Mesos, an open-source tool to share cluster resources between multiple data processing frameworks, at scale through a new web service called Elastic Mesos provided by Big Data startup Mesosphere.
A new Apache incubator project, Tez, generalizes the MapReduce paradigm to execute a complex DAG (directed acyclic graph) of tasks.
Big Data analytics startup QuantCell Research has announced the release of the first public beta of what they are positioning as their "Big Data" spreadsheet.
ThoughtWorks's latest "Technology Radar" focuses on mobile, accessible analytics, simple architectures, reproducible environments, and data persistence done right.
Microsoft recently revealed new pricing structure for Windows Azure Storage along with several improvements.
LinkedIn engineering releases SenseiDB 1.0.0, a NoSQL database focused on high update rates and complex semi-structured search queries, already used in production by LinkedIn in its search related pages (e.g. People/Company search)