BT

Hydra Takes On Hadoop

by Rags Srinivas on  Apr 11, 2014

The social-networking company AddThis open-sourced Hydra under the Apache version 2.0 License in a recent announcement. Hydra grew from an in-house platform created to process semi-structured social data as live streams and do efficient query processing on those data sets.

Hazelcast Introduces MapReduce API

by Michael Hausenblas on  Feb 18, 2014 1

Hazelcast, an open source in-memory data grid solution introduces a MapReduce API for its offering.

Twitter Open-Sources its MapReduce Streaming Framework Summingbird

by Michael Hausenblas on  Jan 16, 2014

Twitter has open sourced their MapReduce streaming framework, called Summingbird. Available under the Apache 2 license, Summingbird is a large-scale data processing system enabling developers to uniformly execute code in either batch-mode (Hadoop/MapReduce-based) or stream-mode (Storm-based) or a combination thereof, called hybrid mode.

New Education Opportunities for Data Scientists

by Charles Menguy on  Jan 14, 2014

2013 has been rich in announcements for new programs, degrees and grants for aspiring data scientists and Big Data practitioners.

Hadoop Jobs on GPU with ParallelX

by Charles Menguy on  Dec 26, 2013 1

The MapReduce paradigm is not always ideal when dealing with large computationally intensive algorithms. A small team of entrepreneurs is building a product called ParallelX to solve that bottleneck by harnessing the power of GPUs to give Hadoop jobs a significant boost.

Elastic Mesos service automates Mesos cluster deployment in EC2

by Charles Menguy on  Dec 17, 2013

EC2 users can now automate the deployment of Apache Mesos, an open-source tool to share cluster resources between multiple data processing frameworks, at scale through a new web service called Elastic Mesos provided by Big Data startup Mesosphere.

Apache Tez - a Generalization of the MapReduce Data Processing

by Boris Lublinsky on  Sep 20, 2013 1

A new Apache incubator project, Tez, generalizes the MapReduce paradigm to execute a complex DAG (directed acyclic graph) of tasks.

QuantCell Research Announces First Public Beta of their Java-Aware Big-Data Spreadsheet

by Victor Grazi on  Aug 21, 2013

Big Data analytics startup QuantCell Research has announced the release of the first public beta of what they are positioning as their "Big Data" spreadsheet.

Trends in the latest Technology Radar

by Aslan Brooke on  Jan 18, 2013 8

ThoughtWorks's latest "Technology Radar" focuses on mobile, accessible analytics, simple architectures, reproducible environments, and data persistence done right.

Windows Azure Storage New Pricing Structure Revealed

by Anand Narayanaswamy on  Dec 11, 2012

Microsoft recently revealed new pricing structure for Windows Azure Storage along with several improvements.

LinkedIn Engineering Releases SenseiDB 1.0.0

by Kostis Kapelonis on  Mar 19, 2012 2

LinkedIn engineering releases SenseiDB 1.0.0, a NoSQL database focused on high update rates and complex semi-structured search queries, already used in production by LinkedIn in its search related pages (e.g. People/Company search)

MapReduce Patterns, Algorithms, and Use Cases

by Boris Lublinsky on  Feb 08, 2012 3

In his new article “MapReduce Patterns, Algorithms, and Use Cases”, Ilya Katsov gives a systematic view of the different MapReduce patterns, algorithms and techniques that can be found on the web or in scientific articles along with several practical use case studies.

Apache Hadoop 1.0.0 Supports Kerberos Authentication, Apache HBase and RESTful API to HDFS

by Srini Penchikala on  Jan 13, 2012 2

After six years of gestation, Big data framework Apache Hadoop 1.0.0 was recently released. Core features in the release include Kerberos Authentication, support for Apache HBase and RESTful API to HDFS. InfoQ spoke with Arun Murthy, VP of Apache Hadoop, about the new release.

Blog Sentiment Analysis Using NoSQL Techniques

by Srini Penchikala on  Dec 28, 2011 7

Corporations are increasingly using social media to learn more about what their customers are saying about their products. This presents unique challenges as unstructured content needs analytic techniques to interpret the sentiment embodied in the blog posts. InfoQ caught up with Subramanian Kartik to learn more about the blog sentiment analysis project his team worked on.

HPCC Systems Launches Big Data Delivery Engine on EC2

by Jean-Jacques Dubray on  Dec 01, 2011 1

HPCC Systems, which is part of LexisNexis, is launching this week its Thor Data Refinery Cluster on the Amazon EC2. HPCC Systems is an enterprise-grade, open source Big Data analytics technology platform capable of ingesting vast amounts of data, transforming, linking and indexing that data, with parallel processing power spread across the nodes.

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2013 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT