BT

HBase 0.98 Introduces Cell-based Security

by Rags Srinivas on  Mar 21, 2014

Apache released HBase 0.98 primarily addressing convergence with Apache Accumulo via cell-based security while resolving over 230 JIRA issues. These new security features are modeled after Accumulo.

Graph Processing Using Big Data Technologies

by Charles Menguy on  Mar 17, 2014 2

Processing extremely large graphs has been and remains a challenge, but recent advances in Big Data technologies have made this task more practical. Tapad, a startup based in NYC focused on cross-device content delivery, has made graph processing the heart of their business model using Big Data to scale to terabytes of data.

Domino: Datascience-as-a-Service

by Michael Hausenblas on  Mar 11, 2014

Domino, a Platform-as-a-Service for data science, enables people to do analytical work using languages such as Python or R in the cloud (EC2).

Big Data Hadoop Solutions, State of Affairs in Q1/2014

by Boris Lublinsky on  Mar 04, 2014 1

According to a new Forrest report, Hadoop’s momentum is unstoppable. Its usage in the enterprise is continuously growing due to its ability to offer companies new ways to store, process, analyze, and share big data. The report takes a look at Hadoop vendors and ranks them.

IBM Launches Contest for Cognitive Mobile Apps using Watson

by Sergio De Simone on  Mar 03, 2014

At the Mobile World Congress, IBM has announced a developer contest for developers to create mobile consumer and business apps powered by IBM Watson cognitive computing platform. The winners of the IBM Watson Mobile Developer Challenge will receive design consulting and support from IBM to gain access to the market.

Spark Officially Graduates From Apache Incubator

by Alex Giamas on  Feb 28, 2014

Recently, Spark graduated from the Apache incubator. Spark claims up to 100x speed improvements over Apache Hadoop over in-memory datasets and gracefully falling back to 10x speed improvement for on-disk performance. Based on Scala, it can run SQL queries and be used directly in R. It provides Machine Learning, Graph database capabilities and other further discussed in the article.

Hazelcast Introduces MapReduce API

by Michael Hausenblas on  Feb 18, 2014 1

Hazelcast, an open source in-memory data grid solution introduces a MapReduce API for its offering.

Elasticsearch 1.0.0 released

by Ralph Winzinger on  Feb 14, 2014

Elasticsearch released version 1.0.0 of its self-titled, open-source analytics tool. Elasticsearch is a distributed search engine which allows for real-time data analysis in big-data environments. The new version comes with various functional enhancements and changes to the API to make Elasticsearch more intuitive and powerful to use.

Running Spark on R with SparkR

by Charles Menguy on  Feb 11, 2014

UC Berkeley’s AMPLab announced a developer preview of their new project SparkR to use Apache Spark natively from R.

Interactive SQL in Apache Hadoop with Impala and Hive

by Alex Giamas on  Feb 07, 2014

In the race for interactive SQL in Big Data environments, there are two open source based front-runners, Impala and Hive with the Stinger project. Cloudera recently announced that Impala is up to 69 times faster than Hive 0.12 and can outperform DBMS. Other than raw speed, we take a look at other considerations in choosing a SQL engine for Hadoop and also Tez, an application framework for YARN.

DataFu Enters Incubation Status at Apache

by Charles Menguy on  Feb 04, 2014

LinkedIn’s DataFu project, a collection of libraries for Hadoop, has now officially entered the incubation status at the Apache Software Foundation (ASF) since the first week of January.

Google Acquires Nest: Big Data Comes to Energy

by Michael Hausenblas on  Feb 04, 2014

Google has acquired Nest, maker of smart thermostat and smoke detectors, for $3.2 billion in cash, making it another major data source that will help Google understand how people live.

Spark, Storm and Real Time Analytics

by Alex Giamas on  Jan 31, 2014

Hadoop is definitely the platform of choice for Big Data analysis and computation. While data Volume, Variety and Velocity increases, Hadoop as a batch processing framework cannot cope with the requirement for real time analytics. Spark, Storm and the Lambda Architecture can help bridge the gap between batch and event based processing.

Presto-as-a-Service: Interactive SQL Queries on AWS

by Charles Menguy on  Jan 24, 2014

Presto, a technology from Facebook enabling interactive SQL queries on petabytes of data, has now taken a first step into mainstream adoption. Big Data startup Qubole has launched its Presto-as-a-Service alpha with integration to Amazon Web Services.

Big Data: Do Languages Really Matter?

by Charles Menguy on  Jan 20, 2014 1

Big Data is a field where even a single millisecond loss can be significant over billions of events. Yet, languages often regarded as slow like Python have gained a lot of popularity in the past year. Recent articles and discussions in the Big Data community have started reigniting the debate around the choice of a programming language for data science and Big Data.

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2014 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT