BT

Mac OS X and BTRFS support in Docker

by Chris Swan on  Feb 17, 2014

As part of the 0.8 release the Docker.io team have announced support for installation on Mac OS X and the use of the BTRFS as an alternative to AUFS.

Elasticsearch 1.0.0 released

by Ralph Winzinger on  Feb 14, 2014

Elasticsearch released version 1.0.0 of its self-titled, open-source analytics tool. Elasticsearch is a distributed search engine which allows for real-time data analysis in big-data environments. The new version comes with various functional enhancements and changes to the API to make Elasticsearch more intuitive and powerful to use.

Running Spark on R with SparkR

by Charles Menguy on  Feb 11, 2014

UC Berkeley’s AMPLab announced a developer preview of their new project SparkR to use Apache Spark natively from R.

DevOps Cafe Podcast on the QCon London 2014 DevOps Track

by João Miranda on  Feb 09, 2014

Last DevOps Cafe Podcast (Episode 47) previewed the QCon London 2014 DevOps track. Manuel Pais and Shane Hastie, the track hosts, explained the rationale behind the track's session selection, the speakers introduced their talks and there was still time to discuss other topics, such as the importance of the scientific method and how agile's definition of "done" must be adapted in a DevOps world.

Interactive SQL in Apache Hadoop with Impala and Hive

by Alex Giamas on  Feb 07, 2014

In the race for interactive SQL in Big Data environments, there are two open source based front-runners, Impala and Hive with the Stinger project. Cloudera recently announced that Impala is up to 69 times faster than Hive 0.12 and can outperform DBMS. Other than raw speed, we take a look at other considerations in choosing a SQL engine for Hadoop and also Tez, an application framework for YARN.

DataFu Enters Incubation Status at Apache

by Charles Menguy on  Feb 04, 2014

LinkedIn’s DataFu project, a collection of libraries for Hadoop, has now officially entered the incubation status at the Apache Software Foundation (ASF) since the first week of January.

Google Acquires Nest: Big Data Comes to Energy

by Michael Hausenblas on  Feb 04, 2014

Google has acquired Nest, maker of smart thermostat and smoke detectors, for $3.2 billion in cash, making it another major data source that will help Google understand how people live.

DevOps Adoption Cultural Challenges

by João Miranda on  Feb 03, 2014

Oliver White, Head of Rebel Labs, recently discussed the difficulties of DevOps adoption at IT organizations, even when there is a growing body of evidence that highlights the benefits of DevOps. InfoQ took the opportunity to interview Oliver and review some of the reports that study this topic.

Spark, Storm and Real Time Analytics

by Alex Giamas on  Jan 31, 2014

Hadoop is definitely the platform of choice for Big Data analysis and computation. While data Volume, Variety and Velocity increases, Hadoop as a batch processing framework cannot cope with the requirement for real time analytics. Spark, Storm and the Lambda Architecture can help bridge the gap between batch and event based processing.

IDC Study: How Many Software Developers Are Out There?

by Abel Avram on  Jan 31, 2014 1

IDC has published the “2014 Worldwide Software Developer and ICT-Skilled Worker Estimates” document, a study estimating the number of professional software developers, hobbyist developers and Information and Communications Technology (ICT)-skilled workers in the world at the start of 2014. The 90 countries covered in the study represent 97% of the world’s GDP.

Presto-as-a-Service: Interactive SQL Queries on AWS

by Charles Menguy on  Jan 24, 2014

Presto, a technology from Facebook enabling interactive SQL queries on petabytes of data, has now taken a first step into mainstream adoption. Big Data startup Qubole has launched its Presto-as-a-Service alpha with integration to Amazon Web Services.

Big Data: Do Languages Really Matter?

by Charles Menguy on  Jan 20, 2014 1

Big Data is a field where even a single millisecond loss can be significant over billions of events. Yet, languages often regarded as slow like Python have gained a lot of popularity in the past year. Recent articles and discussions in the Big Data community have started reigniting the debate around the choice of a programming language for data science and Big Data.

Big Data Revolution and Genomics Analysis

by Alex Giamas on  Jan 17, 2014

Curoverse and Tute Genomics secured $1.5 million each in seed funding in the past month aiming to bring gene sequencing to the masses. Illumina, Seven Bridges Genomics, Complete Genomics and others are offering researchers and private parties the opportunity to map the full genome sequence for a four figure quote. Illumina recently announced HiSeq X Ten, promising the long-awaited $1,000 genome.

Twitter Open-Sources its MapReduce Streaming Framework Summingbird

by Michael Hausenblas on  Jan 16, 2014

Twitter has open sourced their MapReduce streaming framework, called Summingbird. Available under the Apache 2 license, Summingbird is a large-scale data processing system enabling developers to uniformly execute code in either batch-mode (Hadoop/MapReduce-based) or stream-mode (Storm-based) or a combination thereof, called hybrid mode.

New Education Opportunities for Data Scientists

by Charles Menguy on  Jan 14, 2014

2013 has been rich in announcements for new programs, degrees and grants for aspiring data scientists and Big Data practitioners.

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2013 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT