Hortonworks, IBM and Pivotal to Support Open Data Platform in Their Big Data Solutions

by Srini Penchikala on  Apr 24, 2015

Big data vendors Hortonworks, IBM, and Pivotal recently announced that their Hadoop based platform products will use the common Open Data Platform (ODP). They made the announcement at the recent HadoopSummit Europe Conference of the open platform which includes Apache Hadoop 2.6 (HDFS, YARN, and MapReduce) and Apache Ambari software.

Google Enhances Data and Network Services for its Cloud Platform

by Janakiram MSV on  Apr 18, 2015

Google announced the general availability of Cloud DNS, expanded locations for load balancing, additional carrier providers for peering, beta availability of Cloud Dataflow and VPN services

Amazon Web Services launches Machine Learning Service

by Mikio Braun on  Apr 17, 2015 2

Amazon Web Services have recently launched their Amazon Machine Learning service that allows users to learn predictive models in the cloud. After Google with Prediction API, and Microsoft with Azure Machine Learning, Amazon is the latest major cloud service provider to launch a similar service.

Twitter Cuts Off Firehose Partner DataSift

by Saul Caganoff on  Apr 14, 2015

Twitter recently announced that it has cut-off their firehose data distributor DataSift. This move echoes Twitter's controversial 2012 API changes which restricted the Twitter client ecosystem. There is much speculation as to whether this latest announcement is an attempt to control the tweet analytics space and whether or not this is behaviour fitting of a platform provider.

Apache HBase Hits 1.0

by Benjamin Darfler on  Apr 07, 2015

After three developer previews, six release candidates and over 1500 closed tickets the Apache foundation has announced version 1.0 of Apache HBase, a NoSQL database in the Hadoop ecosystem. After more than 7 years of active development, the team behind HBase felt that the project had matured and stabilized enough to warrant a 1.0 version.

Real-time Data Analytics at Pinterest using MemSQL and Spark Streaming

by Srini Penchikala on  Mar 29, 2015

Pinterest, the company behind the visual bookmarking tool that helps you discover and save creative ideas, is using real-time data analytics for data-driven decision making purposes. It’s experimenting with MemSQL and Spark technologies for real-time user engagement across the globe.

Microsoft Acquires Revolution Analytics

by Jérôme Serrano on  Mar 25, 2015

Microsoft increased its foothold in the data science community last winter by acquiring Revolution Analytics, a major provider of software and services based on the open-source R project for computational statistics. The deal is expected to bring R capabilities to the Microsoft suite of products and facilitate the adoption of R-based solutions in the enterprise environment.

Apache Spark 1.3 Released, Data Frames, Spark SQL, and MLlib Improvements

by Mikio Braun on  Mar 16, 2015

Apache Spark has released version 1.3 of their project. The main improvements are the addition of the DataFrames API, better maturity of the Spark SQL, as well as a number of new methods added to the machine learning library MLlib, and better integration of Spark Streaming with Apache Kafka.

Google Open Sources MapReduce Framework for C to Run Native Code in Hadoop

by Srini Penchikala on  Feb 25, 2015 1

Google announced last week the release of open source MapReduce framework for C, called MR4C, that allows developers to run native code in Hadoop framework. MR4C framework brings together the performance and flexibility of natively developed algorithms with the scalability and throughput provided by Hadoop execution framework.

MongoDB 3.0 - WiredTiger Storage Engine and Updated MMS

by Alex Giamas on  Feb 20, 2015

Some time ago, when MongoDB 2.6 was released Kelly Stirman, Director of Products at MongoDB answered our questions regarding the latest release. Now with MongoDB 3.0 announced for March and MongoDB 3.0 RC-8 already available, it’s time to see in more detail what WiredTiger storage engine, new and improved MMS and storage compression can bring to NoSQL users.

Pivotal Open Sources Their Big Data Suite

by Abel Avram on  Feb 19, 2015 8

Pivotal has decided to open source core components of their Big Data Suite and has announced the Open Data Platform, an initiative promoting open source and standardization for Big Data.

Project Pachyderm Aims to Build a "Modern" Hadoop on Docker

by Matt Kapilevich on  Feb 17, 2015 3

Project Pachyderm Aims to Build "Modern" Hadoop using Docker and CoreOS.

Apache Hive 1.0 Released, HiveServer2 Becomes Main Engine, Stable API Defined

by Mikio Braun on  Feb 11, 2015

Apache Hive has released version 1.0 of their project on February 6th, 2015. Originally planned as version 0.14.1, the community voted to change the version numbering to 1.0.0 to reflect the amount of maturity the project has reached.

EMRFS Brings Consistency to Amazon S3

by Jérôme Serrano on  Jan 27, 2015

Amazon recently announced EMRFS, an implementation of HDFS that allows EMR clusters to use S3 with a stronger consistency model. When enabled, this new feature keeps track of operations performed on S3 and provides list consistency, delete consistency and read-after-write-consistency, for any cluster created with Amazon Machine Image (AMI) version 3.2.1 or greater.

Apache Flink 0.8.0 Released, Roadmap for 2015 Published

by Mikio Braun on  Jan 22, 2015

Apache Flink has released the version 0.8.0 of their project. Besides the usual performance, compatibility, and stability improvements, it has also added a streaming Scala API, where streaming capabilities had so far been missing. Apache Flink has also been promoted to the top-level of the Apache projects recently after joining the incubator roughly nine months ago.

General Feedback
Marketing and all content copyright © 2006-2015 C4Media Inc. hosted at Contegix, the best ISP we've ever worked with.
Privacy policy