BT

New Early adopter or innovator? InfoQ has been working on some new features for you. Learn more

Followers

Parquet Becomes Top-Level Apache Project

by Jérôme Serrano Follow 0 Followers on  Jun 11, 2015

Apache Parquet, the open-source columnar storage format for Hadoop, recently graduated from the Apache Software Foundation Incubator and became a top-level project. Initially created by Cloudera and Twitter in 2012 to speed up analytical processing, Parquet is now openly available for Apache Spark, Apache Hive, Apache Pig, Impala, native MapReduce, and other key components of the Hadoop ecosystem.

Followers

MemSQL 4 Database Supports Community Edition, Geospatial Intelligence and Spark Integration

by Srini Penchikala Follow 9 Followers on  May 30, 2015

Latest version of MemSQL, in-memory database with support for transactions and analytics, includes a new Community Edition for free use by organizations. MemSQL 4, released last week, also supports integration with Apache Spark, Hadoop Distributed File System (HDFS), and Amazon S3.

Followers

Glenn Tamkin on Applying Apache Hadoop to NASA's Big Climate Data

by Srini Penchikala Follow 9 Followers on  May 06, 2015

NASA Center for Climate Simulation (NCCS) is using Apache Hadoop for high-performance data analytics. Glenn Tamkin from NASA team, recently spoke at ApacheCon Conference and shared the details of the platform they built for climate data analysis with Hadoop.

Followers

Hortonworks, IBM and Pivotal to Support Open Data Platform in Their Big Data Solutions

by Srini Penchikala Follow 9 Followers on  Apr 24, 2015

Big data vendors Hortonworks, IBM, and Pivotal recently announced that their Hadoop based platform products will use the common Open Data Platform (ODP). They made the announcement at the recent HadoopSummit Europe Conference of the open platform which includes Apache Hadoop 2.6 (HDFS, YARN, and MapReduce) and Apache Ambari software.

Followers

Apache HBase Hits 1.0

by Benjamin Darfler Follow 0 Followers on  Apr 07, 2015

After three developer previews, six release candidates and over 1500 closed tickets the Apache foundation has announced version 1.0 of Apache HBase, a NoSQL database in the Hadoop ecosystem. After more than 7 years of active development, the team behind HBase felt that the project had matured and stabilized enough to warrant a 1.0 version.

Followers

Spring XD 1.1: Simplifying Big Data like Spring Did for Java EE

by Matt Raible Follow 1 Followers on  Mar 05, 2015

Pivotal recently released Spring XD 1.1 GA with new features including stream processing with Reactor, RxJava, Spark Streaming and Python. Additionally support for Kafka, batching and compression with RabbitMQ, and support for container group management when running on YARN are now featured.

Followers

Google Open Sources MapReduce Framework for C to Run Native Code in Hadoop

by Srini Penchikala Follow 9 Followers on  Feb 25, 2015 1

Google announced last week the release of open source MapReduce framework for C, called MR4C, that allows developers to run native code in Hadoop framework. MR4C framework brings together the performance and flexibility of natively developed algorithms with the scalability and throughput provided by Hadoop execution framework.

Followers

Pivotal Open Sources Their Big Data Suite

by Abel Avram Follow 4 Followers on  Feb 19, 2015 8

Pivotal has decided to open source core components of their Big Data Suite and has announced the Open Data Platform, an initiative promoting open source and standardization for Big Data.

Followers

Project Pachyderm Aims to Build a "Modern" Hadoop on Docker

by Matt Kapilevich Follow 0 Followers on  Feb 17, 2015 3

Project Pachyderm Aims to Build "Modern" Hadoop using Docker and CoreOS.

Followers

Project Myriad: Mesos and YARN Working Together

by Boris Lublinsky Follow 0 Followers on  Feb 14, 2015 1

An article by Jin Scott - A tale of two clusters: Mesos and YARN – describes hardware silos created by using different resource managers on different hardware clusters, most popular being Mesos and Yarn and introduces Myriad – a solution allowing to run a YARN cluster on Mesos.

Followers

Apache Hive 1.0 Released, HiveServer2 Becomes Main Engine, Stable API Defined

by Mikio Braun Follow 0 Followers on  Feb 11, 2015

Apache Hive has released version 1.0 of their project on February 6th, 2015. Originally planned as version 0.14.1, the community voted to change the version numbering to 1.0.0 to reflect the amount of maturity the project has reached.

Followers

EMRFS Brings Consistency to Amazon S3

by Jérôme Serrano Follow 0 Followers on  Jan 27, 2015

Amazon recently announced EMRFS, an implementation of HDFS that allows EMR clusters to use S3 with a stronger consistency model. When enabled, this new feature keeps track of operations performed on S3 and provides list consistency, delete consistency and read-after-write-consistency, for any cluster created with Amazon Machine Image (AMI) version 3.2.1 or greater.

Followers

Splice Machine Version 1.0 Supports Integration with Hadoop and Analytic Window Functions

by Srini Penchikala Follow 9 Followers on  Dec 18, 2014

Splice Machine version 1.0 supports analytic window functions and integration with Hadoop ecosystem. Splice Machine team recently released their Hadoop based RDBMS data management solution that can be used for transactional workloads on Hadoop.

Followers

LinkedIn Open Sources Cubert With an Eye To Big Data Analytics

by Alex Giamas Follow 0 Followers on  Dec 17, 2014

LinkedIn recently open sourced Cubert, its High Performance Computation Engine for Complex Big Data Analytics. Cubert is a framework written for analysts and data scientists in mind.Developed completely in Java and expressed as a scripting language, Cubert is designed for complex joins and aggregations that frequently arise in the reporting world.

Followers

Gobblin, LinkedIn's Unified Data Ingestion Platform

by Mikio Braun Follow 0 Followers on  Dec 15, 2014

At the 2014 QCon San Francisco conference, LinkedIn's Lin Qiao gave a talk on their Gobblin project (also summarized in a blog post) that is a unified data ingestion system for their internal and external data sources.

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT