InfoQ Homepage Hadoop Content on InfoQ

News

RSS Feed

Newer Older

Apache HBase Hits 1.0

After three developer previews, six release candidates and over 1500 closed tickets the Apache foundation has announced version 1.0 of Apache HBase, a NoSQL database in the Hadoop ecosystem. After more than 7 years of active development, the team behind HBase felt that the project had matured and stabilized enough to warrant a 1.0 version.

Benjamin Darfler
on Apr 07, 2015
Spring XD 1.1: Simplifying Big Data like Spring Did for Java EE

Pivotal recently released Spring XD 1.1 GA with new features including stream processing with Reactor, RxJava, Spark Streaming and Python. Additionally support for Kafka, batching and compression with RabbitMQ, and support for container group management when running on YARN are now featured.

Matt Raible
on Mar 05, 2015
Google Open Sources MapReduce Framework for C to Run Native Code in Hadoop

Google announced last week the release of open source MapReduce framework for C, called MR4C, that allows developers to run native code in Hadoop framework. MR4C framework brings together the performance and flexibility of natively developed algorithms with the scalability and throughput provided by Hadoop execution framework.

Srini Penchikala
on Feb 25, 2015
Project Pachyderm Aims to Build a "Modern" Hadoop on Docker

Project Pachyderm Aims to Build "Modern" Hadoop using Docker and CoreOS.

Matt Kapilevich
on Feb 17, 2015
Apache Hive 1.0 Released, HiveServer2 Becomes Main Engine, Stable API Defined

Apache Hive has released version 1.0 of their project on February 6th, 2015. Originally planned as version 0.14.1, the community voted to change the version numbering to 1.0.0 to reflect the amount of maturity the project has reached.

Mikio Braun
on Feb 11, 2015
Splice Machine Version 1.0 Supports Integration with Hadoop and Analytic Window Functions

Splice Machine version 1.0 supports analytic window functions and integration with Hadoop ecosystem. Splice Machine team recently released their Hadoop based RDBMS data management solution that can be used for transactional workloads on Hadoop.

Srini Penchikala
on Dec 18, 2014
LinkedIn Open Sources Cubert With an Eye To Big Data Analytics

LinkedIn recently open sourced Cubert, its High Performance Computation Engine for Complex Big Data Analytics. Cubert is a framework written for analysts and data scientists in mind.Developed completely in Java and expressed as a scripting language, Cubert is designed for complex joins and aggregations that frequently arise in the reporting world.

Alex Giamas
on Dec 17, 2014
Gobblin, LinkedIn's Unified Data Ingestion Platform

At the 2014 QCon San Francisco conference, LinkedIn's Lin Qiao gave a talk on their Gobblin project (also summarized in a blog post) that is a unified data ingestion system for their internal and external data sources.

Mikio Braun
on Dec 15, 2014
Stripe Open Sources Tools For Apache Hadoop

Stripe, the internet payments infrastructure company recently announced open sourcing a set of internally developed tools based on Apache Hadoop.Timberlake, Brushfire, Sequins and Herringbone all contribute to enriching the available tools for building an Apache Hadoop stack.

Alex Giamas
on Dec 09, 2014
Spark Sets New Record in Sort Performance

Databricks has recently announced a new record in the Daytona GraySort contest using the Spark processing engine. The Daytona GraySort contest is a 3rd party benchmark measuring how fast a system can sort 100 Terabytes of data. Databricks posted a throughput of 4.27 TB/min over a cluster of 206 machines for their official run.

Benjamin Darfler
on Nov 26, 2014
Hortonworks Data Platform Makes an Enterprise Push

Hortonworks Data Platform (HDP) version 2.2 with features based around Hadoop and YARN has better support for enterprise features such as security, compliance and so on as well.

Rags Srinivas
on Nov 14, 2014
Microsoft Expands Azure Machine Learning and Real Time Analytics Offering

Microsoft recently announced new machine learning capabilities for Microsoft Azure platform. Developers can also create their own web services and publish them to Azure Marketplace. Microsoft also announced availability of Apache Storm for Azure. Azure Stream Analytics, Data Factory and Event Hubs for Azure were all announced in the past few weeks by Microsoft. In this article we explore moreabout

Alex Giamas
on Oct 31, 2014
Big Data Analytics: Using Hunk with Hadoop and Elastic MapReduce

Hunk is a relatively new product from Splunk for exploring and visualizing Hadoop and other NoSQL data stores. New in this release is support for Amazon’s Elastic MapReduce.

Jonathan Allen
on Oct 07, 2014
Data Encryption in Apache Hadoop with Project Rhino - Q&A with Steven Ross

Cloudera recently released an update over Project Rhino and data at-rest encryption in Apache Hadoop. Project Rhino is an effort of Cloudera, Intel and Hadoop community to bring a comprehensive security framework for data protection. InfoQ recently talked to Steven Ross from Cloudera team to learn more about the project.

Abhishek Sharma
on Aug 14, 2014
Cloudbreak, New Hadoop as a Service API, Enters Open Beta

Cloudbreak, a new open-source and cloud-agnostic Hadoop as a Service API, is now open for beta access to application developers and enterprises. SequenceIQ, Cloudbreak's maker, claims that its freely available product will make it easier to manage and monitor on-demand Hadoop clusters while also abstracting their provisioning.

Sergio De Simone
on Jul 22, 2014

Newer News

Older News

InfoQ Software Architects' Newsletter

News