InfoQ Homepage Hadoop Content on InfoQ

Articles

RSS Feed

Newer Older

Nikita Ivanov on GridGain’s In-Memory Accelerator for Hadoop

GridGain recently announced the In-Memory Accelerator for Hadoop, offering the benefits of in-memory computing to Hadoop based applications. It includes two components: an in-memory file system and a MapReduce implementation. InfoQ spoke with Nikita Ivanov, CTO of GridGain about the architecture of the product.

Srini Penchikala
on Sep 08, 2014
Rich Reimer on SQL-on-Hadoop Databases and Splice Machine

SQL-on-Hadoop technologies include a SQL layer or a SQL database over Hadoop. These solutions are becoming popular recently as they solve the data management issues of Hadoop and provide a scale-out alternative for traditional RDBMSs. InfoQ spoke with Rich Reimer, VP of Marketing and Product Management at Splice Machine about the architecture and data patterns for SQL in Hadoop databases.

Srini Penchikala
on Jun 19, 2014
Lambda Architecture: Design Simpler, Resilient, Maintainable and Scalable Big Data Solutions

Lambda Architecture proposes a simpler, elegant paradigm designed to store and process large amounts of data. In this article, author Daniel Jebaraj presents the motivation behind the Lambda Architecture, reviews its structure with the help of a sample Java application.

Daniel Jebaraj
on Mar 12, 2014
Big Data Analytics for Security

In this article, authors discuss the role of big data and Hadoop in security analytics space and how to use MapReduce to efficiently process data for security analysis for use cases like Security Information and Event Management (SIEM) and Fraud Detection.

Sreeranga P. Rajan Alvaro A. Cárdenas Pratyusa K. Manadhata
on Feb 11, 2014
Building Applications With Hadoop

When building applications using Hadoop, it is common to have input data from various sources coming in various formats. In his presentation, “New Tools for Building Applications on Apache Hadoop”, Eli Collins overviews how to build better products with Hadoop and various tools that can help, such as Apache Avro, Apache Crunch, Cloudera ML and the Cloudera Development Kit.

Roopesh Shenoy
on Jan 30, 2014
Building a Real-time, Personalized Recommendation System with Kiji

Jon Natkins explains in this article how to create a personalized recommendation system fed with large amounts of real-time data using Kiji, which leverages HBase, Avro, Map-Reduce and Scalding.

Jon Natkins
on Dec 26, 2013
Costin Leau on Elasticsearch, BigData and Hadoop

Elasticsearch is an open source, distributed real-time search and analytics engine for the cloud. The first milestone of elasticsearch-hadoop 1.3.M1 was released last month. InfoQ spoke with Costin Leau about Elasticsearch and how it integrates with Hadoop and other Big Data technologies.

Srini Penchikala
on Nov 15, 2013
Spoilt for Choice – How to choose the right Big Data / Hadoop Platform?

In his new article Kai Wähner compares several alternatives for installing a version of Hadoop and realizing big data processes. He compares distributions and tooling from Apache and many other vendors including Cloudera, HortonWorks, MapR, Amazon, IBM, Oracle, Microsoft. He additionally describes pros and cons of every distribution and provides a decision tree for choosing a most appropriate one.

Kai Wähner
on Jul 09, 2013
Interview and Video Review: Working with Big Data: Infrastructure, Algorithms, and Visualizations

Paul Dix leads a practical exploration into Big Data in this video training series. The first five lessons of the training span multiple server systems with a focus on the end to end processing of large quantities of XML data from real Stack Exchange posts. He completes the training with a lesson on developing visualizations for gaining insights from the macro level analysis of Big Data.

Aslan Brooke
on May 02, 2013
Hadoop Virtual Panel

In this virtual panel, InfoQ talks to several Hadoop vendors and users about their views at current and future state of Hadoop and the things that are the most important for Hadoop’s further adoption and success.

Boris Lublinsky
on Nov 20, 2012
Interview with Arun Murthy on Apache YARN

Apache Hadoop YARN – a new Hadoop resource manager - has just been promoted to a high level Hadoop subproject. InfoQ had the chance to discuss YARN with Arun Murthy - founder and architect at Hortonworks.

Boris Lublinsky
on Aug 17, 2012
Generating Avro Schemas from XML Schemas Using JAXB

Apache Avro is an up and coming binary marshalling framework. In his new article Benjamin Fagin explains how one can leverage existing XSD tooling to create data definitions and then use XJC plugin to directly generate AVRO schemes and marshaling classes.

Benjamin Fagin
on Mar 06, 2012

Newer Articles

Older Articles

InfoQ Software Architects' Newsletter

Articles