BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Hadoop Content on InfoQ

  • Generating Avro Schemas from XML Schemas Using JAXB

    Apache Avro is an up and coming binary marshalling framework. In his new article Benjamin Fagin explains how one can leverage existing XSD tooling to create data definitions and then use XJC plugin to directly generate AVRO schemes and marshaling classes.

  • Exploring Hadoop OutputFormat

    As more companies adopt Hadoop, its integration with other applications is becoming more important. A key to such integration is usage of the appropriate OutputFormat allowing to produce output data in a form most appropriate for other applications.

  • Extending Oozie

    In this article authors show how leverage Oozie extensibility to implement custom language extensions. This approach can be viewed a specializing workflow language for a given company/line of business.

  • An Open, Interoperable Cloud

    This article describes how interoperable clouds can be created, today, through the integration of open standards such as the Open Cloud Compute Interface, the Open Virtualisation Format and CDMI. They provide the means to package virtual infrastructure deployments, an API for the runtime management of storage infrastructure and an API for the runtime management of infrastructure as service.

  • Oozie by Example

    End to end Oozie example, including process design, resource coordinator and workflow implementation

  • Introduction to Oozie

    Basic introduction to Oozie - a framework allowing to combine multiple Map/Reduce jobs into a logical unit of work.

  • Using Apache Avro

    Boris Lublinsky presents an introduction to AVRO and evaluate its usage for Schema componentization, inheritance and polymorphism. He also discusses backward compatibility issues and AVRO solutions for this problem.

  • Data Mining in the Swamp: Taming Unruly Data With Cloud Computing

    Matrix presents a white paper on using the open source tool, Hadoop, to implement the MapReduce strategy and a Cloud computing strategy to solve business intelligence problems.

  • Clojure and Rails - the Secret Sauce Behind FlightCaster

    FlightCaster, a realtime flight delay site, is built on Clojure and Hadoop for the statistical analysis. The web frontend is built with Ruby on Rails and hosted on Heroku. We talked to Bradford Cross about Clojure, functional programming and tips for OOP developers interested in making the jump.

  • Yahoo's Doug Cutting on MapReduce and the Future of Hadoop

    InfoQ's lead Java editor, Scott Delap, recently caught up with Hadoop project lead Doug Cutting. Hadoop is an open source distributed computing platform that includes implementations of MapReduce and a distributed file system. In this special InfoQ interview Cutting discusses how Hadoop is used at Yahoo, the challenges of its development, and the future direction of the project.

BT