BT
x Your opinion matters! Please fill in the InfoQ Survey about your reading habits!

InfoQ eMag: Hadoop

by InfoQ on May 19, 2014

About the Author

InfoQ.com is facilitating the spread of knowledge and innovation in enterprise software development. InfoQ content is currently published in English, Chinese, Japanese and Brazilian Portuguese. With a readership base of over 800,000 unique visitors per month reading content from 100 locally-based editors across the globe, we continue to build localized communities.

Hadoop 2 - which provides a huge update over Hadoop 1 - is no longer just about Map-Reduce. This eMag will delve into the various updates in Hadoop 2, including new projects such as Storm, Tez, Spark and others. Through various case studies, it will examine Hadoop architecures, some useful frameworks, and look at how teams leverage Hadoop for real-world projects.

Free download

Contents of the Hadoop eMag include:

  • Introduction - Apache Hadoop is an open-source framework that runs applications on large clustered hardware (servers). It is designed to scale from a single server to thousands of machines, with a very high degree of fault tolerance.
  • Building Applications With Hadoop - When building applications using Hadoop, it is common to have input data from various sources coming in various formats. In his presentation, “New Tools for Building Applications on Apache Hadoop”, Eli Collins overviews how to build better products with Hadoop and various tools that can help, such as Apache Avro, Apache Crunch, Cloudera ML and the Cloudera Development Kit.
  • What is Apache Tez? - Apache Tez is a new distributed execution framework that is targeted to-wards data-processing applications on Hadoop. But what exactly is it? How does it work? In the presentation, “Apache Tez: Accelerating Hadoop Query Processing”, Bikas Saha and Arun Murthy discuss Tez’s design, highlight some of its features and share initial results obtained by making Hive use Tez instead of MapReduce.
  • Modern Healthcare Architectures Built with Hadoop - We have heard plenty in the news lately about healthcare challenges and the difficult choices faced by hospital administrators, technology and pharmaceutical providers, researchers, and clinicians. At the same time, consumers are experiencing increased costs without a corresponding increase in health security or in the reliability of clinical outcomes.
  • How LinkedIn Uses Apache Samza - Apache Samza is a stream processor LinkedIn recently open-sourced. In his presentation, Samza: Real-time Stream Processing at LinkedIn, Chris Riccomini discusses Samza's feature set, how Samza integrates with YARN and Kafka, how it's used at LinkedIn, and what's next on the roadmap.

About InfoQ eMags

InfoQ eMags are professionally designed, downloadable collections of popular InfoQ content - articles, interviews, presentations, and research - covering the latest software development technologies, trends, and topics.

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2014 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT