Yahoo uses Hadoop for different use cases in big data & machine learning areas. They also use deep learning techniques in their products like Flickr. InfoQ spoke with Peter Cnudde on how Yahoo leverages big data platform technologies.
Machine learning research scientist John Langford talks to QCon chair Wesley Reisz about his ML system Vowpal Wabbit used for news personalisation on MSN. They also discuss how to get started in the field and its shift from academic research to industry use.
Data Lake-as-a-Service solutions provide big data processing in the cloud for faster business outcomes in a very cost effective way. InfoQ spoke with Lovan Chetty and Hannah Smalltree from Cazena team about how Data Lake as a Service works.
A new Eclipse Oozie plugin allows to significantly simplify implementation of Oozie processes by allowing to define them graphically. An article introduces plugin and provides an example of its usage. 1
An interview with Google's William Vambenepe, who's lead product manager for Big Data services, to ask him about the shift from products to services when working with Big Data.
In this article Monica Beckwith, starting from core Hadoop components, investigates the design of a highly available, fault tolerant Hadoop cluster, adding security and data-level isolation.
The new “Hadoop in Practice. 2 Edition" book by Alex Holmes covers a lot of topics building Hadoop code and organizing data to support code simplicity and execution speed.
Datameer, a big data analytics application for Hadoop, introduced Datameer 5.0 with Smart Execution to enhance the data analytics. InfoQ spoke with Matt Schumpert from Datameer about the new product.
This article discusses what stream processing is, how it fits into a big data architecture with Hadoop and a data warehouse (DWH), and what technologies and products you can choose from. 8
GridGain announced In-Memory Accelerator for Hadoop, offering benefits of in-memory computing to Hadoop applications. InfoQ spoke with Nikita Ivanov from GridGain about the product's architecture.
InfoQ spoke with Rich Reimer, VP of Marketing and Product Management at Splice Machine about the architecture and data patterns for SQL-on-Hadoop technologies.
Lambda Architecture proposes a simpler, elegant paradigm designed to process large amounts of data. In this article, author discusses Lambda Architecture with the help of a sample Java application. 20