Machine learning research scientist John Langford talks to QCon chair Wesley Reisz about his ML system Vowpal Wabbit used for news personalisation on MSN. They also discuss how to get started in the field and its shift from academic research to industry use.
The new “Hadoop in Practice. Second Edition” book by Alex Holmes provides a deep insight into Hadoop ecosystem covering a wide spectrum of topics such as data organization, layouts and serialization, data processing, including MapReduce and big data patterns, special structures along with their usage to simplify big data processing, and SQL on Hadoop data.
This article discusses what stream processing is, how it fits into a big data architecture with Hadoop and a data warehouse (DWH), when stream processing makes sense, and what technologies and products you can choose from.
Apache Hadoop YARN – a new Hadoop resource manager - has just been promoted to a high level Hadoop subproject. InfoQ had the chance to discuss YARN with Arun Murthy - founder of Hortonworks. 1