Adopting Big Data and Data Science technologies into an organisation is a transformative project similar to an agile transformation and with many similar challenges. In this article, the author describes such a project for a FTSE100 financial services company.
Logging and aggregation are crucial tools for today's complex, distributed systems. They provide rich insights which keep time to recover short. We must therefore make sure we test logging adequately.
Yahoo uses Hadoop for different use cases in big data & machine learning areas. They also use deep learning techniques in their products like Flickr. InfoQ spoke with Peter Cnudde on how Yahoo leverages big data platform technologies.
Data Lake-as-a-Service provides big data processing in the cloud for business outcomes in a cost effective way. InfoQ spoke with Lovan Chetty & Hannah Smalltree from Cazena about these solutions work.
A new Eclipse Oozie plugin allows to significantly simplify implementation of Oozie processes by allowing to define them graphically. An article introduces plugin and provides an example of its usage. 1
Elixir in action aims to introduce readers to Elixir and the Erlang virtual machine while also discussing concurrent programming topics, fault-tolerance, and topics related to high-availability.
An interview with Google's William Vambenepe, who's lead product manager for Big Data services, to ask him about the shift from products to services when working with Big Data.
F# Deep Dives is a new book aimed at showing the business value that using F# brings in practice. It presents 11 industrial scenarios and their solution with F# using a functional-first approach.
The new book, The Practice of Cloud System Administration: Designing and Operating Large Distributed Systems, looks at a wide range of considerations for cloud-scale systems.
In this article Monica Beckwith, starting from core Hadoop components, investigates the design of a highly available, fault tolerant Hadoop cluster, adding security and data-level isolation.
The new “Hadoop in Practice. 2 Edition" book by Alex Holmes covers a lot of topics building Hadoop code and organizing data to support code simplicity and execution speed.
Datameer, a big data analytics application for Hadoop, introduced Datameer 5.0 with Smart Execution to enhance the data analytics. InfoQ spoke with Matt Schumpert from Datameer about the new product.