BT
Older rss

Three Experts on Big Data Engineering

Posted by Clemens Szyperski Martin Petitclerc Roger Barga on  Mar 12, 2017

Clemens Szyperski (Microsoft), Martin Petitclerc (IBM), and Roger Barga (Amazon Web Services) talk about challenges when building scalable, big data systems, and how to address them.

Data Preprocessing vs. Data Wrangling in Machine Learning Projects

Posted by Kai Wähner on  Mar 05, 2017

This article compares different alternative techniques to prepare data, including extract-transform-load (ETL) batch processing, streaming ingestion and data wrangling.

Learning Paths: QCon London Expert Recommendations

Posted by Wesley Reisz on  Feb 16, 2017

Advice on the best talks to attend at QCon London 2017 from London Thought Leaders.

Q&A with Immuta on the Implications of EU’s General Data Protection Regulation (GDPR)

Posted by Manuel Pais on  Feb 10, 2017

InfoQ talked with Immuta’s Andrew Burt and Steve Touw, to better understand the implications and challenges of the EU's Global Data Protection Regulation, which will come into effect in May 2018.

Cassandra: The Definitive Guide, 2nd Edition Book Review and Interview

Posted by Srini Penchikala on  Jan 05, 2017

Cassandra: The Definitive Guide, 2nd Edition book authored by Jeff Carpenter and Eben Hewitt covers the Cassandra NoSQL database version 3.0. InfoQ spoke with the co-author Jeff Carpenter.

Article Series: Getting a Handle on Data Science

Posted by Francine Bennett on  Dec 05, 2016

In this series we explore ways of making sense of data science - understanding where it’s needed and where it’s not, and how to make it an asset for you, from people who’ve been there and done it.

Peter Cnudde on How Yahoo Uses Hadoop, Deep Learning and Big Data Platform

Posted by Srini Penchikala on  Oct 13, 2016

Yahoo uses Hadoop for different use cases in big data & machine learning areas. InfoQ spoke with Peter Cnudde on how Yahoo leverages big data technologies.

Traffic Data Monitoring Using IoT, Kafka and Spark Streaming

Posted by Amit Baghel on  Sep 28, 2016

Internet of Things (IoT) is an emerging technology. One of the areas of IoT is the connected vehicles. In this article, we'll use Spark and Kafka to analyse and process IoT connected vehicle's data. 9

Big Data Processing with Apache Spark - Part 5: Spark ML Data Pipelines

Posted by Srini Penchikala on  Sep 24, 2016

In this fifth installment of Apache Spark article series, author Srini Penchikala discusses Spark ML package and how to use it to create and manage machine learning data pipelines. 2

BT