With support for Machine Learning data pipelines, Apache Spark framework is a great choice for building a unified use case that combines ETL, batch analytics, streaming data analysis, and machine learning. In this fifth installment of Apache Spark article series, author Srini Penchikala discusses Spark ML package and how to use it to create and manage machine learning data pipelines.
In this week's podcast InfoQ’s editor-in-chief Charles Humble talks to Data Scientist Cathy O’Neil. Topics discussed include her book “Weapons of Math Destruction,” predictive policing models, the teacher value added model, approaches to auditing algorithms and whether government regulation of the field is needed.
“Spark GraphX in Action” book from Manning Publications, authored by Michael Malak and Robin East, provides a tutorial based coverage of Spark GraphX, the graph data processing library from Apache Spark framework. InfoQ spoke with authors about the book and Spark GraphX library as well as overall Spark framework and what's coming up in the area of graph data processing and analytics.
“Big Data has plenty of evangelists, but I’m not one of them,” writes Cathy O’Neil, a blogger (mathsbabe.org) and former quantitative analyst at the hedge fund DE Shaw. 4
Christine Doig spoke at OSCON Conference about data science as a team discipline and how to navigate data science Python ecosystem. InfoQ spoke with Christine about challenges of data science teams.
Machine learning research scientist John Langford talks to Wesley Reisz about his ML system Vowpal Rabbit, used for news personalisation on MSN.
In this fourth installment of Apache Spark article series, author Srini Penchikala discusses machine learning concept & Spark MLlib library for running predictive analytics using a sample application.
In Spark in Action book, authors Petar and Marko discuss Apache Spark for data processing batch & streaming data. InfoQ spoke with them about Spark framework, developer tools, and upcoming features.
People worry about whether AI will surpass human intelligence these days. Prof. Juergen Schmidhuber will answer your questions and tell you more about deep learning as well as the latest trends in AI. 4
In this article, author discusses the survival prediction of colorectal cancer as a multi-class classification problem and how to solve that problem using the Apache Spark's MLlib Java API.
This article covers machine learning and cognitive computing, and how they are related to artificial intelligence (AI). Panelists discuss how this technology is applied in digital marketing space.