InfoQ Homepage Spark MLlib Content on InfoQ
Articles
RSS Feed-
Big Data Processing with Apache Spark - Part 5: Spark ML Data Pipelines
With support for Machine Learning data pipelines, Apache Spark framework is a great choice for building a unified use case that combines ETL, batch analytics, streaming data analysis, and machine learning. In this fifth installment of Apache Spark article series, author Srini Penchikala discusses Spark ML package and how to use it to create and manage machine learning data pipelines.