AI, ML & Data Engineering Follow 816 Followers

Simplifying ML Workflows with Apache Beam

Posted by Tyler Akidau  on  Jun 19, 2018 Posted by Tyler Akidau Follow 1 Followers  on  Jun 19, 2018

Tyler Akidau discusses how Apache Beam is simplifying pre- and post-processing for ML pipelines.

AI, ML & Data Engineering Follow 816 Followers

Orchestrating Data Microservices with Spring Cloud Data Flow

Posted by Mark Pollack  on  Jun 09, 2018 Posted by Mark Pollack Follow 0 Followers  on  Jun 09, 2018

Mark Pollack discusses how to create data integration and real-time data processing pipelines using Spring Cloud Data Flow and deploy them to multiple platforms – Cloud Foundry, Kubernetes, and YARN.

AI, ML & Data Engineering Follow 816 Followers

Data Pipelines for Real-Time Fraud Prevention at Scale

Posted by Mikhail Kourjanski  on  May 23, 2018 Posted by Mikhail Kourjanski Follow 1 Followers  on  May 23, 2018

Mikhail Kourjanski discusses the architecture of PayPal’s data service which combines a Big Data approach with providing data in real time for decision making in fraud detection.

AI, ML & Data Engineering Follow 816 Followers

Developing Data and ML Pipelines at Stitch Fix

Posted by Jeff Magnusson  on  May 15, 2018 Posted by Jeff Magnusson Follow 0 Followers  on  May 15, 2018

Jeff Magnusson discusses thoughts and guidelines on how Stitch Fix develops, schedules, and maintains their data and ML pipelines.

Architecture & Design Follow 2024 Followers

Streaming Reactive Systems & Data Pipes w. Squbs

Posted by Anil Gursel  on  May 04, 2018 Posted by Anil Gursel Follow 0 Followers , Akara Sucharitakul Follow 0 Followers  on  May 04, 2018

Anil Gursel and Akara Sucharitakul focus on modeling and building software that considers all input and all output as stream of events, and introducing Squbs.

Architecture & Design Follow 2024 Followers

Scaling Uber's Elasticsearch Clusters

Posted by Danny Yuan  on  Apr 11, 2018 Posted by Danny Yuan Follow 3 Followers  on  Apr 11, 2018

Danny Yuan talks about how Uber scaled its Elasticsearch clusters as well as its ingestion pipelines for ingestions, queries, data storage, and operations by a three-person team.

AI, ML & Data Engineering Follow 816 Followers

Effective Data Pipelines: Data Mngmt from Chaos

Posted by Katharine Jarmul  on  Mar 29, 2017 Posted by Katharine Jarmul Follow 0 Followers  on  Mar 29, 2017

Katharine Jarmul discusses implementation decisions for those looking for a practical recommendation on the "what" and "how" of data automation workflows.

AI, ML & Data Engineering Follow 816 Followers

Building Data Pipelines in Python

Posted by Marco Bonzanini  on  Mar 28, 2017 Posted by Marco Bonzanini Follow 0 Followers  on  Mar 28, 2017

Marco Bonzanini discusses the process of building data pipelines and all the steps necessary to prepare data, focusing on data plumbing and going from prototype to production.

AI, ML & Data Engineering Follow 816 Followers

Cloud Native Streaming and Event-driven Microservices

Posted by Marius Bogoevici  on  Jan 14, 2017 Posted by Marius Bogoevici Follow 1 Followers  on  Jan 14, 2017

Marius Bogoevici demonstrates how to create complex data processing pipelines that bridge the big data and enterprise integration together and how to orchestrate them with Spring Cloud Data Flow.

AI, ML & Data Engineering Follow 816 Followers

Spring and Big Data

Posted by Thomas Risberg  on  Jan 08, 2017 Posted by Thomas Risberg Follow 0 Followers  on  Jan 08, 2017

Thomas Risberg discusses developing big data pipelines with Spring, focusing around the code needed and he also covers how to set up a test environment both locally and in the cloud.

AI, ML & Data Engineering Follow 816 Followers

Data Microservices in the Cloud

Posted by Mark Pollack  on  Jan 08, 2017 Posted by Mark Pollack Follow 0 Followers  on  Jan 08, 2017

Mark Pollack introduces Spring Cloud Data Flow enabling one to create pipelines for data ingestion, real-time analytics and data import/export, demoing apps that are deployed onto multiple runtimes.

AI, ML & Data Engineering Follow 816 Followers

Hydrator: Open Source, Code-Free Data Pipelines

Posted by Jonathan Gray  on  Oct 23, 2016 Posted by Jonathan Gray Follow 0 Followers  on  Oct 23, 2016

Jonathan Gray introduces Hydrator, an open source framework and user interface for creating data lakes for building and managing data pipelines on Spark, MapReduce, Spark Streaming and Tigon.

Login to InfoQ to interact with what matters most to you.

Recover your password...


Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.


More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.


Stay up-to-date

Set up your notifications and don't miss out on content that matters to you