BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Guides The InfoQ eMag: Modern Data Architectures, Pipelines, & Streams

The InfoQ eMag: Modern Data Architectures, Pipelines, & Streams

Bookmarks

In this eMag on “Modern Data Architectures, Pipelines and Streams”, you’ll find up-to-date case studies and real-world data architectures from technology SME’s and leading data practitioners in the industry.

“Building & Operating High-Fidelity Data Streams” by Sid Anand highlights the importance of reliable and resilient data stream architectures. He talks about how to create high-fidelity loosely-coupled data stream solutions from the ground up with built-in capabilities such as scalability, reliability, and operability using messaging technologies like Apache Kafka.

Sharma Podila’s article on “Microservices to Async Processing Migration at Scale” emphasizes the importance of asynchronous processing and how it can improve the availability of a web service by relieving backpressure using Apache Kafka by implementing a durable queue between service layers.

“Streaming-First Infrastructure for Real-Time Machine Learning” by Chip Huyen nicely captures ​​the benefits of streaming-first infrastructure for real-time ML scenarios like online prediction and continual learning.

And “Building End-to-End Field Level Lineage for Modern Data Systems” authored by Mei Tao, Xuanzi Han and Helena Muñoz describes the data lineage as a critical component of data pipeline root cause and impact analysis workflow, and how automating lineage creation and abstracting metadata to field-level helps with the root cause analysis efforts.

We at InfoQ hope that you find the value in the articles and other resources shared in this eMag and potentially apply the design patterns and techniques discussed, in your own data architecture projects and initiatives.

We would love to receive your feedback via editors@infoq.com or on Twitter about this eMag. I hope you have a great time reading it!

Free download

The InfoQ eMag - Modern Data Architectures, Pipelines, & Streams include:

  • Building & Operating High-Fidelity Data Streams - At QCon Plus 2021 last November, Sid Anand, chief architect at Datazoom and PMC Member at Apache Airflow, presented on building high-fidelity nearline data streams as a service within a lean team. In this talk, Anand provides a master class on building high-fidelity data streams from the ground up.
  • Migrating Netflix's Viewing History from Synchronous Request-Response to Async Events - Sharma Podila shares lessons from migrating to asynchronous processing at scale, requiring attention to managing data loss, a highly available infrastructure, and elasticity to handle bursts. 

  • Streaming-First Infrastructure for Real-Time Machine Learning - The benefits of streaming-first infrastructure for real-time ML are online prediction for fast responses and continual learning for adapting to changes in data distributions in production.

  • Building End-to-End Field Level Lineage for Modern Data Systems - In this article, authors discuss the data lineage as a critical component of data pipeline root cause and impact analysis workflow, and how automating lineage creation helps with root cause analysis.

InfoQ eMags are professionally designed, downloadable collections of popular InfoQ content - articles, interviews, presentations, and research - covering the latest software development technologies, trends, and topics.

BT