Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Introducing Reactive Streams

Introducing Reactive Streams

This item in japanese

Modern software increasingly operates on data in near real-time. There is business value in sub-second responses to changing information, a speed traditional batch-based architectures can’t compete with. Stream processing is one way to help turn data into knowledge as fast as possible, Kevin Webber explains in an introduction to Reactive Streams. Example of systems he thinks are suitable for stream processing includes ETL (extract, transform, load) and complex event processing (CEP) systems, as well as other reporting or analytics systems.

Webber, enterprise advocate at Typesafe, describes streams as a series of elements emitted over time, potentially without beginning or end. Comparing with an array of numbers which has a fixed size and where it’s easy to calculate a mean value, a stream may have an infinite series of numbers which makes it challenging how and when to calculate an average. This highlights an important aspect of streams, in a stream it’s expected that not every single element of a stream will be processed. For Webber this also highlights the need to think in terms of movement with ever-changing sets of data that are in constant motion.

Reactive streams is a specification and developers building systems will work with an implementation of this specification. The goal of reactive streams is to increase the abstraction level, instead of dealing with the low-level plumbing of streams the specification defines those problems which are then solved by the library implementations.

One major objective of the reactive streams specification is asynchronous boundaries as a means of decoupling components in a system in time. In a synchronous world each function or operation is processed in order, one after another, with each one, except the first one, depending on the previous one to complete its work. For Webber this creates a maintenance burden and hinders building fully responsive systems. He sees such a design as a responsiveness anti-pattern that impacts scalability and resilience. In order to fully take advantage of the multi-core CPU architectures of today, Webber claims we need a completely different model facilitating decoupling of components thus enabling the functions in the synchronous example to execute in parallel.

Another major objective is defining a model for dealing with back pressure. The ideal paradigm for stream processing is to push data from the publisher to a subscriber, allowing the publisher to operate as fast as possible and back pressure is a way to ensure that a fast publisher doesn’t overload a slow subscriber. Back pressure provides resilience by working with flow control to ensure steady state of operation and graceful degradation.

In a recent blog post Jamie Allen, senior director at Typesafe, describes washing dishes from an asynchronous, non-blocking and concurrent perspective and compares this with how a reactive system would deal with an increasing workload.

Rate this Article