Reactive Streams Releases First Stable Version for JVM
After more than a year on the drawing board, Reactive Streams has released version 1.0 of their API for several different platforms, Java among them. This library won't add capabilities that weren't previously available, but it will provide a common framework to standardise reactive patterns.
When Reactive Streams started their work, following the path established by the Reactive Manifesto, they indicated that their aim was to provide "a standard for asynchronous stream processing with non-blocking back pressure". The main challenge, however, wasn't to find a solution, in particular since there were already a number of them out there. The challenge was to coalesce the different existing patterns into a common one to maximise interoperability. More precisely, the objective of Reactive Streams was “to find a minimal set of interfaces, methods and protocols that will describe the necessary operations and entities to achieve the goal—asynchronous streams of data with non-blocking back pressure”.
The concept of "back pressure" is key. When an asynchronous consumer subscribes to receive messages from a producer, it will typically provide some form of callback method to be invoked whenever a new message becomes available. If the producer emits messages at a higher rate than the consumer can handle, the consumer could be forced to seize an increasing amount of resources and potentially crash. In order to prevent this, a mechanism is needed by which consumers can notify producers that the message rate needs to be reduced. Producers can then adopt one of multiple strategies to achieve this. This mechanism is called back pressure.
Blocking back pressure is easy to achieve. If, for instance, producer and consumer are running under the same thread, the execution of one will block the execution of the other. This means that, while the consumer is being executed, the producer cannot emit any new messages, and therefore a natural way to balance input and output occurs. However, there are scenarios where blocking back pressure is undesirable (for instance when a producer has multiple consumers, not all of them consuming messages at the same rate) or simply unattainable (for instance when consumer and producer run in different environments). In these cases it is necessary that the back pressure mechanism works in a non-blocking way.
The way to achieve non-blocking back pressure is to move from a push strategy, where the producer sends messages to the consumer as soon as these are available, to a pull strategy, where the consumer requests a number of messages to the producer and this sends only up to this amount, waiting for further requests before sending any more. This is the strategy chosen by Reactive Streams, as can be deduced by analysing the interfaces.
It is worth noting that non-blocking back pressure was already available to some degree in Java 8 through the newly introduced java.util.Streams API. As explained by Raoul-Gabriel Urma, co-author of “Java 8 in Action”, the new streams in Java 8, together with all the operations that can be executed upon them, already follow the aforementioned pull strategy, therefore providing a form non-blocking back pressure. However, this isn't incompatible with defining a standard for reactive systems, in fact, work is already under way to make the Java implementation of Reactive Streams part of Java 9: Doug Lea, leader of JSR 166, which added concurrency utilities to Java, has proposed a new Flow class that will include the interfaces currently provided by Reactive Streams. Doug justified his proposal due to the fact that “there is no single best fluent async/parallel API. CompletableFuture/CompletionStage best supports continuation-style programming on futures, and java.util.stream best supports (multi-stage, possibly-parallel) ‘pull’ style operations on the elements of collections. Until now, one missing category was ‘push’ style operations on items as they become available from an active source”.
TypeSafe, one of the main contributors of Reactive Streams, indicated during a public webinar some of the most common applications of this technology: bulk data transfers, real-time data sources, batch processing of large data sets, monitoring and analytics. However, the most interesting and promising applications of Reactive Streams may come from the field of distributed computing.
There are two main paradigms when speaking of distributed computing, the radial paradigm, where a central system distributes workload among node computers and then gathers the results, and the mesh model, where there is no central entity, just a number of potentially heterogeneous devices connected to each other. Although both types of distributed computing networks could benefit from Reactive Streams, it is the second type that can benefit the most.
Radial networks tend to be planned for, and therefore carefully engineered and monitored. Mesh networks, on the contrary, tend to form spontaneously by the connection of lower-end devices like a network of sensors or people's mobile phones. Due to the inherent unpredictability of mesh networks, the resiliency provided by Reactive Streams is key to prevent the collapse of the network. We spoke with José Luis Fernandez-Marquez, lead developer of the SAPERE project at the Institute of Services Science of the University of Geneva (Switzerland). José Luis talked about five main groups of applications where mesh networks could potentially benefit from Reactive Streams:
Massive congregations of people and/or devices where traditional antennae saturate and communications aren’t possible, such as traffic jams, music festivals and other kind of events. In these situations mobile phones could enable communications by creating a spontaneous mesh network, where only some of the devices would talk to the main antennae, and the rest would establish communication indirectly through the spontaneous network.
Scenarios where basic communication infrastructure is unavailable, such as after natural disasters or in underdeveloped areas. Communication could be made possible by setting up a mesh network with whatever devices are available.
Ultra-low latency applications, such as car-to-car communications or hovering information. Vehicles could inform each other of crashes, ice formations and other threats, provided a network has been formed among them.
Localised data of localised interest, more specifically when data is only of interest in the same area where it is produced, like live-traffic data. Since users of city A are unlikely to need live-traffic data of city B, there is no need to establish central servers to collate and reproduce these data; instead, users in city A could share this information through a mesh network.
Privacy. A mesh network could be used to allow communication exchanges without going through a central (and potentially censoring) entity. Applications like FireChat already follow this pattern.
All the above, and probably more, would produce a network of unpredictable traffic with unpredictable bottlenecks. Without the relevant mechanism to handle communications, nodes could saturate and drop easily, potentially affecting the network itself. José Luis thinks Reactive Streams has the potential to be part of that mechanism.