One of the main mechanisms behind Reactive Programming is backpressure. In this article we’ll take a look at the concept, and some of the advantages it creates when working with Reactive Streams.
The concept of backpressure in Reactive Streams is as elegant as it is powerful. It will enable the use of slow Consumers within Reactive applications without them “choking up” on too much information.
Reactive Streams are inherently message-driven. They can best be compared to pipelines where information flows through from one point to the other. The pipelines can be merged, or split up. Data flows through them that can be filtered, mapped, or extended with information of a new Reactive Streams that gets added on.
In Project Reactor these pipelines are mainly made up out of Fluxes. A Flux will generate or receive information from a source. This source will often be a Flux itself, or information streaming in through a Reactive driver. Think for example of user input, messages streaming in from a messaging system like RabbitMQ or Kafka, or even the resultset of a database query.
This all works in a message-driven way. Through the use of event-loops and only a couple of Threads, Project Reactor will make only the Fluxes that have work to be done do any work. This is done through having upstream Fluxes pass data to downstream Fluxes that will react to the new events streaming in. The Subscribers have an “onNext”, “onComplete”, and “onError” method that can be used to pass new data on.
The “onNext” method will pass on a new bit of data like some user input, so the Flux can act on it. The “onComplete” method will be called when the Upstream filter has finished, like at the end of streaming the dataset of a query. Finally the “onError” method can be called in case an error condition occurs upstream.
This setup works very well, and offers a lot of scalability. Especially when dealing with a “slow producer” like for example a user’s keyboard, or a file being loaded, data can be processed fast.
That’s different when we deal with slow consumers, however. Take for example a computationally expensive mapping function, an update in a database or saving a file. In this case, so much data might be streamed into the consumer that it completely “chokes up” on all the data that flows in, and crashes with an OutOfMemoryException.
This is prevented by Project Reactor through the use of backpressure. When a Consumer subscribes itself on a Producer, it will get a Subscription. This will enable a feedback mechanism from the Consumer of the datastream to its Producer. Through it, the Consumer can signal how many data events he’s able to handle.
When the Consumer signals it can handle 5 data events, the Producer will at a maximum call the Consumer 5 times with an onNext method. After consuming these 5 events, the Consumer will ask the Producer for extra events, until an onComplete or onError call occurs instead.
Because of this feedback method, a Consumer will never be overloaded by messages from a Producer, but can instead work through them at his own pace. When the Consumer goes through the events it gets passed quickly, it will start asking them at a faster pace, until an equilibrium is reached.
When a Producer is a Consumer as well like in the case of most Fluxes, it will of course also have backpressure built in. This means that the concept of backpressure runs through the entire Reactive Pipeline, making the Reactive Stream slow down to the speed of the Consumer.
There are many extra ways to configure how your Reactive Streams handle backpressure from downstream Subscribers. A buffer size can be supplied, with different strategies that should be triggered in case an overflow would happen, like ERROR, DROP_LATEST, and DROP_OLDEST.
We can conclude that the concept of backpressure is essential to Project Reactor. By enabling a feedback mechanism in the messaging within the Reactive Pipeline components, it offers much extra stability and flexibility within our Reactive applications.