The Oldest Trick in Distributed Systems Is Still the One Engineers Keep Forgetting

Backpressure — the practice of letting slower consumers signal upstream producers to slow down — solves a class of system failures that queues, retries, and autoscaling alone cannot.

Written by Iris Vale · Bureau Tech · Jun 1, 2026

What backpressure actually means

Backpressure is a flow-control technique. When a downstream component — a database, a queue consumer, a microservice — is processing work more slowly than it is receiving it, backpressure gives it a mechanism to signal that fact upstream, so the producer can slow down, pause, or shed load rather than continuing to flood the system.

The term comes from fluid dynamics, where it describes resistance in a pipe that pushes back against flow. In computing, the analogy holds: without resistance, a fast producer and a slow consumer will eventually exhaust memory, saturate a queue, or trigger cascading failures across dependent services.

Why this keeps being rediscovered

Backpressure is not a new idea. TCP's congestion window — the mechanism that prevents a sender from overwhelming a receiver — is a form of backpressure baked into the internet's foundational protocol. Unix pipes implement it at the operating system level: a writer blocks when the pipe buffer is full until the reader catches up.

Reactive programming frameworks, including those following the Reactive Streams specification (a standard for asynchronous stream processing with non-blocking backpressure, formalized in 2015 and later incorporated into Java 9), made backpressure a first-class concept at the application layer.

And yet, as Lucas F. Costa argues in a detailed technical post, the pattern is routinely absent from the service architectures engineers build today. The reasons are structural: microservices communicate over HTTP or message queues that, by default, do not propagate capacity signals backward through a call chain. A service that is overwhelmed returns errors or times out; it does not, unless explicitly designed to, tell its callers to slow down.

The failure modes backpressure prevents

The most common failure pattern in its absence is the retry storm. A slow downstream service begins returning errors. Callers, following standard reliability guidance, retry. Retries increase load on an already-struggling service. The service degrades further. More retries follow. The cycle is self-reinforcing and can take down services that were healthy before the incident began.

Unbounded queues create a related problem. A queue that accepts work indefinitely appears to absorb load spikes gracefully — until it doesn't. When the consumer finally falls far enough behind, the queue grows until memory is exhausted, or the latency of processing any given item becomes so high that it is effectively useless. The queue provided the illusion of resilience without the substance of it.

Autoscaling, a common proposed remedy, addresses neither problem directly. It adds capacity reactively, after a signal (CPU utilization, queue depth, request latency) crosses a threshold. The lag between signal and new capacity coming online — typically measured in minutes — is long enough for a spike to cause significant damage. Backpressure, by contrast, shapes load in real time.

What implementing it actually requires

Backpressure is not a library you add; it is a design decision that propagates through a system. A service that wants to apply backpressure must first be able to measure its own capacity — how much work it can accept without degrading. It must then expose that signal in a form callers can act on: an HTTP 429 (Too Many Requests) response, a queue that stops accepting messages, a gRPC flow-control signal.

Callers must be designed to receive and respect that signal. This is the part that is most often missing. A service that returns 429 to a caller that simply retries immediately has not implemented backpressure; it has implemented a more expensive version of the original problem.

Finally, there is the question of what happens at the edge of the system — the point where load cannot be pushed further upstream. At that boundary, a system must make an explicit choice: shed load (drop requests, return errors to end users), buffer (accept the work but delay it), or surface the constraint visibly so that operators can respond. Each choice has tradeoffs. None of them is free. The value of backpressure is not that it eliminates that choice, but that it makes the choice deliberate rather than accidental.

The broader argument

Costa's framing — that backpressure is 'all you need' — is deliberately provocative. The stronger version of the claim is that a large proportion of distributed systems reliability failures are, at their root, backpressure failures: systems that lacked a mechanism to say 'not yet' and paid for it when load exceeded capacity.

That framing is speculative in its generality, and should be read as such. What is less speculative is the narrower point: backpressure is a well-understood, well-proven mechanism that remains underimplemented in application-layer service design, and the cost of that gap shows up reliably in incident reports.

Key takeaways

Backpressure lets a slow or overwhelmed consumer tell its producer to pause or reduce throughput, rather than silently dropping work or crashing.
Without backpressure, unbounded queues and aggressive retry logic can amplify load spikes into full system failures — a pattern sometimes called a 'retry storm.'
The mechanism is well-established in protocols like TCP and in reactive programming frameworks, but is frequently absent from application-layer service design.
Autoscaling is not a substitute: it adds capacity reactively and too slowly to absorb sudden spikes, whereas backpressure shapes load before it becomes a problem.
Implementing backpressure requires explicit design choices — including what to do when a producer is told to slow down: shed load, buffer, or surface an error to the caller.

FAQ

Is backpressure the same as rate limiting?

They are related but distinct. Rate limiting is typically enforced by the receiver and caps how much traffic it will accept, often on a per-client basis. Backpressure is a signaling mechanism: the receiver communicates its current capacity to the sender so the sender can adjust its behavior. Rate limiting is a policy; backpressure is a feedback loop. In practice, a 429 response can serve both functions, but the intent and implementation differ.

Does using a message queue solve the backpressure problem?

Not automatically. A queue decouples producer and consumer in time, which helps absorb short bursts. But if the queue is unbounded and the consumer is persistently slower than the producer, the queue grows indefinitely. Backpressure requires the queue to signal — either by refusing new messages, slowing producers, or surfacing queue depth as a metric that triggers upstream action. The queue is a tool; backpressure is a design property.

Which protocols or frameworks have backpressure built in?

TCP implements backpressure via its congestion window and receive buffer. The Reactive Streams specification (adopted in Java 9 as java.util.concurrent.Flow) defines a standard API for backpressure in asynchronous stream processing. gRPC supports flow control at the transport layer. HTTP/2 includes flow-control frames. At the application layer, most REST-over-HTTP architectures do not implement backpressure by default and require explicit design to do so.

What should a system do when backpressure reaches the edge — the point where load cannot be pushed further upstream?

At the system boundary, the options are: shed load (reject requests, return errors to callers or end users), buffer (accept work but delay processing, accepting higher latency), or block (pause the producer, which may not be possible for external traffic). Each approach has tradeoffs in user experience, resource consumption, and operational complexity. The important thing is that the choice is made explicitly, not by accident when a queue fills or a service crashes.

Citations

Backpressure is all you needMost reliability problems in distributed systems are, at their core, backpressure problems; backpressure allows slower consumers to signal upstream producers to reduce throughput.
Reactive Streams SpecificationThe Reactive Streams specification defines a standard for asynchronous stream processing with non-blocking backpressure, later incorporated into Java 9 as java.util.concurrent.Flow.
Hacker News discussion: Backpressure is all you needBureau research source: Hacker News surfaced this post as a notable item in technical discussion.
RFC 5681: TCP Congestion ControlTCP implements backpressure through its congestion window mechanism, which prevents a sender from overwhelming a receiver by limiting unacknowledged data in flight.