rponte/littles-law-and-back-pressure.md

Last active January 2, 2026 19:51

Star (2) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Select an option

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/rponte/8489a7acf95a3ba61b6d012fd5b90ed3.js"></script>
Save rponte/8489a7acf95a3ba61b6d012fd5b90ed3 to your computer and use it in GitHub Desktop.

Download ZIP

THEORY: Little's Law and Applying Back Pressure When Overloaded

Raw

littles-law-and-back-pressure.md

Applying Back Pressure When Overloaded

[...]

Let’s assume we have asynchronous transaction services fronted by an input and output queues, or similar FIFO structures. If we want the system to meet a response time quality-of-service (QOS) guarantee, then we need to consider the three following variables:

The time taken for individual transactions on a thread
The number of threads in a pool that can execute transactions in parallel
The length of the input queue to set the maximum acceptable latency

max latency  = (transaction time / number of threads) * queue length
queue length = max latency / (transaction time / number of threads)

By allowing the queue to be unbounded the latency will continue to increase. So if we want to set a maximum response time then we need to limit the queue length.

By bounding the input queue we block the thread receiving network packets which will apply back pressure up stream. If the network protocol is TCP, similar back pressure is applied via the filling of network buffers, on the sender. This process can repeat all the way back via the gateway to the customer. For each service we need to configure the queues so that they do their part in achieving the required quality-of-service for the end-to-end customer experience.

One of the biggest wins I often find is to improve the time taken to process individual transaction latency. This helps in the best and worst case scenarios.

[...]

Author

rponte commented Jan 8, 2025 •

edited

Loading

⭐️ Google SRE Book: Handling Overload
- In a majority of cases (although certainly not in all), we've found that simply using CPU consumption as the signal for provisioning works well, for the following reasons:
  - In platforms with garbage collection, memory pressure naturally translates into increased CPU consumption.
  - In other platforms, it's possible to provision the remaining resources in such a way that they're very unlikely to run out before CPU runs out.
- Our larger services tend to be deep stacks of systems, which may in turn have dependencies on each other. In this architecture, requests should only be retried at the layer immediately above the layer that is rejecting them. When we decide that a given request can't be served and shouldn't be retried, we use an "overloaded; don't retry" error and thus avoid a combinatorial retry explosion.
⭐️ Google SRE Book: Addressing Cascading Failures
- A cascading failure is a failure that grows over time as a result of positive feedback.
- Limit retries per request. Don’t retry a given request indefinitely.
- Consider having a server-wide retry budget. For example, only allow 60 retries per minute in a process, and if the retry budget is exceeded, don’t retry; just fail the request. [...]
- Think about the service holistically and decide if you really need to perform retries at a given level. In particular, avoid amplifying retries by issuing retries at multiple levels: [...]
- Use clear response codes and consider how different failure modes should be handled. For example, separate retriable and nonretriable error conditions. Don’t retry permanent errors or malformed requests in a client, because neither will ever succeed. Return a specific status when overloaded so that clients and other layers back off and do not retry.
- If handling a request is performed over multiple stages (e.g., there are a few callbacks and RPC calls), the server should check the deadline left at each stage before attempting to perform any more work on the request. For example, if a request is split into parsing, backend request, and processing stages, it may make sense to check that there is enough time left to handle the request before each stage.

Author

rponte commented Jan 8, 2025 •

edited

Loading

A tal da "Fila virtual"

Author

rponte commented Mar 21, 2025

Queuing, Backpressure, Single Writer and other useful patterns for managing concurrency

Author

rponte commented Mar 21, 2025 •

edited

Loading

Youtube | ScyllaDB: Resilient Design Using Queue Theory: This talk discusses backpressure, load shedding, and how to optimize latency and throughput.

Author

rponte commented Mar 21, 2025

⭐️ SREcon24 Americas - System Performance and Queuing Theory - Concepts and Application

rafaelpontezup commented Jun 20, 2025

The #1 rule of scalable systems is to avoid congestion collapse - by @jamesacowling
https://x.com/jamesacowling/status/1934991944234770461

A good metaphor for congestion collapse is to imagine you're a barista at a coffee shop that just got popular. The cashier keeps taking orders and stacking them up higher and higher but you can't make coffees any faster. [...] - by @jamesacowling
https://x.com/jamesacowling/status/1935812480254787819

Author

rponte commented Jul 27, 2025

Linkedin post: When your system is overloaded, you have 2 options...

Author

rponte commented Jul 27, 2025

RabbitMQ: Stack Overflow Behavior

RabbitMQ's "stack overflow behavior" refers to how the system handles situations where a queue's capacity is exceeded, leading to an "overflow" of messages. This is particularly relevant when the rate of messages being published to a queue is significantly higher than the rate at which consumers can process them.

Key Aspects of RabbitMQ Queue Overflow Behavior

📏 Maximum Queue Length

RabbitMQ allows configuring a max-length for queues. This limit dictates the maximum number of messages a queue can hold. Setting this limit is crucial for preventing queues from consuming excessive memory and resources.

⚙️ Overflow Strategies

When a queue reaches its max-length, RabbitMQ employs a defined overflow strategy to determine how new messages are handled:

drop-head (Default)
This strategy, the default for classic queues, discards the oldest message in the queue when a new message arrives and the queue is full. This ensures new messages are accepted while maintaining the queue's length limit.
reject-publish
With this strategy, newly published messages are rejected (nacked) by the broker when the queue is full. This signals to the publisher that the message could not be enqueued.
reject-publish-dlx
Similar to reject-publish, but rejected messages are routed to a Dead Letter Exchange (DLX) if configured. This allows for specific handling of rejected messages, such as logging or retrying.

🚦 Flow Control

Beyond explicit overflow strategies, RabbitMQ implements a credit flow control mechanism. This system monitors the rate of message consumption and can temporarily block publishers if consumers are falling significantly behind, preventing the broker from becoming overwhelmed and potentially crashing due to memory exhaustion.

Note: This is distinct from queue overflow but contributes to overall system stability.

⏱️ Message TTL (Time-To-Live)

While not an overflow strategy in itself, setting a message-ttl can help manage queue size by automatically expiring messages after a specified time, regardless of whether they have been consumed. This can prevent queues from growing indefinitely due to slow consumers or unconsumed messages.

⚠️ Consequences of Overflow

Message Loss: If drop-head is used, older messages are discarded.
Publisher Blocking/Errors: If reject-publish or flow control is active, publishers may experience delays or receive errors when attempting to send messages.
Resource Consumption: Without proper limits and strategies, an overflowing queue can consume excessive memory on the RabbitMQ server, potentially impacting performance or leading to crashes.

✅ Addressing Queue Overflow

Increase Consumer Capacity
Scale up the number of consumers or optimize consumer processing logic to handle messages more efficiently.
Optimize Queue Length
Adjust the max-length based on system requirements and expected message volume.
Implement Overflow Strategies
Choose the appropriate x-overflow strategy (for classic queues) based on desired message handling behavior.
Utilize Message TTL
Set TTLs for messages or queues to prevent indefinite message accumulation.
Monitor and Alert
Implement monitoring to detect queue overflow conditions and trigger alerts for proactive intervention.

Author

rponte commented Dec 8, 2025 •

edited

Loading

Twitter thread about Cascading Failure and Backpressure Fundamentals

I liked this thread because it talks about 2 (two) useful patterns on the consumer side to handle backpressuring.

However, it didn't discuss system overloading in general. But, on another tweet, he gives an excellent perspective on the (main) trade-off that drives the choice between back-pressure (blocking on input) and load shedding (dropping data on the floor):

When a service is overwhelmed, we have exactly two choices: preserve data or preserve latency.

We either build backpressure to queue and risk collapse, or we shed load and risk data loss.

Which side of this trade-off does your system live on, and can you defend why?

Author

rponte commented Dec 8, 2025

Your Circuit Breaker is Misconfigured - by Shopify blog

rponte/littles-law-and-back-pressure.md

Select an option

No results found

Select an option

No results found

rponte commented Jan 8, 2025 •

edited

Loading

Uh oh!

rponte commented Jan 8, 2025 •

edited

Loading

Uh oh!

rponte commented Mar 21, 2025

Uh oh!

rponte commented Mar 21, 2025 •

edited

Loading

Uh oh!

rponte commented Mar 21, 2025

Uh oh!

rafaelpontezup commented Jun 20, 2025

Uh oh!

rponte commented Jul 27, 2025

Uh oh!

rponte commented Jul 27, 2025

Uh oh!

rponte commented Dec 8, 2025 •

edited

Loading

Uh oh!

rponte commented Dec 8, 2025

Uh oh!

rponte/littles-law-and-back-pressure.md

rponte commented Jan 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rponte commented Jan 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

A tal da "Fila virtual"

Uh oh!

rponte commented Mar 21, 2025

Uh oh!

rponte commented Mar 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rponte commented Mar 21, 2025

Uh oh!

rafaelpontezup commented Jun 20, 2025

Uh oh!

rponte commented Jul 27, 2025

Uh oh!

rponte commented Jul 27, 2025

RabbitMQ: Stack Overflow Behavior

Key Aspects of RabbitMQ Queue Overflow Behavior

📏 Maximum Queue Length

⚙️ Overflow Strategies

🚦 Flow Control

⏱️ Message TTL (Time-To-Live)

⚠️ Consequences of Overflow

✅ Addressing Queue Overflow

Uh oh!

rponte commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rponte commented Dec 8, 2025

Uh oh!

rponte commented Jan 8, 2025 •

edited

Loading

rponte commented Jan 8, 2025 •

edited

Loading

rponte commented Mar 21, 2025 •

edited

Loading

rponte commented Dec 8, 2025 •

edited

Loading