Created
February 9, 2024 14:59
-
-
Save julik/5336c60472c8ee6a570c63d4e231bfcd to your computer and use it in GitHub Desktop.
A circuit breaker using Pecorino leaky buckets
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# frozen_string_literal: true | |
# Pecobox is a Circuitbox-like class which uses Pecorino for | |
# measurement error rates. It is less resilient than Circuitbox | |
# because your error stats are stored in the database, but guess what: | |
# if your database is down your product is already down, so there is | |
# nothing to circuit-protect. And having a shared data store for | |
# the circuits allows us to track across machines and processes, so | |
# we do not have to have every worker hammer at an already failing | |
# resource right after start | |
class Pecobox | |
class CircuitOpen < StandardError | |
end | |
def initialize(service, max_errors: 70, over_time: 10.minutes, open_for: 2.minutes, exceptions: RuntimeError) | |
@service = service | |
@max_errors = max_errors | |
@over_time = over_time | |
@open_for = open_for | |
@exception_matchers = Array.wrap(exceptions) | |
end | |
# Tells whether this Pecobox monitors a particular instance of Exception, either using matchers or other means | |
def may_open_because_of?(exception) | |
@exception_matchers.any? { |exception_class_or_matcher| exception_class_or_matcher === exception } | |
end | |
def run! | |
Appsignal.increment_counter("pecobox.calls_total", 1, service: @service) | |
# Check whether the throttle can accept our call | |
# That check can be non-atomic and it only does 1 or 2 SELECTs, it won't write anything | |
circuit_breaker_failure_throttle = Pecorino::Throttle.new(key: "pecobox-#{@service}", capacity: @max_errors, over_time: @over_time, block_for: @open_for) | |
if !circuit_breaker_failure_throttle.able_to_accept? | |
# Circuit breaker state must be a value in a sample, because different processes may get a different | |
# view of the circuit breaker state at roughly the same time. So it is more useful to record the | |
# different states together and average them during display. | |
Appsignal.add_distribution_value("pecobox.cb_open", 1.0, service: @service) | |
raise CircuitOpen, "Circuit for #{@service.inspect} is open" unless circuit_breaker_failure_throttle.able_to_accept? | |
else | |
Appsignal.add_distribution_value("pecobox.cb_open", 0.0, service: @service) | |
yield.tap do | |
Appsignal.increment_counter("pecobox.calls_ok", 1, service: @service) | |
end | |
end | |
rescue => e | |
# add one error to the error rate bucket and do not allow it to raise Throttled | |
circuit_breaker_failure_throttle.request(1) if may_open_because_of?(e) | |
raise | |
end | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment