Created
July 31, 2019 18:35
-
-
Save dalyons/68449f7c4fa003339fd1fbf53cae3d70 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## ~/D/c/r/r101-api ❯❯❯ cat lib/active_record/postgresql_adapter_reconnect_patch.rb ✘ 1 nplusone-phonecalls ✭ ✱ ◼ | |
# monkey patch postgres adapter to attempt to reconnect connections that | |
# have been rendered dead by various means, but that activerecord doesnt know about yet. | |
# | |
# We see errors like | |
# ActiveRecord::StatementInvalid: PG::ConnectionBad: PQsocket() - cannot get socket descriptor | |
# ActiveRecord::StatementInvalid: PG::TRSerializationFailure: ERROR: canceling statement due to conflict with recovery DETAIL: User query might have needed to see row versions that must be removed. | |
# | |
# These seem to be caused by upstream cancellations of the connection/socket. | |
# Whenever you pull a connection from a pool in activerecord, it calls #active? | |
# on it first to determine if its alive. | |
# At some point in the rails 4.x line, it was changed from | |
# | |
# def active? | |
# @connection.query 'SELECT 1' | |
# true | |
# rescue PGError | |
# false | |
# end | |
# | |
# To something like: | |
# | |
# def connection_active? | |
# @connection.status == PGconn::CONNECTION_OK | |
# rescue PGError | |
# false | |
# end | |
# | |
# So before it was an active check, now it just checks the state of the conn | |
# in the local C lib. However theres circumstances in which the conn can be | |
# dead but the c lib doesnt know that yet. | |
# | |
# Soooo we catch these specific errors, attempt to reconnect & retry. | |
# | |
module ActiveRecord | |
module PostgreSQLAdapterReconnectPatch | |
MAX_ATTEMPTS = 3 | |
class ReconnectAttemptsExhaustedError < StandardError; end | |
def exec_no_cache(*args) | |
super(*args) | |
rescue ActiveRecord::StatementInvalid => e | |
if (@_reconnect_attempts ||= 0) > MAX_ATTEMPTS | |
# we've failed to reconnect, time to commit seppuku. | |
# exit!() kills immediatly, so we want to sleep a sec to give it time to log & report rollbar | |
$statsd.increment('api.postgres.reconnect.exhausted') | |
$stdout.puts("Too many bad connection reconnect attempts, terminating :(") | |
Exceptions.notify(ReconnectAttemptsExhaustedError.new("Tried #{MAX_ATTEMPTS} times to reconnect bad pg connection, terminating")) | |
sleep 1 | |
exit! | |
end | |
if e.original_exception.is_a?(PG::TRSerializationFailure) || | |
e.original_exception.is_a?(PG::ConnectionBad) | |
$statsd.increment('api.postgres.reconnect.attempt') | |
@_reconnect_attempts += 1 | |
reconnect! | |
super(*args).tap do | |
$statsd.increment('api.postgres.reconnect.success') | |
@_reconnect_attempts = 0 | |
end | |
else | |
$statsd.increment('api.postgres.reconnect.skipped') | |
raise e | |
end | |
end | |
end | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment