By default, Postgres does not enforce any connection limits or lock timeouts. While this is a sensible default, if you are sharing a single database across many apps—as we are—it creates the ability for one application to take down the entire system.
What we've done instead is to instrument Active Record's connetion setup to enforce limits:
- a hard connection timeout of 20 seconds (any connection lasting longer than this is killed)
- a hard lock timeout of 19 seconds (any lock held longer than this is killed)
These are very high, but we had to start somewhere. What this means is if an application were to suddenly experience poor performance, it would not be able to hold onto database connections while it suffered the outage. This means that while that application will be experiencing trouble, the other applications using that database will not.