When backfilling millions of records, it becomes dangerous to try and update them all at once by hand. This technique tries to mitigate some of the dangers by creating a testible script that's sensitive to the current queue depth. It's most useful for things like data backfills, but the technique can be used for any type of sidekiq work that will require millions of jobs.
The goal of this technique is to allow running backfills at any time including peak hours without stressing the procore infrastructure.
- Maximize dispersion of work - we want to disperse the work over time in a way that correlates with the amount of strain currently put on Procore
- Minimize heavy database use - an obvious issue because if the database is getting hammered it can cause issues with locks and queuing
- Decrease variability - we want the results of the backfill to be as predictable as possible