This contains commands to demo the fix in this patch:
Postgres documents the types of locking it uses here:
https://www.postgresql.org/docs/current/explicit-locking.html
This tells us that UPDATE
queries use the ROW EXCLUSIVE
lock.
To show the behavior of interest, we manually obtain that lock type. This is modeled off the docs at:
https://www.postgresql.org/docs/current/sql-lock.html
To do this manually, use awx-manage dbshell
and type in the following commands.
Stop after the LOCK TABLE
command so that it holds the lock indefinetely.
BEGIN WORK;
LOCK TABLE main_jobevent IN ROW EXCLUSIVE MODE;
COMMIT WORK;
While the lock is active, move onto the next stage.
Now in another terminal tab go into awx-manage shell_plus
.
Run these commands as setup:
from awx.main.utils.common import create_partition
from datetime import timedelta
j = Job.objects.order_by('-created').first()
Now for the critical test, we will create a new partition.
To do this, pick a time in the future from the last job that ran.
This gives 1
hour after the last job run, but after you run that,
you will need to pick the next hour to test again, so continue with
2
, 3
, and so-on as needed.
create_partition('main_jobevent', j.created + timedelta(hours=1))
If you are on current AWX devel
, you should find that it creates the partition quickly.
This will happen even if the lock is held in another terminal tab.
Before that patch, our create_partition
method was written as follows:
def create_partition(tblname, start=None):
"""Creates new partition table for events. By default it covers the current hour."""
if start is None:
start = now()
start = start.replace(microsecond=0, second=0, minute=0)
end = start + timedelta(hours=1)
start_timestamp = str(start)
end_timestamp = str(end)
partition_label = start.strftime('%Y%m%d_%H')
try:
with transaction.atomic():
with connection.cursor() as cursor:
cursor.execute(
f'CREATE TABLE IF NOT EXISTS {tblname}_{partition_label} '
f'PARTITION OF {tblname} '
f'FOR VALUES FROM (\'{start_timestamp}\') to (\'{end_timestamp}\');'
)
except ProgrammingError as e:
logger.debug(f'Caught known error due to existing partition: {e}')
If you run the same thing again in this code state, you should find that it hangs when creating the partition, if the other terminal tab is holding the lock.