Skip to content

Instantly share code, notes, and snippets.

@MarkusH
Last active March 11, 2025 23:24
Show Gist options
  • Save MarkusH/4c03d0867ba37b6b89379bced12cb3ae to your computer and use it in GitHub Desktop.
Save MarkusH/4c03d0867ba37b6b89379bced12cb3ae to your computer and use it in GitHub Desktop.
A script that outputs the raw SQL to convert Django's serial columns to identity columns.
WITH tab AS (
SELECT
a.attrelid::regclass::text AS t,
a.attname AS c,
pg_get_serial_sequence(a.attrelid::regclass::text, a.attname) AS s,
nextval(pg_get_serial_sequence(a.attrelid::regclass::text, a.attname)) AS v
FROM pg_attribute a
JOIN pg_class c ON c.oid = a.attrelid
WHERE
a.attrelid::regclass::text LIKE '%'
AND c.relkind IN ('r', 'p') /* regular and partitioned tables */
AND a.attnum > 0
AND NOT a.attisdropped
AND a.atttypid = ANY ('{int,int8,int2}'::regtype[])
AND EXISTS (
SELECT FROM pg_attrdef ad
WHERE
ad.adrelid = a.attrelid
AND ad.adnum = a.attnum
AND (
pg_get_expr(ad.adbin, ad.adrelid)
=
'nextval('''
|| (
pg_get_serial_sequence(a.attrelid::regclass::text, a.attname)
)::regclass
|| '''::regclass)'
)
)
ORDER BY a.attnum
)
SELECT
'BEGIN; '
|| 'LOCK TABLE ' || quote_ident(t) || ' IN ACCESS EXCLUSIVE MODE; '
|| 'ALTER TABLE ' || quote_ident(t) || ' ALTER COLUMN ' || quote_ident(c) || ' DROP DEFAULT; '
|| 'DROP SEQUENCE ' || s || '; '
|| 'ALTER TABLE ' || quote_ident(t) || ' ALTER ' || quote_ident(c) || ' ADD GENERATED BY DEFAULT AS IDENTITY (RESTART ' || v || '); '
|| 'COMMIT;'
FROM tab;
@rob4226
Copy link

rob4226 commented Apr 22, 2023

Cool! Thank you for posting this! I just had to remove the single quotes around table names (or use double quotes) in the generated sql, then it worked perfect!

I found this gist from a great article that helped me understand this change in Django 4.1

Does any know if any changes need to be made to the migrations of Django after runing the SQL manually to change the columns? Like if I ran the migrations to recreate my database, would it create the tables using the old serial type of columns?

@MarkusH
Copy link
Author

MarkusH commented Apr 25, 2023

Does any know if any changes need to be made to the migrations of Django after runing the SQL manually to change the columns? Like if I ran the migrations to recreate my database, would it create the tables using the old serial type of columns?

If you recreate the database with Django 4.1 or higher, you do not need to rerun the SQL.

@crgwbr
Copy link

crgwbr commented Sep 12, 2024

Unless I'm mistaken, this appears to get the nextval of the old sequence at the the SQL is generated, rather than at the time the generated SQL itself it run. That means that if any rows are inserted in-between the time the SQL is generated and when it's run, the new sequence value will be incorrect. Shouldn't the call to nextval be part of the generated SQL (and inside the exclusive table lock)? That'd also mean re-ordering commands so that the DROP SEQUENCE statement is after the last ALTER TABLE.

Update: Here's an updated version of this script that fixes the issue w/ nextval timing.

WITH tab AS (
    SELECT
        a.attrelid::regclass::text AS table_name,
        a.attname AS pk_col,
        pg_get_serial_sequence(a.attrelid::regclass::text, a.attname) AS seq_name
    FROM pg_attribute a
    JOIN pg_class c ON c.oid = a.attrelid
    WHERE
        a.attrelid::regclass::text LIKE '%'
        AND c.relkind IN ('r', 'p')  /* regular and partitioned tables */
        AND a.attnum > 0
        AND NOT a.attisdropped
        AND a.atttypid = ANY ('{int,int8,int2}'::regtype[])
        AND EXISTS (
            SELECT FROM pg_attrdef ad
            WHERE
                ad.adrelid = a.attrelid
                AND ad.adnum = a.attnum
                AND (
                    pg_get_expr(ad.adbin, ad.adrelid)
                    =
                    'nextval('''
                    || (
                        pg_get_serial_sequence(a.attrelid::regclass::text, a.attname)
                    )::regclass
                    || '''::regclass)'
                )
        )
    ORDER BY a.attnum
)
SELECT
    'BEGIN; '
    || 'DO $$ DECLARE nextSeqVal bigint; '
    || 'BEGIN '
    || 'LOCK TABLE ' || quote_ident(table_name) || ' IN ACCESS EXCLUSIVE MODE; '
    || 'ALTER TABLE ' || quote_ident(table_name) || ' ALTER COLUMN ' || quote_ident(pk_col) || ' DROP DEFAULT; '
    || 'nextSeqVal := nextval(''' || seq_name || '''); '
    || 'DROP SEQUENCE ' || seq_name || '; '
    || 'EXECUTE format(''ALTER TABLE ' || quote_ident(table_name) || ' ALTER ' || quote_ident(pk_col) || ' ADD GENERATED BY DEFAULT AS IDENTITY (RESTART %s);'', nextSeqVal); '
    || 'END $$; '
    || 'COMMIT;'
FROM tab;

As well, here's an example bash script of how you can generate and run the migration code on a batch of DBs:

#!/usr/bin/env bash

set -euxo pipefail

DB_URL="postgres://foo:[email protected]:5432"

for DB in foo bar baz
do
    psql \
        -d "${DB_URL}/${DB}" \
        -f serial-to-identity.sql \
        | grep 'BEGIN;' \
        | awk '{$1=$1};1' \
            > "migrate-${DB}.sql";
    psql -d "${DB_URL}/${DB}" \
        -f "migrate-${DB}.sql";
    echo "Finished ${DB}";
done

@johanneskares
Copy link

johanneskares commented Mar 11, 2025

This should execute directly

DO $$
DECLARE
    rec record;
    table_ident text;
    seq_ident text;
    max_val bigint;
BEGIN
    FOR rec IN
        SELECT
            a.attrelid::regclass::text AS t,
            a.attname AS c,
            pg_get_serial_sequence(a.attrelid::regclass::text, a.attname) AS s
        FROM pg_attribute a
        JOIN pg_class c ON c.oid = a.attrelid
        WHERE
            a.attrelid::regclass::text LIKE '%'
            AND c.relkind IN ('r', 'p')  -- regular and partitioned tables
            AND a.attnum > 0
            AND NOT a.attisdropped
            AND a.atttypid = ANY ('{int,int8,int2}'::regtype[])
            AND EXISTS (
                SELECT 1
                FROM pg_attrdef ad
                WHERE
                    ad.adrelid = a.attrelid
                    AND ad.adnum = a.attnum
                    AND (
                        pg_get_expr(ad.adbin, ad.adrelid)
                        =
                        'nextval('''
                        || (pg_get_serial_sequence(a.attrelid::regclass::text, a.attname))::regclass
                        || '''::regclass)'
                    )
            )
        ORDER BY a.attnum
    LOOP
        -- Handle schema-qualified table names
        table_ident := CASE 
                         WHEN position('.' in rec.t) > 0 
                           THEN quote_ident(split_part(rec.t, '.', 1)) || '.' || quote_ident(split_part(rec.t, '.', 2))
                         ELSE quote_ident(rec.t)
                       END;
        -- Handle schema-qualified sequence names
        seq_ident := CASE 
                       WHEN position('.' in rec.s) > 0 
                         THEN quote_ident(split_part(rec.s, '.', 1)) || '.' || quote_ident(split_part(rec.s, '.', 2))
                       ELSE quote_ident(rec.s)
                     END;
        
        -- Lock the table to prevent concurrent modifications
        EXECUTE 'LOCK TABLE ' || table_ident || ' IN ACCESS EXCLUSIVE MODE';
        
        -- Get the max value of the column to set the new identity restart value
        EXECUTE 'SELECT COALESCE(MAX(' || quote_ident(rec.c) || '), 0) + 1 FROM ' || table_ident INTO max_val;
        
        -- Drop the default and sequence
        EXECUTE 'ALTER TABLE ' || table_ident || ' ALTER COLUMN ' || quote_ident(rec.c) || ' DROP DEFAULT';
        EXECUTE 'DROP SEQUENCE ' || seq_ident;
        
        -- Convert the column to GENERATED ALWAYS AS IDENTITY
        EXECUTE 'ALTER TABLE ' || table_ident || ' ALTER COLUMN ' || quote_ident(rec.c)
                || ' ADD GENERATED ALWAYS AS IDENTITY (RESTART WITH ' || max_val || ')';
    END LOOP;
END $$;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment