Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save bharathvaj-ganesan/9d64dd3eddb557a3010a282f08f95504 to your computer and use it in GitHub Desktop.
Save bharathvaj-ganesan/9d64dd3eddb557a3010a282f08f95504 to your computer and use it in GitHub Desktop.

To follow best practices for running Alembic migrations on two separate PostgreSQL databases while using SQLModel, you should:

  1. Assign Models to Specific Databases: Define separate metadata objects to distinguish models belonging to each database.
  2. Use Separate Migration Folders: Store migration scripts in distinct directories to avoid conflicts and keep each database's schema changes independent.
  3. Modify env.py to Handle Multiple Databases: Configure Alembic to run migrations sequentially for both databases by looping through their configurations and applying changes to the correct metadata.
  4. Ensure Autogeneration Works Properly: Implement logic in env.py to apply --autogenerate migrations to the correct metadata.
  5. Use Async and Sync Engines Correctly: Handle async database connections properly in the Alembic migration flow.

I'll prepare a detailed setup with env.py modifications, migration folder structures, and best practices to ensure Alembic knows which models belong to which database.

Best Practices: Alembic Migrations on Two PostgreSQL Databases with SQLModel

When working with multiple PostgreSQL databases in one project, you need to isolate each database’s models and migration history. Below is a step-by-step guide to configure Alembic for two separate databases using SQLModel, covering model metadata separation, migration file structure, Alembic configuration, autogeneration, async vs. sync engines, and implementation details.

Step 1: Define Models with Separate metadata for Each Database

By default, SQLModel uses a single SQLModel.metadata for all models. To target two different databases, create distinct metadata objects so each model set is bound to the correct database. The easiest way is to subclass SQLModel to make separate base classes, each with its own MetaData (How to use two different metadata properly to connect to two databases? · Issue #264 · fastapi/sqlmodel · GitHub):

from sqlmodel import SQLModel, Field
from sqlalchemy import MetaData

# Define two base classes with separate metadata
class BaseDB1(SQLModel, table=False):
    """Base class for models in Database 1"""
    metadata = MetaData()

class BaseDB2(SQLModel, table=False):
    """Base class for models in Database 2"""
    metadata = MetaData()

# Example models for DB1
class User(BaseDB1, table=True):
    id: int = Field(primary_key=True)
    name: str
    email: str

# Example models for DB2
class Product(BaseDB2, table=True):
    id: int = Field(primary_key=True)
    title: str
    price: float

In this setup, User is mapped to BaseDB1.metadata and Product to BaseDB2.metadata. Each base class’s metadata will collect only its respective tables. This separation ensures Alembic can generate and run migrations for one database without affecting the other.

Why separate metadata? If all models shared one metadata, Alembic’s autogenerate would see a combined model state and might not correctly detect changes per database (or could attempt to apply all tables to both databases). Using distinct metadata objects for each DB avoids this conflict (python - Using alembic with multiple databases - Stack Overflow). Make sure to import all model classes in your Alembic env script so that each metadata is populated with its tables (otherwise Alembic might think tables were removed if models aren’t imported).

Step 2: Organize Separate Migration Directories for Each Database

Next, set up your Alembic migration environment to keep each database’s migrations isolated. Create a folder structure with distinct subfolders for each database’s migration scripts. For example:

migrations/
├── env.py
├── alembic.ini
├── db1/        (migration scripts for Database 1)
│   └── <revision_files_for_db1>.py
└── db2/        (migration scripts for Database 2)
    └── <revision_files_for_db2>.py

In this layout, migrations/env.py will be a single env script used for both databases, and we’ll configure Alembic to place revision files in either migrations/db1/ or migrations/db2/ depending on which database we target. Each subfolder acts as a separate version history for its database, with its own alembic_version table in the respective database. This isolation means each DB’s schema changes are tracked independently (preventing one database’s migrations from appearing in another’s history). It’s generally recommended to maintain separate migration streams for multiple databases (Cookbook — Alembic 1.14.1 documentation).

Step 3: Configure Alembic for Multiple Databases (alembic.ini)

Open alembic.ini and set up configurations for each database. Use Alembic’s support for multiple named databases by adding separate config sections and distinct version locations. For example:

[alembic]
script_location = migrations
databases = db1, db2

[db1]
sqlalchemy.url = postgresql+psycopg2://user:password@localhost:5432/database1
version_locations = %(here)s/migrations/db1

[db2]
sqlalchemy.url = postgresql+asyncpg://user:password@localhost:5432/database2
version_locations = %(here)s/migrations/db2

Key points:

  • The [alembic] section’s script_location is the base directory (migrations folder) containing env.py. We also define a comma-separated list of database names in databases (this is a custom option to help our env.py logic).
  • Each database has its own section ([db1], [db2]) with a sqlalchemy.url for that DB and a version_locations pointing to its migration subfolder. Alembic will create and look for revision files in those directories for each DB.
  • In this example, db1 uses a standard (sync) PostgreSQL driver, and db2 uses an async driver (asyncpg). We will handle the async engine in the env.py. If you prefer, you could use a sync URL for both to simplify (e.g., use psycopg2 for migrations even if your app uses asyncpg).

This multi-config setup allows using Alembic’s --name flag to target a specific database’s migration context. For instance, --name db1 will apply settings under [db1] (Managing Alembic Migrations with a single alembic.ini & env.py – LearningToTest). We’ll leverage this in our commands and env.py logic.

Step 4: Implement env.py to Handle Multiple Databases Sequentially

Now, modify migrations/env.py so that Alembic knows how to run migrations for both databases. The env.py will need to:

  • Import the models for both databases (to load table definitions into each metadata).
  • Determine which database’s context is being run (or if all should run).
  • Set Alembic’s target_metadata to the appropriate metadata for autogeneration.
  • Create an engine/connection for each database and run migrations sequentially.

Below is a sample env.py that accomplishes this:

from logging.config import fileConfig
from sqlalchemy import engine_from_config, create_engine, pool
from sqlalchemy.ext.asyncio import create_async_engine, AsyncEngine
from alembic import context

# Interpret the config file
config = context.config
fileConfig(config.config_file_name)

# Import models to populate metadata
from myapp.models.db1 import BaseDB1, User  # models for DB1
from myapp.models.db2 import BaseDB2, Product  # models for DB2

# Map database names to their metadata
METADATA_MAP = {
    "db1": BaseDB1.metadata,
    "db2": BaseDB2.metadata
}

# Utility function to run migrations for a given database name
def run_migrations_for(db_name: str):
    # Get database section config
    section_config = config.get_section(db_name) or {}
    url = section_config.get("sqlalchemy.url")
    if url is None:
        raise RuntimeError(f"No database URL found for '{db_name}'")

    # Create engine (handle async vs sync)
    if url.startswith("postgresql+asyncpg"):
        # Async engine: create and use its sync engine for migration
        async_engine = create_async_engine(url)
        connectable = async_engine.sync_engine  # use underlying sync Engine
    else:
        connectable = create_engine(url, poolclass=pool.NullPool)
    
    target_metadata = METADATA_MAP[db_name]

    with connectable.connect() as connection:
        context.configure(
            connection=connection,
            target_metadata=target_metadata,
            compare_type=True   # include type changes in autogen
        )
        with context.begin_transaction():
            context.run_migrations()

# Determine which databases to migrate
section = config.config_ini_section  # e.g. "db1", "db2", or "alembic"
databases = config.get_main_option("databases")  # e.g. "db1, db2" from alembic.ini

if section == "alembic" and databases:
    # No specific --name given: run all databases sequentially
    for name in [db.strip() for db in databases.split(",")]:
        run_migrations_for(name)
else:
    # Only run the specified database (with --name or default section)
    run_migrations_for(section)

How this works: When you run Alembic, the config.config_ini_section will be set to the target name (e.g. "db1" if you run alembic --name db1 ...). The code above checks if no specific name was provided (meaning section == "alembic" which is the default section) – in that case it loops through all configured databases and runs each sequentially. Otherwise, it runs migrations for just the specified DB section.

Inside run_migrations_for, we retrieve the proper database URL and create an engine. We handle async URLs by using create_async_engine and then accessing its sync_engine property to get a synchronous engine. This lets us use the async driver while still running migrations in a standard (blocking) way, avoiding the need to manage an asyncio event loop in Alembic. For sync URLs, we just use a normal engine via create_engine. We then configure the migration context with the corresponding target_metadata for that DB and run the migrations. The compare_type=True flag is included to ensure Alembic detects type changes; you can adjust Alembic config arguments as needed (e.g. compare_server_default=True).

Sequential migration: This env.py is set up so that a single Alembic invocation can apply migrations to both databases one after the other. For example, calling alembic upgrade head (with no --name) will upgrade db1 then db2 in sequence. Each database’s migrations run in its own transaction and connection. This sequential approach ensures isolation — at runtime Alembic will finish all migrations on DB1 before moving on to DB2.

Target metadata selection: We explicitly choose the metadata for the target database on each run. This is crucial for Alembic’s autogeneration to work properly for multiple databases (python - Using alembic with multiple databases - Stack Overflow). If you were to use the same target_metadata for both, Alembic would either miss differences or attempt to combine changes incorrectly. By mapping each Alembic “environment” to only the tables of one metadata, --autogenerate will compare the models and the actual database schema correctly for that one DB at a time.

Step 5: Autogenerate Migrations for Each Database

With the above configuration, you can autogenerate new migrations for each database separately. Use Alembic’s --name flag to specify which DB you are targeting when creating revisions. For example:

$ alembic --name db1 revision --autogenerate -m "Create user table for DB1"
$ alembic --name db2 revision --autogenerate -m "Create product table for DB2"

Each command will:

  • Load env.py, set config.config_ini_section to the given name (db1 or db2), and apply the logic we wrote.
  • Alembic will call context.configure(... target_metadata=METADATA_MAP["dbX"], ...) for the specified DB. This means only the models for that DB will be considered.
  • The generated revision file will be placed in the corresponding migrations/db1/ or migrations/db2/ folder (because we set version_locations for each).

Each migration file includes the usual upgrade() and downgrade() functions with operations only for that database’s schema. For example, the DB1 revision might have an op.create_table('user', ...) for the User model, and DB2’s revision will have op.create_table('product', ...) for the Product model.

Verify autogeneration: Since we import all models and set the correct metadata, Alembic’s autogenerate should correctly detect changes for each DB. Ensure that before running the revision --autogenerate, the target database is up to date (or empty for initial migration) and the models reflect the desired state. If a model is not imported in env.py, Alembic might think it was removed and generate a drop statement, so double-check that all models for that DB are imported into the environment. When done properly, autogeneration will work independently for both databases, producing separate migration scripts as expected.

Step 6: Applying Migrations to Both Databases

Finally, run the migrations on each database. You can apply them one at a time, or let env.py handle both sequentially:

  • Individual upgrades: You can upgrade each DB separately by specifying its name:

    alembic --name db1 upgrade head   # apply all migrations to Database 1
    alembic --name db2 upgrade head   # apply all migrations to Database 2

    This will execute the upgrade commands using the respective migration files in each folder.

  • Sequential upgrade in one command: Thanks to our env.py logic, you can also run:

    alembic upgrade head

    without specifying --name. In this setup, that command will connect to db1, apply its pending migrations, then connect to db2 and apply its migrations. The env.py loops through databases = db1, db2 and runs each in order. This can be convenient for dev environments or simplified deployment scripts. (Note: Alembic’s built-in behavior with databases= in alembic.ini doesn’t automatically loop; we implemented the loop manually in env.py.)

After running the upgrades, check each database’s schema to confirm that:

  • Database1 has the user table (and any other objects defined in DB1 models) and its own alembic_version table with the latest revision ID for db1’s migration history.
  • Database2 has the product table (and any other DB2 objects) and an alembic_version entry for the latest db2 revision.

Each database’s alembic_version table will reflect only its migration timeline. Downgrades can be run similarly with alembic --name db1 downgrade -1, etc., targeting each DB as needed.

Step 7: Handling Async vs. Sync Engines in the Migration Flow

When using asynchronous SQLAlchemy engines (e.g., with asyncpg), Alembic migrations require some special handling because Alembic runs in a synchronous context. In our env.py above, we addressed this by using the async engine’s synchronous counterpart:

  • We detect the URL scheme (postgresql+asyncpg) and create an async engine with create_async_engine. However, instead of awaiting an async connection, we simply take async_engine.sync_engine which is a regular Engine object connected to the same database. We then proceed with connectable.connect() and run migrations normally. This means Alembic will actually use a standard connection under the hood, even for the async database, which is acceptable because the database server handles either connection type similarly. By doing this, we avoid dealing with asyncio event loops in the migration process and let Alembic operate as if it were a synchronous database.

  • An alternative approach (not shown in full) would be to use Alembic’s async migration support. Alembic provides an asyncio.run() pattern where you define an async def run_migrations_online() and within it use async with engine.connect() as connection: await connection.run_sync(do_run_migrations). However, this approach can become tricky if the Alembic CLI is invoked inside an existing event loop (and it’s more complex to maintain). The sync engine approach is simpler for most cases. Alembic will still compare metadata and generate migrations accurately for an async database, since the dialect (asyncpg) supports reflection and schema operations just like psycopg2.

In summary, our configuration treats the async engine in a way that autogeneration and upgrades work transparently. You can keep using async sessions in your application, but run Alembic migrations without needing to run an async event loop. The engine_from_config helper won’t create async engines automatically, so we manually handle it. If both of your databases were sync, this extra step isn’t needed – but it’s good to know how to manage an async DB in Alembic migrations.


Putting it all together, the above steps provide a robust setup for multiple databases:

  • Distinct model metadata: Each DB’s models are grouped under their own Base (metadata), preventing cross-database table mix-ups.
  • Separate migration folders: Each DB has its own versions directory, so Alembic tracks revisions independently (no accidental application of a migration to the wrong DB).
  • Unified Alembic configuration: A single alembic.ini and env.py can handle both databases. We use the --name flag or a loop to direct Alembic to the right place.
  • Autogeneration for each DB: By switching target_metadata per database, alembic revision --autogenerate produces correct migration scripts for each schema.
  • Async vs Sync: Async engines are accommodated by using a synchronous context for schema operations, ensuring migrations run smoothly.

Following this guide, you can maintain migrations on two PostgreSQL databases in parallel without conflicts, and confidently apply schema changes to each. This structure is scalable to more than two databases as well (just add more base classes, config sections, and loop entries as needed). Each database will evolve on its own migration timeline, under one project umbrella, using Alembic’s powerful tooling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment