Skip to content

Instantly share code, notes, and snippets.

@kissgyorgy
Last active June 26, 2024 20:01
Show Gist options
  • Save kissgyorgy/e2365f25a213de44b9a2 to your computer and use it in GitHub Desktop.
Save kissgyorgy/e2365f25a213de44b9a2 to your computer and use it in GitHub Desktop.
Python: py.test fixture for SQLAlchemy test in a transaction, create tables only once!
from sqlalchemy import create_engine
from sqlalchemy.orm import Session
from myapp.models import BaseModel
import pytest
@pytest.fixture(scope="session")
def engine():
return create_engine("postgresql://localhost/test_database")
@pytest.fixture(scope="session")
def tables(engine):
BaseModel.metadata.create_all(engine)
yield
BaseModel.metadata.drop_all(engine)
@pytest.fixture
def dbsession(engine, tables):
"""Returns an sqlalchemy session, and after the test tears down everything properly."""
connection = engine.connect()
# begin the nested transaction
transaction = connection.begin()
# use the connection with the already started transaction
session = Session(bind=connection)
yield session
session.close()
# roll back the broader transaction
transaction.rollback()
# put back the connection to the connection pool
connection.close()
@aqc-carlodri
Copy link

@kissgyorgy thanks for the feedback, will test further then, for some reason with your code only the first test works, the others fail to see anything in the DB even if I add stuff.

@kissgyorgy
Copy link
Author

Yes, that's the whole point of these fixtures. Tests should be independent from each other, DB should be empty at the start of every test. The nested transaction what makes it possible to commit anything within one test case but as it will be rolled back at the end, there is no effect on other tests.

@codebydant
Copy link

codebydant commented Mar 13, 2023

In my case, this is my fixture

@pytest.fixture(scope="session")
def dbsession(tables) -> Generator[TestingSessionLocal, Any, None]:
    """fixture that returns an sqlalchemy session, and after the test tears down everything properly."""
    connection = engines["TEST"].connect()
    # begin the nested transaction
    transaction = connection.begin()
    # use the connection with the already started transaction
    session = TestingSessionLocal()

    yield session

    session.close()
    # roll back the broader transaction
    transaction.rollback()
    # put back the connection to the connection pool
    connection.close()

and in the endpoint to avoid committing a transaction I use flush according to sqlachemy docs

https://docs.sqlalchemy.org/en/20/orm/session_api.html#sqlalchemy.orm.Session.flush

def seed_database(dbsession, records: List[object]) -> None:
    """function to add dummy data to the database

    Args:
        dbsession (fixture): database connection
        records (list[object]): list of items to be added in the database
    """
    dbsession.bulk_save_objects(records)
    dbsession.flush()

and in the endpoint

def test_mytable(dbsession, test_client, new_data):

    # GIVEN 3 records
    record1 = new_data(probability=0.1)
    record2 = new_data(probability=0.2)
    record3 = new_data(probability=0.3)
    seed_database(dbsession, records=[record1, record2, record3])

    # WHEN the endpoint is requested (GET)
    response = test_client.get(url=f"/myendpoint")

    # THEN check that the response is valid 
    assert response.status_code == 200
    assert len(response.json()) == expected

With this approach, every new test execution will have an empty database table for adding new registers

@kissgyorgy
Copy link
Author

@dtcMLOps if you you only flush in test code, there is no point using a nested transaction, as without commit, closing the session will clean up everything. The point of my snippet is that you can real code with transaction handling, committing and everything and tests will still be isolated and leave no junk in the database with the nested transaction rollback.

@codebydant
Copy link

codebydant commented Mar 20, 2023

@dtcMLOps if you you only flush in test code, there is no point using a nested transaction, as without commit, closing the session will clean up everything. The point of my snippet is that you can real code with transaction handling, committing and everything and tests will still be isolated and leave no junk in the database with the nested transaction rollback.

Hi @kissgyorgy, your approach to running a test is not correct if I want to run tests concurrently. Commiting to the database will make the next test case that is pointing to the same table fail since there are "registers" committed from the previous test case. My approach allows me to run the test cases concurrently.

@pxu-mdsol
Copy link

pxu-mdsol commented Mar 31, 2023

@kissgyorgy are you sure your dbsession.commit() in your code really commiting to the database? I set the breakpoint right before and after dbsession.add(...), and dbsession.commit(), while monitoring the database status (I am using a database in container). I do see your line 14 and 16 create and drop tables in the database, but I never see dbsession.commit() really do anything in the database.

Overall it seems that all tests passed, however I have doubts whether physically "committing to the database" is true.

Could you post one of your test cases?

@kissgyorgy
Copy link
Author

@pxu-mdsol the whole point of this is that things are committed to a sub-transaction only, so nothing will be saved in the database after the test case. It depends on isolation level settings if committed things can be seen from other connections/transactions. See: https://en.wikipedia.org/wiki/Isolation_(database_systems)

@pxu-mdsol
Copy link

@kissgyorgy Thanks for the quick response. I know that "nothing will be saved in the database after the test case", how about during the test case? I assume you meant that it does "change the database", it's just everything rollback at the end of each test case.

As for as whether I can monitor that "change" in the database with a separate session in the database (such as a mysql query in a mysql client) concurrently with the test session, your point is that depends on the isolation level settings?

@pxu-mdsol
Copy link

pxu-mdsol commented Mar 31, 2023

Ok, I realized that I missed the role of "transaction" in line 24. It seems that if this "transaction" is created, then dbsession.commit() in the test case will not commit changes to the database, unless we literally call transaction.commit().

So the direct answer to my question:

"nothing will be saved in the database after the test case, how about during the test case?"

is NO. Even during the test case, changes never commit to database. They all stay in transaction, and finally canceled by your line 32.

@kissgyorgy
Copy link
Author

@pxu-mdsol

"nothing will be saved in the database after the test case, how about during the test case?"
is NO. Even during the test case, changes never commit to database. They all stay in transaction, and finally canceled by your line 32.

That is False. I'm not familiar with database implementation specifics, but I'm pretty sure you can find files on disk related to transactions for certain databases and engines. A concrete example is SQLite in WAL mode. Also as I mentioned, you can see the data committed during test based on isolation levels, so the data is definitely saved in the database somewhere.

I think the more precise way to put it is that "during test case, data will be saved in the database in a transaction and after the test case, everything will be deleted by rolling back the transaction."

@pxu-mdsol
Copy link

@kissgyorgy I agree you -- transaction is saved in the database. Thanks for the discussion!

@bashtian-fr
Copy link

its been a long time since this gist was posted, but I am having really inconsistent behaviors where one time it will succeed my tests, then another run it will complain that there is no such table. Like the create_all and drop_all weren't called or that the table already exists, Did anyone experience this behavior? I only have 3 tests in a class.

@kissgyorgy
Copy link
Author

The database and table creation fixtures are session scoped, so they are only created at the start of the first test which uses it and destroyed at the end of the test suite.
These SQLAlchemy APIs didn't change.
The only problem I can guess is that the test run was abrupted and the database tables were not deleted. You can try starting the tables fixture with drop_all() , so the tables are deleted BEFORE every test run.

@bashtian-fr
Copy link

Thanks for the answer, I tried that too but that would not help.
Altho I found the problematic part but I don't know how to solve it yet. The issue is that I have a lot of Core CPUs and I had pytest --numprocesses/-n on auto. My assumption is that 2 (or more) of the tests from the same class - which are using these db fixtures - were running on different processes which would create new test session. Setting it to 1 (or not adding it) solved the problem.

@kissgyorgy
Copy link
Author

With pytest-xdist, you should make a different test database for every process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment