Skip to content

Instantly share code, notes, and snippets.

@chaitanyagupta
Last active March 16, 2021 13:20
Show Gist options
  • Save chaitanyagupta/3b98f7ddd9f4a62fffee2c60c24b5e91 to your computer and use it in GitHub Desktop.
Save chaitanyagupta/3b98f7ddd9f4a62fffee2c60c24b5e91 to your computer and use it in GitHub Desktop.
Simple problems to learn how django's querysets work

Simple Querysets

Objective of this exercise is to implement some of the interesting behaviours exhibited by Django querysets:

  • Lazy loading or on-demand evaluation
  • Results caching
  • Filters and chaining

Getting Started

You will need PostgreSQL, and the package psycopg2 or psycopg2-binary (the latter is easier to install).

Create a file called simple_queryset.py and add the following code to it.

from psycopg2 import sql, connect

DSN = None


def set_dsn(dsn):
    global DSN
    DSN = dsn


class SimpleQueryset:
    def __init__(self, table_name):
        self.table_name = table_name

    def evaluate(self):
        with connect(DSN) as connection:
            with connection.cursor() as cursor:
                cursor.execute(
                    # We need to use psycopg2.sql.SQL() in order to correctly
                    # use dynamic identifiers (table, column names, etc.) in our
                    # query. This is different from using query arguments, where
                    # you use %s in the query string and pass the query
                    # arguments to cursor.execute(). See
                    # https://www.psycopg.org/docs/sql.html#module-psycopg2.sql
                    sql.SQL(
                        'SELECT * FROM {}'
                    ).format(
                        sql.Identifier(self.table_name)
                    )
                )
                return cursor.fetchall()

Now test this out by running a Python interpreter. The following assumes that you have a table named employees in your database.

>>> from simple_queryset import SimpleQueryset, set_dsn

>>> set_dsn('dbname=test user=postgres password=secret')

>>> sq = SimpleQueryset('employees')  # 'employees' is name of the table

>>> sq.evaluate()  # Our employees table has two columns - id and name
[(1, 'lem e tweakit'), (2, 'alyssa p hacker'), (3, 'ben bitdiddle'), (4, 'louis reasoner'), (5, 'cy d fect')]

Once the above is running, solve the following problems.

Evaluation and Caching

  • Iteration - make SimpleQueryset iterable, and execute the database query the first time you iterate over it. Subsequent iterations over the queryset should use the cached results.
  • Slicing - SimpleQueryset should support slicing. Like django querysets, slicing an unevaluated simple queryset should return another unevaluated simple queryset i.e. slicing should not cache results.
  • Printing querysets - When a simple queryset is printed, it should evaluate its results. But should the results be cached? Observe the behaviour of django querysets - why are the results not cached when printing django querysets? Is it because django specifically cares about not caching querysets when printed, or is it due to some other aspect of querysets that the printing methods use?
    • When an object is printed using the print() function, what method on that object is called to get the print representation?
    • To print an object in the interactive interpreter, which function is called - print() or something else? If it's something else, then what is it, and which method on the object does it call to get the print representation?
  • len() - A simple queryset should be evaluated and cached when you call len() on it. This should return the number of results.
  • Testing a simple queryset in a boolean context such as an if statement should cause the queryset to be evaluated and cached. If there is at least one result, the queryset is True else it is False.

Instance Methods

The following instance methods should be defined for SimpleQueryset.

  • all() - Should return an unevaluated copy of the current queryset. If the current queryset cached its results, and the database might have changed since it was evaluated, you can get updated results for the same query by calling all() on it and evaluating the newly returned queryset.
  • filter() - Needs to support exact lookups only e.g. SimpleQueryset('employees').filter(bu='acme', department='engg'). Like all(), this should also return an unevaluated queryset every time it is called.
  • count() - Should return a count of the queryset using the SQL count() function. Does not evaluate the queryset.
  • exists() - Returns True fi the queryset contains any results, False otherwise. Does not evaluate the queryset.

Chaining

SimpleQueryset should support chainnig. For example, all of the following should work:

SimpleQueryset('employees').filter(bu='acme').filter(department='engg')

SimpleQueryset('employees').filter(bu='acme').all()

SimpleQueryset('employees').all().filter(department='engg')

SimpleQueryset('employees').filter(department='engg').count()

SimpleQueryset('employees').filter(bu='acme').exists()

Like django querysets, a chain of filter()s should also return an unevaluated queryset.

Bonus - Q objects

Q objects allow you to include OR and NOT clauses in the SQL WHERE clause. Implement Q objects for SimpleQueryset.

SimpleQueryset('employees').filter(Q(name='alyssa p hacker') | Q(name='lem e tweakit'))

SimpleQueryset('employees').filter(~Q(department='finance'))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment