Skip to content

Instantly share code, notes, and snippets.

@ptone
Created August 6, 2012 23:59
Show Gist options
  • Save ptone/3279748 to your computer and use it in GitHub Desktop.
Save ptone/3279748 to your computer and use it in GitHub Desktop.

App-loading refactor reviewer's notes

The concept of the app-loading refactor concept has been around for 4-5 years From Django's earliest days, there has always been a tight coupling of the notion of a Django "app" and its models module. In fact, internally the term "app" essentially was the models module. As Django has matured, an "App" has come to be understood as rough abstraction that encapsulates some set of features for a project, that may or may not involve models. To Django this abstraction is generally a python package created by startapp, with modules imported explicitly in user code, register through interfaces in Django (admin) or by agreed upon filesystem location (templates, static). Modules may follow naming conventions not enforced or used by the framework (views, urls, forms etc).

The primary goal of app-loading is to define a Python object that can represent this encapsulation.

Current Benefits:

  • configure app specific settings in installed apps, not in global settings
  • have a defined place where code for an app runs once, and runs early - no more sticking it as import time execute in models.py
  • apps no longer require a models.py
  • translatable app label for use in the admin
  • allow configuration in settings.py of things like db_prefix
  • a foundation for a place to define app interfaces, for use by both Django and between coordinating 3rd party apps (think CMS plugins)

How it works:

A global instance of an AppCache is created and populated early on in the startup process of Django.

https://github.com/ptone/django/commit/de94a482d834539b9735d84afb2efdf31b8f0999

populating the app_cache consists of iterating over settings.INSTALLED_APPS in a threadsafe way

each item in INSTALLED_APPS is either:

  1. A string consisting of the fully qualified name of an app package (The current convention)
  2. A tuple consisting of
    • an app name as in 1 or a fully qualified name to a subclass of django.apps.base.App
    • a dictionary of kwargs passed to the __init__ method of the app for this entry

https://github.com/ptone/django/blob/app-loading/django/apps/cache.py#L89

the app name and kwargs are then passed to django.apps.cache.AppCache.load_app

https://github.com/ptone/django/blob/app-loading/django/apps/cache.py#L165

This method determines whether a subclass of App is explicitly specified in installed app, and if not, it will create one on the fly from the app name using a class method.

The App class resembles the pattern in Django models, using a metaclass to establish an options object at <instance>._meta which is used to contain bits related to django's internal implementation.

During the app instantiation, models are not loaded, that happens later on in the populating of the app_cache.

As each app is loaded, the app_cache is populated with an 'App' instance for each installed app.

After all apps are instantiated, they are looped over and and each app:

  • Attempts to re-associate any already loaded models
  • Import the models module if it exists for that app
  • attempt to call a post_load method if defined on the app

At this point a digression is needed about how models and apps are related, and how that is handled during both model import time and during the population of the app_cache.

App loading takes the opinion that all models are stored in some instance of an App object. When a model is imported, the models metaclass will register models with the app_cache via the historical route of register_models:

https://github.com/ptone/django/blob/app-loading/django/apps/cache.py#L433

For this case, and in app-loading in general, an app_label acts as the 'primary key' for the apps eco-system.

Previously the app cache was, in part, just a dictionary of {<app_lable>:{<model name>:<model>,…}}

Now that we have a set of instantiated app objects, models should be associated with those apps.

For the register_models method (called via ModelBase.__new__) If a matching app is found - the model is instantiated, and then added to the app's models collection. If an app is not found - a 'naive' app is created. This is an app that has been created just for organization of imported models. If the app is in fact also in installed_apps - this is a transient instance of an app, and the model will be moved to its proper app at the end of cache population.

We now return in more detail to the app_cache at the stage after all apps have been instantiated. At the end of the app_cache population, each app's 'relocate_models' method is called, which reviews already installed apps for ones with the same label that may have been naively created by model imports prior to app_cache population. If it finds them - it moves them onto itself, updates any related model._meta info as appropriate, and then removes the transient app instance that was created by model import.

https://github.com/ptone/django/blob/app-loading/django/apps/base.py#L82

This all happens very early in runtime - so the number of these naively loaded models should be minimal, and the case of models being imported that aren't associated with an installed app I believe are also small.

Now a brief note about the model._meta attributes

first a new 'app' attribute is added to the model._meta to reference the 'owning' app instance

the 'installed' attribute is now proxied to the owning app via a property

the db_table is set using an app's configuration of db_prefix if available, and this will be updated at the end of populate if needed by the app's register_models method.

Structuring the population of the app_cache in this sequence allows for a model to access configuration information from an app object - since hopefully most models will be loaded after all app objects are instantiated. This is relevant to implementing #3011 on top of app-loading, as well as any feature that might require configuration via an app at the time of model class instantiation.

the result is a global app_cache instance of django.apps.cache.AppCache that contains a single 'loaded_apps' structure which is a list of instances of django.apps.base.App - each of which has a _meta.models dictionary of {object_name:modelclass_instance}.

For accessing a given app - there are two ways:

app_cache.get_app_instance(<app_label>) can be used by app_cache machinery before the app_cache is fully populated.

however for convenience, after the cache has been populated, a named tuple is set on the app_cache under the attribute 'apps' which allows dotted path access to individual app instance attributes.

so the following are equivalent:

my_app = app_cache.get_app_instance('myapp')
my_var = my_app.some_attribute

and:

my_var = app_cache.apps.myapp.some_attribute

the following is a discussion of methods on app_cache

get_app and get_apps have been deprecated and replaced by app_cache.get_models_module and get_models_modules to better represent the new semantic meaning of what an 'app' is. their exposure in models.loading is retained but deprecated all internal code has been updated to use the new method names.

get_model and get_model have revised implementations, but retain their existing contract, their exposure as module level names models.loading is deprecated

These are currently also exposed as module level names in db.models.__init__ it remains an open question whether these should be deprecated at this location (currently they are not). On one hand it would be good to consolidate all use of these methods to their true location, on the other, they do relate to models specifically - and it might make sense to retain a models namespace representation of them.

get_app_instance is used to locate the app instance for a given app label

load_app has historically returned the models module of the loaded app I'd like to propose a backwards incompatible change that it return the app instance. This return value has never been used internally in Django, and it is very unlikely anything is using it in the wild - it is simply feels like the right return value.

Future Benefits:

In addition to the benefits available as of the current implementation, the concept of an App base class and a cache of instantiated apps provides the foundation for defining interfaces and conventions for apps that can be used by Django itself, or by a coordinated sets of 3rd party apps. The classic example is a CMS and a set of plugins.

Benefits hoped for, but not easily realized:

Multiple instances of the same app - this has always been a feature discussed when app-loading comes up. However it proves to be exceptionally problematic. First - it is very hard to support regular Python import statements like:

from myapp.models import Article

and have that be interpreted in the context of one app or another - it would require importing through some sort of utility.

But even harder than that, is how in a general purpose way - you tell Django when you would use one instance of an app, and when you would use another. Would it be in the URLs - at the ORM? There is just no simple way. I'm not ready to say it is impossible, but just that it shouldn't be a required feature to get the other benefits of an app-loading refactor.

There is currently a pre_apps_loaded signal that fires before any apps load, but because the apps use the metaclass pattern like models, there is no instance that exists to call a pre_load hook on until after the __metaclass__.__new__ has run - and much of the action occurs here. So a pre_load hook becomes kind of pointless. Really though - app_loading should happen early enough in the process that there isn't much room for something to happen that early on.

So some specific questions or areas where input is requested:

exposure at models.__init__

https://github.com/ptone/django/blob/app-loading/django/db/models/__init__.py#L1

should the app_cache methods continue to be exposed as module level names - or deprecated as has been done for models.loading. see above associated discussion on get_model(s)

There has been a historical (added in 2007) feature of the app cache to store "app_errors" this was a dictionary stored on the app cache, and any entries present were printed to stdout as part of the report from validation.

There is no documentation about how an app, or models file is to add anything to this datastructure, and I can't imagine it has been used by anyone.

I'd like to propose deprecating it, or remove it outright as I believe it to be out and out cruft - but deprecation is fine if I'm wrong. The concept could be properly introduced and documented as part of: #16905. It's also related to #8579, and the eternally proposed (but never accepted Summer of Code project)

https://github.com/ptone/django/blob/app-loading/django/core/management/validation.py#L30

supporting "models" package out of the box:

https://github.com/ptone/django/commit/f0cabee4027a686fea594cc0ae700f219cedf1db

This was a bit orthogonal - and so could be easily reverted, it supports having a "models" package instead of module - the main incompatibility this raises, is that it would no longer be valid to have an "app package" named "models".

[minor] should the convention for app specific configuration also be all caps?

Some areas left to explore for this round:

Fully decouple the app label from the module path.

While currently and historically you could explicitly declare a model to be associated with a specific app via the app_label Meta option, you can not do the same for apps. Currently the app label is hardcoded from the fully qualified python path module name. However, I don't see a reason why a "label" option couldn't be supported in the app's Meta options. This would allow a truly mix and match landscape at the app instance level. There is a check for duplicate labels in cache.AppCache.load_app. So right now loading django.contrib.comments and mymodule.apps.comments can not happen - even if they use different db_prefixs (duplicates of which are also checked for - and which is based on the app_label by default).

What this allows for is more promise in the future of having something like multiple app instances. Where a reusable app can be designed in a way similar to abstract models. Where it would in fact ship with abstract models, views, and a App base class. You would create concrete versions of the models, subclass the App class - give it all a unique label, and it would all just plug in. That is all in the future, but this would provide some flexibility there. Of course, allowing a "label" meta option could also be deferred to the future.

Allow discovery of a blessed app subclass for a given package

There is an app subclass at contrib.auth.app.AuthApp

however there is no way currently, given an installed app of django.contrib.auth to its existence and use it.

This makes it harder to implement a easy migration towards potential new features such as a swappable auth user which would depend on an auth App - as to take advantage, people would need to change their installed apps to point to "django.contrib.auth.app.AuthApp".

While it may verge on too implicit, what I'd like to consider is looking at django.contrib.auth.__init__ for a module level "default_app_class" attribute or something, that could tell the app loading machinery to use a certain class.

Draft User Docs

https://github.com/ptone/django/blob/app-loading/docs/topics/application-objects.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment