Skip to content

Instantly share code, notes, and snippets.

@ogier
Created April 3, 2012 05:03
Show Gist options
  • Save ogier/2289395 to your computer and use it in GitHub Desktop.
Save ogier/2289395 to your computer and use it in GitHub Desktop.
auth.User: A sane replacement strategy.

Replacing auth.User with as little magic as possible

Here is my proposal for replacing auth.User in a sane, backwards-compatible, non-magical fashion.

  1. Split off the current functionality of auth.User into reusable components. Proof of concept that this can be done transparently is available: https://github.com/ogier/django/blob/auth-mixins/django/contrib/auth/mixins.py . The changes there, with independent password, profile, groups/permissions and identity support being shipped as abstract models, are totally transparent to Django's test suite.
  2. Ask users to reimplement a User model if they want to change anything. This, while arguably invasive, already has precedent in nearly every other batteries-included framework out there (See Ruby on Rails, ASP.NET MVC, Play, etc.). In addition, this has the advantage of making AUTH_PROFILE a dinosaur. Why implement a one-to-one model accessible indirectly via method on auth.User when you can just stick your fields where they belong.
  3. Break any hard dependencies on auth.User. Other proposals have suggested this is a bad idea, and have proposed various degrees of magic to solve this: mount a user-specified model at auth.User and then use a new LazyForeignKey to attach to it, or similar. This seems like too much magic to me. Just specify a new model via settings, eg USER_MODEL = 'myauth.SuperDuperUser'. Apps that need to foreign key to your User can use the existing string-based mechanism to bind to you. Apps that don't can just access request.user and see the runtime model.

Alternate proposals, and why they are too magical

Here is a collection of proposals that Russell Keith-Magee has kindly curated: https://code.djangoproject.com/wiki/ContribAuthImprovements .

My solution is Solution 2a, with a nice set of orthogonal primitives. I see it as far and away the easiest to implement (i.e. using the least magic) and for this reason the most viable and elegant solution. If done right, the circular dependencies and late-binding should be non-issues. models <-> settings dependencies are largely a red herring because it is my understanding that importing django.db.models causes settings to be imported anyways (if this is incorrect can someone slap me straight sooner rather than later?), and models <-> models dependencies already have a tried-and-true resolution strategy of just using strings in ForeignKey fields. So long as the solution doesn't involve having a class available at exactly django.contrib.auth.models.User, then lazy ModelForms aren't necessary any more than they are in existing cases.

Sorry if I am putting words into your mouth, Russell, but you seem to be in support of solution 2, that is, you see LazyForeignKeys and mounting a new model at auth.User as necessary prerequisites to a viable solution in the vein of my proposal, presumably so that existing apps can support the new pluggable user. For my part I don't think breaking off with existing apps is such a horrible thing. Many apps don't have hard dependencies on auth.User, only soft ones on request.user. The ones that do may break subtly when faced with a replacement user that doesn't quite fit the contract expected. Breaking noisily (perhaps by suppressing the auth.User model entirely when not in use) seems better to me.

Jacob's profile proposal

Jacob Kaplan Moss has a proposal based upon the extension and generalization of profiles to provide pluggable behavior. Although I think the community is seeing this proposal as a breath of fresh air because it doesn't depart as far from the status quo of User model + profile, I think anyone who takes a step back and looks at our user community can see that a break from the status quo is exactly what is needed. Sorry to be so harsh, but here are some specific problems I have with the solution:

  1. Efficient conglomeration of fields requires cooperation with the ORM. Automatically calling select_related() on all profile fields is basically a necessity.
  2. Jacob argues against a swappable user model: "I'm convinced that such an idea is ultimately a bad idea: it allows apps exert action at a distance over other apps. It would allow the idea of a user to completely change without any warning simply by modifying a setting." Jacob's solution in fact creates the runtime version of this same tarpit: An AUTH_PROFILES setting that, when changed, alters the core contract of the user object (if only at runtime). This is admittedly superior to having app settings modify the static database layout of said User, but I do think it is inferior to having a concrete model present in an installed app that defines a User.
  3. Jacob's solution forces a migration on every single app that uses auth.User. No matter how ugly auth.User is, it serves its stated purpose, so I prefer a solution that allows an opt-in migration to developer-controlled User models.
  4. It's undiscoverable. By this I mean that examination of the django/contrib/auth/models.py file is unlikely to be enlightening. The definition of the User object is an amalgation of an AUTH_PROFILES setting, various apps scattered over the file system, and the profile loading hooks somewhere in the guts of a metaclass. By way of contrast, my solution has one canonical source for a User: the model. Every bit of behavior can be traced from there by following imports and standard python inheritance.
  5. The identifier attribute is too strong. As an arbitrary length unique character field, there is great potential for it to be abused. For example, consider the following: Sally registers for a site using a Username and Password scheme with the username Sally456, which gets stored as the identifier. Eve sees this, and signs up for the Twitter handle Sally456. Eve then logs in to the Django site with Twitter's OAuth mechanism via an auth module that confirms that Eve owns the Twitter handle Sally456 then authenticates her to the account with identifier "Sally456", and she has pwned Sally's account. Stripping the User down to just an identifier and password sounds nice, but it just leads to a big ball of namespace collisions akin to what we find in other places like cache keys, except that every collision is a potential security hole in basic authentication.
  6. From a data integrity standpoint, it is less transparently secure to have core authentication in profile models that may or may not exist. Doing so forces all integrity into the application, which is now a necessary component in order to correctly initiate transactions that encompass the User+profiles. It is very difficult to specify cross-table integrity constraints in a profile table that itself doesn't need to exist. For example, how do I specify that a field is "Unique, Not-Null" when in fact the whole profile table entry is nullable?
  7. It needlessly complicates the database layout. Even if the database representation is nicely unified when it reaches your application, Django may not be the only system that wants to access the database. In a system with a bare User and eight profile modules, I would imagine it is nearly impossible to run a reasonable worker process on authenticated users, or administer users with phpMyAdmin.
  8. auth.User is one of the few areas in Django that prevents people from painlessly migrating to Django without requiring a data migration. For the most part, you can model your existing database in Python, and then run Django on top of it, but you can't do that with auth.User. Extending and generalizing the profile system will further entrench this impossibility: if you want to run admin, you will damn well have to make your data conform to Django's notions of users.
  9. It's magic. For example, the word "sugar" appears six times :(. There's a whole new Profile class, with requisite Meta options.

That's basically the gist of the problems I have with other magical solutions. Even though my proposal may seem like more of an upheaval, I hope I showed that it's really removing cruft, not adding it.

GSoC proposal

Scratch this. I was originally planning to submit this as a GSoC proposal, but it's become clear to me that the technical changes are minimal compared to the social friction that this discussion is causing (at least, assuming we go with a solution with no magic... hint hint). I'd rather just get this change in properly without having to justify Google providing money.

@freakboy3742
Copy link

To clarify my position: I make a distinction between "Replacing the auth.User model" and "providing a way to get access to a User model, which will be auth.User in the default case, but could be something else".

I'm in favour of the latter. I don't think auth.User should always refer to the currently active User model -- I think the developer should be forced to be explicit about the fact that they're dealing with a model that might vary. This will require developers to opt-in in their projects (ForeignKey(User) won't magically fix itself), but I see this as a good thing.

In my ideal world, my preferred option is actually Option 3; however, it seems that the App Refactor isn't as close to being trunk ready as I originally thought. That being the case, a variant on 2a is probably my preferred option. At the core, the two are actually pretty close -- the only real difference is the mechanism for specifying the pluggable model. My only real criticism of 2a is that it is User specific, rather than solving the generic problem of pluggable models. I've given my thoughts on this on django-dev; I'll try and get the ideas together on the wiki.

Regarding GSoC; I wouldn't completely give up on that. I suspect that a well documented, well tested, generic swappable model approach would probably fit nicely into a GSoC period. I know there isn't much time left to get a proposal in, but you've been actively engaged in the design discussion around this; if you can put together a quick summary of what you're planning, with plenty of references to the mailing list, I suspect you might still find yourself selected as a project.

@ogier
Copy link
Author

ogier commented Apr 4, 2012

Isn't ForeignKey(settings.USER_MODEL) explicit enough? If the purpose of LazyForeignKey is purely to abstract away the fact that the string setting is a model that should be registered at a certain location in the model cache then I don't think it's necessary. The term LazyForeignKey suggests to me a model whose class definition will be delayed above and beyond the usual time of foreign key resolution, not a foreignkey to an already defined class object whose attributes will later change.

@freakboy3742
Copy link

A LFK to a key, and a FK to a setting are essentially the same thing. I have a vague preference for a LFK because (1) it's ever so slightly less typing (no import, no 'settings.'), and more importantly, (2) it forces the user to be very clear that they're dealing with a swappable model. You can't just make a LFK to any random model or setting -- it needs to be one declared as swappable.

If you think a name other than LazyForeignKey will express that better, then go ahead; however, at this point, we're getting into bikeshed painting territory :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment