Unexpected UUIDField upset
That moment when you realise your Universally Unique Identifier... isn't.
@stevejalim – YJ Tech Lead
We like UUIDs at YJ. One use of them we're particularly fond of is sticking specific yet meaning-free labels on objects, so we don't have to worry about exposing 'walkable' integer PKs or human-readable usernames or any of that awkward stuff.
As such, we've used our own UUIDField implementation since pretty much the start of YunoJuno. But with Django 1.8 we saw the arrival of django.db.models.fields.UUIDField
. It's nice: it uses the native uuid
type on Postgres and a real Python UUID in, well, Python (instead of the hex UUID string our homespun version does). We're now using it across lots of our models, and will be migrating our own UUIDFields to the new implementation.
However, there's a gotcha, which is (I think) all too easy to be caught by. Django's own UUIDField does not set a unique value for a UUID in a Migration. That's to say, if you add a Django UUIDField to a model that already has records in the database, the Django Migration will give the same UUID to every object affected. New instances made after the migration get unique UUIDs, no problem, but the ones retro-fitted in the Migration get the same value, with no warning.
The Django docs for models.UUIDField say:
The database will not generate the UUID for you, so it is recommended to use [the]
default
[keyword]
So, you dutifully set a callable (default=uuid.uuid4
), which you might reasonably expect to be called.
And it is, but only once, then the value is reused, resulting in the same UUID for all the affected models.
Let's pause to allow that to sink in.
Now, this isn't a bug – it's the intended behaviour. And it stops being a gotcha if you sit and think about it at a deeper level: if you were doing it in SQL, a DB could only take a single argument for the DEFAULT
clause when adding a new column, so the Django Migration can only provide a single value – hence why the callable is only called once. (Hat-tip: I'm paraphrasing Andrew Godwin talking about this in a Django ticket.)
You may think "Could the Migrations code know to run the callable multiple times, for each UUIDField addition? Via a lambda or special-case handling?". Something's surely possible, though it doesn't sound likely for Django right now – it'd be patch-it-yourself-and-make-a-persuasive-case territory. However, it does look like the docs will be improved to help people avoid the pitfall.
Anyway, what do you do about it with the current cut of the code? There's a How-To for such matters, handily, which boils down to something along these lines:
# -*- coding: utf-8 -*-
# foo/migrations/0002_example.py
from __future__ import unicode_literals
from django.db import models, migrations
import uuid
def set_uuids(apps, schema_editor):
"""
Setting default=uuid.uuid4 is not enough as Django
will set the same uuids to all models
"""
MyModel = apps.get_model('foo', 'MyModel')
for instance in MyModel.objects.all().order_by('-id'):
instance.uuid = uuid.uuid4()
instance.save(update_fields=['uuid'])
print (
"MyModel #%s uuid set to %s" %
(instance.id, instance.uuid)
)
def noop(apps, schema_editor):
"""Fake backwards migration."""
pass
class Migration(migrations.Migration):
dependencies = [
('foo', '0001_initial'),
]
operations = [
migrations.AddField(
model_name='mymodel',
name='uuid',
field=models.UUIDField(default=uuid.uuid4, editable=False),
),
migrations.RunPython(
set_uuids,
noop,
)
]
PS: for the record, our homespun UUIDField would always set a fresh uuid.uuid4().hex
if no default was present (unless you configured it not to). That approach completely ducked this issue. It wasn't perfect, but that little YJ UUIDField has served us well.
Tech Lead, Django Dev
Posted in: django