Referencing external variables in Django data migrations - django

When using models in migrations, in Django we can use apps.get_model() to make sure that the migration will use the right "historical" version of the model (as it was when the migration was defined).
But how do we deal with "regular" variables (not models) imported from the codebase?
If we import a variable from another module, we will probably face issues in the future. For example:
If someday in the future we delete the variable (because we changed the implementation) this will break migrations. So we won't be able to re-run migrations locally to re-create a database from scratch.
If we modify the variable (e.g. we change the values in a list) this will produce unexpected effects when we run the reverse operation on an existing db.
So the question is: what's the best practice for writing migrations? Should we always hard-code values without importing external variables?
Example
Suppose I want to simply modify the value of a field with a variable that I've defined somewhere in the codebase. For example, I want to turn all normal users into admins. I stored user roles in an enum (UserRoles). One way to write the migration would be this:
from django.db import migrations
from user_roles import UserRoles
def change_user_role(apps, schema_editor):
User = apps.get_model('users', 'User')
users = User.objects.filter(role=UserRoles.NORMAL_USER.value)
for user in users:
user.role = UserRoles.ADMIN.value
User.objects.bulk_update(users, ["role"])
def revert_user_role_changes(apps, schema_editor):
User = apps.get_model('users', 'User')
users = User.objects.filter(role=UserRoles.ADMIN.value)
for user in users:
user.role = UserRoles.NORMAL_USER.value
User.objects.bulk_update(users, ["role"])
class Migration(migrations.Migration):
dependencies = [
('users', '0015_auto_20220612_0824'),
]
operations = [
migrations.RunPython(change_user_role, revert_user_role_changes)
]
As you see, this will have the issues I mentioned above.
I've used the enum example, but this can be applied to every variable referenced inside the migrations.
So the question again: what's the best practice for migrations? Should we always hard-code values without referencing external variables that might change?

Related

ForeignKey to AnonymousUser

I would like to follow this guideline and avoid a nullable ForeignKey.
I have a ForeignKey to the django User model.
If I try to store request.user in an instance of this model, I get this error:
ValueError: Cannot assign "<SimpleLazyObject: <django.contrib.auth.models.AnonymousUser>>":
"MyModel.user" must be a "User" instance.
I think it is feasible to avoid the non-conditionless data schema (nullable foreign key).
How could I solve this?
Those guidelines suggest you create an instance of user that reflects an anonymous user. Otherwise known as a sentinel value. You'd have to keep track of it via some unique key, likely the username. Then make sure it exists and nobody else has actually created a user with that key otherwise you run into other problems.
However, because of those outlined issues above I disagree with those guidelines. If your data model allows for optional relationships, then you absolutely should use NULL values.
Regarding the comment:
If there is no NULL in your data, then there will be no NullPointerException in your source code while processing the data :-)
Simply because there are no NULL fields, doesn't mean those conditions don't exist. You still are handling these edge cases, but changing the names and some of the syntax. You're still vulnerable to bugs because you still have as many conditions (and potentially more given that you have to now make sure your sentinel value is unique).
Hey it's my first attempt at answering a question! I'm a newbie, but I had a similar error recently. I suppose this is a naive version Nigel222's answer which calls for doing this with a migration, but maybe there is something of value here nonetheless for another newbie who needs a simpler solution. I was influenced by an answer to this post by Ayman Al-Absi that suggests that you may need to reference this user by it's auto-generated primary key.
By default, request.user is AnonymousUser when not authenticated. It seems from the error message, AnonymousUser can't be used as value for your foreign key in the User table.
Proposed solution:
from django.contrib.auth.models import User
# Start by creating a user in your User table called something like anon in some kind of initialization method:
tempUser= User.objects.create_user(username="anon", email="none", first_name="none", last_name="none")
tempUser.save()
#when the user is unauthenticated, before calling a method that takes request as a parameter do:
if request.user.is_anonymous:
anonUser = User.objects.get(username='anon')
request.user=User.objects.get(id=anonUser.id)
Another comment for the newbie. I made my own table called User in models.py. This became confusing. I had to import it with an alias:
from .models import User as my_user_table
It would have been better just to call it my_user_table to begin with.
Create a special instance of User for this purpose. It's The best place to do so is in a data migration for the model which will rely on being able to create a ForeignKey to this special User object. When you deploy your app and run makemigrations and migrate, it will create the special user objects before there are any actual users in the DB.
There's a lot of detail on creating data migrations here
Here's an example of making sure that some Group objects will exist as of this migration for any future deployment.
# Generated by Django 2.2.8 on 2020-03-05 09:53
from django.db import migrations
def apply_migration(apps, schema_editor):
Group = apps.get_model("auth", "Group")
Group.objects.bulk_create(
[Group(name="orderadmin"),
Group(name="production"),
Group(name="shipping")]
)
def revert_migration(apps, schema_editor):
Group = apps.get_model("auth", "Group")
Group.objects.filter(name__in=["orderadmin", "production", "shipping"]).delete()
class Migration(migrations.Migration):
dependencies = [
('jobs', '0034_auto_20200303_1810'),
]
operations = [
migrations.RunPython(apply_migration, revert_migration)
]

How can I initialise group names in Django every time the program runs?

I have this code and I want it to just create the groups every time the program runs so that if the database is deleted it will still be a sufficient program itself and someone won't have to create groups again, do you know an easy way to do this?
system_administrator = Group.objects.get_or_create(name='system_administrator')
manager = Group.objects.get_or_create(name='manager')
travel_advisor = Group.objects.get_or_create(name='travel_advisor')
If you lose your DB, you'd have to rerun migrations on a fresh db before the program could run again. So I think data migrations might be a good solution for this? A data migration, is a migration that runs python code to alter the data in the DB, not the schema as a normal migration does.
You could do something like this:
In a new migration file (you can run python manage.py makemigrations --empty yourappname to create an empty migration file for an app)
def generate_groups(apps, schema_editor):
Group = apps.get_model('yourappname', 'Group')
Group.objects.get_or_create(name="system_administrator")
Group.objects.get_or_create(name="manager")
Group.objects.get_or_create(name="travel_advisor")
class Migration(migrations.Migration):
dependencies = [
('yourappname', 'previous migration'),
]
operations = [
migrations.RunPython(generate_groups),
]
Worth reading the docs on this https://docs.djangoproject.com/en/3.0/topics/migrations/#data-migrations
You can do it in the ready method of one of your apps.
class YourApp(Appconfig):
def ready(self):
# important do the import inside the method
from something import Group
Group.objects.get_or_create(name='system_administrator')
Group.objects.get_or_create(name='manager')
Group.objects.get_or_create(name='travel_advisor')
The problem with the data migrations approach is that it is useful for populate the database the first time. But if the groups are deleted once the data migration has run, you will need to populate them again.
Also remember that get_or_create return a tuple.
group, created = Group.objects.get_or_create(name='manager')
# group if an instance of Group
# created is a boolean

Data migrations in Django

I am working on a data migration for a Django app to populate
the main table in the db with data that will form the mainstay of
the app - this is persistent/permanent data that may added to but
never deleted.
My reference is the Django 1.7 documentation and in particular an
example on page
https://docs.djangoproject.com/en/1.7/ref/migration-operations/#django.db.migrations.operations.RunPython
with a custom method called forward_funcs:
def forwards_func(apps, schema_editor):
# We get the model from the versioned app registry;
# if we directly import it, it'll be the wrong version
Country = apps.get_model("myapp", "Country")
db_alias = schema_editor.connection.alias
Country.objects.using(db_alias).bulk_create([
Country(name="USA", code="us"),
Country(name="France", code="fr"),])
I am assuming the argument to bulk_create is a list of Country model objects not namedtuple objects, although the format looks exactly the same. Is this the case, and could someone please explain what db_alias is?
Also, if I wish to change or remove existing entries in a table using a data migration what are the methods corresponding to bulk_create to do this?
Thanks in advance for any help.
Country is just the same as you would do from app.models import Country. Only thing different, the import always gives you the latest model and apps.get_model in a migration gives you the model at the time of the migration. It continues to edit the model within the initial migration.
About bulk_create; its argument is indeed a list of unsaved Country objects and uses it to do an huge insert into your db. More information about bulk_create can be found here; https://docs.djangoproject.com/en/1.7/ref/models/querysets/#bulk-create.
About db_alias, it is the name of the database you set within your settings. Most of the time it is default, so you can leave it in your code if you just use one database. The function will probably will called more than once if you have more databases set within your settings. More info about using; https://docs.djangoproject.com/en/1.7/ref/models/querysets/#using.
An bulk delete is actually quite simple, you just filter your Countries and call delete on the queryset. So something like;
Country.objects.filter(continent="Europe").delete()
About the persistent/permanent data question, I don't really have a solution for that one. One thing you can do, I think, is overwrite the .delete() function on the model and Manager.

OneToOne Fields on users causing some ID problems

I have a bit of a problem using django-registration and signals.
The basic setup is that I have a django 1.4.3 setup, with django-south and django-registration (and the db is SQLite for what it's worth).
EDIT: I changed the question a bit because the effect is the same in a shell, so the registration is not in cause (edits are in italic).
I have a one of my model that is related to the User model in the following way:
class MyUserProfile(models.Model):
user = models.OneToOneFiled(User)
#additional fields
I initialized the base using south.
When I do a little sqlall to check the sql that should be in it, I can clearly see:
CREATE TABLE "myApp_myuserprofile" (
"id" integer NOT NULL PRIMARY KEY
"user_id" integer NOT NULL UNIQUE REFERENCES "auth_user" ("id"),
#other fields
)
After that, I wanted to initialize the data if the user activated its account.
So in models.py I put
from django.dispatch import receiver
from registration.signals import user_activated
#Models....
#receiver(user_activated)
def createMyProfile(sender, **kwargs):
currentUser = kwargs['user']
profile = Profile(user = currentUser, #other fields default value)
profile.save()
#And now the reverse relation:
currentUser.myuserprofile = profile
currentUser.save()
While I am in there, everything seems alright, if I print the ids (both for the user and the profile), and if I travel back and forth between the 2, I see something that seems correct.
If I disable this part of the code and do the same kind of initialization using the shell, I get the same result.
But after that, everyhting is wrong.
If I open a shell and import the relevant matters, I have the following for every X value
MyUserProfile.objects.get(pk=X)
#DoesNotExist Exception
User.objects.get(pk=X).myuserprofile.pk
1
MyUserProfile.objects.all()[X].pk
1
Seems a bit weird no?
And now if I go to the sql shell
select id from myApp_myuserprofile;
1
1
1
1
...
So I have a primary column which is filled with the same value all over the place. Which is well... embarrassing to say the least (and does lead to problem, because everyone has a profile with the same Id).
Any idea to what could be the cause of the problem and how I could solve it?
P.S: Note that the foreign key from the related relation are correct and their uniqueness is preserved.
Well looks like the problem was indeed coming from the use of SQLite and South.
The doc states that:
SQLite doesn’t natively support much schema altering at all, but South has workarounds to allow deletion/altering of columns. Unique indexes are still unsupported, however; South will silently ignore any such commands.
Which was (I think) the case, as I hadn't created this relation from the start but with a latter migration. I just reseted the base and migrations and voilà.
See What's the recommended approach to resetting migration history using Django South? for the migrations and a simple ./manage.py reset myApp for the base reset.

Django south migration error with unique field in postgresql database

Edit: I understand the reason why this happened. It was because of the existence of `initial_data.json` file. Apparently, south wants to add those fixtures after migration but failing because of the unique property of a field.
I changed my model from this:
class Setting(models.Model):
anahtar = models.CharField(max_length=20,unique=True)
deger = models.CharField(max_length=40)
def __unicode__(self):
return self.anahtar
To this,
class Setting(models.Model):
anahtar = models.CharField(max_length=20,unique=True)
deger = models.CharField(max_length=100)
def __unicode__(self):
return self.anahtar
Schema migration command completed successfully, but, trying to migrate gives me this error:
IntegrityError: duplicate key value violates unique constraint
"blog_setting_anahtar_key" DETAIL: Key (anahtar)=(blog_baslik) already
exists.
I want to keep that field unique, but still migrate the field. By the way, data loss on that table is acceptable, so long as other tables in DB stay intact.
It's actually the default behavior of syncdb to run initial_data.json each time. From the Django docs:
If you create a fixture named initial_data.[xml/yaml/json], that fixture will be loaded every time you run syncdb. This is extremely convenient, but be careful: remember that the data will be refreshed every time you run syncdb. So don't use initial_data for data you'll want to edit.
See: docs
Personally, I think the use-case for initial data that needs to be reloaded each and every time a change occurs is retarded, so I never use initial_data.json.
The better method, since you're using South, is to manually call loaddata on a specific fixture necessary for your migration. In the case of initial data, that would go in your 0001_initial.py migration.
def forwards(self, orm):
from django.core.management import call_command
call_command("loaddata", "my_fixture.json")
See: http://south.aeracode.org/docs/fixtures.html
Also, remember that the path to your fixture is relative to the project root. So, if your fixture is at "myproject/myapp/fixtures/my_fixture.json" call_command would actually look like:
call_command('loaddata', 'myapp/fixtures/my_fixture.json')
And, of course, your fixture can't be named 'initial_data.json', otherwise, the default behavior will take over.