In the documentation it says,
"Pass the RunPython.noop method to code or reverse_code when you want the operation not to do anything in the given direction. This is especially useful in making the operation reversible."
Sometimes it is possible that you want to revert a migration. For example you have added a field, but now you want to bring the database back to the state before the migration. You can do this by reverting the migration [Django-doc] with the command:
python3 manage.py migrate app_name previous_migration_name
Then Django will look how it can migrate back the the previous_migration_name and perform the operations necessary. For example if you renamed a field from foo to bar, then Django will rename it from bar back to foo.
Other operations are not reversible. For example if you remove a field in the migration, and that field has no default and is non-NULLable, then this can not be reversed. This makes sense since the reverse of removing a field is adding a field, but since there is no value to take for the existing records, what should Django fill in for that field that is recreated for the existing records?
A RunPython command is by default not reversible. In general in computer science one can not computationally determine the reverse of a function if any exists. This is a consequence of Rice's theorem [wiki]. But sometimes it is possible. If we for example constructed a migration where we incremented a certain field with one, then the reverse is to decrement all the fields with one, for example:
from django.db.models import F
from django.db import migrations
def forwards_func(apps, schema_editor):
MyModel = apps.get_model('my_app', 'MyModel')
db_alias = schema_editor.connection.alias
MyModel.objects.using(db_alias).all().update(
some_field=F('some_field')+1
])
def reverse_func(apps, schema_editor):
MyModel = apps.get_model('my_app', 'MyModel')
db_alias = schema_editor.connection.alias
MyModel.objects.using(db_alias).all().update(
some_field=F('some_field')-1
])
class Migration(migrations.Migration):
dependencies = []
operations = [
migrations.RunPython(code=forwards_func, reverse_code=reverse_func),
]
But sometimes it is possible that a (data)migration does nothing when you migrate it forward, or more common when you migrate it backwards. Instead of each time implementing an empty function, you can then pass a reference to noop, which does nothing:
from django.db.models import F
from django.db import migrations
def forwards_func(apps, schema_editor):
# … some action …
pass
class Migration(migrations.Migration):
dependencies = []
operations = [
migrations.RunPython(code=forwards_func, reverse_code=migrations.RunPython.noop),
]
Related
When using models in migrations, in Django we can use apps.get_model() to make sure that the migration will use the right "historical" version of the model (as it was when the migration was defined).
But how do we deal with "regular" variables (not models) imported from the codebase?
If we import a variable from another module, we will probably face issues in the future. For example:
If someday in the future we delete the variable (because we changed the implementation) this will break migrations. So we won't be able to re-run migrations locally to re-create a database from scratch.
If we modify the variable (e.g. we change the values in a list) this will produce unexpected effects when we run the reverse operation on an existing db.
So the question is: what's the best practice for writing migrations? Should we always hard-code values without importing external variables?
Example
Suppose I want to simply modify the value of a field with a variable that I've defined somewhere in the codebase. For example, I want to turn all normal users into admins. I stored user roles in an enum (UserRoles). One way to write the migration would be this:
from django.db import migrations
from user_roles import UserRoles
def change_user_role(apps, schema_editor):
User = apps.get_model('users', 'User')
users = User.objects.filter(role=UserRoles.NORMAL_USER.value)
for user in users:
user.role = UserRoles.ADMIN.value
User.objects.bulk_update(users, ["role"])
def revert_user_role_changes(apps, schema_editor):
User = apps.get_model('users', 'User')
users = User.objects.filter(role=UserRoles.ADMIN.value)
for user in users:
user.role = UserRoles.NORMAL_USER.value
User.objects.bulk_update(users, ["role"])
class Migration(migrations.Migration):
dependencies = [
('users', '0015_auto_20220612_0824'),
]
operations = [
migrations.RunPython(change_user_role, revert_user_role_changes)
]
As you see, this will have the issues I mentioned above.
I've used the enum example, but this can be applied to every variable referenced inside the migrations.
So the question again: what's the best practice for migrations? Should we always hard-code values without referencing external variables that might change?
I have written a data migration that initializes new tables with some rows. The objects have foreign keys to other tables, so I have to check that the foreign ids exist. If they don't, I would like to stop the migration with an error message.
I have written two functions: forward and reverse. What is a recommended way of stopping a migration from within forward?
def forward(apps, schema_editor):
...
def reverse(apps, schema_editor):
...
class Migration(migrations.Migration):
dependencies = [
("my_app", "0001_initial"),
]
operations = [
migrations.RunPython(code=forward, reverse_code=reverse)
]
I am trying to change the field type of one of attributes from CharField to DecimalField by doing an empty migrations and populate the new field by filling the migrations log with the following:
from __future__ import unicode_literals
from django.db import migrations
from decimal import Decimal
def populate_new_col(apps, schema_editor): #Plug data from 'LastPrice' into 'LastPrice_v1' in the same model class 'all_ks'.
all_ks = apps.get_model('blog', 'all_ks')
for ks in all_ks.objects.all():
if float(ks.LastPrice): #Check if conversion to float type is possible...
print ks.LastPrice
ks.LastPrice_v1, created = all_ks.objects.get_or_create(LastPrice_v1=Decimal(float(ks.LastPrice)*1.0))
else: #...else insert None.
ks.LastPrice_v1, created = all_ks.objects.get_or_create(LastPrice_v1=None)
ks.save()
class Migration(migrations.Migration):
dependencies = [
('blog', '0027_auto_20190301_1600'),
]
operations = [
migrations.RunPython(populate_new_col),
]
But I kept getting an error when I tried to migrate:
TypeError: Tried to update field blog.All_ks.LastPrice_v1 with a model instance, <All_ks: All_ks object>. Use a value compatible with DecimalField.
Is there something I missed converting string to Decimal?
FYI, ‘LastPrice’ is the old attribute with CharField, and ‘LastPrice_v1’ is the new attribute with DecimalField.
all_ks.objects.get_or_create() returns an All_ks object which you assign to the DecimalField LastPrice_v1. So obviously Django complains. Why don't you assign the same ks's LastPrice?
ks.LastPrice_v1 = float(ks.LastPrice)
That said, fiddling around with manual migrations seems a lot of trouble for what you want to achieve (unless you're very familiar with migration code). If you're not, you're usually better off
creating the new field in code
migrating
populating the new field
renaming the old field
renaming the new field to the original name
removing the old field
migrating again
All steps are vanilla Django operations, with the bonus that you can revert until the very last step (nice to have when things can take unexpected turns as you've just experienced).
I am adding a new field to an existing db table. it is to be auto-generated with strings.
Here is my code:
from django.utils.crypto import get_random_string
...
Model:
verification_token = models.CharField(max_length=60, null=False, blank=False, default=get_random_string)
I generate my migration file with ./manage.py makemigrations and a file is generated.
I verify the new file has default set to field=models.CharField(default=django.utils.crypto.get_random_string, max_length=60)
so all seems fine.
Proceed with ./manage.py migrate it goes with no error from terminal.
However when i check my table i see all the token fields are filled with identical values.
Is this something i am doing wrong?
How can i fix this within migrations?
When a new column is added to a table, and the column is NOT NULL, each entry in the column must be filled with a valid value during the creation of the column. Django does this by adding a DEFAULT clause to the column definition. Since this is a single default value for the whole column, your function will only be called once.
You can populate the column with unique values using a data migration. The procedure for a slightly different use-case is described in the documentation, but the basics of the data migrations are as follows:
from django.db import migrations, models
from django.utils.crypto import get_random_string
def generate_verification_token(apps, schema_editor):
MyModel = apps.get_model('myapp', 'MyModel')
for row in MyModel.objects.all():
row.verification_token = get_random_string()
row.save()
class Migration(migrations.Migration):
dependencies = [
('myapp', '0004_add_verification_token_field'),
]
operations = [
# omit reverse_code=... if you don't want the migration to be reversible.
migrations.RunPython(generate_verification_token, reverse_code=migrations.RunPython.noop),
]
Just add this in a new migration file, change the apps.get_model() call and change the dependencies to point to the previous migration in the app.
It maybe the token string to sort, so django will save some duplicates values. But, i'm not sure it is your main problem.
Anyway, I suggest you to handle duplicates values using while, then filter your model by generated token, makesure that token isn't used yet. I'll give you exampe such as below..
from django.utils.crypto import get_random_string
def generate_token():
token = get_random_string()
number = 2
while YourModel.objects.filter(verification_token=token).exists():
token = '%s-%d' % (token, number)
number += 1
return token
in your field of verification_token;
verification_token = models.CharField(max_length=60, unique=True, default=generate_token)
I also suggest you to using unique=True to handle duplicated values.
I have a simple data migration, which creates a Group, and which looks like this :
def make_manager_group(apps, schema_editor):
Group = apps.get_model("auth", "Group")
managers_group = Group(name="managers")
managers_group.save()
class Migration(migrations.Migration):
dependencies = [
('my_app', '0001_initial'),
('auth', '0006_require_contenttypes_0002'),
]
operations = [
migrations.RunPython(make_manager_group, reverse_code=lambda *args, **kwargs: True)
]
and a simple functional test app containing the following tests :
from django.contrib.auth.models import Group
from django.contrib.staticfiles.testing import StaticLiveServerTestCase
class FunctionalTest(StaticLiveServerTestCase):
def setUp(self):
print("Groups : {}".format(Group.objects.all()))
def test_2(self):
pass
def test_1(self):
pass
When I run the tests, I get :
Creating test database for alias 'default'...
Groups : [<Group: managers>]
.Groups : []
.
Clearly, the group is being created when the test db is created, but when this db is reset between tests, it is reset to an empty db and not to the state it was after all the migrations were applied.
The model itself doesn't contain anything special (I only created one for the migration not to be the first, as in the project I'm working on, but I'm not sure it is needed at all).
Is this a bug, or am I missing something about data migration, to be able to have my group created when every single test starts?
Edit 1 : I'm using Django 1.8.3
Edit 2 : Quick'n'dirty hack added to the setUp of the test class :
from django.contrib.auth.models import Group, Permission
if not Group.objects.all():
managers_group = Group(name="managers")
managers_group.save()
managers_group.permissions.add(
Permission.objects.get(codename='add_news'),
Permission.objects.get(codename='change_news'),
Permission.objects.get(codename='delete_news')
)
This is all but DRY, but until now, I couldn't find another way...
I answer my own question :
It seems to be a filed bug which has became documented
It says that using TransactionTestCase and its subclasses (like in my case LiveServerTestCase) doesn't insert the data migrations before every test. It is just done once for the first of them.
It also says that we could set serialized_rollback to True, which should force the rollback to the filled database. But in my case, I'm having the same error as the last message in the bug report.
So I'm going to stick with my dirty hack for now, and maybe create a data fixture, as it says that fixtures are used everytime.