How to fail a custom data migration? - django

I have written a data migration that initializes new tables with some rows. The objects have foreign keys to other tables, so I have to check that the foreign ids exist. If they don't, I would like to stop the migration with an error message.
I have written two functions: forward and reverse. What is a recommended way of stopping a migration from within forward?
def forward(apps, schema_editor):
...
def reverse(apps, schema_editor):
...
class Migration(migrations.Migration):
dependencies = [
("my_app", "0001_initial"),
]
operations = [
migrations.RunPython(code=forward, reverse_code=reverse)
]

Related

Is django data migration immediately applied?

I read the following text on docs:
"""
Django’s default behavior is to run in autocommit mode. Each query is immediately committed to the database, unless a transaction is active. See below for details.
"""
and I'm running the following data migration:
def fill_query(apps, schema_editor):
Result = apps.get_model('monitoring', 'Result')
for r in Result.objects.all():
r.query = r.monitored_search.query
r.user_id = r.monitored_search.user_id
r.save()
class Migration(migrations.Migration):
dependencies = [
('monitoring', '0006_searchresult_user_id'),
]
operations = [
migrations.RunPython(fill_query),
]
But when I try to find objects from Result I found that all still have query and user_id as null. And my data migration keep running (more than 2 millions registers on database)
maybe the changes will be applied when data migration stop running or my data migration is not working?

What does RunPython.noop() do?

In the documentation it says,
"Pass the RunPython.noop method to code or reverse_code when you want the operation not to do anything in the given direction. This is especially useful in making the operation reversible."
Sometimes it is possible that you want to revert a migration. For example you have added a field, but now you want to bring the database back to the state before the migration. You can do this by reverting the migration [Django-doc] with the command:
python3 manage.py migrate app_name previous_migration_name
Then Django will look how it can migrate back the the previous_migration_name and perform the operations necessary. For example if you renamed a field from foo to bar, then Django will rename it from bar back to foo.
Other operations are not reversible. For example if you remove a field in the migration, and that field has no default and is non-NULLable, then this can not be reversed. This makes sense since the reverse of removing a field is adding a field, but since there is no value to take for the existing records, what should Django fill in for that field that is recreated for the existing records?
A RunPython command is by default not reversible. In general in computer science one can not computationally determine the reverse of a function if any exists. This is a consequence of Rice's theorem [wiki]. But sometimes it is possible. If we for example constructed a migration where we incremented a certain field with one, then the reverse is to decrement all the fields with one, for example:
from django.db.models import F
from django.db import migrations
def forwards_func(apps, schema_editor):
MyModel = apps.get_model('my_app', 'MyModel')
db_alias = schema_editor.connection.alias
MyModel.objects.using(db_alias).all().update(
some_field=F('some_field')+1
])
def reverse_func(apps, schema_editor):
MyModel = apps.get_model('my_app', 'MyModel')
db_alias = schema_editor.connection.alias
MyModel.objects.using(db_alias).all().update(
some_field=F('some_field')-1
])
class Migration(migrations.Migration):
dependencies = []
operations = [
migrations.RunPython(code=forwards_func, reverse_code=reverse_func),
]
But sometimes it is possible that a (data)migration does nothing when you migrate it forward, or more common when you migrate it backwards. Instead of each time implementing an empty function, you can then pass a reference to noop, which does nothing:
from django.db.models import F
from django.db import migrations
def forwards_func(apps, schema_editor):
# … some action …
pass
class Migration(migrations.Migration):
dependencies = []
operations = [
migrations.RunPython(code=forwards_func, reverse_code=migrations.RunPython.noop),
]

How to solve the problem of Dual behavior of django CustomUser in migration?

I have a data migration as below in which I want to use create_user method of CustomUser, get an instance of the created user, and use this instance to create instance of Partner model.
It is worth mentioning that I have a Partner model that has a one-to-one relationship with CustomUser.
I have two options:
# Option One:
def populate_database_create_partner(apps, schema_editor):
Partner = apps.get_model('partners', 'Partner')
CustomUser.objects.create_user(
id=33,
email='test_email#email.com',
password='password',
first_name='test_first_name',
last_name="test_last_name",
is_partner=True,
)
u = CustomUser.objects.get(id=33)
partner = Partner.objects.create(user=u, )
class Migration(migrations.Migration):
dependencies = [
('accounts', '0006_populate_database_createsuperuser'),
]
operations = [
migrations.RunPython(populate_database_create_partner),
]
In option one, I see this error:
ValueError: Cannot assign "<CustomUser: test_email#email.com>": "Partner.user" must be a "CustomUser" instance.
I then test this:
# Option Two:
def populate_database_create_partner(apps, schema_editor):
Partner = apps.get_model('partners', 'Partner')
CustomUser = apps.get_model('accounts', 'CustomUser')
CustomUser.objects.create_user(
id=33,
email='test_email#email.com',
password='password',
first_name='test_first_name',
last_name="test_last_name",
is_partner=True,
)
u = CustomUser.objects.get(id=33)
partner = Partner.objects.create(user=u, )
class Migration(migrations.Migration):
dependencies = [
('accounts', '0006_populate_database_createsuperuser'),
]
operations = [
migrations.RunPython(populate_database_create_partner),
]
I the see this error:
CustomUser.objects.create_user(
AttributeError: 'Manager' object has no attribute 'create_user'
The create_user method does not work.
If I do not use the create_user method and simply use CustomUser.objects.create(...), I will not be able to set password in here.
Django only keeps limited historical information about each version of your models. One of the things it doesn't keep track of, as documented here, is custom model managers.
The good news is that there's a way to force the migrations system to use your custom manager:
You can optionally serialize managers into migrations and have them available in RunPython operations. This is done by defining a use_in_migrations attribute on the manager class.
As noted, this just allows your migration to use the version of the manager that exists when the migration is run; so, if you later make changes to it, you could break the migration. A safer alternative is to just copy the relevant create_user code into the migration itself.

Data migration only executed for the first test

I have a simple data migration, which creates a Group, and which looks like this :
def make_manager_group(apps, schema_editor):
Group = apps.get_model("auth", "Group")
managers_group = Group(name="managers")
managers_group.save()
class Migration(migrations.Migration):
dependencies = [
('my_app', '0001_initial'),
('auth', '0006_require_contenttypes_0002'),
]
operations = [
migrations.RunPython(make_manager_group, reverse_code=lambda *args, **kwargs: True)
]
and a simple functional test app containing the following tests :
from django.contrib.auth.models import Group
from django.contrib.staticfiles.testing import StaticLiveServerTestCase
class FunctionalTest(StaticLiveServerTestCase):
def setUp(self):
print("Groups : {}".format(Group.objects.all()))
def test_2(self):
pass
def test_1(self):
pass
When I run the tests, I get :
Creating test database for alias 'default'...
Groups : [<Group: managers>]
.Groups : []
.
Clearly, the group is being created when the test db is created, but when this db is reset between tests, it is reset to an empty db and not to the state it was after all the migrations were applied.
The model itself doesn't contain anything special (I only created one for the migration not to be the first, as in the project I'm working on, but I'm not sure it is needed at all).
Is this a bug, or am I missing something about data migration, to be able to have my group created when every single test starts?
Edit 1 : I'm using Django 1.8.3
Edit 2 : Quick'n'dirty hack added to the setUp of the test class :
from django.contrib.auth.models import Group, Permission
if not Group.objects.all():
managers_group = Group(name="managers")
managers_group.save()
managers_group.permissions.add(
Permission.objects.get(codename='add_news'),
Permission.objects.get(codename='change_news'),
Permission.objects.get(codename='delete_news')
)
This is all but DRY, but until now, I couldn't find another way...
I answer my own question :
It seems to be a filed bug which has became documented
It says that using TransactionTestCase and its subclasses (like in my case LiveServerTestCase) doesn't insert the data migrations before every test. It is just done once for the first of them.
It also says that we could set serialized_rollback to True, which should force the rollback to the filled database. But in my case, I'm having the same error as the last message in the bug report.
So I'm going to stick with my dirty hack for now, and maybe create a data fixture, as it says that fixtures are used everytime.

Do django db_index migrations run concurrently?

I'm looking to add a multi-column index to a postgres database. I have a non blocking SQL command to do this which looks like this:
CREATE INDEX CONCURRENTLY shop_product_fields_index ON shop_product (id, ...);
When I add db_index to my model and run the migration, will it also run concurrently or will it block writes? Is a concurrent migration possible in django?
There are AddIndexConcurrently and RemoveIndexConcurrently in Django 3.0:
https://docs.djangoproject.com/en/dev/ref/contrib/postgres/operations/#django.contrib.postgres.operations.AddIndexConcurrently
Create a migration and then change migrations.AddIndex to AddIndexConcurrently. Import it from django.contrib.postgres.operations.
With Django 1.10 migrations you can create a concurrent index by using RunSQL and disabling the wrapping transaction by making the migration non-atomic by setting atomic = False as a data attribute on the migration:
class Migration(migrations.Migration):
atomic = False # disable transaction
dependencies = []
operations = [
migrations.RunSQL('CREATE INDEX CONCURRENTLY ...')
]
RunSQL: https://docs.djangoproject.com/en/stable/ref/migration-operations/#runsql
Non-atomic Migrations: https://docs.djangoproject.com/en/stable/howto/writing-migrations/#non-atomic-migrations
You could use the SeparateDatabaseAndState migration operation to provide a custom SQL command for creating the index. The operation accepts two lists of operations:
state_operations are operations to apply on the Django model state.
They do not affect the database.
database_operations are operations to apply to the database.
An example migration may look like this:
from django.db import migrations, models
class Migration(migrations.Migration):
atomic = False
dependencies = [
('myapp', '0001_initial'),
]
operations = [
migrations.SeparateDatabaseAndState(
state_operations=[
# operation generated by `makemigrations` to create an ordinary index
migrations.AlterField(
# ...
),
],
database_operations=[
# operation to run custom SQL command (check the output of `sqlmigrate`
# to see the auto-generated SQL, edit as needed)
migrations.RunSQL(sql='CREATE INDEX CONCURRENTLY ...',
reverse_sql='DROP INDEX ...'),
],
),
]
Do what tgroshon says for new django 1.10 +
for lesser versions of django i have had success with a more verbose subclassing method:
from django.db import migrations, models
class RunNonAtomicSQL(migrations.RunSQL):
def _run_sql(self, schema_editor, sqls):
if schema_editor.connection.in_atomic_block:
schema_editor.atomic.__exit__(None, None, None)
super(RunNonAtomicSQL, self)._run_sql(schema_editor, sqls)
class Migration(migrations.Migration):
dependencies = [
]
operations = [
RunNonAtomicSQL(
"CREATE INDEX CONCURRENTLY",
)
]
You can do something like
import django.contrib.postgres.indexes
from django.db import migrations, models
from django.contrib.postgres.operations import AddIndexConcurrently
class Migration(migrations.Migration):
atomic = False
dependencies = [
("app_name", "parent_migration"),
]
operations = [
AddIndexConcurrently(
model_name="mymodel",
index=django.contrib.postgres.indexes.GinIndex(
fields=["field1"],
name="field1_idx",
),
),
AddIndexConcurrently(
model_name="mymodel",
index=models.Index(
fields=["field2"], name="field2_idx"
),
),
]
Ref: https://docs.djangoproject.com/en/dev/ref/contrib/postgres/operations/#django.contrib.postgres.operations.AddIndexConcurrently
There is no support for PostgreSQL concurent index creation in django.
Here is the ticket requesting this feature - https://code.djangoproject.com/ticket/21039
But instead, you can manually specify any custom RunSQL operation in the migration -
https://docs.djangoproject.com/en/stable/ref/migration-operations/#runsql