Django migrations using RunPython to commit changes - django

I want to alter a foreign key in one of my models that can currently have NULL values to not be nullable.
I removed the null=True from my field and ran makemigrations
Because I'm an altering a table that already has rows which contain NULL values in that field I am asked to provide a one-off value right away or edit the migration file and add a RunPython operation.
My RunPython operation is listed BEFORE the AlterField operation and does the required update for this field so it doesn't contain NULL values (only rows who already contain a NULL value).
But, the migration still fails with this error:
django.db.utils.OperationalError: cannot ALTER TABLE "my_app_site" because it has pending trigger events
Here's my code:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
from django.db import models, migrations
def add_default_template(apps, schema_editor):
Template = apps.get_model("my_app", "Template")
Site = apps.get_model("my_app", "Site")
accept_reject_template = Template.objects.get(name="Accept/Reject")
Site.objects.filter(template=None).update(template=accept_reject_template)
class Migration(migrations.Migration):
dependencies = [
('my_app', '0021_auto_20150210_1008'),
]
operations = [
migrations.RunPython(add_default_template),
migrations.AlterField(
model_name='site',
name='template',
field=models.ForeignKey(to='my_app.Template'),
preserve_default=False,
),
]
If I understand correctly this error may occur when a field is altered to be not-nullable but the field contains null values.
In that case, the only reason I can think of why this happens is because the RunPython operation transaction didn't "commit" the changes in the database before running the AlterField.
If this is indeed the reason - how can I make sure the changes reflect in the database?
If not - what can be the reason for the error?
Thanks!

This happens because Django creates constraints as DEFERRABLE INITIALLY DEFERRED:
ALTER TABLE my_app_site
ADD CONSTRAINT "[constraint_name]"
FOREIGN KEY (template_id)
REFERENCES my_app_template(id)
DEFERRABLE INITIALLY DEFERRED;
This tells PostgreSQL that the foreign key does not need to be checked right after every command, but can be deferred until the end of transactions.
So, when a transaction modifies content and structure, the constraints are checked on parallel with the structure changes, or the checks are scheduled to be done after altering the structure. Both of these states are bad and the database will abort the transaction instead of making any assumptions.
You can instruct PostgreSQL to check constraints immediately in the current transaction by calling SET CONSTRAINTS ALL IMMEDIATE, so structure changes won't be a problem (refer to SET CONSTRAINTS documentation). Your migration should look like this:
operations = [
migrations.RunSQL('SET CONSTRAINTS ALL IMMEDIATE',
reverse_sql=migrations.RunSQL.noop),
# ... the actual migration operations here ...
migrations.RunSQL(migrations.RunSQL.noop,
reverse_sql='SET CONSTRAINTS ALL IMMEDIATE'),
]
The first operation is for applying (forward) migrations, and the last one is for unapplying (backwards) migrations.
EDIT: Constraint deferring is useful to avoid insertion sorting, specially for self-referencing tables and tables with cyclic dependencies. So be careful when bending Django.
LATE EDIT: on Django 1.7 and newer versions there is a special SeparateDatabaseAndState operation that allows data changes and structure changes on the same migration. Try using this operation before resorting to the "set constraints all immediate" method above. Example:
operations = [
migrations.SeparateDatabaseAndState(database_operations=[
# put your sql, python, whatever data migrations here
],
state_operations=[
# field/model changes goes here
]),
]

Yes, I'd say it's the transaction bounds which are preventing the data change in your migration being committed before the ALTER is run.
I'd do as #danielcorreia says and implement it as two migrations, as it looks like the even the SchemaEditor is bound by transactions, via the the context manager you'd be obliged to use.

Adding null to the field giving you a problem should fix it. In your case the "template" field. Just add null=True to the field. The migrations should than look like this:
class Migration(migrations.Migration):
dependencies = [
('my_app', '0021_auto_20150210_1008'),
]
operations = [
migrations.RunPython(add_default_template),
migrations.AlterField(
model_name='site',
name='template',
field=models.ForeignKey(to='my_app.Template', null=True),
preserve_default=False,
),
]

Related

Django UniqueConstraint not working as expected

At the moment I am using the following condition to avoid duplicates MyModel.objects.filter(other_model=self, ended__isnull=True, started__isnull=False).exists() which works, however due to unfortunate caching etc. in the past there were duplicate instances of this. I want to move this exact check to the database level with Django. I tried different constraints using a simple unique_together ([['other_model', 'ended']]) constraint and
constraints = [
models.UniqueConstraint(fields=['other_model', 'ended'], name='unique_name', condition=models.Q(ended__isnull=True))
]
This is my testcase, which runs without a problem (which it shouldn't)
ts = timezone.now()
MyModel.objects.create(other_model=same_instance, ended=ts, started=ts)
MyModel.objects.create(other_model=same_instance, ended=ts, started=ts)
This is the latest migration (I even reset the entire DB and reapplied all migrations):
migrations.AlterUniqueTogether(
name='mymodel',
unique_together={('other_model', 'ended')},
),

Emptying a table and filling with fixtures

I'm working on a big project (tons of migrations already exist, etc...) and I'm at a point in which I must empty an entire table and then fill it again with ~10 elements. What is the Django approach to this situation? Fixtures? RunPython?
Deleting data from tables (Django)
A default Django-generated migration has nothing to do with populating your table. A default migration changes your database layout, e.g. creates or deletes tables, adds columns, changes parameters of columns etc. (Read until the end on how to use manual migrations to delete data from table!)
Deleting data once
What you want to do is delete entries in a table and not delete the whole table. Of course, you could remove the table from your models.py and then migrate which would delete the table (if no errors, read next) but that might result in unwanted behaviour and errors (e.g. other models have ForeignKeys to this table which would probably prevent you from deleting the table). You have two options:
Manually connect to database and run
DELETE * FROM your_table;
Use Python to do the job for you. You can open Django shell by executing python manage.py shell. Then you have to import your model and run .delete(). That would look like this:
$ python manage.py shell
# We are in Django Python shell now...
>> from app.models import Model_to_empty
>> Model_to_empty.objects.all().delete()
Deleting data from tables with manual migration files
If you want to create it as migration then you can write a migration file yourself. To make sure everything is smooth, run
python manage.py makemigrations
python manage.py migrate
first, to migrate any changes that could possible be done in between. Now, create your fake migration file like this:
If your last migration was number 0180, name your file something like 0181_manual_deletion_through_migration.py and put it in app/migrations where app is the app that contains the model that needs to be emptied and refilled.
You can use migrations.RunSQL class in your migrations which will execute statement given as argument while migrating.
Example migration file taken from one of my projects is:
from django.db import migrations, models
class Migration(migrations.Migration):
dependencies = [
('beer', '0001_initial'),
]
operations = [
migrations.AddField(
model_name='beer',
name='beer_type',
field=models.CharField(default=0, max_length=30),
preserve_default=False,
),
]
Let's break it down:
dependencies = [
('beer', '0001_initial'),
]
This describes the previous migration that altered the model. 'beer' is the name of the app, '0001_initial' is previous migration. Set this to name of model you want to delete entries from and the name of migration should be the last migration.
operations = [
migrations.AddField(
model_name='beer',
name='beer_type',
field=models.CharField(default=0, max_length=30),
preserve_default=False,
),
]
Inside of operations comes what needs to be done. In my example, it was adding a field, thus migrations.AddField. Remember I told you about migrations.RunSQL? Well, we can use it here like this:
operations = [
migrations.RunSQL("DELETE * FROM your_model;"),
# run SQL statements to populate your model again.
]
where instead of a comment you put the SQL statements that will populate the table with entries you want.
When you finish editing the fake migration file, just execute python manage.py migrate (NO python manage.py makemigrations!!).

Commit manually in Django data migration

I'd like to write a data migration where I modify all rows in a big table in smaller batches in order to avoid locking issues. However, I can't figure out how to commit manually in a Django migration. Everytime I try to run commit I get:
TransactionManagementError: This is forbidden when an 'atomic' block is active.
AFAICT, the database schema editor always wraps Postgres migrations in an atomic block.
Is there a sane way to break out of the transaction from within the migration?
My migration looks like this:
def modify_data(apps, schema_editor):
counter = 0
BigData = apps.get_model("app", "BigData")
for row in BigData.objects.iterator():
# Modify row [...]
row.save()
# Commit every 1000 rows
counter += 1
if counter % 1000 == 0:
transaction.commit()
transaction.commit()
class Migration(migrations.Migration):
operations = [
migrations.RunPython(modify_data),
]
I'm using Django 1.7 and Postgres 9.3. This used to work with South and older versions of Django.
The best workaround I found is manually exiting the atomic scope before running the data migration:
def modify_data(apps, schema_editor):
schema_editor.atomic.__exit__(None, None, None)
# [...]
In contrast to resetting connection.in_atomic_block manually this allows using atomic context manager inside the migration. There doesn't seem to be a much saner way.
One can contain the (admittedly messy) transaction break out logic in a decorator to be used with the RunPython operation:
def non_atomic_migration(func):
"""
Close a transaction from within code that is marked atomic. This is
required to break out of a transaction scope that is automatically wrapped
around each migration by the schema editor. This should only be used when
committing manually inside a data migration. Note that it doesn't re-enter
the atomic block afterwards.
"""
#wraps(func)
def wrapper(apps, schema_editor):
if schema_editor.connection.in_atomic_block:
schema_editor.atomic.__exit__(None, None, None)
return func(apps, schema_editor)
return wrapper
Update
Django 1.10 will support non-atomic migrations.
From the documentation about RunPython:
By default, RunPython will run its contents inside a transaction on databases that do not support DDL transactions (for example, MySQL and Oracle). This should be safe, but may cause a crash if you attempt to use the schema_editor provided on these backends; in this case, pass atomic=False to the RunPython operation.
So, instead of what you've got:
class Migration(migrations.Migration):
operations = [
migrations.RunPython(modify_data, atomic=False),
]
For others coming across this. You can have both data (RunPython), in the same migration. Just make sure all the alter tables goes first. You cannot do the RunPython before any ALTER TABLE.
First you need to set Migration.atomic = False
class Migration(migrations.Migration):
atomic = False
Then in your function you can wrap certain block of code inside of transaction.atomic() to make only that block atomic
from django.db import transaction
for row in rows:
with transaction.atomic():
do_something(row)
# Changes made by `do_something` will be committed by this point
Here's the relevant documentation: https://docs.djangoproject.com/en/4.1/howto/writing-migrations/#non-atomic-migrations
Gotcha: migrations.RunPython(forwards_func, atomic=False) does NOT do what you want. It prevents django from manually putting your migration code inside a transaction, which it doesn't do for Postgresql anyway. This atomic=False option is meant for DBs that don't support DDL transaction, as stated in their documentation: https://docs.djangoproject.com/en/4.1/ref/migration-operations/#runpython
By default, RunPython will run its contents inside a transaction on databases that do not support DDL transactions (for example, MySQL and Oracle). This should be safe, but may cause a crash if you attempt to use the schema_editor provided on these backends; in this case, pass atomic=False to the RunPython operation.
On databases that do support DDL transactions (SQLite and PostgreSQL), RunPython operations do not have any transactions automatically added besides the transactions created for each migration.

Stale content type prompt deleting all model instances after renaming django model with permissions

I had two models called CombinedProduct and CombinedProductPrice which I renamed to Set and SetPrice respectively. I did this by changing their model name in the models.py file and replaced all occurrences of it. This also included renaming a foreignkey field in another model from combined_product to set (pointing to a CombinedProduct).
When running makemigrations django properly detected the renaming and asked if I had renamed all three of those things and I pressed 'yes' for all. However when running 'migrate', after applying some stuff, I get asked:
The following content types are stale and need to be deleted:
product | combinedproduct
product | combinedproductprice
Any objects related to these content types by a foreign key will also
be deleted. Are you sure you want to delete these content types?
If you're unsure, answer 'no'.
I backed up my data and entered 'yes' which deleted all instances of Set (previously CombinedProduct) and SetPrice (previously CombinedProductPrice). If I roll back and tick no, then this question comes up every time I migrate.
This is weird since I don't use any of the django ContentType framework anywhere. When inspecting which fields point to ContentType however I see that auth.permission points to it, and I use permissions for those models. So maybe the deletion cascades from old permissions pointing to the old model names which in turn would delete my instances? If that is the case, how can I prevent this situation?
This is the migration that was generated:
operations = [
migrations.RenameModel(
old_name='CombinedProduct',
new_name='Set',
),
migrations.RenameModel(
old_name='CombinedProductPrice',
new_name='SetPrice',
),
migrations.AlterModelOptions(
name='setprice',
options={'ordering': ('set', 'vendor', 'price'), 'verbose_name': 'Set price', 'verbose_name_plural': 'Set prices'},
),
migrations.RenameField(
model_name='setprice',
old_name='combined_product',
new_name='set',
),
]
If you want to rename your table, please take a look to RenameModel. Yes, Django do not detect the renamed model. So, you need to add it manually.

Django South - schema and data migration at the same time

Isn't it possible to do something like the following with South in a schemamigration?
def forwards(self, orm):
## CREATION
# Adding model 'Added'
db.create_table(u'something_added', (
(u'id', self.gf('django.db.models.fields.AutoField')(primary_key=True)),
('foo', self.gf('django.db.models.fields.related.ForeignKey')(to=orm['something.Foo'])),
('bar', self.gf('django.db.models.fields.related.ForeignKey')(to=orm['something.Bar'])),
))
db.send_create_signal(u'something', ['Added'])
## DATA
# Create Added for every Foo
for f in orm.Foo.objects.all():
self.prev_orm.Added.objects.create(foo=f, bar=f.bar)
## DELETION
# Deleting field 'Foo.bar'
db.delete_column(u'something_foo', 'bar_id')
See the prev_orm that would allow me to access to f.bar, and do all in one. I find that having to write 3 migrations for that is pretty heavy...
I know this is not the "way to do" but to my mind this would be honestly much cleaner.
Would there be a real problem to do so btw?
I guess your objective is to ensure that deletion does not run before the data-migration. For this you can use the dependency system in South.
You can break the above into three parts:
001_app1_addition_migration (in app 1)
then
001_app2_data_migration (in app 2, where the Foo model belongs)
and then
002_app1_deletion_migration (in app 1) with something like following:
class Migration:
depends_on = (
("app2", "001_app2_data_migration"),
)
def forwards(self):
## DELETION
# Deleting field 'Foo.bar'
db.delete_column(u'something_foo', 'bar_id')
First of all, the orm provided by South is the one that you are migrating to. In other words, it matches the schema after the migration is complete. So you can just write orm.Added instead of self.prev_orm.Added. The other implication of this fact is that you cannot reference foo.bar since it is not present in the final schema.
The way to get around that (and to answer your question) is to skip the ORM and just execute raw SQL directly.
In your case, the create statement that accesses the deleted row would look something like:
cursor.execute('SELECT "id", "bar_id" FROM "something_foo"')
for foo_id, bar_id in cursor.fetchall()
orm.Added.ojbects.create(foo_id=foo_id, bar_id=bar_id)
South migrations are using transaction management.
When doing several migrations at once, the code is similar to:
for migration in migrations:
south.db.db.start_transaction()
try:
migration.forwards(migration.orm)
south.db.db.commit_transaction()
except:
south.db.db.rollback_transaction()
raise
so... while it is not recommended to mix schema and data migrations, once you commit the schema with db.commit_transaction() the tables should be available for you to use. Be mindful to provide a backwards() method that does that correct steps backwards.