Django South - schema and data migration at the same time - django

Isn't it possible to do something like the following with South in a schemamigration?
def forwards(self, orm):
## CREATION
# Adding model 'Added'
db.create_table(u'something_added', (
(u'id', self.gf('django.db.models.fields.AutoField')(primary_key=True)),
('foo', self.gf('django.db.models.fields.related.ForeignKey')(to=orm['something.Foo'])),
('bar', self.gf('django.db.models.fields.related.ForeignKey')(to=orm['something.Bar'])),
))
db.send_create_signal(u'something', ['Added'])
## DATA
# Create Added for every Foo
for f in orm.Foo.objects.all():
self.prev_orm.Added.objects.create(foo=f, bar=f.bar)
## DELETION
# Deleting field 'Foo.bar'
db.delete_column(u'something_foo', 'bar_id')
See the prev_orm that would allow me to access to f.bar, and do all in one. I find that having to write 3 migrations for that is pretty heavy...
I know this is not the "way to do" but to my mind this would be honestly much cleaner.
Would there be a real problem to do so btw?

I guess your objective is to ensure that deletion does not run before the data-migration. For this you can use the dependency system in South.
You can break the above into three parts:
001_app1_addition_migration (in app 1)
then
001_app2_data_migration (in app 2, where the Foo model belongs)
and then
002_app1_deletion_migration (in app 1) with something like following:
class Migration:
depends_on = (
("app2", "001_app2_data_migration"),
)
def forwards(self):
## DELETION
# Deleting field 'Foo.bar'
db.delete_column(u'something_foo', 'bar_id')

First of all, the orm provided by South is the one that you are migrating to. In other words, it matches the schema after the migration is complete. So you can just write orm.Added instead of self.prev_orm.Added. The other implication of this fact is that you cannot reference foo.bar since it is not present in the final schema.
The way to get around that (and to answer your question) is to skip the ORM and just execute raw SQL directly.
In your case, the create statement that accesses the deleted row would look something like:
cursor.execute('SELECT "id", "bar_id" FROM "something_foo"')
for foo_id, bar_id in cursor.fetchall()
orm.Added.ojbects.create(foo_id=foo_id, bar_id=bar_id)

South migrations are using transaction management.
When doing several migrations at once, the code is similar to:
for migration in migrations:
south.db.db.start_transaction()
try:
migration.forwards(migration.orm)
south.db.db.commit_transaction()
except:
south.db.db.rollback_transaction()
raise
so... while it is not recommended to mix schema and data migrations, once you commit the schema with db.commit_transaction() the tables should be available for you to use. Be mindful to provide a backwards() method that does that correct steps backwards.

Related

Create DB Constraint via Django

I have a Django model which looks like this:
class Dummy(models.Model):
...
system = models.CharField(max_length=16)
I want system never to be empty or to contain whitespace.
I know how to use validators in Django.
But I would enforce this at database level.
What is the easiest and django-like way to create a DB constraint for this?
I use PostgreSQL and don't need to support any other database.
2019 Update
Django 2.2 added support for database-level constrains. The new CheckConstraint and UniqueConstraint classes enable adding custom database constraints. Constraints are added to models using the Meta.constraints option.
Your system validation would look like something like this:
from django.db import models
from django.db.models.constraints import CheckConstraint
from django.db.models.query_utils import Q
class Dummy(models.Model):
...
system = models.CharField(max_length=16)
class Meta:
constraints = [
CheckConstraint(
check=~Q(system="") & ~Q(system__contains=" "),
name="system_not_blank")
]
First issue: creating a database constraint through Django
A)
It seems that django does not have this ability build in yet. There is a 9-year-old open ticket for it, but I wouldn't hold my breath for something that has been going on this long.
Edit: As of release 2.2 (april 2019), Django supports database-level check constraints.
B) You could look into the package django-db-constraints, through which you can define constraints in the model Meta. I did not test this package, so I don't know how useful it really is.
# example using this package
class Meta:
db_constraints = {
'price_above_zero': 'check (price > 0)',
}
Second issue: field system should never be empty nor contain whitespaces
Now we would need to build the check constraint in postgres syntax to accomplish that. I came up with these options:
Check if the length of system is different after removing whitespaces. Using ideas from this answer you could try:
/* this check should only pass if `system` contains no
* whitespaces (`\s` also detects new lines)
*/
check ( length(system) = length(regexp_replace(system, '\s', '', 'g')) )
Check if the whitespace count is 0. For this you could us regexp_matches:
/* this check should only pass if `system` contains no
* whitespaces (`\s` also detects new lines)
*/
check ( length(regexp_matches(system, '\s', 'g')) = 0 )
Note that the length function can't be used with regexp_matches because the latter returns a set of text[] (set of arrays), but I could not find the proper function to count the elements of that set right now.
Finally, bringing both of the previous issues together, your approach could look like this:
class Dummy(models.Model):
# this already sets NOT NULL to the field in the database
system = models.CharField(max_length=16)
class Meta:
db_constraints = {
'system_no_spaces': 'check ( length(system) > 0 AND length(system) = length(regexp_replace(system, "\s", "", "g")) )',
}
This checks that the fields value:
does not contain NULL (CharField adds NOT NULL constraint by default)
is not empty (first part of the check: length(system) > 0)
has no whitespaces (second part of the check: same length after replacing whitespace)
Let me know how that works out for you, or if there are problems or drawbacks to this approach.
You can add CHECK constraint via custom django migration. To check string length you can use char_length function and position to check for containing whitespaces.
Quote from postgres docs (https://www.postgresql.org/docs/current/static/ddl-constraints.html):
A check constraint is the most generic constraint type. It allows you
to specify that the value in a certain column must satisfy a Boolean
(truth-value) expression.
To run arbitrary sql in migaration RunSQL operation can be used (https://docs.djangoproject.com/en/2.0/ref/migration-operations/#runsql):
Allows running of arbitrary SQL on the database - useful for more
advanced features of database backends that Django doesn’t support
directly, like partial indexes.
Create empty migration:
python manage.py makemigrations --empty yourappname
Add sql to create constraint:
# Generated by Django A.B on YYYY-MM-DD HH:MM
from django.db import migrations
class Migration(migrations.Migration):
dependencies = [
('yourappname', '0001_initial'),
]
operations = [
migrations.RunSQL('ALTER TABLE appname_dummy ADD CONSTRAINT syslen '
'CHECK (char_length(trim(system)) > 1);',
'ALTER TABLE appname_dummy DROP CONSTRAINT syslen;'),
migrations.RunSQL('ALTER TABLE appname_dummy ADD CONSTRAINT syswh '
'CHECK (position(' ' in trim(system)) = 0);',
'ALTER TABLE appname_dummy DROP CONSTRAINT syswh;')
]
Run migration:
python manage.py migrate yourappname
I modify my answer to reach out your requirements.
So, if you would like to run a DB constraint try this one :
import psycopg2
def your_validator():
conn = psycopg2.connect("dbname=YOURDB user=YOURUSER")
cursor = conn.cursor()
query_result = cursor.execute("YOUR QUERY")
if query_result is Null:
# Do stuff
else:
# Other Stuff
Then use the pre_save signal.
In your models.py file add,
from django.db.models.signals import pre_save
class Dummy(models.Model):
...
#staticmethod
def pre_save(sender, instance, *args, **kwargs)
# Of course, feel free to parse args in your def.
your_validator()

Commit manually in Django data migration

I'd like to write a data migration where I modify all rows in a big table in smaller batches in order to avoid locking issues. However, I can't figure out how to commit manually in a Django migration. Everytime I try to run commit I get:
TransactionManagementError: This is forbidden when an 'atomic' block is active.
AFAICT, the database schema editor always wraps Postgres migrations in an atomic block.
Is there a sane way to break out of the transaction from within the migration?
My migration looks like this:
def modify_data(apps, schema_editor):
counter = 0
BigData = apps.get_model("app", "BigData")
for row in BigData.objects.iterator():
# Modify row [...]
row.save()
# Commit every 1000 rows
counter += 1
if counter % 1000 == 0:
transaction.commit()
transaction.commit()
class Migration(migrations.Migration):
operations = [
migrations.RunPython(modify_data),
]
I'm using Django 1.7 and Postgres 9.3. This used to work with South and older versions of Django.
The best workaround I found is manually exiting the atomic scope before running the data migration:
def modify_data(apps, schema_editor):
schema_editor.atomic.__exit__(None, None, None)
# [...]
In contrast to resetting connection.in_atomic_block manually this allows using atomic context manager inside the migration. There doesn't seem to be a much saner way.
One can contain the (admittedly messy) transaction break out logic in a decorator to be used with the RunPython operation:
def non_atomic_migration(func):
"""
Close a transaction from within code that is marked atomic. This is
required to break out of a transaction scope that is automatically wrapped
around each migration by the schema editor. This should only be used when
committing manually inside a data migration. Note that it doesn't re-enter
the atomic block afterwards.
"""
#wraps(func)
def wrapper(apps, schema_editor):
if schema_editor.connection.in_atomic_block:
schema_editor.atomic.__exit__(None, None, None)
return func(apps, schema_editor)
return wrapper
Update
Django 1.10 will support non-atomic migrations.
From the documentation about RunPython:
By default, RunPython will run its contents inside a transaction on databases that do not support DDL transactions (for example, MySQL and Oracle). This should be safe, but may cause a crash if you attempt to use the schema_editor provided on these backends; in this case, pass atomic=False to the RunPython operation.
So, instead of what you've got:
class Migration(migrations.Migration):
operations = [
migrations.RunPython(modify_data, atomic=False),
]
For others coming across this. You can have both data (RunPython), in the same migration. Just make sure all the alter tables goes first. You cannot do the RunPython before any ALTER TABLE.
First you need to set Migration.atomic = False
class Migration(migrations.Migration):
atomic = False
Then in your function you can wrap certain block of code inside of transaction.atomic() to make only that block atomic
from django.db import transaction
for row in rows:
with transaction.atomic():
do_something(row)
# Changes made by `do_something` will be committed by this point
Here's the relevant documentation: https://docs.djangoproject.com/en/4.1/howto/writing-migrations/#non-atomic-migrations
Gotcha: migrations.RunPython(forwards_func, atomic=False) does NOT do what you want. It prevents django from manually putting your migration code inside a transaction, which it doesn't do for Postgresql anyway. This atomic=False option is meant for DBs that don't support DDL transaction, as stated in their documentation: https://docs.djangoproject.com/en/4.1/ref/migration-operations/#runpython
By default, RunPython will run its contents inside a transaction on databases that do not support DDL transactions (for example, MySQL and Oracle). This should be safe, but may cause a crash if you attempt to use the schema_editor provided on these backends; in this case, pass atomic=False to the RunPython operation.
On databases that do support DDL transactions (SQLite and PostgreSQL), RunPython operations do not have any transactions automatically added besides the transactions created for each migration.

Django migrations using RunPython to commit changes

I want to alter a foreign key in one of my models that can currently have NULL values to not be nullable.
I removed the null=True from my field and ran makemigrations
Because I'm an altering a table that already has rows which contain NULL values in that field I am asked to provide a one-off value right away or edit the migration file and add a RunPython operation.
My RunPython operation is listed BEFORE the AlterField operation and does the required update for this field so it doesn't contain NULL values (only rows who already contain a NULL value).
But, the migration still fails with this error:
django.db.utils.OperationalError: cannot ALTER TABLE "my_app_site" because it has pending trigger events
Here's my code:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
from django.db import models, migrations
def add_default_template(apps, schema_editor):
Template = apps.get_model("my_app", "Template")
Site = apps.get_model("my_app", "Site")
accept_reject_template = Template.objects.get(name="Accept/Reject")
Site.objects.filter(template=None).update(template=accept_reject_template)
class Migration(migrations.Migration):
dependencies = [
('my_app', '0021_auto_20150210_1008'),
]
operations = [
migrations.RunPython(add_default_template),
migrations.AlterField(
model_name='site',
name='template',
field=models.ForeignKey(to='my_app.Template'),
preserve_default=False,
),
]
If I understand correctly this error may occur when a field is altered to be not-nullable but the field contains null values.
In that case, the only reason I can think of why this happens is because the RunPython operation transaction didn't "commit" the changes in the database before running the AlterField.
If this is indeed the reason - how can I make sure the changes reflect in the database?
If not - what can be the reason for the error?
Thanks!
This happens because Django creates constraints as DEFERRABLE INITIALLY DEFERRED:
ALTER TABLE my_app_site
ADD CONSTRAINT "[constraint_name]"
FOREIGN KEY (template_id)
REFERENCES my_app_template(id)
DEFERRABLE INITIALLY DEFERRED;
This tells PostgreSQL that the foreign key does not need to be checked right after every command, but can be deferred until the end of transactions.
So, when a transaction modifies content and structure, the constraints are checked on parallel with the structure changes, or the checks are scheduled to be done after altering the structure. Both of these states are bad and the database will abort the transaction instead of making any assumptions.
You can instruct PostgreSQL to check constraints immediately in the current transaction by calling SET CONSTRAINTS ALL IMMEDIATE, so structure changes won't be a problem (refer to SET CONSTRAINTS documentation). Your migration should look like this:
operations = [
migrations.RunSQL('SET CONSTRAINTS ALL IMMEDIATE',
reverse_sql=migrations.RunSQL.noop),
# ... the actual migration operations here ...
migrations.RunSQL(migrations.RunSQL.noop,
reverse_sql='SET CONSTRAINTS ALL IMMEDIATE'),
]
The first operation is for applying (forward) migrations, and the last one is for unapplying (backwards) migrations.
EDIT: Constraint deferring is useful to avoid insertion sorting, specially for self-referencing tables and tables with cyclic dependencies. So be careful when bending Django.
LATE EDIT: on Django 1.7 and newer versions there is a special SeparateDatabaseAndState operation that allows data changes and structure changes on the same migration. Try using this operation before resorting to the "set constraints all immediate" method above. Example:
operations = [
migrations.SeparateDatabaseAndState(database_operations=[
# put your sql, python, whatever data migrations here
],
state_operations=[
# field/model changes goes here
]),
]
Yes, I'd say it's the transaction bounds which are preventing the data change in your migration being committed before the ALTER is run.
I'd do as #danielcorreia says and implement it as two migrations, as it looks like the even the SchemaEditor is bound by transactions, via the the context manager you'd be obliged to use.
Adding null to the field giving you a problem should fix it. In your case the "template" field. Just add null=True to the field. The migrations should than look like this:
class Migration(migrations.Migration):
dependencies = [
('my_app', '0021_auto_20150210_1008'),
]
operations = [
migrations.RunPython(add_default_template),
migrations.AlterField(
model_name='site',
name='template',
field=models.ForeignKey(to='my_app.Template', null=True),
preserve_default=False,
),
]

Django queryset filter on boolean not working

I have a model:
class mymodel(models.Model):
order_closed = models.BooleanField(default=False)
I added this field to my development mysqllite db manually since its a new field for a model/table that already existed. I then tried:
mymodel.objects.filter(order_closed=False) #and with True
and its producing incorrect or unpredictable results. I saw some mention that is could be a sqllite thing but I'm not sure? The templates seem to understand whether its a true or false value but python code doesn't. To clarify with some examples:
{{mymodel.order_closed}} will print 0 after I set the default to 0 in sqllite. but using .filter(order_closed=value) will still return every record.
I think you make some mistake in wrirting SQL . If you have db which have some important informations use
south
http://south.aeracode.org/
When you will have it you can easly upgrate/edit your database .
If you dont wanna install new 'plugins' try that .
1. Delete this field from DB manualy .
2. write : python manage.py sql 'name of app'
It will return the CREATE TABLE SQL statements for the app.
Then you can upr your database manualy with some of CReate command.

Before syncdb, delete field from standard Django model

This is a follow-up question on Delete field from standard Django model. In short: a field can be dynamically deleted from a model that is already created, in this case the field User.email . So field email would be deleted from User without changing the code for User. See code below for example.
I can dynamically delete a a field from a model(1), but that happens when the server starts and is undone when it exists. Since syncdb doesn't require the server to be running, and generally seems to ignore the deletion code (somehow), this approach doesn't prevent the field from appearing in the database(2).
Is there a way to do delete the field from the model (without changing the file it's in, as it's a Django model), in a way that also makes it not appear in the database?
Thanks in advance!
Mark
EDIT: I problem is not that I am deleting the text "m = models.IntegerField()" from the model file and want the field removed from the database. The problem is that I am using the code below to remove a field from a model that has already been declared in another file. I do not think that creating a migration with South for every time I run syncdb is a solution(3).
Additional information:
1) Currently, the code is in models.py, but I suppose Where to put Django startup code? works.
2) I can delete it on post_syncdb signal with custom query, but I hope for something more elegant... Or elegant at all, more accurately.
3) If it even works at all, because obviously syncdb is still seeing the 'removed' field), so I think South would to as it's still somehow there.
This is the code (models.py):
class A(models.Model):
m = models.IntegerField()
for i, f in enumerate(A._meta.fields):
if f.name == 'm':
del A._meta.fields[i]
break
class B(A):
n = models.IntegerField()
for i, f in enumerate(B._meta.fields):
if f.name == 'm':
del B._meta.fields[i]
break
EDIT: I checked (with print) and the deletion code is executed on syncdb. It is executed before tables are created
django does a lot of meta class magic and i would guess that the meta class is responsible for defining the database table to back your model. Subsequently just deleting the field is not enough to alter the generated table.
as several people have pointed out, south is the way to deal with these problems.