Adding a non-null ForeignKey field in Django+South - django

I use Django and South for my database. Now I want to add a new Model and a field in an existing model, referencing the new model. For example:
class NewModel(models.Model):
# a new model
# ...
class ExistingModel(models.Model):
# ... existing fields
new_field = models.ForeignKey(NewModel) # adding this now
Now South obviously complains that I added a non-null field and asks me to enter a one-off value. But what I really want is to create a new NewModel instance for every existing ExistingModel instance, thus fulfilling the database requirements. Is that possible somehow?

The easiest way to do this is to write a schema migration that makes the column change, and then write a datamigration to correctly fill in the value. Depending on the database you're using you'll have to do this in slightly different ways.
Sqlite
For Sqlite, you can add a sentinel value for the relation and use a datamigration to fill it in without any issue:
0001_schema_migration_add_foreign_key_to_new_model_from_existing_model.py
from south.db import db
from south.v2 import SchemaMigration
from django.db import models
class Migration(SchemaMigration):
def forwards(self, orm):
db.add_column('existing_model_table', 'new_model',
self.gf('django.db.models.fields.related.ForeignKey')(default=0, to=orm['appname.new_model']), keep_default=False)
0002_data_migration_for_new_model.py:
class Migration(DataMigration):
def forwards(self, orm):
for m in orm['appname.existing_model'].objects.all():
m.new_model = #custom criteria here
m.save()
This will work just fine, with no issues.
Postgres and MySQL
With MySql, you have to give it a valid default. If 0 isn't actually a valid Foreignkey, you'll get errors telling you so.
You could default to 1, but there are instances where that isn't a valid foreign key (happened to me because we have different environments, and some environments publish to other databases, so the IDs rarely match up (we use UUIDs for cross-database identification, as God intended).
The second issue you get is that South and MySQL don't play well together. Partially because MySQL doesn't have the concept of DDL transactions.
In order to get around some issues you will inevitably face (including the error I mentioned above and from South asking you to mark orm items in a SchemaMigration as no-dry-run), you need to change the above 0001 script to do the following:
0001_schema_migration_add_foreign_key_to_new_model_from_existing_model.py
from south.db import db
from south.v2 import SchemaMigration
from django.db import models
class Migration(SchemaMigration):
def forwards(self, orm):
id = 0
if not db.dry_run:
new_model = orm['appname.new_model'].objects.all()[0]
if new_model:
id = new_model.id
db.add_column('existing_model_table', 'new_model',
self.gf('django.db.models.fields.related.ForeignKey')(default=id, to=orm['appname.new_model']), keep_default=False)
And then you can run the 0002_data_migration_for_new_model.py file as normal.
I advise using the same example above for Postgres and for MySql. I don't remember any issues offhand with Postgres with the first example, but I'm certain the second example works for both (tested).

You want a data migration to supplement your schema migration in this scenario.
South has a nice step by step tutorial on how to achieve this in the docs, here.
It's not uncommon in South to have the desired outcome spread over two or three schema/data migrations as its not always possible to do it in one big hit (sometimes depends on the underlying db if it will tolerate adding a non null column with no default). So in this case you might add a schema migration that has a default, then a data migration with your object manipulation then a final schema migration.

Related

how to add db_index=True to email field of django auth_user

suddenly, my page got so many users in db that the a filter for email over the auth_user table almost failing because of the extremely big number of users.
Since the table comes built-in, I need to add db_index=True to columns in this table, any idea how to do this?
One quick and easy way would be to manually add the index using RunSQL in a migration.
operations = [
migrations.RunSQL("CREATE INDEX..."),
]
It's not very elegant. For one thing, the migration will be for a different app (since you don't control the auth migrations). For another, the schema will technically be out of sync with the database. However, I don't think there are any negative consequences in this case, since Django doesn't do anything with db_index other than create the index.
One possibility is to substitute the user model with a custom one, which will have proper indices and any other field that you require. There is extensive documentation on Django docs: Substituting a custom User model on how to achieve this. This is how I did it on a particular case with a similar issue.
Another possibility is to extend the user model, which could have a particular field repeated from the original model, on which there is an index. Disclaimer: I am genuinely against that for obvious reasons, but I have seen this happening, as this approach is easier to code than the first. This would be very bad though if there are many fields.
This is a good question imo. I would love to know if there is another possibility which I miss.
I had the same problem, but with an additional twist--- I already had an index created by South in my table. So if I added a RunSQL("Create index") to my migration, it would add a second index to my database. But at the same time, if I don't include some create index action in the migrations, then I won't be able to properly spin up new databases.
Here's my solution, some python code to check for the existence of an index using some of the private-ish methods in schema_editor
project/appname/migrations/0002.py:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
from django.db import models, migrations
from django.db.migrations import RunPython
def forward__auth_user__email__index(apps, schema_editor):
auth_user = apps.get_model("auth", "User")
target_column_to_index = 'email'
se = schema_editor
# if there aren't any indexes already, create one.
index_names = se._constraint_names(auth_user, [target_column_to_index], index=True)
if len(index_names) == 0:
se.execute(se._create_index_sql(auth_user, [auth_user._meta.get_field('email')]))
def reverse__auth_user__email__index(apps, schema_editor):
auth_user = apps.get_model("auth", "User")
target_column_to_index = 'email'
se = schema_editor
# delete any indexes for this table / column.
index_names = se._constraint_names(model, [target_column_to_index], index=True)
for index_name in index_names:
se.execute(se._delete_constraint_sql(se.sql_delete_index, auth_user, index_name))
class Migration(migrations.Migration):
dependencies = [
('frontend', '0001_initial'),
]
operations = [
RunPython(forward__auth_user__email__index, reverse_code=reverse__auth_user__email__index)
]

Migrating existing auth.User data to new Django 1.5 custom user model?

I'd prefer not to destroy all the users on my site. But I want to take advantage of Django 1.5's custom pluggable user model. Here's my new user model:
class SiteUser(AbstractUser):
site = models.ForeignKey(Site, null=True)
Everything works with my new model on a new install (I've got other code, along with a good reason for doing this--all of which are irrelevant here). But if I put this on my live site and syncdb & migrate, I'll lose all my users or at least they'll be in a different, orphaned table than the new table created for my new model.
I'm familiar with South, but based on this post and some trials on my part, it seems its data migrations are not currently a fit for this specific migration. So I'm looking for some way to either make South work for this or for some non-South migration (raw SQL, dumpdata/loaddata, or otherwise) that I can run on each of my servers (Postgres 9.2) to migrate the users once the new table has been created while the old auth.User table is still in the database.
South is more than able to do this migration for you, but you need to be smart and do it in stages. Here's the step-by-step guide: (This guide presupposed you subclass AbstractUser, not AbstractBaseUser)
Before making the switch, make sure that south support is enabled in the application
that contains your custom user model (for the sake of the guide, we'll call it accounts and the model User).
At this point you should not yet have a custom user model.
$ ./manage.py schemamigration accounts --initial
Creating migrations directory at 'accounts/migrations'...
Creating __init__.py in 'accounts/migrations'...
Created 0001_initial.py.
$ ./manage.py migrate accounts [--fake if you've already syncdb'd this app]
Running migrations for accounts:
- Migrating forwards to 0001_initial.
> accounts:0001_initial
- Loading initial data for accounts.
Create a new, blank user migration in the accounts app.
$ ./manage.py schemamigration accounts --empty switch_to_custom_user
Created 0002_switch_to_custom_user.py.
Create your custom User model in the accounts app, but make sure it is defined as:
class SiteUser(AbstractUser): pass
Fill in the blank migration with the following code.
# encoding: utf-8
from south.db import db
from south.v2 import SchemaMigration
class Migration(SchemaMigration):
def forwards(self, orm):
# Fill in the destination name with the table name of your model
db.rename_table('auth_user', 'accounts_user')
db.rename_table('auth_user_groups', 'accounts_user_groups')
db.rename_table('auth_user_user_permissions', 'accounts_user_user_permissions')
def backwards(self, orm):
db.rename_table('accounts_user', 'auth_user')
db.rename_table('accounts_user_groups', 'auth_user_groups')
db.rename_table('accounts_user_user_permissions', 'auth_user_user_permissions')
models = { ....... } # Leave this alone
Run the migration
$ ./manage.py migrate accounts
- Migrating forwards to 0002_switch_to_custom_user.
> accounts:0002_switch_to_custom_user
- Loading initial data for accounts.
Make any changes to your user model now.
# settings.py
AUTH_USER_MODEL = 'accounts.User'
# accounts/models.py
class SiteUser(AbstractUser):
site = models.ForeignKey(Site, null=True)
create and run migrations for this change
$ ./manage.py schemamigration accounts --auto
+ Added field site on accounts.User
Created 0003_auto__add_field_user_site.py.
$ ./manage.py migrate accounts
- Migrating forwards to 0003_auto__add_field_user_site.
> accounts:0003_auto__add_field_user_site
- Loading initial data for accounts.
Honestly, If you already have good knowledge of your setup and already use south, It should be as simple as adding the following migration to your accounts module.
# encoding: utf-8
from south.db import db
from south.v2 import SchemaMigration
from django.db import models
class Migration(SchemaMigration):
def forwards(self, orm):
# Fill in the destination name with the table name of your model
db.rename_table('auth_user', 'accounts_user')
db.rename_table('auth_user_groups', 'accounts_user_groups')
db.rename_table('auth_user_permissions', 'accounts_user_permissions')
# == YOUR CUSTOM COLUMNS ==
db.add_column('accounts_user', 'site_id',
models.ForeignKey(orm['sites.Site'], null=True, blank=False)))
def backwards(self, orm):
db.rename_table('accounts_user', 'auth_user')
db.rename_table('accounts_user_groups', 'auth_user_groups')
db.rename_table('accounts_user_user_permissions', 'auth_user_user_permissions')
# == YOUR CUSTOM COLUMNS ==
db.remove_column('accounts_user', 'site_id')
models = { ....... } # Leave this alone
EDIT 2/5/13: added rename for auth_user_group table. FKs will auto update to point at the correct table due to db constraints, but M2M fields' table names are generated from the names of the 2 end tables and will need manual updating in this manner.
EDIT 2: Thanks to #Tuttle & #pix0r for the corrections.
My incredibly lazy way of doing this:
Create a new model (User), extending AbstractUser. Within new model, in it's Meta, override db_table and set to 'auth_user'.
Create an initial migration using South.
Migrate, but fake the migration, using --fake when running migrate.
Add new fields, create migration, run it normally.
This is beyond lazy, but works. You now have a 1.5 compliant User model, which just uses the old table of users. You also have a proper migration history.
You can fix this later on with manual migrations to rename the table.
I think you've correctly identified that a migration framework like South is the right way to go here. Assuming you're using South, you should be able to use the Data Migrations functionality to port the old users to your new model.
Specifically, I would add a forwards method to copy all rows in your user table to the new table. Something along the lines of:
def forwards(self, orm):
for user in orm.User.objects.all():
new_user = SiteUser(<initialize your properties here>)
new_user.save()
You could also use the bulk_create method to speed things up.
I got tired of struggling with South so I actually ended up doing this differently and it worked out nicely for my particular situation:
First, I made it work with ./manage.py dumpdata, fixing up the dump, and then ./manage.py loaddata, which worked. Then I realized I could do basically the same thing with a single, self-contained script that only loads necessary django settings and does the serialization/deserialization directly.
Self-contained python script
## userconverter.py ##
import json
from django.conf import settings
settings.configure(
DATABASES={
# copy DATABASES configuration from your settings file here, or import it directly from your settings file (but not from django.conf.settings) or use dj_database_url
},
SITE_ID = 1, # because my custom user implicates contrib.sites (which is why it's in INSTALLED_APPS too)
INSTALLED_APPS = ['django.contrib.sites', 'django.contrib.auth', 'myapp'])
# some things you have to import after you configure the settings
from django.core import serializers
from django.contrib.auth.models import User
# this isn't optimized for huge amounts of data -- use streaming techniques rather than loads/dumps if that is your case
old_users = json.loads(serializers.serialize('json', User.objects.all()))
for user in old_users:
user['pk'] = None
user['model'] = "myapp.siteuser"
user['fields']["site"] = settings['SITE_ID']
for new_user in serializers.deserialize('json', json.dumps(old_users)):
new_user.save()
With dumpdata/loaddata
I did the following:
1) ./manage.py dumpdata auth.User
2) Script to convert auth.user data to new user. (or just manually search and replace in your favorite text editor or grep) Mine looked something like:
def convert_user_dump(filename, site_id):
file = open(filename, 'r')
contents = file.read()
file.close()
user_list = json.loads(contents)
for user in user_list:
user['pk'] = None # it will auto-increment
user['model'] = "myapp.siteuser"
user['fields']["site"] = side_id
contents = json.dumps(user_list)
file = open(filename, 'w')
file.write(contents)
file.close()
3) ./manage.py loaddata filename
4) set AUTH_USER_MODEL
*Side Note: One critical part of doing this type of migration, regardless of which technique you use (South, serialization/modification/deserialization, or otherwise) is that as soon as you set AUTH_USER_MODEL to your custom model in the current settings, django cuts you off from auth.User, even if the table still exists.*
We decided to switch to a custom user model in our Django 1.6/Django-CMS 3 project, perhaps a little bit late because we had data in our database that we didn't want to lose (some CMS pages, etc).
After we switched AUTH_USER_MODEL to our custom model, we had a lot of problems that we hadn't anticipated, because a lot of other tables had foreign keys to the old auth_user table, which wasn't deleted. So although things appeared to work on the surface, a lot of things broke underneath: publishing pages, adding images to pages, adding users, etc. because they tried to create an entry in a table that still had a foreign key to auth_user, without actually inserting a matching record into auth_user.
We found a quick and dirty way to rebuild all the tables and relations, and copy our old data across (except for users):
do a full backup of your database with mysqldump
do another backup with no CREATE TABLE statements, and excluding a few tables that won't exist after the rebuild, or will be populated by syncdb --migrate on a fresh database:
south_migrationhistory
auth_user
auth_user_groups
auth_user_user_permissions
auth_permission
django_content_types
django_site
any other tables that belong to apps that you removed from your project (you might only find this out by experimenting)
drop the database
recreate the database (e.g. manage.py syncdb --migrate)
create a dump of the empty database (to make it faster to go round this loop again)
attempt to load the data dump that you created above
if it fails to load because of a duplicate primary key or a missing table, then:
edit the dump with a text editor
remove the statements that lock, dump and unlock that table
reload the empty database dump
try to load the data dump again
repeat until the data dump loads without errors
The commands that we ran (for MySQL) were:
mysqldump <database> > ~/full-backup.sql
mysqldump <database> \
--no-create-info \
--ignore-table=<database>.south_migrationhistory \
--ignore-table=<database>.auth_user \
--ignore-table=<database>.auth_user_groups \
--ignore-table=<database>.auth_user_user_permissions \
--ignore-table=<database>.auth_permission \
--ignore-table=<database>.django_content_types \
--ignore-table=<database>.django_site \
> ~/data-backup.sql
./manage.py sqlclear
./manage.py syncdb --migrate
mysqldump <database> > ~/empty-database.sql
./manage.py dbshell < ~/data-backup.sql
(edit ~/data-backup.sql to remove data dumped from a table that no longer exists)
./manage.py dbshell < ~/empty-database.sql
./manage.py dbshell < ~/data-backup.sql
(repeat until clean)

How to load fixtures in Django south migrations properly?

I am using Django 1.5b1 and south migrations and life has generally been great. I have some schema updates which create my database, with a User table among others. I then load a fixture for ff.User (my custom user model):
def forwards(self, orm):
from django.core.management import call_command
fixture_path = "/absolute/path/to/my/fixture/load_initial_users.json"
call_command("loaddata", fixture_path)
All has been working great until I have added another field to my ff.User model, much further down the migration line. My fixture load now breaks:
DatabaseError: Problem installing fixture 'C:\<redacted>create_users.json':
Could not load ff.User(pk=1): (1054, "Unknown column 'timezone_id' in 'field list'")
Timezone is the field (ForeignKey) which I added to my user model.
The ff.User differs from what is in the database, so the Django ORM gives up with a DB error. Unfortunately, I cannot specify my model in my fixture as orm['ff.User'], which seems to be the south way of doing things.
How should I load fixtures properly using south so that they do not break once the models for which these fixtures are for gets modified?
I found a Django snippet that does the job!
https://djangosnippets.org/snippets/2897/
It load the data according to the models frozen in the fixture rather than the actual model definition in your apps code! Works perfect for me.
I proposed a solution that might interest you too:
https://stackoverflow.com/a/21631815/797941
Basicly, this is how I load my fixture:
from south.v2 import DataMigration
import json
class Migration(DataMigration):
def forwards(self, orm):
json_data=open("path/to/your/fixture.json")
items = json.load(json_data)
for item in items:
# Be carefull, this lazy line won't resolve foreign keys
obj = orm[item["model"]](**item["fields"])
obj.save()
json_data.close()
This was a frustrating part of using fixtures for me as well. My solution was to make a few helper tools. One which creates fixtures by sampling data from a database and includes South migration history in the fixtures.
There's also a tool to add South migration history to existing fixtures.
The third tool checks out the commit when this fixture was modified, loads the fixture, then checks out the most recent commit and does a south migration and dumps the migrated db back to the fixture. This is done in a separate database so your default db doesn't get stomped on.
The first two can be considered beta code, and the third please treat as usable alpha, but they're already being quite helpful to me.
Would love to get some feedback from others:
git#github.com:JivanAmara/django_fixture_tools.git
Currently, it only supports projects using git as the RCS.
The most elegant solution I've found is here where by your app model's get_model function is switched out to instead supply the model from the supplied orm. It's then set back after the fixture is applied.
from django.db import models
from django.core.management import call_command
def load_fixture(file_name, orm):
original_get_model = models.get_model
def get_model_southern_style(*args):
try:
return orm['.'.join(args)]
except:
return original_get_model(*args)
models.get_model = get_model_southern_style
call_command('loaddata', file_name)
models.get_model = original_get_model
You call it with load_fixture('my_fixture.json', orm) from within you forwards definition.
Generally South handles migrations using forwards() and backwards() functions. In your case you should either:
alter the fixtures to contain proper data, or
import fixture before migration that breaks it (or within the same migration, but before altering the schema),
In the second case, before migration adding (or, as in your case, removing) the column, you should perform the migration that will explicitly load the fixtures similarly to this (docs):
def forwards(self, orm):
from django.core.management import call_command
call_command("loaddata", "create_users.json")
I believe this is the easiest way to accomplish what you needed. Also make sure you do not do some simple mistakes like trying to import data with new structure before applying older migrations.
Reading the following two posts has helped me come up with a solution:
http://andrewingram.net/2012/dec/common-pitfalls-django-south/#be-careful-with-fixtures
http://news.ycombinator.com/item?id=4872596
Specifically, I rewrote my data migrations to use output from 'dumpscript'
I needed to modify the resulting script a bit to work with south. Instead of doing
from ff.models import User
I do
User = orm['ff.User']
This works exactly like I wanted it to. Additionally, it has the benefit of not hard-coding IDs, like fixtures require.

Django syncdb not running custom SQL

I'm trying to get my app to run some custom SQL on syncdb with the official method of placing some INSERT statements into /sql/.sql
Now, when I run "manage.py sqlall ", all the SQL I want run is there.
However, after running syncdb, the data I want is nowhere to be found in the database! Am I missing something?
EDIT: The app I want to make the inserts for is using South migrations, and this is maybe why the initial SQL is skipped. Does anyone know how I can force it to run the SQL after migration, perhaps?
Initial data sql doesn't get run for South-managed applications. From what I understood this is by design, though I couldn't find any formal proof of that, except this mailing list post.
To achieve the same behavior you can create a migration and convert your SQL to Python. This is how I did it for installing a view:
Create and edit the migration:
$ python manage.py schemamigration app1 install_foo_view --empty
$ vim app1/migrations/*_install_foo_view.py
Here's the migration itself:
from south.db import db
from south.v2 import SchemaMigration
class Migration(SchemaMigration):
def forwards(self, orm):
db.execute("CREATE VIEW app1_foo AS SELECT col1, col2 FROM app1_bar")
def backwards(self, orm):
db.execute("DROP VIEW app1_foo")
# autogenerated models attibute goes here
complete_apps = ['app1']
For data migrations the flow is similar:
$ python manage.py datamigration app1 install_bars
$ vim app1/migrations/*_install_bars.py
Here's the migration itself:
from south.db import db
from south.v2 import DataMigration
class Migration(DataMigration):
def forwards(self, orm):
orm.Bar.objects.create(col1='val1', col2='val2')
def backwards(self, orm):
orm.Bar.objects.filter(col1='val1', col2='val2').delete()
# autogenerated models attibute goes here
complete_apps = ['app1']
The official South doc for data fixtures is broken:
It doesn't scale well with respect to schema changes.
You should copy and adjust the fixture file each time you change the Bar model fields. So you would end up with "my_fixture_v1.json", "my_fixture_v2.json" and so on.
Otherwise, if you don't keep versioning them and just keep the latest version, there will be mismatches between previous versions of model and the fixture and you won't be able to navigate migrations back and forth.
It doesn't support backwards migration.
The suggested approach takes care of both of these issues:
Since you use orm.Bar you have the set of fields that is applicable to Bar right at that point in history. You won't get any missing or extra fields.
There's no need to keep copying anything with Bar changes.
It is possible to migrate backwards.
Note that filter().delete() combination is used instead of get().delete(). This way you won't delete the Bars that has been changed by user, neither you fail with Bar.DoesNotExist when not finding them.
I am not exactly sure what you are doing, but if you are wanting to prepopulate the database you need to use fixtures.
https://docs.djangoproject.com/en/1.3/howto/initial-data/

Utility of managed=False option in Django models

In django models we have option named managed which can be set True or False
According to documentation the only difference this option makes is whether table will be managed by django or not. Is management by django or by us makes any difference?
Is there any pros and cons of using one option rather than other?
I mean why would we opt for managed=False? Will it give some extra control or power which affects my code?
The main reason for using managed=False is if your model is backed by something like a database view, instead of a table - so you don't want Django to issue CREATE TABLE commands when you run syncdb.
Right from Django docs:
managed=False is useful if the model represents an existing table or a database view that has been created by some other means. This is the only difference when managed=False. All other aspects of model handling are exactly the same as normal
When ever we create the django model, the managed=True implicitly is
true by default. As we know that when we run python manage.py makemigrations the migration file(which we can say a db view) is
created in migration folder of the app and to apply that migration i.e
creates the table in db or we can say schema.
So by managed=False, we restrict Django to create table(scheme, update
the schema of the table) of that model or its fields specified in
migration file.
Why we use its?
case1: Sometime we use two db for the project for
example we have db1(default) and db2, so we don't want particular
model to be create the schema or table in db1 so we can this or we can
customize the db view.
case2. In django ORM, the db table is tied to django ORM model, It
help tie a database view to bind with a django ORM model.
Can also go through the link:
We can add our raw sql for db view in migration file.
The raw sql in migration look like: In 0001_initial.py
from future import unicode_literals
from django.db import migrations, models
class Migration(migrations.Migration):
initial = True
dependencies = [
]
operations = [
migrations.RunSQL(
CREATE OR REPLACE VIEW app_test AS
SELECT row_number() OVER () as id,
ci.user_id,
ci.company_id,
),
]
Above code is just for overview of the looking of the migration file, can go through above link for brief.