Django Migrations fail during django initialization - django

The setup
A few months ago I upgraded an app from Django 1.6 to 1.7 then 1.8 and now I'm working on getting up to 1.9.
While wrangling with migrations I uncovered a pretty nasty instance of database state dependency - a management method on the (not so great) custom user model calls django.contrib.auth.models.Group. Yikes. So when I set up the Continuous Integration pipeline I managed to solve the problem with a data migration and everything was great.
The migration looks like this:
# import statements left out for brevity
def seed_groups(apps, schema_editor):
Group = apps.get_model('auth', 'Group')
Group.objects.get_or_create(name='group1')
Group.objects.get_or_create(name='group2')
Group.objects.get_or_create(name='group3')
class Migration(migrations.Migration):
dependencies = [('auth', '0001_initial')]
operations = [migrations.RunPython(seed_groups)]
Okay, so that's not totally great - the usage of get_or_create here allows us to hook up a database that already has data in it without making Postgres get super upset about asking it to insert rows it already has. This works though and we've been happy campers with tests running merrily and none of our environments have had any issues.
The twist
So I've been running my tests and fixing deprecations, updating libraries, blah blah blah. So it came as a surprise that my CI environment (a moderately popular service) is failing to build at the migration step, only since the Django version changed from 1.8 to 1.9.
I've checked that there isn't some sort of cached dependency chain issue (we're loading all the right libraries) but the traceback from the error is very familiar....
django.db.utils.ProgrammingError: relation "auth_group" does not exist
LINE 1: ...ELECT "auth_group"."id", "auth_group"."name" FROM "auth_grou...
^
Full traceback available here: https://gist.github.com/alexkahn/b63c41904809cbe53962dc104e4067f0
This error is cropping up from running python manage.py migrate --no-input
Things I've done in futile attempts to resolve the issue:
Modifying the seed_groups function like so:
def seed_groups(apps, schema_editor):
# same
db_alias = schema_editor.connection.alias
Group.objects.using(db_alias).get_or_create(name='group1')
# etc...
Adding an initial = True class attribute to my 0001 migration for this app.
Squash all of the migrations into one.
Pointing my installed apps list directly to this app's AppConfig subclass.
What I was thinking for some of those, I don't know.
Bottom Line
Anyone have a clue why this would suddenly change? Is there something super obvious that I am thinking too hard about?

So this pulled out some new havoc from a...not so great way to do things with Django.
So we have a model manager here:
class UserManager(BaseUserManager):
def users_in_group1(self):
return Group.objects.get(name='group1').user_set.filter()
It's returning a queryset that directly interacts with the auth.models.Group model. Tight coupling between models here resulted in Django needing to resolve that relation before any tables were created.
A simple change to:
def users_in_group1(self):
return self.filter(groups__name='group1')
Allows the migrations to run without issue.

The Django system checks run before the migrations. The URL system checks added in 1.9 check your URL config, which causes your views to be imported.
This part of the traceback shows that SuperUserAccountForm is causing queries when the module is loaded. This causes an error when you are migrating, because the table has not been created yet.
File "/home/rof/src/github.com/myapp/myapp/myapp/accounts/views.py", line 28, in <module>
from myapp.accounts.forms import (
File "/home/rof/src/github.com/myapp/myapp/myapp/accounts/forms.py", line 351, in <module>
class SuperUserAccountForm(forms.ModelForm):
File "/home/rof/src/github.com/myapp/myapp/myapp/accounts/forms.py", line 358, in SuperUserAccountForm
queryset=Account.objects.group1(),
File "/home/rof/src/github.com/myapp/myapp/myapp/accounts/models.py", line 71, in group1
return Group.objects.get(name='group1').user_set.filter()

Related

Django pytest running many migrations after tests

I am running pytest-django with a legacy database that was not created by Django (e.g. all my models use managed=False). In production, everything is fine but in testing Django wants to apply a bunch of curious migrations.
For testing, I have a pre-populated test database and I want my tests to commit changes to the database (because we have logic in db views and triggers that needs to get run). All that is working fine but afterwards a ton of migrations are run and it takes my test suite time from 1s to 70s.
The migrations are overwhelmingly this type: ALTER TABLE [DeprecatedLegacyTable] CHECK CONSTRAINT [FK_dbo.DeprecatedLegacyTable_dbo.User_DeprecatedServiceId]. The tables aren't even in any models.py, so I guess Django is digging this up with inspectdb.
I've looked around a bit and it seems this is a "feature" of Django but it is hurting my workflow. Is there any way to to apply these migrations once and for all rather than replay them every test run? I've run makemigrations and showmigrations and there is nothing to apply.
EDIT:
I think that everything is related to TransactionTestCase. pytest-django actually warns that using transaction=True will be slow. Also, I don't think that these are migrations; it is the database flush procedure. The queries being run are the same as when I do django-admin sqlflush. So, I guess I am trying to override that flush behavior.
EDIT2:
What a ride. I see that Dj defers to the vendor database module for flush functionality, meaning each vendor can do it differently. I'm using mssql and they chose some questionable operations. Here's the part where they do the ALTER TABLE on every constraint:
COLUMNS = "TABLE_NAME, CONSTRAINT_NAME"
WHERE = "CONSTRAINT_TYPE not in ('PRIMARY KEY','UNIQUE')"
cursor.execute(
"SELECT {} FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS WHERE {}".format(COLUMNS, WHERE))
fks = cursor.fetchall()
sql_list = ['ALTER TABLE %s NOCHECK CONSTRAINT %s;' %
(self.quote_name(fk[0]), self.quote_name(fk[1])) for fk in fks]
In the end, I decided to try to monkeypatch the sql_flush functionality to return an empty list since I don't need any actual flushing done.
This is from a conftest.py:
#pytest.fixture(scope="session")
def django_db_setup():
# Turn the database flush procedure into a no-op
def mock_flush(*args, **kwargs):
return []
import django.core.management.sql
django.core.management.sql.sql_flush = mock_flush
settings.DATABASES["default"] = {
"ENGINE": "mssql",
"HOST": os.environ["SERVER_URL"],
"NAME": os.environ["TEST_DATABASE"],
}

Run migrations without loading views/urls

I have following code in one of my views:
#ratelimit(method='POST', rate=get_comment_rate())
def post_comment_ajax(request):
...
However, upon initial ./manage.py migrate, get_comment_rate() requires a table in database, so I'm unable to run the migrations to create the tables. I ended up with following error:
Django.db.utils.ProgrammingError: relation .. does not exist
Is it possible to run migrations without loading views or is there a better way?
Running migrations triggers the system checks to run, which causes the views to load. There isn't an option to disable this.
It looks like the ratelimit library allows you to pass a callable.
#ratelimit(method='POST', rate=get_comment_rate)
def post_comment_ajax(request):
This would call get_comment_rate when the view runs, rather than when the module loads. This could be an advantage (value won't be stale) or a disadvantage (running the SQL query every time the view runs could affect performance.
In general, you want to avoid database queries when modules load. As well as causing issues with migrations, it can cause issues when running tests -- queries can go to the live db before the test database has been created.
If you are ok with this risk, one option would be to catch the exception in the decorator:
def get_comment_rate():
try:
...
except ProgrammingError:
return '1/m' # or some other default

Test failure because model couldn't be imported

Application I am working on is proprietary and thus I will try to provide as much information as possible.
When running python manage.py test, which runs all the tests, only one application among many others fails. Too many hours have been burned on this.
The output is:
ImportError: Failed to import test module: app.aom.apps.forum.tests
after this, tracing is listed and then one line which says that the problem occurs when importing models into tests.py file, that is:
from .models import ForumSectionGroup, ForumSection, ForumThread, ForumPost
and the last line of the output is:
RuntimeError: Model class app.aom.apps.forum.models.ForumSectionGroup doesn't declare an explicit app_label and isn't in an application in INSTALLED_APPS.
I have Googled and researched what could cause this problem, and the conclusion: either I am importing module before application is loaded or I don't have the application listed in INSTALLED_APPS. But none of these seems to be the problem. Maybe testing mechanism somehow skips few steps and renders the model unloaded before importing it.
Explicitly assigning app_label as part of class Meta in the model results in conflict, because the model ends up registered twice, when I force it. I was driven to this conclusion by looking at the code at line 111, https://github.com/django/django/blob/master/django/db/models/base.py
I ran into this same issue. For me what fixed it was changing
from .models import Model1, Model2
to
from app.models import Model1, Model2
The from .model import syntax works fine in view.py, etc. but in the tests it was not working. This only seems to be the case when using a non-standard structure as pointed out in a comment above.
In my specific case I was using Django 1.11.

How to unittest a django database migration?

We've changed our database, using django migrations (django v1.7+).
The data that exists in the database is no longer valid.
Basically I want to test a migration by, inside a unittest, constructing the pre-migration database, adding some data, applying the migration, then confirming everything went smoothly.
How does one:
hold back the new migration when loading the unittest
I found some stuff about overriding settings.MIGRATION_MODULES but couldn't work out how to use it. When I inspect executor.loader.applied_migrations it still lists everything. The only way I could prevent the new migration was to actually remove the file; not a solution I can use.
create a record in the unittest database (using the old model)
If we can prevent the migration then this should be pretty straightforward. myModel.object.create(...)
apply the migration
I think I can probably work this out now that I've found the test_executor: set a plan pointing to the migration file and execute it? Um, right? Got any code for that :-D
confirm the old data in the database now matches the new model
Again, I expect this should be pretty easy: just fetch the instance created before the migration and confirm it has changed in all the right ways.
So the challenge is really just working out how to prevent the unittest from applying the latest migration script and then applying it when we're ready?
Perhaps I have the wrong approach? Should I create fixtures, and just confirm that they're all good at the end? Do fixtures get loaded before the migrations are applied, or after they're all done?
By using the MigrationExecutor and picking out specific migrations with .migrate I've been able to, maybe?, roll it back to a specific state, then roll forward one-by-one. But that is popping up doubts; currently chasing down sqlite fudging around due to the lack of an actual ALTER TABLE instruction. Jury still out.
I wasn't able to prevent the unittest from starting with the current database schema, but I did find it is quite easy to revert to earlier points in the migration history:
Where "0014_nulls_permitted" is a file in the migrations directory...
from django.db.migrations.executor import MigrationExecutor
executor.migrate([("workflow_engine", "0014_nulls_permitted")])
executor.loader.build_graph()
NB: running the executor.loader.build_graph between invocations of executor.migrate seems to be a very important part of completing the migration and making things behave as one might expect
The migrations which are currently applicable to the database can be checked with something like:
print [x[1] for x in sorted(executor.loader.applied_migrations)]
[u'0001_initial', u'0002_fix_foreignkeys', ... u'0014_nulls_permitted']
I created a model instance via the ORM then ensured the database was in the old state by running some SQL directly:
job = Job.objects.create(....)
from django.db import connection
cursor = connection.cursor()
cursor.execute('UPDATE workflow_engine_job SET next_job_state=NULL')
Great. Now I know I have a database in the old state, and can test the forwards migration. So where 0016_nulls_banished is a migration file:
executor.migrate([("workflow_engine", "0016_nulls_banished")])
executor.loader.build_graph()
Migration 0015 goes through the database converting all the NULL fields to a default value. Migration 0016 alters the schema. You can scatter some print statements around to confirm things are happening as you think they should be.
And now the test can confirm that the migration has worked. In this case by ensuring there are no nulls left in the database.
jobs = Job.objects.all()
self.assertTrue(all([j.next_job_state is not None for j in jobs]))
We have used the following code in settings_test.py to ignore the migration for the tests:
MIGRATION_MODULES = dict(
(app.split('.')[-1], '.'.join([app, 'nonexistent_django_migrations_module']))
for app in INSTALLED_APPS
)
The idea here being that none of the apps have a nonexistent_django_migrations_module folder, and thus django will simply find no migrations.

Django: Loaddata command after syncdb fails

I'm trying to use fixtures as a DB-agnostic way to get the data into my database, but this is much harder than it should be. I'm wondering what I'm doing wrong...
Specifically, when I do a syncdb followed by a migrate followed by a loaddata I run into trouble, since syncdb already creates data that loaddata tries to read from the dump. This leads to double entries and hence a crashing script.
This seems to be the same problem as described here: https://code.djangoproject.com/ticket/15926
But it's weird to me that this seems to be an ignored issue. Are fixtures not meant to actually put real (live) data in?
If so: is there any Django-format that is meant for this? Or is everyone just dumping data as SQL? And, if so, how would one migrate development data in SQLite to a production database?
syncdb will also load data from fixtures if you have the fixtures named correctly and in the correct location. See this link for more info.
https://docs.djangoproject.com/en/1.3/howto/initial-data/#automatically-loading-initial-data-fixtures
If you do not want the data to load on every syncdb then you will need to change the name of the fixture.
fixtures are an OK way to load your data, I have used it on a number of projects. On some projects when I have a ton of data I sometimes write a special load script that will take the data from my data source and load up my new django models, the custom script is a little more work, but gives you more flexibility.
I tend to stay away from using sql to load if I can, since SQL is usually DB specific, if you have to worry about loading on different database versions, stay away if you can.
"In general, using a fixture is a cleaner method since it’s database-agnostic, but initial SQL is also quite a bit more flexible."
OP here; this is what I came up with so far:
# some_app/management/commands/delete_all_objects.py
from django.core.management.base import BaseCommand, CommandError
from django.db.models import get_models
class Command(BaseCommand):
help = 'Deletes all objects'
def handle(self, *args, **options):
for model in get_models():
model.objects.all().delete()
And then just run delete_all_objects between after syncdb & migrate and before loaddata. I'm not sure I like it, I'm very surprised it's necessary, but it works.