Mix data and schema migrations in one migrations file (Django)? - django

I've heard the opinion that mix data migration and structure migrations is bad practice in Django. Even if you are specify atomic=False in your Migration class. But i could not find any information on this topic. Even my more expirience collegues could not answer this question.
So, is it bad to mix data and structure migrations? If so why? What exactly may happen if i do it?

There is an actual reason for not mixing data and schema migrations in one migration, mentioned in the entry for RunPython operation in Django docs:
On databases that do support DDL transactions (SQLite and PostgreSQL), RunPython operations do not have any transactions automatically added besides the transactions created for each migration. Thus, on PostgreSQL, for example, you should avoid combining schema changes and RunPython operations in the same migration or you may hit errors like OperationalError: cannot ALTER TABLE "mytable" because it has pending trigger events.
It should be also noted that for databases that do not support DDL transactions, it may be easier to fix the database after an unsuccessful migration attempt when data and schema migration operations are not mixed together, as data migration operations can be rolled back automatically in Django.

In the past the best practice was to keep them separate. The second sentence in this section in the docs says:
Migrations that alter data are usually called “data migrations”;
they’re best written as separate migrations, sitting alongside your
schema migrations.
But doesn't list any reasons why. Since Django ~2.0 I've been allowing small data migrations to occur with schema migrations. However there have been times when the migration simply couldn't run with the schema migration. There are two main cases that I've run into.
The data migration takes a long time and shouldn't be a migration in the first place. The resolution was to simply run a script that did what the data migration would have, but in batches.
Attempting to add/update data, then creating an index. This forced me into splitting the migrations into two separate files. I don't remember the exact error, but it simply wouldn't migrate. This shouldn't cause problems for you unless there are non-atomic migrations running which would leave your DB in an unexpected state.

Related

Confusion about migration files on Official Django document

On Migration Files part of Django Official Document, it reads:
Paragraph 1: The operations are the key; they are a set of declarative instructions which tell Django what schema changes need to be made. Django scans them and builds an in-memory representation of all of the schema changes to all apps, and uses this to generate the SQL which makes the schema changes.
Paragraph 2: That in-memory structure is also used to work out what the differences are between your models and the current state of your migrations; Django runs through all the changes, in order, on an in-memory set of models to come up with the state of your models last time you ran makemigrations. It then uses these models to compare against the ones in your models.py files to work out what you have changed
Q1: In differences are between your models and the current state of your migrations, what exactly your models and current states of your migrations refer to? which one refers to the Database Version before this new migration file is applied and which one refers to the Database Version after new migration file is applied?
Q2: That in-memory structure is also used to work out what the differences are between your models and the current state of your migrations
Isn't this already accomplished by Django scans them and builds an in-memory representation of all of the schema changes to all apps, and uses this to generate the SQL which makes the schema changes in Paragraph 1? If so, why it says is also used to which makes audience think it is something different from paragraph 1?
Q3: As a best practice, should we delete all migrations files every time after successful migration? I just fixed some errors poping up when makemigrations, which has bothered me for days, by deleting all existing migration files and makemigrations again.

Am I required to use django data migrations?

I migrated a database that was manipulated via SQL to use Django's migration module: https://docs.djangoproject.com/en/3.0/topics/migrations/.
Initially I thought of using migrations only for changes in the model such as changes in columns or new tables, performing deletes, inserts and updates through SQL, as they are constant and writing a migration for each case would be a little impractical.
Could using the migration module without using data migrations cause me some kind of problem or inconsistency in the future?
You can think about data migration if you made a change that required also a manual fix on the data.
Maybe you decided to normalize something in the database, e.g. to split a name column to the first name and the last name. If you have only one instance of the application with one database and you are the only developer then you will also not write a data migration, but if you want to change it on a live production site with 24 x 7 hours traffic or you cooperate with other developers then you probably prepare a data migration for their databases or you will thoroughly test the migration on a copy of live data that the update will work on the production site correctly without issues and with minimal shutdown. If you don't write a data migration and you had no problem immediately then it is OK and will be not worse then a possible ill-conceived data migration.

Schema migration commit changes

I have the following situation:
more than one schema migration
one data migration
It would be simple enough but I encountered a problem with the data migration. It sends a query for a specific ContentType which I need for django-taggit. The problem is that the model I want to query didn't exist until the migration that preceded it. That errors out with an empty result from that query.
However, when I run all migrations up to the data migration and then I run the data migration itself, everything works well. I've noticed that a migration process doesn't save changes until all of the migrations are finished which doesn't work for this.
One of the solutions I got to was to manually commit/save changes to the database however I haven't been able to find a way to do it. Of course, if there are any other ideas/better solution I'd be happy to hear them.
This is the code where the data migration errors out:
# ChallengeContest ContentType
challenge_contest_ct = ContentType.objects.get(model='challengecontest')
As you can see the model challengecontest is the one that was created in a migration preceeding data migration.
I have found data migrations to be more trouble than they're worth. In my last two jobs we abandoned them, replacing them with writing one-off management commands.

Django won't create a table

I have two different databases in django. Initially, I had a table called cdr in my secondary database. I decided to get rid of the second database and just add the cdr table to the first database.
I deleted references (all of them, I think) to the secondary database in the settings file and throughout my app. I deleted all of the migration files and ran make migrations fresh.
The table that used to be in the secondary database is not created when I run migrate even though it doesn't exist on my postgres database.
I simply cannot for the life of me understand why the makemigrations function will create the migration file for the table when I add it back in to the model definition and I have verified that it is in the migration file. When I run migrate, it tells me there are no migrations to apply.
Why is this so. I have confirmed that I have managed=True. I have confirmed that the model is not on my postgres database by logging into the first database and running \dt.
Why does Django still think that this table still exists such that it is telling me no migrations to apply even though it shows a create command in the migrations file? I even dropped the secondary database to make sure it wasn't somehow being referenced.
I suspect code isn't needed to explain this to me but I will post if needed. I figure I am missing something simple here.
Why does Django still think that this database still exists such that
it is telling me no migrations to apply even though it shows a create
command in the migrations file
Because django maintains a table called django_migrations in your database which lists all the migrations that have been applied. Since you are almost starting afresh, clear out this table and then run the migrations.
If this still doesn't work and still assuming that you are still on a fresh start, it's a simple matter to drop all the tables (or even the database and do the migration again). OTH that you have data you want to save, you need to look at the --fake and --fake-initial options to migrate

Is there an easy way to compare Django's models and migration chain against the db verify consistency?

I've had some migration issues over time and occasionally have run into a case where a field will not have been correctly migrated (almost certainly because I tried some fake migration to get my dev db in a working state).
Doing an automatic schema migration will check the migration chain against the model, but not check either of those against the actual db.
Is there a way to easily compare the database against the current models or migration chain and verify that the db, the models, and migration chain are consistent?
As a straw man imagine you delete your migrations, create a new initial migration, and fake migrate to that initial while deleting the ghost migrations.
Is it trivially possible to verify that the database is in sync with that initial migration?
The django-extensions application provides sqldiff management command, which shows difference between current database and your model. So if there is difference between your database and model (migrations should be same after running makemigrations command), you will see.