Adding a non-nullable field on a production Django deployment - django

I have a production Django deployment (Django 1.11) with a PostgreSQL database. I'd like to add a non-nullable field to one of my models:
class MyModel(models.Model):
new_field = models.BooleanField(default=False)
In order to deploy, I need to either update the code on the servers or run migrations first, but because this is a production deployment, requests can (and will) happen in between my updating the database and my updating the server. If I update the server first, I will get an OperationalError no such column, so I clearly need to update the database first.
However, when I update the database first, I get the following error from requests made on the server before it is updated with the new code:
django.db.utils.IntegrityError: NOT NULL constraint failed: myapp_mymodel.new_field
On the surface, this makes no sense because the field has a default. Digging into this further, it appears that defaults are provided by Django logic alone and not actually stored on the SQL level. If the server doesn't have the updated code, it will not pass the column to SQL for the update, which SQL interprets as NULL.
Given this, how do I deploy this new non-nullable field to my application without my users getting any errors?

Migrations should always be run at the beginning of deployments or else you get other problems. The solution to this problem is to split the changes into two deployments.
In deployment 1, the field needs to be nullable (either a NullBooleanField or null=True). You should make a migration for the code in this state and make sure the rest of your code will not crash if the value of the field is None. This is necessary because requests can go to servers that do not yet have the new code; if those servers create instances of the model, they will create it with the field being null.
In deployment 2, you set the field to be not nullable, make a migration for this, and remove any extra code you wrote to handle cases where the value of the field is None. If the field does not have a default, the migration you make for this second deployment will need to fill in values for objects that have None in this field.
The two deployments technique is needed to safely delete fields as well, although it looks a bit different. For this, use the library django-deprecate-fields. In the first deployment, you deprecate the field in your models file and remove all references to it from your code. Then, in deployment 2, you actually delete the field from the database.

You can accomplish this by starting with a NullBooleanField:
Add new_field = models.NullBooleanField(default=False) to your model
Create schema migration 1 with makemigrations
Change model to have new_field = models.BooleanField(default=False)
Create schema migration 2 with makemigrations
Run schema migration 1
Update production code
Run Schema migration 2
If the old production code writes to the table between steps 5 and 6, a null value of new_field will be written. There will be a time between steps 6 and 7 where there can be null values for the BooleanField, and when the field is read, it will be null. If your code can handle this, you'll be ok, and then step 7 will convert all of those null values to False. If your new code can't handle these null values, you can perform these steps:
Add new_field = models.NullBooleanField(default=False) to your model
Create schema migration 1 with makemigrations
Run schema migration 1
Update production code
Change model to have new_field = models.BooleanField(default=False)
Create schema migration 2 with makemigrations
Run schema migration 2
Update production code
*note that these methods were only tested with Postgres.

Typically, a django upgrade process looks as follows:
LOCAL DEVELOPMENT ENV:
Change your model locally
Migrate the model (python manage.py makemigrations)
Test your changes locally
Commit & push your changes to (git) server
ON THE PRODUCTION SERVER:
Set ENV paramaters
Pull from your version control system (git fetch --all; git reset --hard origin/master)
update python dependencies (eg pip install -r requirements.txt)
migrate (manage.py migrate_schemas)
update static files (python manage.py collectstatic)
restart django server (depends on the server but could be something like 'python manage.py runserver')

Related

Column Already Exist Error on Django Rest Framework on Heroku Migration

I have updated my model, but I couldn't do the migrations, the problem is:
I am getting an error like: column "blah blah" of relation "blah blah blah" already exists
The mentioned column should exist in DB, but it shouldn't exist on migration file because I did not do any addition or modification about that model field and it was already successfully created in one of the previous migrations, even used frequently without any error.
When I do the migration with --fake, this time it doesn't create the really unexisting field of migration file which is defined with that model update.
Deployed on Heroku, it may be caused by rollbacks of code layer, since after rollbacks code gets to older versions but DB stays same.
What is the best way without losing any data from production DB?
Following is a screenshot of the bash; timezone, endtime and start time fields already exist on model and DB before this migration, created on one of the previous successful migrations
Click here to screenshot of Heroku Bash when I try to run migrations
Thanks
You should not be running makemigrations on Heroku. Do it locally, then commit the result, deploy, and then run migrate only.
As it is, you've got completely out of sync; if you don't have any data you need to keep, the easiest thing to do is to delete your db and start again.

Load fixtures + add Page: IntegrityError (duplicate key value)

I have a migration that loads a fixture for populating the database with a basic site structure (from Loading initial data with Django 1.7 and data migrations
). After that migration ran, my test adds a (custom) NewsPage. THis yields an "IntegrityError at /admin/pages/add/website/newspage/5/
duplicate key value violates unique constraint "wagtailcore_page_pkey"
DETAIL: Key (id)=(3) already exists." The same happens when i add the page through the admin interface.
It's a bit suspicious that the Page with pk=3 is the first one that is created in my fixture. The other two pk's were already created by Wagtail's migrations.
I've read up about fixtures an migrations, and it seems Postgres won't reset the primary key sequences. I'm assuming this is also my problem here.
I found a possible solution in Django: loaddata in migrations errors, but that failed (with "psycopg2.ProgrammingError: syntax error at or near "LINE 1: BEGIN;"). Trying to execute the gist of it, I ran the sqlsequencereset management command (./manage.py sqlsequencereset wagtailcore myapp), but i still get the error, although now for id=4.
Is my assumption correct that Postgres not resetting the primary key sequences is my problem here?
Does anyone know how to reliably fix that from/after a migration loaded fixtures?
Would it maybe be easier / more reliable to create content in Python code?
Edit (same day):
If i don't follow the example in Loading initial data with Django 1.7 and data migrations, but just run the management command, it works:
def load_fixture(fixture_file):
"""Load a fixture."""
commands = StringIO()
call_command('loaddata', fixture_file, stdout=commands)
I don't know what the drawbacks of this more simple approach are.
Edit 2 (also same day):
Ok i do know, the fixtures will be based on the current model state, not the state that the migration is for, so it will likely break if your model changes.
I converted the whole thing to Python code. That works and will likely keep working. Today i learned: don't load fixtures in migrations. (Pity, it would have been a nice shortcut.)

Manually altering postgres schema leads to the error: current transaction aborted, commands ignored until end of transaction block (django app)

I have a Django model called Message:
class Message(models.Model):
text = models.CharField(max_length=50)
sending_time = models.DateTimeField(auto_now_add=True)
I added sender = models.ForeignKey(User) to this model in my Django code. Next, I wanted to explore doing changes in the postgresql table manually, without invoking syncdb or south.
So I logged into psql and did the following:
ALTER TABLE message ADD COLUMN "sender" INTEGER DEFAULT 1;
ALTER TABLE message ADD CONSTRAINT fk_message_user FOREIGN KEY (sender) REFERENCES user(id);
With those two commands, I successfully added a new column with default value 1, containing a foreign key to Django's built-in User model.
After I had done that, I figured I had successfully approximated what Django does with syncdb or south. I ran my web app - everything looked good. But then as soon as I hit the first database transaction in my code, the following error resulted:
DatabaseError: current transaction is aborted, commands ignored until
end of transaction block
I'm unable to get it to work unless I delete the column I manually added (or rollback the transaction), but not otherwise. syncdb is an option, but let's exclude it for this question.
What did I miss when I manually provisioned that foreignkey field? Can someone walk me through the process? I'm assuming running a migration simply automates steps one could have done manually too. I need to understand the anatomy of that process, hence doing it by hand. That's what this question's about.

Is there an easy way to compare Django's models and migration chain against the db verify consistency?

I've had some migration issues over time and occasionally have run into a case where a field will not have been correctly migrated (almost certainly because I tried some fake migration to get my dev db in a working state).
Doing an automatic schema migration will check the migration chain against the model, but not check either of those against the actual db.
Is there a way to easily compare the database against the current models or migration chain and verify that the db, the models, and migration chain are consistent?
As a straw man imagine you delete your migrations, create a new initial migration, and fake migrate to that initial while deleting the ghost migrations.
Is it trivially possible to verify that the database is in sync with that initial migration?
The django-extensions application provides sqldiff management command, which shows difference between current database and your model. So if there is difference between your database and model (migrations should be same after running makemigrations command), you will see.

Django: flush command doesnt completely clear database, reset fails

I rewrote a lot of my models, and since I am just running a test server, I do ./manage.py reset myapp to reset the db tables and everything has been working fine.
But I tried to do it this time, and I get an error,
"The full error: contraint owner_id_refs_id_9036cedd" of relation "myapp_tagger" does not exist"
So I figured I would just nuke the whole site and start fresh. So i did ./manage.py flush then did a syncdb this did not raise an error and deleted all my data, however it did not update the database since when I try to access any of my_app's objects, i get a column not found error. I thought that flush was supposed to drop all tables. The syncdb said that no fixtures were added.
I assume the error is related to the fact that I changed the tagger model to having a foreignkey with a name owner tied to another object.
I have tried adding related_name to the foreignkey arguments and nothing seems to be working.
I thought that flush was supposed to drop all tables.
No. According to the documentation, manage.py flush doesn't drop the tables. Instead it does the following:
Returns the database to the state it was in immediately after syncdb was executed. This means that all data will be removed from the database, any post-synchronization handlers will be re-executed, and the initial_data fixture will be re-installed.
As stated in chapter 10 of The Django Book in the "Making Changes to a Database Schema" section,
syncdb merely creates tables that don't yet exist in your database — it does not sync changes in models or perform deletions of models. If you add or change a model's field, or if you delete a model, you’ll need to make the change in your database manually.
Therefore, to solve your problem you will need to either:
Delete the database and reissue manage.py syncdb. This is the process that I use when I'm still developing the database schema. I use an initial_data fixture to install some test data, which also needs to be updated when the database schema changes.
Manually issue the SQL commands to modify your database schema.
Use South.