I can think of three reasons why:
providing users with the flexibility on "when" to commit model changes
debugging modularity
perhaps resource consumption in larger
databases
However, it does seem that migrate always follows shortly after migration (tutorials/youtube videos).
so is there a philosophy behind this that I'm missing?
Ofcourse there are some reasons.
First of all, 'makemigrations' doesn't touch real DB, it just tells django how models(db scheme) have changed so you can see what's going on when you do 'migrate'.
and this makes django more safe.
This also provides to make default options for new fields or db changes..
Other reason is 'revert'.
If you want to roll-back db scehme with specific migrations, you can just tell django to roll back to specific migration file.
Another reason is 'reusable-app' principle.
If you create app with django and it could be reusable with no-db-interaction. It means if you deploy your app(or project, too!) to another project or server, it just needs 'migrations' files not real db.
Related
I'm just beginning my journey with Django framework and I read that Django developers have made using migrations mandatory beginning from version 2.0. I might be old school but I like my database separate from my code. I have always kept my database separate from my code models. I think that the migrations won't scale with the engineering team size.
So my question is 2 fold.
Can you not use Django 2.0 without the migrations as I don't think it will scale well and won't fit the CI/CD pipeline?
If we can't avoid the db migrations then how can we integrate them in a robust CI/CD pipeline where a model can be changed by different developers from different teams.
Yes, you can. You can create your tables manually and set Django to not manage your tables.
After your Django project is configured, just run on your terminal python manage.py inspectdb > models.py, and django will pick the models on the configured database. This is particularly good if your project will use a already existing or legacy database
Then, you can tell django to not manage your tables on the meta options of the model:
class MyModel(models.Model):
# your fields here
class Meta:
managed = False
See the docs here
But, unless you have a very good way to keep track of your table changes, I must say this is a mistake. Django migrations help you to keep track on your models changes along the way. It is really helpful if you need to rollback or understand your database history.
Migrations are not mandatory, it's not clear what you think has changed in 2.0 to make them so.
Migrations are intended for large teams. If you avoid them, you'll make things much much harder for yourself and your fellow team members.
Our product has a restful API and a server rendered app (the CMS). Both share the database. Both are written in django
The fields and the models needed in both are not mutually exclusive, there are some only particular to the API, some particular to the CMS, and some which are common.
My question is if I run migrations on one of the repos will they try to drop the fields that aren't present in the models of that particular repo, and needed by the other. Will running the migrations individually in both repos keep the database up to date and not pose a problem.
The only other valid option IMHO (besides merging projects) is turning off automation of Django migrations on common models (Meta.managed = False) and taking table creation & versioning into your own hands. You still can write migration scripts using django.db.migrations but makemigrations command won't do anything for these tables.
This was solved by using a schema migration tool external to Django's own. We use
yoyo migrations to migrate our schema now.
Will running the migrations individually in both repos keep the database up to
date and not pose a problem.
Unfortunately, no. As you suspected, changes in one will attempt to override the other.
The easiest thing to do is merge the two projects into one so this problem goes away entirely.
If this isn't an option, can the code be organised in such a way that both projects share the same models.py files? You could do this by perhaps having the models.py files and migrations folders only exist in one project. The second project could have a symlink across to each models.py file it uses. The trick (and the difficult part) will be to make sure you never create migrations for the app which uses the symlinks.
I think the best things to do would be to have one repo that contains all the fields. This project will be responsible to apply the migrations.
In the other projects, you'll need a db_router containing a function allow_migrate which will return False on your model classes.
Also having different db user with different db permissions can prevent from altering the tables.
I read somewhere that you would never run syncdb on a database, post its initial run.
Is this true?
I don't see what the problem could be. Do you?
running syncdb will not make changes to tables for any models already in the database (even if you have changed them).
for managing changes to models, consider south
Syncdb will create tables that don't exist, but not modify existing tables. So it's fairly safe to run in production. But it's not a reliable way to maintain a database schema. Look at the South package for a way to reliably maintain changes to your database schema between development and production. Should be part of django standard, IMHO.
On Heroku, as soon as you push new code, the web-serving instances restart... even if the underlying database schema additions/changes (via syncdb or south migrate) haven't yet been applied.
In many cases, this might just cause harmless errors undtil the syncdb/migrate is run soon afterward. But I'm concerned that in some cases, new code might half-work making unexpected changes in the pre-migration database.
What's the right way to be safe against this risk?
One technique might be to add the syncdb/migrate to the Procfile so it's run before web restart. But, in the case of multiple instances, or maybe even a case where the one old-code-instance is left running until the moment the one new-code-instance is known-up, there's still a variant of the issue where code is talking to a DB with a mismatched schema.
Is there a 'hold all web instances' feature (or common best practice) for letting the migrate complete without web traffic?
Or am I being overly concerned about a risk that is negligible in practice?
The safest way to handle migrations of this nature, Heroku or no, is to strictly adopt a compatibility approach with your schema and code:
Every additive or transformative schema change must be backwards-compatible;
Every destructive schema change must be performed after the code that depends on it has been removed;
Every code change must either be:
durable against the possibility that associated schema changes have not yet been made (for instance, removing a model or a field on a model) or
made only after the associated schema change has been performed (adding a model or a field on a model)
If you need to make a significant transformation of a model, this approach might require the following steps:
Create a new database table to hold your new model structure, and deploy that migration
Create a new model with the new structure, and code to copy changes from the old model to the new model when the old model changes, and deploy that code
Execute a migration or code action to copy all old model data to the new model
Update your codebase to use the new model rather than the old model, deleting the old model, and deploy that code
Execute a migration to delete the old model structure from the database
With some thought and planning, it can be used for more drastic changes as well:
Deploy code that completely removes dependence on a section of the database, presumably replacing those sections of the site with maintenance pages
Deploy a migration that makes drastic changes that would not for whatever reason work with the above dual-model workflow
Deploy code that brings the affected sections back with the new model structure supported
This can be hard to organize and requires strict discipline and firm understanding of your code's interaction with your database, but in practice, it does allow for most changes to be made with no more downtime than the server restart itself imposes.
Looks like fast-database changeovers are the way to go, but it requires a dedicated database.
http://devcenter.heroku.com/articles/fast-database-changeovers
Alternatively, here's a tutorial for copying the data from one database (e.g., production) to another database (e.g., staging), doing the schema/data migration (e.g., using django/south), then switching the app to use the newly-updated database instance.
http://devcenter.heroku.com/articles/migrating-data-between-plans
Seems reasonable, but potentially slow if there's a large amount of data.
The recommended method is this:
Add database changes for your new features to your existing code
Make the existing code compatible with the new schema
Deploy
Add the new features to your codebase
Deploy
This means that your database changes are already in place when the code starts to require them.
However....
There's a couple of issues with this. First that I know of no development shop that is organised enough to be able to handle this, as features just get built ad-hoc, and secondly that you're not really saving anything.
Generally speaking, unless your making big changes to a massive database your changes won't take long to apply and are usually over in a couple of seconds which a developer can work around quite happily issuing restarts etc when needed. The risk being that a user might get an error page. If the changes are larger, you have some alternatives. One is using maintenance mode to turn the site off for a few seconds.
To be honest, there is no clear cut way for how to handle this nicely as by definition your code needs to be in place for your database changes to start. The best way I've found to approach the problem is to look at each change individually and work out the smoothest path for each on a case by case basis.
Rehearsing deployments on a staging environment will mitigate the risk of a deploy going bad, and give you an idea of the impact.
Heroku recently released "buildpacks" which are the scripts they use to set up an environment for your application, from managing dependencies to restarting the instances. Essentially it's a more comprehensive Procfile which you can customize.
You can fork the Python buildpack and modify the script to run in the sequence you want. Append the command you run to syncdb to the end of bin/steps/django. Commit and put this repo on Github.
Unfortunately as of now it's not possible to modify the buildpack of an existing Heroku app, so you'll have to delete it and recreate one that points to your buildpack repo:
heroku create --stack cedar --buildpack git#github.com:...
This is the best solution because it
Doesn't cost anything at all
Doesn't require you to adapt your code to Heroku
Only syncs the db once per deployment
Hope this helps.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
update django database to reflect changes in existing models
I've used Django in the past and one of the frustrations I've had with it as an ORM tools is the inability to update an existing database with changes in the model. (Hibernate does this very well and makes things really easy for updating and heavily modifying a model and applying this to an existing database.) Is there a way to do this without wiping the database every time? It gets really old having to regenerate admin users and sites after every change in the model which I'd like to play with.
You will want to look into South. It provides a migrations system to migrate both schema changes as well as data from one version to the next.
It's quite powerful and the vast majority of changes can be handled simple by going
manage.py schemamigration --auto
manage.py migrate
The auto functionality does have it limits, and especially if the change is going to be run on a production system eventually you should check the code --auto generated to be sure it's doing what you expect.
South has a great guide to getting started and is well documented. You can find it at http://south.aeracode.org
No.
As the documentation of syncdb command states:
Syncdb will not alter existing tables
syncdb will only create tables
for models which have not yet been installed. It will never issue
ALTER TABLE statements to match changes made to a model class after
installation. Changes to model classes and database schemas often
involve some form of ambiguity and, in those cases, Django would have
to guess at the correct changes to make. There is a risk that critical
data would be lost in the process.
If you have made changes to a model and wish to alter the database
tables to match, use the sql command to display the new SQL structure
and compare that to your existing table schema to work out the
changes.
South seems to be how most people solve this problem, but a really quick and easy way to do this is to change the db directly through your database's interactive shell. Just launch your db shell (usually just dbshell) and manually alter, add, drop the fields and tables you need changed using your db syntax.
You may want to run manage.py sqlall appname to see the sql statements Django would run if it was creating the updated table, and then use those to alter the database tables and fields as required.
The Making Changes to a Database Schema section of the Django book has a few examples of how to do this: http://www.djangobook.com/en/1.0/chapter05/
I manually go into the database - whatever that may be for you: MySQL, PostgreSQL, etc. - to change database info, and then I adjust the models.py accordingly for reference. I know there is Django South, but I didn't want to bother with using another 3rd party application.