Django running a migration on multi-tenant DB - django

On a Django app with some self-made (but close to available plugin methods) multi-tenant implementation, I would like to run a migration (a simple add_column this time) with South that could apply on all schemas. I have a configuration very close to this one.
I would like to skip any pure SQL queries if possible. I can get a list of the schemas name from the ORM properly, but then I wonder if I have the possibility to access the tables from the various schemas in a somehow propre way.
I have a hook to be able to change the DB_HOST and DB_SCHEMA via parameters at some level, but I think not to be able to loop cleanly this way inside the forwards migration method of South.
This question is quite high-level, but I mainly wonder if somebody had to face the same kind of question and I am curious to know if there is some clever way to handle it !
Regards,
Matt

This is an outline of a solution, as posted on the South mailing list. The question as phrased is a little different from the one that was posted on the list: There, it was also mentioned that there are "common" tables, shared between all tenants, in a separate schema. Rmatt's own answer refers to this as the public schema.
The basic idea of my solution: Save the migration history for each database (schema) in the schema. To do this, we need to employ some database and Django tricks.
This implies that history records for migrations of apps on the public schema are saved in the public schema, while history for migrations of tenant apps is saved in the tenant schema -- effectively sharding the migration history table. Django does not really support this kind of sharding; it is easy enough to set up the writing by instance content, but there's no way to set up the reading.
So I suggested to create, per tenant, a "tenant-helper" schema, containing one view, named south_migrationhistory, which is a union of the south_migrationhistory tables from the public and tenant schemata. Then, set up a database router for the South MigrationHistory model, instructing it to:
syncdb to both public and tenant schemata
read always from tenant-helper schema
write to public or tenant schema, according to the app the migration belongs to
The result allows proper treatment of dependencies from the tenant app migrations to the public app migrations; and it means all you need to do, to migrate forwards, is loop on the migrate --all (or syncdb --migrate) command -- no need to fake backward migrations. The migrations for the public schema will be run with the migrations for the first tenant in the loop, and all other tenants will "see" them.
As an afterthought, it is probably possible to do this also without a helper schema - by renaming the south_migrationhistory table in the tenant schema, and installing a view with that name in the schema that returns the above-mentioned union when queried, and has an "instead-of insert" trigger to write to the renamed table.

Fine, not so many people seem to have experience or to be concerned with this quite specific problem. I have tried some things here and there and I also got some support from the South mailing-list that helped me to understand some points.
Basically, the solution I implemented is the following:
I have a quite normal migration file autogenerated via South's schemamigration. But I have changed the table name of the add_column and delete_column to schema.table_name. The schema is provided by importing the multi-tenant middleware.
The migration is then applied only if the schema is not run against the public schema. It is actually not meant to be run standalone, or only with database and schema kwargs, but rather from a migration runner that is a new django command.
The runner has unfortunately to call the migration externally, in order to go through the middleware each time again. One other trick is that we have to get the previous state of migration, in order to fake it back to the previous state for south after each tenant migration.
Here is my snippet :
from subprocess import call
import os
from django.core.management.base import BaseCommand
from south.models import MigrationHistory
from myapp.models import MyModel
class Command(BaseCommand):
def handle(self, *args, **options):
#the only allowed arg is the prefix version and it should have a length of 4 (i.e. 0002)
applied = MigrationHistory.objects.filter(app_name='myapp').latest('applied')
current_version = applied.migration[:4]
call_args = ['python', os.path.join('bin', 'manage.py'), 'migrate', 'myorderbird.app.backups']
if len(args) == 1 and len(args[0]) == 4:
call_args.append(args[0])
obje_call_args = None
for obje in MyModel.objects.all():
if obje.schema_exists:
# fake the migration of the previous venue back to the current version
if obje_call_args:
obje_call_args = obje_call_args[:4] + [current_version, '--fake'] + obje_call_args[len(obje_call_args)-3:]
call(obje_call_args)
# migrate the venue in the loop
obje_call_args = list(call_args)
obje_call_args.extend(['--database={}'.format(obje.db), '--schema={}'.format(obje.schema)])
call(venue_call_args)

Related

Django - Reading from live Database, created by separate application

I have an sqlite database located at /home/pi/Desktop/Databsaes/data.db and wish to access it from my models.py script.
To view table contents in a normal command prompt I would execute:
sqlite3
sqlite> .open data.db
sqlite> SELECT * from table1
I have been reading through this official tutorial, but I do not understand how to access my local db and perform the above.
In SQL terms, a QuerySet equates to a SELECT statement,
... but how can I perform something of the sort directly in my models.py script?
Models.py is still untouched:
from __future__ import unicode_literals
from django.db import models
# Create your models here
From what I am gathering, field lookups may be used, which specify arguments for QuerySet methods (such as get()).
EDIT
I followed this tutorial to import the database into Django.
However my question is: Will new data added to this database by a separate process be visible from Django's side after the import?
If you do want to allow Django to manage the table’s lifecycle, you’ll need to change the managed option above to True (or simply remove it because True is its default value).
I don not know whether "manage the table's lifecycle" means to update the database with newer data once this is added.
I think the statement of managed=False might be the main point that confusing you. Here is the description on Model._meta.managed:
If False, no database table creation or deletion operations will be performed for this model. This is useful if the model represents an existing table or a database view that has been created by some other means. This is the only difference when managed=False. All other aspects of model handling are exactly the same as normal. This includes
This means new migrations will not be generated (through makemigrations) for the modification of models schema with managed=False. This implies that you are telling Django, "I'm not going to change these model's schema through Django" (But through other way, maybe through another service).
Note that all we were talking about are just the effect on schema changed, which is nothing to do with your real data. After the link of your database and Django model has been established, just as #DanielRoseman's comments, any data that is there will be visible to Django on each query.
Since according to your statement the question is for newly-added data, the answer should be yes. But, if you were meaning that new tables are created through other service (not through the Django service above), of course you still have to add the corresponding model to Django (with managed=False) then you will be able to access data through Django.

How to create a new model (kind) in google cloud Datastore

I am using Google Cloud Datastore(not NDB) for my project.
python2.7 and Django.
I want to create a new model, lets say Tag model.
class Tag(db.Model):
name = ndb.StringProperty()
feature = ndb.StringProperty(default='')
I have added property to a model many times, but not yet created new model.
My question is when I have changed model schema in Django for my another project using mySQL, I always executed manage.py migrate.
Do I have to execute the migration command for Datastore as well?
Or just defining the model is all I have to do?
Thanks in advance!
Unlike SQL databases like MySQL, Cloud Datastore doesn't require you to create kinds (similar to tables) in advance. Other than defining it in your code, no admin steps are required to create the kind.
When you write the first entity of that kind, it's created implicitly for you.
You can even query for kinds that don't exist yet without an error, you'll just get no entities back:
Of course you have to migrate, except if you are using the same database from the another project. Anyway if you type migrate it will create the tables from your models but if you are working with a existing database nothing is going to happen

Generally, when does Django model change require South?

I am new to Django. I have completed the tutorial and am reading the documentation for more learning. As I try to add to my understanding, say, in new Managers or ModelForms I am curious as to what needs South (or even just scrapping it and rewriting the app).
update django database to reflect changes in existing models
The link above says basically that any column change it is necessary, while the link below is more of what I am asking. Can someone generalize when it is not needed (eg: Adding a new Form/ModelForm based on an existing Model? Adding a Manager?) If no changes are made to the columns of the database South is not then not necessary?
Does changing a django models related_name attribute require a south migration?
There are two types of south migrations: schema and data.
Data migrations are used to change data in the DB and not the schema of the DB.
The schema migrations are the ones you are interested in. They are used to keep track of changes to the DB schema and one should accompany any changes to your models that result in DB schema change (create table, drop table, drop column, change null constrain e.g.)
Some great insights might be found if you read two consecutive migrations for a django app.
In each of them you can find code that applies the migreation, code that reverts the migrations and a snapshot of the DB schema.
P.S. It is quite easy to check if south migration is needed for a particular change in your models. Just run a schemamigration for the modified django app and delete the newly created migration if such was created. As creating a south migration is different that running it, this is a great way to test and learn.
Keep in mind that south is a piece of software like any other and it does 'support' bugs.
Related_name attribute changes only affect your project and django uses it to make queries.
Changes like blank = True/False, null = True/False, symmetrical = True/False require database changes although symmetrical = True/False does not trigger update by south, but the setting definitely makes difference at field creation.
Column changes, like the link in your post shows, require updates in database and that's what south does very good.

Django unit-testing with loading fixtures for several dependent applications problems

I'm now making unit-tests for already existing code. I faced the next problem:
After running syncdb for creating test database, Django automatically fills several tables like django_content_type or auth_permissions.
Then, imagine I need to run a complex test, like check the users registration, that will need a lof ot data tables and connections between them.
If I'll try to use my whole existing database for making fixtures (that would be rather convinient for me) - I will receive the error like here. This happens because, Django has already filled tables like django_content_type.
The next possible way is to use django dumpdata --exclude option for already filled with syncdb tables. But this doesn't work well also, because if I take User and User Group objects from my db and User Permissions table, that was automatically created by syncdb, I can receive errors, because the primary keys, connecting them are now pointing wrong. This is better described here in part 'fixture hell', but the solution shown there doensn't look good)
The next possible scheme I see is next:
I'm running my tests; Django creates test database, makes syncdb and creates all those tables.
In my test setup I'm dropping this database, creating the new blank database.
Load data dump from existing database also in test setup
That's how the problem was solved:
After the syncdb has created the test database, in setUp part of the tests I use os.system to access shell from my code. Then I'm just loading the dump of the database, which I want to use for tests.
So this works like this: syncdb fills contenttype and some other tables with data. Then in setUp part of tests loading the sql dump clears all the previously created data and i get a nice database.
May be not the best solution, but it works=)
My approach would be to first use South to make DB migrations easy (which doesn't help at all, but is nice), and then use a module of model creation methods.
When you run
$ manage.py test my_proj
Django with South installed with create the Test DB, and run all your migrations to give you a completely updated test db.
To write tests, first create a python module calle, test_model_factory.py In here create functions that create your objects.
def mk_user():
User.objects.create(...)
Then in your tests you can import your test_model_factory module, and create objects for each test.
def test_something(self):
test_user = test_model_factory.mk_user()
self.assert(test_user ...)

Can I use a database view as a model in Django?

i'd like to use a view i've created in my database as the source for my django-view.
Is this possible, without using custom sql?
******13/02/09 UPDATE***********
Like many of the answers suggest, you can just make your own view in the database and then use it within the API by defining it in models.py.
some warning though:
manage.py syncdb will not work anymore
the view need the same thing at the start of its name as all the other models(tables) e.g if your app is called "thing" then your view will need to be called thing_$viewname
Just an update for those who'll encounter this question (from Google or whatever else)...
Currently Django has a simple "proper way" to define model without managing database tables:
Options.managed
Defaults to True, meaning Django will create the appropriate database tables in syncdb and remove them as part of a reset management command. That is, Django manages the database tables' lifecycles.
If False, no database table creation or deletion operations will be performed for this model. This is useful if the model represents an existing table or a database view that has been created by some other means. This is the only difference when managed is False. All other aspects of model handling are exactly the same as normal.
Since Django 1.1, you can use Options.managed for that.
For older versions, you can easily define a Model class for a view and use it like your other views. I just tested it using a Sqlite-based app and it seems to work fine. Just make sure to add a primary key field if your view's "primary key" column is not named 'id' and specify the view's name in the Meta options if your view is not called 'app_classname'.
The only problem is that the "syncdb" command will raise an exception since Django will try to create the table. You can prevent that by defining the 'view models' in a separate Python file, different than models.py. This way, Django will not see them when introspecting models.py to determine the models to create for the app and therefor will not attempt to create the table.
I just implemented a model using a view with postgres 9.4 and django 1.8.
I created custom migration classes like this:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
from django.db import migrations
class Migration(migrations.Migration):
dependencies = [
('myapp', '0002_previousdependency'),
]
sql = """
create VIEW myapp_myview as
select your view here
"""
operations = [
migrations.RunSQL("drop view if exists myapp_myview;"),
migrations.RunSQL(sql)
]
I wrote the model as I normally would. It works for my purposes.
Note- When I ran makemigrations a new migration file was created for the model, which I manually deleted.
Full disclosure- my view is read only because I am using a view derived from a jsonb data type and have not written an ON UPDATE INSTEAD rule.
We've done this quite extensively in our applications with MySQL to work around the single database limitation of Django. Our application has a couple of databases living in a single MySQL instance. We can achieve cross-database model joins this way as long as we have created views for each table in the "current" database.
As far as inserts/updates into views go, with our use cases, a view is basically a "select * from [db.table];". In other words, we don't do any complex joins or filtering so insert/updates trigger from save() work just fine. If your use case requires such complex joins or extensive filtering, I suspect you won't have any problems for read-only scenarios, but may run into insert/update issues. I think there are some underlying constraints in MySQL that prevent you from updating into views that cross tables, have complex filters, etc.
Anyway, your mileage may vary if you are using a RDBMS other than MySQL, but Django doesn't really care if its sitting on top of a physical table or view. It's going to be the RDBMS that determines whether it actually functions as you expect. As a previous commenter noted, you'll likely be throwing syncdb out the window, although we successfully worked around it with a post-syncdb signal that drops the physical table created by Django and runs our "create view..." command. However, the post-syncdb signal is a bit esoteric in the way it gets triggered, so caveat emptor there as well.
EDIT: Of course by "post-syncdb signal" I mean "post-syncdb listener"
From Django Official Documentation, you could call the view like this:
#import library
from django.db import connection
#Create the cursor
cursor = connection.cursor()
#Write the SQL code
sql_string = 'SELECT * FROM myview'
#Execute the SQL
cursor.execute(sql_string)
result = cursor.fetchall()
Hope it helps ;-)