Django Accessing external Database to get data into project database - django

i'm looking for a "best-practice" guide/solution to the following situation.
I have a Django project with a MySql DB which i created and manage. I have to import data, every 5 minutes, from a second (external, not managed by me) db in order to do some actions. I have read rights for the external db and all the necessary information.
I have read the django docs regarding the usage of multiple database: register the db in settings.py, migrate using the --database flag, query/access data by routing to the db (short version) and multiple question on this matter on stackoverflow.
So my plan is:
Register the second database in settings.py, use inspectdb to add to the model, migrate, define a method which reads data from the external db and add it to the internal (own) db.
However I do have some questions:
Do i have to register the external db if i don't manage it?
(Most probably yes in order to use ORM or the cursors to access the data)
How can i migrate the model if I don't manage the DB and don't have write permissions? I also don't need all the tables (around 250, but only 5 needed).
(is fake migration an option worth considering? I would use inspectdb and migrate only the necessary tables.)
Because I only need to retrieve data from the external db and not to write back, would it suffice to have a method that constantly gets the latest data like the second solution suggested in this answer
Any thoughts/ideas/suggestions are welcomed!

I would not use Django's ORM for it, but rather just access the DB with psycopg2 and SQL, get the columns you care about into dicts, and work with those. Otherwise any minor change to that external DB's tables may break your Django app, because the models don't match anymore. That could create more headaches than an ORM is worth.

Related

Am I required to use django data migrations?

I migrated a database that was manipulated via SQL to use Django's migration module: https://docs.djangoproject.com/en/3.0/topics/migrations/.
Initially I thought of using migrations only for changes in the model such as changes in columns or new tables, performing deletes, inserts and updates through SQL, as they are constant and writing a migration for each case would be a little impractical.
Could using the migration module without using data migrations cause me some kind of problem or inconsistency in the future?
You can think about data migration if you made a change that required also a manual fix on the data.
Maybe you decided to normalize something in the database, e.g. to split a name column to the first name and the last name. If you have only one instance of the application with one database and you are the only developer then you will also not write a data migration, but if you want to change it on a live production site with 24 x 7 hours traffic or you cooperate with other developers then you probably prepare a data migration for their databases or you will thoroughly test the migration on a copy of live data that the update will work on the production site correctly without issues and with minimal shutdown. If you don't write a data migration and you had no problem immediately then it is OK and will be not worse then a possible ill-conceived data migration.

manipulating Django database directly

I was wondering this could produce any problem if I directly add rows or remove some from a model table. I thought maybe Django records the number of rows in all tables? or this could mess up the auto-generated id's?
I don't think it matters but I'm using MySql.
No, it's not a problem because Django does the same that you do "directly" to the database, it execute SQL statements, and the auto generated id is handled by the database server (MySql server in this case), no matter where that SQL queries comes from, whatever it is Mysql Client or Django.
Since you can have Django work on a pre-existing database (one that wasn't created by Django), I don't think you will have problems if you access/write the tables of your own app (you might want to avoid modifying Django's internal tables like auth, permission, content_type etc until you are familiar with them)
When you create a model through Django, Django doesn't store the count or anything (unless your app does), so it's okay if you create the model with Django on the SQL database, and then have another app write/read from that same SQL table
If you use Django signals, those will not be triggered by modifying the SQL table directly through the DB, so you might want to pay attention to side effects like that.
Your RDBMS handles it's own auto generated IDs and referential integrity, counts etc, so you don't have to worry about messing it up.

Modify the django models

I just tested it myself. I had Django models, and there have already been instances of the models in the database.
Then I added a dummy integer field to a model and ran manage.py syncdb. Checked the database, and nothing happened to the table. I don't see the extra field added in.
Is this the expected behavior? What's the proper way of modifying the model, and how will that alter the data that's already in the database?
Django will not alter already existing tables, they even say so in the documentation. The reason for this is that django can not guarantee that there will be no information lost.
You have two options if you want to change existing tables. Either drop them and run syncdb again, but you will need to store your data somehow if you want to keep it. The other options is to use a migrations tool to do this for you. Django can show you the SQL for the new database schema and you can diff that to the current version of the database to create the update script.
You could even update your database mannually if it is a small change and you don't want to bother with migrations tools, though I would recommend to use one.
Please use south for any kind of changes to get reflected to your database tables,
here goes the link for using south
Link for South documentation

Django switch from sqlite3 to Postgresql issues

I've recently changed the database server on my project from sqlite3 to Postgresql and I have a few questions that I hope will give an answer to my issues.
I understand that switching from sqlite to Postgres implies that I create the new database and the tables inside it, right? I've done that but I haven't seen any new files created in my project to show me that the database I've made is visible. (Btw, I've changed the database name in settings.py)
I probably should mention that I'm working in a virtual environment and I would like to know if that affects my references in any way. I've tried to import the tables in Django to try and count the number of records in a table but I get the error: "No module named psdemo". (psdemo is my database name and i'm trying to import the table with:
from ps.psdemo import Product
where ps is my application, psdemo is my database and Product the table in the database.
In conclusion I'm trying to get access to my database and tables but I can't manage to find them. I repeat, there is no new database file in my project or in my virtual environment (I've searched thoroughly) but if I use a terminal connection I can connect to my virtual environment and change directories to get to the application folder then if I connect to the Postgresql server I can create the database, the tables and can Insert into them, make queries etc, but I cannot access them from the Django code.
I understand that switching from sqlite to Postgres implies that I create the new database and the tables inside it, right? I've done that but I haven't seen any new files created in my project to show me that the database I've made is visible. (Btw, I've changed the database name in settings.py)
All you have to do with postgres is create the database. Not the tables. Django will create the tables, and anything else it thinks are useful, once you call syncdb.
You won't have any new files in your project like you did in sqlite. If you want to view your database, you should download and install pgadminIII (which I would recommend in any event)
I probably should mention that I'm working in a virtual environment and I would like to know if that affects my references in any way. I've tried to import the tables in Django to try and count the number of records in a table but I get the error: "No module named psdemo". (psdemo is my database name and i'm trying to import the table with:
Here, you import models via normal python syntax and it then references your tables. Each model should represent a single table. You define your models first, and then call
python manage.py syncdb
In conclusion I'm trying to get access to my database and tables but I can't manage to find them.
See above, but you should definitely read about postgres installation from the postgres docs, and read the psycopg2 docs as well as the Django docs for setting up a postgres database.
I understand that switching from sqlite to Postgres implies that I
create the new database and the tables inside it, right? I've done
that but I haven't seen any new files created in my project to show me
that the database I've made is visible. (Btw, I've changed the
database name in settings.py)
Database files are not created in the project directory with postgresql. They are created in the database server data directory (like /var/lib/postgres it depends on the distribution). You should generally query it through a PostgreSQL client that connects to the PostgreSQL server rather than messing with the files directly.
You can for example run command:
manage.py dbshell
As to your first issue, see #jpic's answer.
On your second issue, your database is not a package, and you do not import models from your database. If you were able to import your models correctly before you made any changes, change your import statements back to how they were.

Django unit-testing with loading fixtures for several dependent applications problems

I'm now making unit-tests for already existing code. I faced the next problem:
After running syncdb for creating test database, Django automatically fills several tables like django_content_type or auth_permissions.
Then, imagine I need to run a complex test, like check the users registration, that will need a lof ot data tables and connections between them.
If I'll try to use my whole existing database for making fixtures (that would be rather convinient for me) - I will receive the error like here. This happens because, Django has already filled tables like django_content_type.
The next possible way is to use django dumpdata --exclude option for already filled with syncdb tables. But this doesn't work well also, because if I take User and User Group objects from my db and User Permissions table, that was automatically created by syncdb, I can receive errors, because the primary keys, connecting them are now pointing wrong. This is better described here in part 'fixture hell', but the solution shown there doensn't look good)
The next possible scheme I see is next:
I'm running my tests; Django creates test database, makes syncdb and creates all those tables.
In my test setup I'm dropping this database, creating the new blank database.
Load data dump from existing database also in test setup
That's how the problem was solved:
After the syncdb has created the test database, in setUp part of the tests I use os.system to access shell from my code. Then I'm just loading the dump of the database, which I want to use for tests.
So this works like this: syncdb fills contenttype and some other tables with data. Then in setUp part of tests loading the sql dump clears all the previously created data and i get a nice database.
May be not the best solution, but it works=)
My approach would be to first use South to make DB migrations easy (which doesn't help at all, but is nice), and then use a module of model creation methods.
When you run
$ manage.py test my_proj
Django with South installed with create the Test DB, and run all your migrations to give you a completely updated test db.
To write tests, first create a python module calle, test_model_factory.py In here create functions that create your objects.
def mk_user():
User.objects.create(...)
Then in your tests you can import your test_model_factory module, and create objects for each test.
def test_something(self):
test_user = test_model_factory.mk_user()
self.assert(test_user ...)