django multitenant architecture options: what influence on database performance? - django

I m designing a website where security of data is an issue.
I ve read this book : https://books.agiliq.com/projects/django-multi-tenant/en/latest/index.html
I'm still thinking about the right database structure for users.
And im hesitating between shared database with isolated schemas, isolated databases in a shared app, or completely isolated tenants using docker.
As security of data is an issue, i would like to avoid to put all the users in the same table in the database or in different schemas in the same database. However i dont understand well if i should put each user in a separate database (create a database per user, sqlite for example but i dont know if it would communicate well with postgres). What is the best practice for this in terms of security?
Im wondering how these options affect database speed compared to a shared database with a shared schema, which was the basic configuration of the course i attended on django.
I dont have good knowledge on databases so your help on the performance issue would be very appreciated!
Also, if I want to do some stats and use tenants data, how difficult is it to query completely isolated tenants using docker or isolated databases, in particular if each user is a separate docker or database?

Related

Using Amazon Redshift for analytics for a Django app with Postgresql as the database

I have a working Django web application that currently uses Postgresql as the database. Moving forward I would like to perform some analytics on the data and also generate reports etc. I would like to make use of Amazon Redshift as the data warehouse for the above goals.
In order to not affect the performance of the existing django web application, I was thinking of writing a NEW Django application that essentially would leverage a READ-ONLY replica of the Postgresql database and continuously write data from read-only replicas to the Amazon Redshift. My thinking is that perhaps the NEW Django application can be used to handle some/all of the Extract, Transform and Load functions
My questions are as follows:
1. Does the Django ORM work well with Amazon Redshift? If yes, how does one handle the model schema translations? Any pointers in this regard would be greatly appreciated.
2. Is there any better alternative to achieve the goals listed above?
Thanks in advance.

Multi database - each database writes to self and read from self

tried to google as much as I could but couldn't find what I was looking for..
So idea is: There is one Master Database from which one you read users authentication process. And there is other many databases which ones keep information(her users, files and other) just to itself(it writes to itself and reads) but Master can reach them all. And if master makes changes to lets say to database structure - all fields should change but information on those database should stay (but master can change any of those database information).
It's like Multi-master but I do not want that other masters could reach other databases, but only write to itself.
Any tips?
Question is not clear, but if you want use multiple databases with django. Look at THIShttps://docs.djangoproject.com/en/2.1/topics/db/multi-db/

Django two projects on two separate servers with one database

I have a Django application that does quite expensive computation and then populates a database. I would like to serve the results using the Django Rest Framework. I would like to keep the application and the API on separate server instances so that I can ensure that the outgoing API is never deprived of resources. This means that both projects will be using the same tables in a shared database. Is there a way to achieve this without at least one of the projects having to perform raw SQL queries? I.e. Can I share both the database and the models (with querysets etc) between two projects on separate servers? Or is that not possible?
Note: I've looked at a lot of similar questions but the answers were not useful.
Thanks!

How to move from a database backend to another on a production Django project?

I would like to move a database in a Django project from a backend to another (in this case azure sql to postgresql, but I want to think of it as a generic situation). I can't use a dump since the databases are different.
I was thinking of something at the django level, like dumpdata, but depending on the amount of available memory and the size of the db sometimes it appears unreliable and crashes.
I have seen solutions that try to break the process into smaller parts that the memory can handle but it was a few years ago, so I was hoping to find other solutions.
So far my searches have failed since they always lead to 'south', which refers to schema migration and not moving data.
I have not implemented this before, but what about the following:
Django supports multiple databases...so just configure DATABASES in your settings file to support the old postgresql database and the azure sql database. Then create a small script that makes use of bulk_create, reading the data from one DB and writing it to the other.

Data Warehouse and Django

This is more of an architectural question than a technological one per se.
I am currently building a business website/social network that needs to store large volumes of data and use that data to draw analytics (consumer behavior).
I am using Django and a PostgreSQL database.
Now my question is: I want to expand this architecture to include a data warehouse. The ideal would be: the operational DB would be the current Django PostgreSQL database, and the data warehouse would be something additional, preferably in a multidimensional model.
We are still in a very early phase, we are going to test with 50 users, so something primitive such as a one-column table for starters would be enough.
I would like to know if somebody has experience in this situation, and that could recommend me a framework to create a data warehouse, all while mantaining the operational DB with the Django models for ease of use (if possible).
Thank you in advance!
Here are some cool Open Source tools I used recently:
Kettle - great ETL tool, you can use this to extract the data from your operational database into your warehouse. Supports any database with a JDBC driver and makes it very easy to build e.g. a star schema.
Saiku - nice Web 2.0 frontend built on Pentaho Mondrian (MDX implementation). This allows your users to easily build complex aggregation queries (think Pivot table in Excel), and the Mondrian layer provides caching etc. to make things go fast. Try the demo here.
My answer does not necessarily apply to data warehousing. In your case I see the possibility to implement a NoSQL database solution alongside an OLTP relational storage, which in this case is PostgreSQL.
Why consider NoSQL? In addition to the obvious scalability benefits, NoSQL offer a number of advantages that probably will apply to your scenario. For instance, the flexibility of having records with different sets of fields, and key-based access.
Since you're still in "trial" stage you might find it easier to decide for a NoSQL database solution depending on your hosting provider. For instance AWS have SimpleDB, Google App Engine provide their own DataStore, etc. However there are plenty of other NoSQL solutions you can go for that have nice Python bindings.