I am building a Django app where there are several "user spaces" or "client domains". I was wondering how I could separate each clients from accessing other's data ? So far I have come up with several options :
Distribute one project per client with their own database
Keep one database and one django running (two options)
a) Add some top level table with some ID to be referenced in each model. But then how do I restrict someone from impersonnating someone else's ID and access their data ?
b) Build some Authorization layer on top of requests using a header or some query param to help discriminate which data the user can acces (but I don't really know how to do so)
What is the state of the art in the domain of multiple clients using a same Django app ? Are my solutions appropriate and if so, how do I implement them ?
I have seen few related posts / articles on internet so I'm resorting to ask here.
Thanks in advance !
Related
The problem background
I am working on a Django project, which is using PostgreSQL and is hosted on Heroku(also using heroku-postgres). After some time, the amount of data becomes very big and that slows down the application.
I tried replication in order to read from multiple databases, that helped reducing the queue and connection time, but the big table is still the problem.
I can split the data based on group of users, different group do not need to interact with each other.
I have read into Sharding. But since we use heroku-postgres, it's hard to customize the sharding
So I have come up with 2 different ideas below
1. The app cluster with multi-db (Not allowed to embed image yet)
Please see the design here
We can use the middlewares and database-routers to achieve this
But not sure if this is friendly with django
2. The gateway app with multiple sub-apps (Not allowed to embed image yet)
Please see the design here
This require less effort than the previous design
Also possible to set region-based sub-apps in the future
My question is: which of the two is more django-friendly and better for scalability in the long run?
I need to write an API which would provide access to data being served as HTML documents from a web server. I need for my users to be able to perform queries over the data.
Say on a web site there is a page which lists items and their owners. Then there is additional set of profile pages for owners which for each owner provide information about their reputation. An example query I may need to answer is "Give me ID's and owners of all items submitted in 2013 whose owners have reputation of at least 10".
Given a query to answer, I need to be able to screen scrape only the parts of the web site I need for answering the query at hand. And ideally cache the obtained information for future use with new queries.
I have no problem writing the screen scraping part, but I am struggling with designing the storage/query/cache part. Is there something about Clojure/Datomic that makes it an especially suitable technology choice for this kind of processing of data? I have been pointed in this direction before.
It seems a nice challenge but not sure about a few things: a) would you like to expose to your users a Datalog query box and so make them learn datalog-like syntax? b) what exact kind of results do you wish to cache, raw DB responses, html fomatted text, json ?
Anyway I suggest you to install and play a little bit with the Datomic console to get a grasp if you didn't before as it seems to me the more close idea to what you want to achieve atm https://www.youtube.com/watch?v=jyuBnl0XQ6s http://blog.datomic.com/2013/10/datomic-console.html
For the API I suggest you to use http://clojure-liberator.github.io/liberator/ as it provides sane defaults to implement REST services and let you focus on your app behaviour
I am in the beginning stages of planning a web application using ColdFusion and SQL Server 2012.
In researching the pros and cons of using multiple databases (one per customer) vs one large database, for my purposes I have decided multiple databases would be the best approach.
With this in mind I am now wondering the best way to proceed regarding logging clients in. I have two thoughts here:
I could use sub-domains with each one being for a specific client. The sub-domain also being the datasource name.
I could have a single sign on page with the datasource for this client stored in a universal users table.
I like the idea of option 2 best however I am wondering how this may work in the real world. Making each user unique would not be ideal (although I suppose I could make this off of an email address instead of a username).
I was thinking of maybe adding something along the lines of a "company code" that would need to be entered along with the username and password.
I feel like this may be asking too much of clients though.
With all of this said, would you advise going with option 1 or option 2? Would also love to hear any thoughts or ideas that may differ.
Thanks!
If you are expecting to have a large amount of data per client, it may be a good idea to split each client into their own database.
You can create a global database that contains client information, client datasource, settings, etc. for each client and then set the client database in the application.cfc.
This also makes it easier at the end if a client request their data or you would like to remove a client from the system.
I am currently trying to figure out he best practice in order to design my web services between a django administrated database (+ images) and a mobile app. My main concern is how to separate a bulk update (send every data in the database and all the files on the server) and a lighter, smaller update with only the new and / or modified objects (images or data.)
I have had access to a working code-base using a cronjob and states for each data field (new, modified, up to date) to generate either a reference data file or an update file. I find it to be very redundant and somewhat unelegant, in contradiction with the DRY spirit of Django (there are tons of lines of code, making it nearly unmaintainable.))
I find it very surprising that this aspect is almost un-documented, since web traffic is a crucial matter in mobile developpment.. Fetching everytime all the data served quickly becomes unsustainable as the database grows..
I would be very grateful for any lead or advice you could give me :-) Thx in advance !
Just have a last_modified DateTimeField in your table, and in your user's profile a last_synchronized DateTimeField. When the mobile app wants to synchronize, send the data which was modified after the last synchronization run, and update the last_synchronized field in the user's profile.
I want to trace user's actions in my web site by logging their requests to database as plain text in Django.
I consider to write a custom decorator and place it to every view that I want to trace.
However, I have some troubles in my design.
First of all, is such logging mecahinsm reasonable or because of my log table will be enlarging rapidly it causes some preformance problems ?
Secondly, how should be my log table's design ?
I want to keep keywords if the user call search view or keep the item's id if the user call details of item view.
Besides, IP addresses of user's should be kept but how can I seperate users if they connect via single IP address as in many companies.
I am glad to explain in detail if you think my question is unclear.
Thanks
I wouldn't do that. If this is a production service then you've got a proper web server running in front of it, right? Apache, or nginx or something. That can do logging, and can do it well, and can write to a form that won't bloat your database, and there's a wealth of analytical tools for log analysis.
You are going to have to duplicate a lot of that functionality in your decorator, such as when you want to switch it on or off, or change the log level. The only thing you'll get by doing it all in django is the possibility of ultra-fine control, such as only logging views of blog posts with id numbers greater than X or something. But generally you'd not want that level of detail, and you'd log everything and do any stripping at the analysis phase. You've not given any reason currently why you need to do it from Django.
If you really want it in a RDBMS, reading an apache log file into Postgres or MySQL or one of those expensive ones is fairly trivial.
One thing you should keep in mind is that SQL databases don't offer you a very good writing performance (in comparison with reading), so if you are experiencing heavy loads you should probably look for a better in-memory solution (eg. some key-value-store like redis).
But keep in mind, that, especially if you would use a non-sql solution you should be aware what you want to do with the collected data (just display something like a 'log' or do some more in-deep searching/querying on the data).
If you want to identify different users from the same IP address you should probably look for a cookie-based solution (if you are using django's session framework the session's are per default identified through a cookie - so you could just simply use sessions). Another solution could be doing the logging 'asynchronously' via javascript after the page has loaded in the browser (which could give you more possibilities in identifying the user and avoid additional load when generating the page).