Django + PostgreSQL with bi-directional replication - django

Firstly please let me introduce my use-case: I am working on Django application (GraphQL API using Graphene), which runs in the cloud but also have its local instances in local customer's networks.
For example One application in the cloud and 3 instances (local Django app instance with a PostgreSQL server with enabled BDR) on local networks. If there is a network connection we are using bi-directional replication to have fresh data because if there is no connectivity we use local instances. Here is the simplified infrastructure diagram for an illustration.
So, if I want to use the BDR I can't do DELETE and UPDATE operations in ORM. I have to generate UUIDs for my entities and every change is just a new record with updated data for the same UUID. Latest record for selected UUID is my valid record. Removal is just a another flag. Till now, everything seems to be fine, problem starts when I want to use for example many-to-many relationship. Relationship relies on the database primary keys and I have to handle removal somehow. Can you please find the best way how to solve this issue? I have few ideas but I do not want to made a bad decision:
I can try to override ManyToManyField to work with my UUIDs and special removal flag. It's looks like nice idea because everything should work as before (Graphene will find the relations etc.). But I am afraid of "invisible" consequences.
Create my own models to simulate ManyToMany relationship. It's much more work but it should work just fine.
Did you have to solve similar issue before? Is there some kind of good practice or it's just building a highway to hell (AC/DC is pretty cool)?
Or if you think there is a better way how to build the service architecture, I would love to hear your ideas.
Thanks in advance.

Related

Django Moving lookup table to Redis

I have a django app with redis which is currently used as the broker for Celery, and nothing beyond that.
I would like to utilize it further for lookup caching.
Let's say I had a widely used table in my database that I keep hitting for lookups. For the same of example, let's say it's a mapping of U.S. zip codes to city/state names, or any lookup that may actually change over time that's important to my application.
My questions are:
Once the server starts (in my case, Gunicorn), how do I one-time load the data from the database table to Redis. I mean- where and how do I make this one time call? Is there a place in the django framework for such "onload" calls? or do I simply trigger it lazy-style, upon the first request which will be served from the database, but trigger a Redis load of the entire table?
What about updates? If the database table is updated somehow, (e.g. row deleted, row updated, row added) how do I catch that in order to update the Redis representation of it?
Is there a best-practice or library already geared toward exactly that?
how do I one-time load
For the one time load you can find answer here (from those answers only urls.py worked for me). But I prefer another scenario. I would create manage command and I would add this script to the command you start your Gunicorn. For example if you're using systemd you could add this to service service config. You can also combine those, like add command and call it from urls.py
What about updates
It really depends on your database. For example if you use postgresql, you can create trigger for update/insert/delete and external table as redis. Also django has signal mechanism so you can implement that in django as well. You can also write your custom wrapper. In this wrapper you implement you operations + syncing with redis. And you would call wrapper instead of. But I prefer the first scenario.
Is there a best-practice or library already geared toward exactly
that?
Sorry I can't help you with this one.

Can I use bigchainDB server with django instead of using sqlite?

I am creating degree verification process using blockchain approach which contain six main entities. By entities I mean to say consensus mechanism will evolve around these six entities, so for this I need to build a distributed database. Two approaches came into my mind
One approach of achieving this is to completely built everything from scratch: Separate database for each node in sqlite and then connect each node with some type of query.
Another approach is to use bigchainDB server which is a distributed database server based on blockchain.
Now my question which approach is feasible? I don't know whether bigchainDB server is compatible with django or not since they haven't mention anything about it in their docs.
If anyone have use bigchainDB please help me out. I am really confused as to which approach should I follow.

django Mongo-db automatic failover from Primary(master) to Secondary(slave)

I've developed a web-app in django, and have used MongoDB for backend.
I'm not sure how to do an automatic failover for the database.
My requirement is that, suppose when Primary node of mongodb is down, django should automatically connect to Secondary node.
How can this be achieved?
I found this library, https://github.com/brianjaystanley/django-failover
which is for django 1.3, but i want for django 1.5
What settings do i need to change, or any library available for the rescue? Any solutions on the floor?
Thanks
You should not need to set up anything in your you application to handle this and the link you provided for the library is not appropriate for use with MongoDB as it is a relational back end solution.
The first case here is do you actually have a Replica Set Configuration for MongoDB? I can only answer presuming that you but the link is worthwhile reading as from your question you probably do not have a core understanding of MongoDB Replication concepts.
What will be explained there is that there is no Secondary for your application to failover to, what actually happens is the Replica Set itself elects amongst it's members which node will become the Primary.
Going on with the answer, you configure your application to handle the failover through settting up your Connection String to the driver. Read through that documentation and you will find that among other useful things, you are basically providing a list of hostnames which will be members of the Replica Set. You don't need all the members, but just enough to be a seed list so that the other nodes can be discovered. That would just happen anyway with the correct options, but it is good practice to have more than one host to contact even to get that information. Here's a sample:
mongodb://<Primary>,<Secondary>/<database>
You may possibly want to take a look at MongoEngine, considering you probably have experience with django and it uses modelling concepts that you will be familiar with, whilst still allowing access to MongoDB features. There is some documentation there on setting up Replica Set connections from memory.

Using Django as a custom Database Management Tool

I am relatively new to Django and this is a more general 'concept' question.
For a client I need to construct an expansive database holding data returned from a series of questionnaires as well as some basic biological data. The idea is to move away from the traditional tools (i.e. Microsoft Access) and manage the data in a mysql database using a basic CRUD interface. Initially the project doesn't need to live on the web, but the next phase will to be to have a centralized db with login and admin page.
I have started building the db with Django models which is great, and I want to use the Django admin for the management of the data.
My question is: Is this a good use of Django? Is there anything I should consider before relying on django for the whole process? And is it advisable to us the Django runserver for db admin on a client's local machine (before we get to the web phase).
Any advice would be much appreciated.
Actually, your description sounds exactly like the sort of thing for which Django is an ideal solution. It sounds more complex and customized than a CMS, and if it's as straightforward as your description then the ORM is definitely a good tool for this. Then again, this sounds exactly like an appserver-ready problem, so Rails, Express for Node.js, or even ChicagoBoss (if you're brave) would be good platforms for this kind of application.
And sure, Django is solid enough you can run it with the test server for local clients before you go whole-hog and run the thing on the web. For that, though, I recommend Apache/mod_wsgi, and if you're going to be fault tolerant there are diamond architectures (one front end proxy with monitoring failover, two or more appserver machines, one database with hot spare) and more complex (see: sharding) architectural layouts you can approach later.
If you're going to run it in a client's local setting, and you're not running Windows, I recommend looking into the screen program. It will allow you to detach the running job into the background while making diagnostics accessible in an ongoing fashion.

Determining popularity of an item within a given date/time range

I'm working on a django website that needs to track popularity of items within a given date/time range. I'll need the ability to have a most viewed today, this week, all time, etc...
There is a "django-popularity" app on github that looks promising but only works with mysql (I'm using postgresql).
My initial thoughts are to create a generic ViewCounter model that logs views for all the objects that are tracked and then run a cron that crunches those numbers into the relevant time-based statistics for each item.
Looking forward to hearing your ideas.
Did you try django-popularity with postgres? The github page just says that the developer has not tested it with anything other than MySQL.
The app has only been tested with MySQL but it should fully work for Postgres with few adjustments. If you do manage to get it to work: please inforrm me. (I'm the developer.) I would love to be able to tell people this product is useable for Postgres as well.
Moreover; all the funcitonality relying on raw SQL checks whether there is actually a MySQL database in use. If not, it should throw an assertion error.
Also, the generic viewcounter is already in my package (it's called ViewTracker, but hell). The cron job seems too much of a hassle to me if we could do either SQL or Django caching as well.