Django Model Table temporary data vs. permanent data - django

I am writing a trip planner, and I have users. For the purposes of this question, lets assume my models are as simple as having a "Trip" model and having a "UserProfile" model.
There is a functionality of the site that allows to search for routes (via external APIs), and then dynamically assembles those into "trips", which we then display. A new search deletes all the old "trips" and figures out new ones.
My problem is this: I want to save some of these trips to the user profile. If the user selects a trip, I want it to be permanently associated with that profile. Currently I have a ManyToMany field for Trips in my UserProfile, but when the trips are "cleaned/flushed", all trips are deleted, and that association is useless. I need a user to be able to go back a month later and see that trip.
I'm looking for an easy way to duplicate that trip data, or make it static once I add it to a profile . .. I don't quite know where to start. Currently, the way it is configured is there is a trips_profile datatable that has a foreign key to the "trips" table . . . which would be fine if we weren't deleting/flushing the trips table all the time.
Help appreciated.

It's hard to say exactly without your models, but given the following layout:
class UserProfile(models.Model):
trips = models.ManyToManyField(Trip)
You can clear out useless Trips by doing:
Trip.objects.filter(userprofile__isnull=True).delete()
Which will only delete Trips not assigned to a UserProfile.
However, given the following layout:
class Trip(models.Model):
users = models.ManyToManyField(User)
You could kill the useless trips with:
Trip.objects.filter(users__isnull=True).delete()
The second method has the side benefit of not requiring any changes to UserProfile or even a UserProfile at all, since you can then just get a Users trips with:
some_user.trip_set.all()

Related

Determine how many times a Django model instance has been updated

I'm trying to find a generic way to get a count of how many times an instance of a model has had any of its fields updated. In other words, in Django, how do I get a count of how many times a specific row in a table has been updated? I'm aiming to show a count of how many updates have been made.
Let's say I have:
class MyModel(models.Model):
field = models.CharField()
another_field = models.IntegerField()
...
and I have an instance of the model:
my_model = MyModel.objects.get(id=1)
Is there a way to find out how many times my_model has had any of its fields updated? Or would I need to create a field like update_count and increment it each time a field is updated? Hopefully there is some kind of mechanism available in Django so I don't have to go that route.
Hopefully this isn't too basic of a question, I'm still learning Django and have been struggling with how to figure this out on my own.
There is no generic way to get this. As mentioned by wim you can use some "versioning package" to track whole history of changes. I've personally used the same suggestion: django-reversion, but there are other alternatives.
If you need to track only some fields then you may program some simpler mechanism yourself:
create a model/field to track your information
use something like FieldTracker to track changes to specific fields
Create handler post-save signal (or just modify model's save method) to save the data
You may also use something like "table audit". I haven't tried anything like that myself but there are some packages for that too:
https://github.com/StefanKjartansson/django-postgres-audit
https://github.com/torstenrudolf/django-audit-trigger
https://github.com/kvesteri/postgresql-audit

Should I use JSONField over ForeignKey to store data?

I'm facing a dilemma, I'm creating a new product and I would not like to mess up the way I organise the informations in my database.
I have these two choices for my models, the first one would be to use foreign keys to link my them together.
Class Page(models.Model):
data = JsonField()
Class Image(models.Model):
page = models.ForeignKey(Page)
data = JsonField()
Class Video(models.Model):
page = models.ForeignKey(Page)
data = JsonField()
etc...
The second is to keep everything in Page's JSONField:
Class Page(models.Model):
data = JsonField() # videos and pictures, etc... are stored here
Is one better than the other and why? This would be a huge help on the way I would organize my databases in the futur.
I thought maybe the second option could be slower since everytime something changes all the json would be overridden, but does it make a huge difference or is what I am saying false?
A JSONField obfuscates the underlying data, making it difficult to write readable code and fully use Django's built-in ORM, validations and other niceties (ModelForms for example). While it gives flexibility to save anything you want to the db (e.g. no need to migrate the db when adding new fields), it takes away the clarity of explicit fields and makes it easy to introduce errors later on.
For example, if you start saving a new key in your data and then try to access that key in your code, older objects won't have it and you might find your app crashing depending on which object you're accessing. That can't happen if you use a separate field.
I would always try to avoid it unless there's no other way.
Typically I use a JSONField in two cases:
To save a response from 3rd party APIs (e.g. as an audit trail)
To save references to archived objects (e.g. when the live products in my db change but I still have orders referencing the product).
If you use PostgreSQL, as a relational database, it's optimised to be super-performant on JOINs so using ForeignKeys is actually a good thing. Use select_related and prefetch_related in your code to optimise the number of queries made, but the queries themselves will scale well even for millions of entries.

What is the best way to have user specific numbering in Django?

I'm making a web site for a friend for a small business, and for each user, I want them to be able to access their orders by number which starts from 1 for each user, but in the backend this should be a global numbering. So for each user, their first order will be at /orders/1/ and so on. Is there a consensus on how this should be achieved in general? Way I see it, I can do this 2 ways:
Store the number in another column in the orders table. I'd prefer not to do this because I'm not entirely sure how to handle deletions without going through and updating all the records of the user. If someone knows the edge cases I need to handle, I might go with this.
OR
For every queryset I make when getting the orders page for each user I handle the numbering, benefit of this is that it will always give the correct numbering, especially if I just do it in the template. Right now this seems easier, but I have a feeling this would give rise to problems in the future. Main problem I see is I'm not sure how to make it link to the correct url without the primary key being in that url.
I recommend you to store MyUser in a separate app, say accounts
class MyUser(BaseUser):
# extra fields
And store Order in a separate app, say order
from accounts.models import MyUser
class Order(models.Model):
user = models.ForeignKey(MyUser)
order_num = models.IntegerField()
# other fields
keep update this order_num by the count of orders the user has made.
to get the count,
count = Order.objects.filter(user==request.user).count()

Trying to minimize the number of trips to a database voting table

I use django 1.10.1, postgres 9.5 and redis.
I have a table that store users votes and looks like:
==========================
object | user | created_on
==========================
where object and user are foreign keys to the id column of their own tables respectively.
The problem is that in many situations, I have to list many objects in one page. If the user is logged in or authenticated, I have to check for every object whether it was voted or not (and act depending on the result, something like show vote or unvote button). So in my template I have to call such function for every object in the page.
def is_obj_voted(obj_id, usr_id):
return ObjVotes.objects.filter(object_id=obj_id, user_id=usr_id).exists()
Since I may have tens of objects in one page, I found, using django-debug-toolbar, that the database access alone could take more than one second because I access just one row for each query and that happens in a serial way for all objects in the page. To make it worse, I use similar queries from that tables in other pages (i.e. filter using user only or object only).
What I try to achieve and what I think it is the right thing to do is to find a way to access the database just once to fetch all objects voted filtered by some user (maybe when the user logs in in or the at the first page hit requiring such database access), and then filter it further to whatever I want depending on the page needs. Since I use redis and django-cacheops app, can it help me to do that job?
In your case I'd better go with getting an array of object IDs and querying all votes by user's ID and this array, something like:
object_ids = [o.id for o in Object.objects.filter(YOUR CONDITIONS)]
votes = set([v.object_id for v in ObjVotes.objects.filter(object_id__in=object_ids, user_id=usr_id)]
def is_obj_voted(obj_id, votes):
return obj_id in votes
This will make only one additional database query for getting votes by user per page.

Is it possible to improve the process of instance creation/deletion in Django using querysets?

So I have a list of unique pupils (pupil is the primary_key in an LDAP database, each with an associated teacher, which can be the same for several pupils.
There is a box in an edit form for each teacher's pupils, where a user can add/remove an pupil, and then the database is updated according using the below function. My current function is as follows. (teacher is the teacher associated with the edit page form, and updated_list is a list of the pupils' names what has been submitted and passed to this function)
def update_pupils(teacher, updated_list):
old_pupils = Pupil.objects.filter(teacher=teacher)
for pupils in old_pupils:
if pupil.name not in updated_list:
pupil.delete()
else:
updated_list.remove(pupil.name)
for pupil in updated_list:
if not Pupil.objects.filter(name=name):
new_pupil = pupil(name=name, teacher=teacher)
new_pupil.save()
As you can see the function basically finds what was the old pupil list for the teacher, looks at those and if an instance is not in our new updated_list, deletes it from the database. We then remove those deleted from the updated_list (or at least their names)...meaning the ones left are the newly created ones, which we then iterate over and save.
Now ideally, I would like to access the database as infrequently as possible if that makes sense. So can I do any of the following?
In the initial iteration, can I simply mark those pupils up for deletion and potentially do the deleting and saving together, at a later date? I know I can bulk delete items but can I somehow mark those which I want to delete, without having to access the database which I know can be expensive if the number of deletions is going to be high...and then delete a lot at once?
In the second iteration, is it possible to create the various instances and then save them all in one go? Again, I see in Django 1.4 that you can use bulk_create but then how do you save these? Plus, I'm actually using Django 1.3 :(...
I am kinda assuming that the above steps would actually help with the performance of the function?...But please let me know if that's not the case.
I have of course been reading this https://docs.djangoproject.com/en/1.3/ref/models/querysets/ So I have a list of unique items, each with an associated email address, which can be the same for several items.
First, in this line
if not Pupil.objects.filter(name=name):
It looks like the name variable is undefined no ?
Then here is a shortcut for your code I think:
def update_pupils(teacher, updated_list):
# Step 1 : delete
Pupil.objects.filter(teacher=teacher).exclude(name__in=updated_list).delete() # delete all the not updated objects for this teacher
# Step 2 : update
# either
for name in updated_list:
Pupil.objects.update_or_create(name=name, defaults={teacher:teacher}) # for updated objects, if an object of this name exists, update its teacher, else create a new object with the name from updated_list and the input teacher
# or (but I'm not sure this one will work)
Pupil.objects.update_or_create(name__in=updated_list, defaults={teacher:teacher})
Another solution, if your Pupil object only has those 2 attributes and isn't referenced by a foreign key in another relation, is to delete all the "Pupil" instances of this teacher, and then use a bulk_create.. It allows only 2 access to the DB, but it's ugly
EDIT: in first loop, pupil also is undefined