Determine how many times a Django model instance has been updated - django

I'm trying to find a generic way to get a count of how many times an instance of a model has had any of its fields updated. In other words, in Django, how do I get a count of how many times a specific row in a table has been updated? I'm aiming to show a count of how many updates have been made.
Let's say I have:
class MyModel(models.Model):
field = models.CharField()
another_field = models.IntegerField()
...
and I have an instance of the model:
my_model = MyModel.objects.get(id=1)
Is there a way to find out how many times my_model has had any of its fields updated? Or would I need to create a field like update_count and increment it each time a field is updated? Hopefully there is some kind of mechanism available in Django so I don't have to go that route.
Hopefully this isn't too basic of a question, I'm still learning Django and have been struggling with how to figure this out on my own.

There is no generic way to get this. As mentioned by wim you can use some "versioning package" to track whole history of changes. I've personally used the same suggestion: django-reversion, but there are other alternatives.
If you need to track only some fields then you may program some simpler mechanism yourself:
create a model/field to track your information
use something like FieldTracker to track changes to specific fields
Create handler post-save signal (or just modify model's save method) to save the data
You may also use something like "table audit". I haven't tried anything like that myself but there are some packages for that too:
https://github.com/StefanKjartansson/django-postgres-audit
https://github.com/torstenrudolf/django-audit-trigger
https://github.com/kvesteri/postgresql-audit

Related

Should I use JSONField over ForeignKey to store data?

I'm facing a dilemma, I'm creating a new product and I would not like to mess up the way I organise the informations in my database.
I have these two choices for my models, the first one would be to use foreign keys to link my them together.
Class Page(models.Model):
data = JsonField()
Class Image(models.Model):
page = models.ForeignKey(Page)
data = JsonField()
Class Video(models.Model):
page = models.ForeignKey(Page)
data = JsonField()
etc...
The second is to keep everything in Page's JSONField:
Class Page(models.Model):
data = JsonField() # videos and pictures, etc... are stored here
Is one better than the other and why? This would be a huge help on the way I would organize my databases in the futur.
I thought maybe the second option could be slower since everytime something changes all the json would be overridden, but does it make a huge difference or is what I am saying false?
A JSONField obfuscates the underlying data, making it difficult to write readable code and fully use Django's built-in ORM, validations and other niceties (ModelForms for example). While it gives flexibility to save anything you want to the db (e.g. no need to migrate the db when adding new fields), it takes away the clarity of explicit fields and makes it easy to introduce errors later on.
For example, if you start saving a new key in your data and then try to access that key in your code, older objects won't have it and you might find your app crashing depending on which object you're accessing. That can't happen if you use a separate field.
I would always try to avoid it unless there's no other way.
Typically I use a JSONField in two cases:
To save a response from 3rd party APIs (e.g. as an audit trail)
To save references to archived objects (e.g. when the live products in my db change but I still have orders referencing the product).
If you use PostgreSQL, as a relational database, it's optimised to be super-performant on JOINs so using ForeignKeys is actually a good thing. Use select_related and prefetch_related in your code to optimise the number of queries made, but the queries themselves will scale well even for millions of entries.

How to fetch and display data from 2 models in a single queryset ordered by time, Django

I want to have facebook kind of news feed, in which i need to fetch data from 2 different models ordered by time.
Models are something like :
class User_image(models.Model):
user = models.ForeignKey(User_info)
profile_pic = models.ImageField(upload_to='user_images')
created = models.DateTimeField(auto_now_add=True)
class User_status(models.Model):
user = models.ForeignKey(User_info)
status = models.CharField(max_length=1)
created = models.DateTimeField(auto_now_add=True)
As per my requirement, i can not make a single model out of these two models.
Now i need to know the simple code in views and template so as to display profile pic and status in the news feed according to time.
Thanks.
The most simple way of archiving this is to have a base model, call it Base_event,
class Base_event(models.Model):
user = models.ForeignKey(User_info)
created = models.DateTimeField(auto_now_add=True)
and derive both your models from this Base. This way you write less code, and you archive your objective. Notice that you have to make an implementation choice: how will they will inherit from base. I advice to read Django documentation to help you choose wisely according to what you want to do.
EDIT:
I would notice that the accepted answer has a caveat. It sorts the data on the python and not on the mysql, which means it will have an impact on the performance: the whole idea of mysql having SORT is to avoid having to hit the database and them perform the sorting. For instance, if you want to retrieve just the first 10 elements sorted, with the accepted solution you have to extract all the entries, and only then decide which ones are the first 10.
Something like Base_event.objects.filter().sort_by(...)[10] would only extract 10 elements of the database, instead of the whole filtered table.
The easy solution now becomes the problem later.
Try something like creating list chain.
feed = list(chain(User_image,User_status))
feed = sorted(feed, key=operator.attrgetter('date_added'))
for those who refer it as not correct.
https://stackoverflow.com/a/434755/2301434

Django .order_by() with .distinct() using postgres

I have a Read model that is related to an Article model. What I would like to do is make a queryset where articles are unique and ordered by date_added. Since I'm using postgres, I'd prefer to use the .distinct() method and specify the article field. Like so:
articles = Read.objects.order_by('article', 'date_added').distinct('article')
However this doesn't give the desired effect and orders the queryset by the order they were created. I am aware of the note about .distinct() and .order_by() in Django's documentation, but I don't see that it applies here since the side effect it mentions is there will be duplicates and I'm not seeing that.
# To actually sort by date added I end up doing this
articles = sorted(articles, key=lambda x: x.date_added, reverse=True)
This executes the entire query before I actually need it and could potentially get very slow if there are lots of records. I've already optimized using select_related().
Is there a better, more efficient, way to create a query with uniqueness of a related model and order_by date?
UPDATE
The output would ideally be a queryset of Read instances where their related article is unique in the queryset and only using the Django orm (i.e. sorting in python).
Is there a better, more efficient, way to create a query with uniqueness of a related model and order_by date?
Possibily. It's hard to say without the full picture, but my assumption is that you are using Read to track which articles have and have not been read, and probably tying this to User instance to determine if a particular user has read an article or not. If that's the case, your approach is flawed. Instead, you should do something like:
class Article(models.Model):
...
read_by = models.ManyToManyField(User, related_name='read_articles')
Then, to get a particular user's read articles, you can just do:
user_instance.read_articles.order_by('date_added')
That takes the need to use distinct out of the equation, since there will not be any duplicates now.
UPDATE
To get all articles that are read by at least one user:
Article.objects.filter(read_by__isnull=False)
Or, if you want to set a threshold for popularity, you can use annotations:
from django.db.models import Count
Article.objects.annotate(read_count=Count('read_by')).filter(read_count__gte=10)
Which would give you only articles that have been read by at least 10 users.

Django Model Table temporary data vs. permanent data

I am writing a trip planner, and I have users. For the purposes of this question, lets assume my models are as simple as having a "Trip" model and having a "UserProfile" model.
There is a functionality of the site that allows to search for routes (via external APIs), and then dynamically assembles those into "trips", which we then display. A new search deletes all the old "trips" and figures out new ones.
My problem is this: I want to save some of these trips to the user profile. If the user selects a trip, I want it to be permanently associated with that profile. Currently I have a ManyToMany field for Trips in my UserProfile, but when the trips are "cleaned/flushed", all trips are deleted, and that association is useless. I need a user to be able to go back a month later and see that trip.
I'm looking for an easy way to duplicate that trip data, or make it static once I add it to a profile . .. I don't quite know where to start. Currently, the way it is configured is there is a trips_profile datatable that has a foreign key to the "trips" table . . . which would be fine if we weren't deleting/flushing the trips table all the time.
Help appreciated.
It's hard to say exactly without your models, but given the following layout:
class UserProfile(models.Model):
trips = models.ManyToManyField(Trip)
You can clear out useless Trips by doing:
Trip.objects.filter(userprofile__isnull=True).delete()
Which will only delete Trips not assigned to a UserProfile.
However, given the following layout:
class Trip(models.Model):
users = models.ManyToManyField(User)
You could kill the useless trips with:
Trip.objects.filter(users__isnull=True).delete()
The second method has the side benefit of not requiring any changes to UserProfile or even a UserProfile at all, since you can then just get a Users trips with:
some_user.trip_set.all()

Designing a database for a user/points system? (in Django)

First of all, sorry if this isn't an appropriate question for StackOverflow. I've tried to make it as generalisable as possible.
I want to create a database (MySQL, site running Django) that has users, who can be allocated a certain number of points for various types of action - it's a collaborative game. My requirements are to obtain:
the number of points a user has
the user's ranking compared to all other users
and the overall leaderboard (i.e. all users ranked in order of points)
This is what I have so far, in my Django models.py file:
class SiteUser(models.Model):
name = models.CharField(max_length=250 )
email = models.EmailField(max_length=250 )
date_added = models.DateTimeField(auto_now_add=True)
def points_total(self):
points_added = PointsAdded.objects.filter(user=self)
points_total = 0
for point in points_added:
points_total += point.points
return points_total
class PointsAdded(models.Model):
user = models.ForeignKey('SiteUser')
action = models.ForeignKey('Action')
date_added = models.DateTimeField(auto_now_add=True)
def points(self):
points = Action.objects.filter(action=self.action)
return points
class Action(models.Model):
points = models.IntegerField()
action = models.CharField(max_length=36)
However it's rapidly becoming clear to me that it's actually quite complex (in Django query terms at least) to figure out the user's ranking and return the leaderboard of users. At least, I'm finding it tough. Is there a more elegant way to do something like this?
This question seems to suggest that I shouldn't even have a separate points table - what do people think? It feels more robust to have separate tables, but I don't have much experience of database design.
this is old, but I'm not sure exactly why you have 2 separate tables (Points Added & Action). It's late, so maybe my mind isn't ticking, but it seems like you just separated one table into 2 for some reason. It doesn't seem like you get any benefit out of it. It's not like there's a 1 to many relationship in it right?
So first of all, I would combine those two tables. Secondly, you are probably better off storing points_total into a value in your site_user table. This is what I think Demitry is trying to allude to, but didn't say explicitly. This way instead of doing this whole additional query (pulling everything a user has done in his history of the site is expensive) + looping action (going through it is even more expensive), you can just pull it as one field. It's denormalizing the data for a greater good.
Just be sure to update the value everytime you add in something that has points. You can use django's post_save signal to do that
It's a bit more difficult to have points saved in the same table, but it's totally worth it. You can do very simple ordering/filtering operations if you have computed points total on user model. And you can count totals only when something changes (not every time you want to show them). Just put some validation logic into post_save signals and make sure to cover this logic with tests and you're good.
p.s. denormalization on wiki.