Django - How to annotate QuerySet using multiple field values? - django

I have a model called "Story" that has two integer fields called "views" and "votes". When I retrieve all the Story objects I would like to annotate the returned QuerySet with a "ranking" field that is simply "views"/"votes". Then I would like to sort the QuerySet by "ranking". Something along the lines of...
Story.objects.annotate( ranking=CalcRanking('views','votes') ).sort_by(ranking)
How can I do this in Django? Or should it be done after the QuerySet is retrieved in Python (like creating a list that contains the ranking for each object in the QuerySet)?
Thanks!
PS: In my actual program, the ranking calculation isn't as simple as above and depends on other filters to the initial QuerySet, so I can't store it as another field in the Story model.

In Django, the things you can pass to annotate (and aggregate) must be subclasses of django.db.models.aggregates.Aggregate. You can't just pass arbitrary Python objects to it, since the aggregation/annotation actually happens inside the database (that's the whole point of aggregate and annotate). Note that writing custom aggregations is not supported in Django (there is no documentation for it). All information available on it is this minimal source code: https://code.djangoproject.com/browser/django/trunk/django/db/models/aggregates.py
This means you either have to store the calculations in the database somehow, figure out how the aggregation API works or use raw sql (raw method on the Manager) to do what you do.

Related

Best practice for using Django templates and connect it to database

I am trying to build a database for my website. There are currently three entries with different attributes in my database. I have not created these entries in order, but I have assigned a 'Chapter number' attribute which indicates the order 1,2,3.
I am now trying to inject this using 'context' and 'render' function in my views. I am using the method 'objects.all()' to add all objects to my context. I have a simple Html file where I am inserting the data from the database by looping over (a simple for loop) these added objects.
Now the output that is being generated (naturally) is that it is following the order in which I created the database. I am not sure how I can have the loop run in such a way that I get these chapters in correct order. Thank you for patiently reading my question. Any help will be appreciated.
You may use the order_by method which is included in Djangos QuerySet API:
https://docs.djangoproject.com/en/3.0/ref/models/querysets/
If you offer some more information of your specific data I might provide you with an example.
For orientation purposes, sorting queried objects by date would work as follows:
most_recent = Entry.objects.order_by('-timestamp')
You can sort by any field like so:
sorted_by_field = Entry.objects.order_by('custom_field')

Values list vs Iterating

I have to extract value of id field from each model instance in queryset. What is more efficient - iterating through queryset with use of list comprehension or values list method with flat argument setted to true and then converted to list?
values_list will be more performant as it will only fetch the requested fields from the database and it will not instantiate model instances.
Quoting the Django documentation:
It is useful when you know you’re only going to need values from a small number of the available fields and you won’t need the functionality of a model instance object. It’s more efficient to select only the fields you need to use.

django: saving a query set to session

I am trying to save query result obtained in one view to session, and retrieve it in another view, so I tried something like below:
def default (request):
equipment_list = Equipment.objects.all()
request.session['export_querset'] = equipment_list
However, this gives me
TypeError at /calbase/
<QuerySet [<Equipment: A>, <Equipment: B>, <Equipment: C>]> is not JSON serializable
I am wondering what does this mean exactly and how should I go about it? Or maybe there is alternative way of doing what I want besides using session?
If this is what you are saving:
equipment_list = Equipment.objects.all()
You shouldn't or wouldn't need to use sessions. Why? Because this is a simple query without any filtering. equipment_list would be common to all the users. This can quite easily be saved in the cache
from django.core.cache import cache
equipment_list = cache.get('equipment_list')
if not equipment_list:
equipment_list = Equipment.objects.all()
cache.set('equipment_list',equipment_list)
Note that a queryset can be saved in the cache without it having to be converted to values first.
Update:
One of the other answers mention that a querysets are not json serializable. That's only applicable when you are trying to pass that off as a json response. Isn't applicable when you are trying to cache it because django.core.cache does not use json serialization it uses pickling.
'e4c5' raises a concern which is perfectly valid. From the limited code we can see, it makes no sense to put in the results of that query into the session. Unless ofcourse you have some other plans which we cant quite see here. I'll ignore this and assume you ABSOLUTELY MUST save the query results into the session.
With this assumption, you must understand that the queryset instance which Django is giving you is a python object. You can move this around WITHIN your Django application without any hassles. However, whenever you attempt to send such an entity over the wire into some other data store/application (in your case, saving it into the session, which involves sending this data over to your configured session store), it must be serializable to some format which:
your application knows how to serialize objects into
the data store at the other end knows how to de-serialize. In this case, the accepted format seems to be JSON. (this is optional, the JSON string can be stored directly)
The problem is, the queryset instance not only contains the rows returned from the table, it also contains a bunch of other attributes and meta attributes which come in handy to you when you use the Django ORM API. When you try to send the queryset instance over the wire to your session store, the system knows no better and tries to serialize all these attributes into JSON. This fails because there are attributes within the queryset that are not serializable into JSON.
As far as a solution is concerned, if you must save data into the session, as some people have suggested, simply performing objects.all().values() and saving it into your session may not always work. A simple case is when your table returns datetime objects. Datetime objects are by default, not JSON serializable.
So what should you do? What you need is some sort of serializer which accepts a queryset, and safely iterates over the returned rows converting each python native datatype into a JSON safe equivalent, and then returning that. In case of datetime.datetime objects, you would need to call obj.isoformat() to transform it into an ISO format datetime string.
You cannot save a QuerySet instance in session, cause well as you said, they're not JSON Serializable. Read This for more information.
To save your queryset, you can use values and values_list methods to get your desired fields, then you cast them to a list and then save the list into session.(most of the time saving only the PKs does the job though).
so basically:
qset = Model.objects.values_list("pk", "field_one", "field_two") # Gives you a ValuesListQuerySet object which's still not serializable.
cache_results = list(qset)
# Now you cache the cache_results variable however you want.
redis.setex("cached:user_id:querytype", 10 * 60, json.dumps(cache_results))
It's also better to change the way you save this special result (values_list) so you can have better lookups, a dictionary might be a good choice.
Saving query sets in django sessions requires them to be serialized and that causes the error. One way of easily moving the query set by saving them in sessions is to make a list of the id of the Equipments model. (Or any other field that serves as the primary key of the model), like:
equipments = [equipment.id for equipment in Equipment.objects.all()]
request.session['export_querset'] = equipments
And then whenever you need the Equipments, traverse this list and get the corresponding Equipment.
equipments = [Equipment.objects.get(id=id) for id in request.session['export_querset']]
Note: This method is inefficient and is not recommended for large query sets, but for small query sets, it can be used without worries.

Django QuerySet way to select from sql table-function

everybody.
I work with Django 1.3 and Postgres 9.0. I have very complex sql query which extends simple model table lookup with some extra fields. And it wrapped in table function since it is parameterized.
A month before I managed to make it work with the help of raw query but RawQuerySet lacks a lot of features which I really need (filters, count() and clone() methods, chainability).
The idea looks simple. QuerySet lets me to perform this query:
SELECT "table"."field1", ... ,"table"."fieldN" FROM "table"
whereas I need to do this:
SELECT "table"."field1", ... ,"table"."fieldN" FROM proxy(param1, param2)
So the question is: How can I do this? I've already started to create custom manager but can't substitute model.db_table with custom string (because it's being quoted and database stops to recognize the function call).
If there's no way to substitute table with function I would like to know if I can create QuerySet from RawQuerySet (Not the cleanest solution but simple RawQuerySet brings so much pain in... one body part).
If your requirement is that you want to use Raw SQL for efficiency purposes and still have access to model methods, what you really need is a library to map what fields of the SQL are the columns of what models.
And that library is, Unjoinify

How to limit columns returned by Django query?

That seems simple enough, but all Django Queries seems to be 'SELECT *'
How do I build a query returning only a subset of fields ?
In Django 1.1 onwards, you can use defer('col1', 'col2') to exclude columns from the query, or only('col1', 'col2') to only get a specific set of columns. See the documentation.
values does something slightly different - it only gets the columns you specify, but it returns a list of dictionaries rather than a set of model instances.
Append a .values("column1", "column2", ...) to your query
The accepted answer advising defer and only which the docs discourage in most cases.
only use defer() when you cannot, at queryset load time, determine if you will need the extra fields or not. If you are frequently loading and using a particular subset of your data, the best choice you can make is to normalize your models and put the non-loaded data into a separate model (and database table). If the columns must stay in the one table for some reason, create a model with Meta.managed = False (see the managed attribute documentation) containing just the fields you normally need to load and use that where you might otherwise call defer(). This makes your code more explicit to the reader, is slightly faster and consumes a little less memory in the Python process.