Django Query, Distinct and Order_By combination not working - django

There are similar questions here but I haven't been able to find one that helps me.
I have two models, Chat and Post
there are multiple Chats, and each chat has multiple posts attached to it.
I'm trying to get the latest post for each chat.
Post.objects.order_by('-id').distinct('Chat')
Filter the posts by ID (so the newest post is first), and then grab the distinct ones based on the Chats.
but since order_by and distinct don't match I'm getting the error:
SELECT DISTINCT ON expressions must match initial ORDER BY expressions
So how exactly do I go about doing this? Rawsql? Thanks!

If you use distinct by related model, you must use ordering based of this model:
Post.objects.order_by('chat', '-id').distinct('chat')
Also you can look at this question

Related

Cannot use distinct w/ a different Order by using Django ORM

I have seen a few posts that reference this issue on Stackoverflow, but I cannot figure out how to get the business need solved, if the query changes.
I am trying to get the last 10 contacts that have sent a message to the organization
messages = (Message.objects.order_by('-created_at').distinct('contact_id'))
However, I get this error:
SELECT DISTINCT ON expressions must match initial ORDER BY expressions
LINE 1: SELECT DISTINCT ON ("messages"."contact_id") "messages"."id"...
I see that the distinct column must match the order by column, but a distinct created_at, isn't what the business needs to solve the problem.
Any suggestions?
The field in distinct() must be included and must be the first one in order_by(), in your case contact_id:
Message.objects.order_by('contact_id', '-created_at').distinct('contact_id')
Docs: https://docs.djangoproject.com/en/dev/ref/models/querysets/#distinct

Obtain the last value from every sensor on my django model

I am working with Django and I am a bit lost on how to extract information from models (tables).
I have a table containing different information from various sensors. What I would like to know is if it is possible from the Django models to obtain for each sensor (each sensor has an identifier) ​​the last row of data (using the timestamp column).
In sql it would be something like this, (probably the query is not correct but I think you can understand what I'm trying)
SELECT sensorID,timestamp,sensorField1,sensorField2
FROM sensorTable
GROUP BY sensorID
ORDER BY max(timestamp);
I have seen that the group_by() function exists and also lastest() but I don't get anything coherent and I'm also not clear if I'm choosing the best form.
Can anyone help me get started with this topic? I imagine it is very easy but it is a new world and it is difficult to start.
Greetings!
When you use a PostgreSQL database, you can make use of the .distinct(..) method [Django-doc] of the queryset where you add fields that determine on what these should be distinct.
So you can obtain the latest sensors in Django with:
SensorModel.objects.order_by('sensor', '-timestamp').distinct('sensor')
We thus order by sensor (which is required for a .distinct(..)), and then in case of a tie (so two times the same sensor), we order on the timestamp in descending order, hence we pick the latest SensorModel object for that sensor.

How to write the query for this requirement?

I have several hundred thousand svn commit record in my django database, each record save the related info of each commit(like BugID,LinesChanged,SubmitWeek ...)
I want to summary each field info of the records and create the report according to the SubmitWeek field like the following :
I iterate the records and operate the related field value currently , I want to know if there is a more succinct way to define the query and extract the summary? Many thanks
Your question is a bit vague.
If you are looking for a way to form your queries more specific to make Django do more joins and less separate queries, have a look at:
values() and values_list() of the QueryManager
If you want to make Django fetch related objects at once and not in separate queries, have a look at:
prefetch_related() and select_related()
If you want to update data more efficiently, have a look at:
F() https://docs.djangoproject.com/en/1.9/ref/models/expressions/#django.db.models.F
refer to the manual , I used the following statements and it seems works well , thanks Risadinha anyway :)
# Sum all the records's LinesChanged value
SVN_Commit.objects.filter(my filter).aggregate(Sum('LinesChanged'))
# Get the unique SubmitWeek List
SVN_Commit.objects.filter(my filter).values_list('SubmitWeek', flat=True).order_by('SubmitWeek').distinct()

Musicbrainz querying artist and release

I am trying to get an artist and their albums. So reading this page https://musicbrainz.org/doc/Development/XML_Web_Service/Version_2 i created the following query to get Michael Jackson's albums
http://musicbrainz.org/ws/2/artist/?query=artist:michael%20jackson?inc=releases+recordings
My understanding is to add ?inc=releases+recordings at the end of the URL which should return Michael Jackson's albums however this doesnt seem to return the correct results or i cant seem to narrow down the results? I then thought to use the {MBID} but again thats not returned in the artists query (which is why im trying to use inc in my query)
http://musicbrainz.org/ws/2/artist/?query=artist:michael%20jackson
Can anyone suggest where im going wrong with this?
You're not searching for the correct Entity. What you want is to get the discography, not artist's infos. Additionally, query fields syntax is not correct (you must use Lucene Search Syntax).
Here is what you're looking for:
http://musicbrainz.org/ws/2/release-group/?query=artist:"michael jackson" AND primarytype:"album"
We're targeting the release-group entity to get the albums, searching for a specific artist and filtering the results to limit them to albums. (accepted values are: album, single, ep, other)
There are more options to fit your needs, for example you can filter the type of albums using the secondarytype parameter. Here is the query to retrieve only live albums:
http://musicbrainz.org/ws/2/release-group/?query=artist:"michael jackson" AND primarytype:"album" AND secondarytype="live"
Here is the doc:
https://musicbrainz.org/doc/Development/XML_Web_Service/Version_2/Search
Note that to be able to use MB's API you need to understand how it is structured, especially, the relations between release_group, release and medium.

Filter on a list of tags

I'm trying to select all the songs in my Django database whose tag is any of those in a given list. There is a Song model, a Tag model, and a SongTag model (for the many to many relationship).
This is my attempt:
taglist = ["cool", "great"]
tags = Tag.objects.filter(name__in=taglist).values_list('id', flat=True)
song_tags = SongTag.objects.filter(tag__in=list(tags))
At this point I'm getting an error:
DatabaseError: MultiQuery does not support keys_only.
What am I getting wrong? If you can suggest a completely different approach to the problem, it would be more than welcome too!
EDIT: I should have mentioned I'm using Django on Google AppEngine with django-nonrel
You shouldn't use m2m relationship with AppEngine. NoSQL databases (and BigTable is one of them) generally don't support JOINs, and programmer is supposed to denormalize the data structure. This is a deliberate design desicion: while your database will contain redundant data, your read queries will be much simpler (no need to combine data from 3 tables), which in turn makes the design of DB server much simpler as well (of course this is made for the sake of optimization and scaling)
In your case you should probably get rid of Tag and SongTag models, and just store the tag in the Song model as a string. I of course assume that Tag model only contains id and name, if Tag in fact contains more data, you should still have Tag model. Song model in that case should contain both tag_id and tag_name. The idea, as I explained above, is to introduce redundancy for the sake of simpler queries
Please, please let the ORM build the query for you:
song_tags = SongTag.objects.filter(tag__name__in = taglist)
You should try to use only one query, so that Django also generates only one query using a join.
Something like this should work:
Song.objects.filter(tags__name__in=taglist)
You may need to change some names from this example (most likely the tags in tags__name__in), see https://docs.djangoproject.com/en/1.3/ref/models/relations/.