Convert the value of a field in a django RawQueryset to a different django field type - django

I have a rather complex query that's generating a Django RawQuerySet. This specific query returns some fields that aren't part of the model that the RawQuerySet is based on, so I'm using .annotate(field_name=models.Value('field_name')) to attach it as an attribute to individual records in the RawQuerySet. The most important custom field is actually a uuid, which I use to compose URLs using Django's {% url %} functionality.
Here's the problem: I'm not using standard uuids inside my app, I'm using SmallUUIDs (compressed UUIDs.) These are stored in the database as native uuidfields then converted to shorter strings in python. So I need to somehow convert the uuid returned as part of the RawQuerySet to a SmallUUID for use inside a template to generate a URL.
My code looks somewhat like this:
query = "SELECT othertable.uuid_field as my_uuid FROM myapp_mymodel
JOIN othertable o ON myapp_mymodel.x = othertable.x"
MyModel.objects.annotate(
my_uuid=models.Value('my_uuid'),
).raw(query)
Now there is a logical solution here, there's an optional kwarg for models.Value called output_field, making the code look like this:
MyModel.objects.annotate(
my_uuid=models.Value('my_uuid', output_field=SmallUUIDField()),
).raw(query)
But it doesn't work! That kwarg is completely ignored and the type of the attribute is based on the type returned from the database and not what's in output_field. In my case, I'm getting a uuid output because Postgres is returning a UUID type, but if I were to change the query to SELECT cast othertable.uuid_field as text) as my_uuid I'd get the attribute in the format of a string. It appears that Django (at least version 1.11.12) doesn't actually care what is in that kwarg in this instance.
So here's what I'm thinking are my potential solutions, in no particular order:
Change the way the query is formatted somehow (either in Django or in the SQL)
Change the resulting RawQuerySet in some way before it's passed to the view
Change something inside the templates to convert the UUID to a smalluuid for use in the URL reverse process.
What's my best next steps here?

A couple of issues with your current approach:
Value() isn't doing what you think it is - your annotation is literally just annotating each row with the value "my_uuid" because that is what you have passed to it. It isn't looking up the field of that name (to do that you need to use F expressions).
Point 1 above doesn't matter anyway because as soon as you use raw() then the annotation is ignored - which is why you see no effect coming from it.
Bottom line is that trying to annotate a RawQuerySet isn't going to be easy. There is a translations argument that it accepts, but I can't think of a way to get that to work with the type of join you are using.
The next best suggestion that I can think of is that you just manually convert the field into a SmallUUID object when you need it - something like this:
from smalluuid import SmallUUID
objects = MyModel.objects.raw(query)
for o in objects:
# Take the hex string obtained from the database and convert it to a SmallUUID object.
# If your database has a built-in UUID type you will need to do
# SmallUUID(small=o.my_uuid) instead.
my_uuid = SmallUUID(hex=o.my_uuid)
(I'm doing this in a loop just to illustrate - depending on where you need this you can do it in a template tag or view).

Related

Django ORM Cast() returning double quoted string from JSON field

I need to annotate a value that is saved in a json field in the same model. (Not the smartest but it is what it is).
I am annotating the value as such:
class SomeModel(BaseModel):
reference_numbers = JSONField(blank=True, null=True)
SomeModel.objects.annotate(
reference=Cast(
F("reference_numbers__some_id"),
output_field=models.CharField(),
)
)
I need it to be cast to text/char in the query because a subsequent search will only work on text/char (trigram similarity).
It works, sort of, but the result adds an extra quote to my string. Like so:
queryset[0].reference -> '"666999"'
Any ideas on how to get the correct string from the query?
I've also tried using just an ExpressionWrapper with the output field but since it doesnt cast the type in the SQL the code breaks afterwards when trying to execute the search because it uses the jsonb field still.

How to create custom db model function in Django like `Greatest`

I have a scenario that, i want a greatest value with the field name. I can get greatest value using Greatest db function which django provides. but i am not able to get its field name. for example:
emps = Employee.objects.annotate(my_max_value=Greatest('date_time_field_1', 'date_time_field_1'))
for e in emps:
print(e.my_max_value)
here i will get the value using e.my_max_value but i am unable to find out the field name of that value
You have to annotate a Conditional Expression using Case() and When().
from django.db.models import F, Case, When
emps = Employee.objects.annotate(
greatest_field=Case(
When(datetime_field_1__gt=F("datetime_field_2"),
then="datetime_field_1"),
When(datetime_field_2__gt=F("datetime_field_1"),
then="datetime_field_2"),
default="equal",
)
)
for e in emps:
print(e.greatest_field)
If you want the database query to tell you which of the fields was larger, you'll need to add another annotated column, using case/when logic to return one field name or the other. (See https://docs.djangoproject.com/en/4.0/ref/models/conditional-expressions/#when)
Unless you're really trying to offload work onto the database, it'll be much simpler to do the comparison work in Python.

Query Django's HStoreField values using LIKE

I have a model with some HStoreField attributes and I can't seem to use Django's ORM HStoreField to query those values using LIKE.
When doing Model.objects.filter(hstoreattr__values__contains=['text']), the queryset only contains rows in which hstoreattr has any value that matches text exactly.
What I'm looking for is a way to search by, say, te instead of text and those same rows be returned as well. I'm aware this is possible in a raw PostgreSQL query but I'm looking for a solution that uses Django ORM.
If you want to check value of particular key in every object if it contains 'te', you can do:
Model.objects.filter(hstoreattr__your_key__icontains='te')
If you want to check if any key in your hstore field contains 'te', you will need to create your own lookup in django, because by default django won't do such thing. Refer to custom lookups in django docs for more info.
As far as I can remember, you cannot filter in values. If you want to filter in values, you have to pass a column and value you are referencing to. When you want it to be case insensitive use __icontains.
Although you cannot filter by all values, you can filter by all keys. Just like you showed in your code.
If you want to search for 'text' in all objects in key named let's say 'fo' - just do smth like this:
Model.objects.filter(hstoreattr__icontains={'fo': 'text'})

Django Array contains a field

I am using Django, with mongoengine. I have a model Classes with an inscriptions list, And I want to get the docs that have an id in that list.
classes = Classes.objects.filter(inscriptions__contains=request.data['inscription'])
Here's a general explanation of querying ArrayField membership:
Per the Django ArrayField docs, the __contains operator checks if a provided array is a subset of the values in the ArrayField.
So, to filter on whether an ArrayField contains the value "foo", you pass in a length 1 array containing the value you're looking for, like this:
# matches rows where myarrayfield is something like ['foo','bar']
Customer.objects.filter(myarrayfield__contains=['foo'])
The Django ORM produces the #> postgres operator, as you can see by printing the query:
print Customer.objects.filter(myarrayfield__contains=['foo']).only('pk').query
>>> SELECT "website_customer"."id" FROM "website_customer" WHERE "website_customer"."myarrayfield_" #> ['foo']::varchar(100)[]
If you provide something other than an array, you'll get a cryptic error like DataError: malformed array literal: "foo" DETAIL: Array value must start with "{" or dimension information.
Perhaps I'm missing something...but it seems that you should be using .filter():
classes = Classes.objects.filter(inscriptions__contains=request.data['inscription'])
This answer is in reference to your comment for rnevius answer
In Django ORM whenever you make a Database call using ORM, it will generally return either a QuerySet or an object of the model if using get() / number if you are using count() ect., depending on the functions that you are using which return other than a queryset.
The result from a Queryset function can be used to implement further more refinement, like if you like to perform a order() or collecting only distinct() etc. Queryset are lazy which means it only hits the database when they are actually used not when they are assigned. You can find more information about them here.
Where as the functions that doesn't return queryset cannot implement such things.
Take time and go through the Queryset Documentation more in depth explanation with examples are provided. It is useful to understand the behavior to make your application more efficient.

Django GROUP BY including unnecessary columns?

I have Django code as follows
qs = Result.objects.only('time')
qs = qs.filter(organisation_id=1)
qs = qs.annotate(Count('id'))
And it gets translated into the following SQL:
SELECT "myapp_result"."id", "myapp_result"."time", COUNT("myapp_result"."id") AS "id__count" FROM "myapp_result" WHERE "myapp_result"."organisation_id" = 1 GROUP BY "myapp_result"."id", "myapp_result"."organisation_id", "myapp_result"."subject_id", "myapp_result"."device_id", "myapp_result"."time", "myapp_result"."tester_id", "myapp_result"."data"
As you can see, the GROUP BY clause starts with the field I intended (id) but then it goes on to list all the other fields as well. Is there any way I can persuade Django not to specify all the individual fields like this?
As you can see, even with .only('time') that doesn't stop Django from listing all the other fields anyway, but only in this GROUP BY clause.
The reason I want to do this is to avoid the issue described here where PostgreSQL doesn't support annotation when there's a JSON field involved. I don't want to drop native JSON support (so I'm not actually using django-jsonfield). The query works just fine if I manually issue it without the reference to "myapp_result"."data" (the only JSON field on the model). So if I could just persuade Django not to refer to it, I'd be fine!
only only defers the loading of certain fields, i.e. it allows for lazy loading of big or unused fields. It should generally not be used unless you know exactly what you're doing and why you need it, as it is nothing more than a performance booster than often decreases performance with improper use.
What you're looking for is values() (or values_list()), which actually excludes certain fields instead of just lazy loading. This will return a dictionary (or list) instead of a model instance, but this is the only way to tell Django to not take other fields into account:
qs = (Result.objects.filter_by(organisation_id=1)
.values('time').annotate(Count('id')))