Is there a way to query an object, 'extract' a nested piece of data from a JSONField field and then make it available as a custom, temporary field on each instance of the Queryset?
In my use case, I'm storing overflow metadata from Twitter's API in a data field for later use. I'd like to be able to access the nested field followers_count within TwitterPost.data.
I've read the docs about how to filter based on nested values but not how to extract it as a temporary field when generating a queryset.
Similarly, I've read the annotate docs for ways to create a custom temporary field but the examples all use aggregation functions on simple fields, so not JSONFields.
Thanks in advance for any suggestions.
Example model:
from django.contrib.postgres.fields import JSONField
class TwitterPost(models.Model):
id = models.IntegerField()
data = JSONField()
Example JSON value for the data field:
{
'followers_count': 7172,
"default_profile_image": false,
"profile_text_color": "000000"
}
Pseudocode for what I'd like to be able to do:
TwitterPost.objects.annotate(followers_count=instance.data.followers_count)
This is probably a late answer, but there is a way to do it
from django.contrib.postgres.fields.jsonb import KeyTransform
TwitterPost.objects.annotate(followers_count=KeyTransform('followers_count', 'data'))
OR KeyTextTransform could be used instead of KeyTransform (for converting to string)
If you want to access the data inside a JSONField, you've to use __. In your example it will be something like this
TwitterPost.objects.annotate(followers_count=instance.data__followers_count)
Take a look to the documentation here
Related
I am trying load some data into datatables. I am trying to specify columns in the model.objects query by using .only() --- at first glance at the resulting QuerySet, it does in fact look like the mySQL query is only asking for those columns.
However, When I try to pass the QuerySet into Paginator, and/or a Serializer, the result has ALL columns in it.
I cannot use .values_list() because that does not return the nested objects that I need to have serialized as part of my specific column ask. I am not sure what is happening to my .only()
db_result_object = model.objects.prefetch_related().filter(qs).order_by(asc+sort_by).only(*columns_to_return)
paginated_results = Paginator(db_result_object,results_per_page)
serialized_results = serializer(paginated_results.object_list,many=True)
paginated_results.object_list = serialized_results.data
return paginated_results
This one has tripped me up too. In Django, calling only() doesn't return data equivalent to a SQL statement like this:
SELECT col_to_return_1, ... col_to_return_n
FROM appname_model
The reason it doesn't do it like this is because Django returns data to you not when you construct the QuerySet, but when you first access data from that QuerySet (see lazy QuerySets).
In the case of only() (a specific example of what is called a deferred field) you still get all of the fields like you normally would, but the difference is that it isn't completely loaded in from the database immediately. When you access the data, it will only load the fields included in the only statement. Some useful docs here.
My recommendation would be to write your Serializer so that it is only taking care of the one specific filed, likely using a SerializerMethodField with another serializer to serialize your related fields.
we all know that if we need to retrieve data from the database the data will back as a queryset but the question is How can I retrieve the data from database which is the name of it is queryset but remove that name from it.
maybe I can't be clarified enough in explanation so you can look at the next example to understand what I mean:
AnyObjects.objects.all().values()
this line will back the data like so:
<QuerySet [{'key': 'value'}]
now you can see the first name that is on the left side of retrieving data which is: "QuerySet" so, I need to remove that name to make the data as follows:
[{'key': 'value'}]
if you wonder about why so, the abbreviation of answer is I want to use Dataframe by pandas so, to put the data in Dataframe method I should use that layout.
any help please!!
You don't have to change it from a Queryset to anything else; pandas.DataFrame can take any Iterable as data. So
df = pandas.DataFrame(djangoapp.models.Model.objects.all().values())
Gives you the DataFrame you expect. (though you may want to double check df.dtypes. If there are Nones in your data, the column may end up to be of object type.)
You can use list(…) to convert it to a list of dictionaries:
list(AnyObjects.objects.values())
You will need to serialize it with the json package to obtain a JSON blob, since strings with single quotes are not valid JSON, in order to make it a JSON blob, you can work with:
import json
json.dumps(list(AnyObjects.object.values()))
I want to modify some_date_field value just for filtering purpose.
Like using models.Lookup or models.Transform but I dont want to make a raw sql expression.
For instance, using a raw ms sql expression I could write:
WHERE CONVERT(date, FORMAT(some_date_field, '2021MMdd')) >= #some_var
But I how I can do that with Django?
class SomeModel(models.Model):
some_date_field = models.DateField()
def replace_year(value):
return value.replace(year=2021)
SomeModel.objects.filter(
# replace_year(some_date_field)__gte=some_var
)
Is it possible?
You can use SomeModel.objects.filter({whatever you want to filter}).update(some_date_field={date_value})
if you have any issues see:
https://docs.djangoproject.com/en/3.2/ref/models/querysets/#django.db.models.query.QuerySet.update
If you are trying to bulk update all of the objects returned by a queryset and you are using Django 2.2 or greater you can use 'bulk_update'.
See here: Django Bulk Update
If you are dynamically updating values based off of another field check out F expressions they can be used with an 'update' on querysets.
See here: Update dynamically with F expressions
Something to note though, this won't use ModelClass.save method (so if you have some logic inside it won't be triggered).
Take a look at these answers here as well
you can use filter() and update() methods in django
Assuming we need to filter some known year which is the old_date variable and the new value contains in the new_date variable
# defing mehod to filter and update new date
def update_date(old_date, new_date):
SomeModel.objects.filter(some_date_field=old_date).update(some_date_field=new_date)
return None
you can find some examples using this link.
Hope this will be helpful for you.
I have a model with some HStoreField attributes and I can't seem to use Django's ORM HStoreField to query those values using LIKE.
When doing Model.objects.filter(hstoreattr__values__contains=['text']), the queryset only contains rows in which hstoreattr has any value that matches text exactly.
What I'm looking for is a way to search by, say, te instead of text and those same rows be returned as well. I'm aware this is possible in a raw PostgreSQL query but I'm looking for a solution that uses Django ORM.
If you want to check value of particular key in every object if it contains 'te', you can do:
Model.objects.filter(hstoreattr__your_key__icontains='te')
If you want to check if any key in your hstore field contains 'te', you will need to create your own lookup in django, because by default django won't do such thing. Refer to custom lookups in django docs for more info.
As far as I can remember, you cannot filter in values. If you want to filter in values, you have to pass a column and value you are referencing to. When you want it to be case insensitive use __icontains.
Although you cannot filter by all values, you can filter by all keys. Just like you showed in your code.
If you want to search for 'text' in all objects in key named let's say 'fo' - just do smth like this:
Model.objects.filter(hstoreattr__icontains={'fo': 'text'})
I am using Django, with mongoengine. I have a model Classes with an inscriptions list, And I want to get the docs that have an id in that list.
classes = Classes.objects.filter(inscriptions__contains=request.data['inscription'])
Here's a general explanation of querying ArrayField membership:
Per the Django ArrayField docs, the __contains operator checks if a provided array is a subset of the values in the ArrayField.
So, to filter on whether an ArrayField contains the value "foo", you pass in a length 1 array containing the value you're looking for, like this:
# matches rows where myarrayfield is something like ['foo','bar']
Customer.objects.filter(myarrayfield__contains=['foo'])
The Django ORM produces the #> postgres operator, as you can see by printing the query:
print Customer.objects.filter(myarrayfield__contains=['foo']).only('pk').query
>>> SELECT "website_customer"."id" FROM "website_customer" WHERE "website_customer"."myarrayfield_" #> ['foo']::varchar(100)[]
If you provide something other than an array, you'll get a cryptic error like DataError: malformed array literal: "foo" DETAIL: Array value must start with "{" or dimension information.
Perhaps I'm missing something...but it seems that you should be using .filter():
classes = Classes.objects.filter(inscriptions__contains=request.data['inscription'])
This answer is in reference to your comment for rnevius answer
In Django ORM whenever you make a Database call using ORM, it will generally return either a QuerySet or an object of the model if using get() / number if you are using count() ect., depending on the functions that you are using which return other than a queryset.
The result from a Queryset function can be used to implement further more refinement, like if you like to perform a order() or collecting only distinct() etc. Queryset are lazy which means it only hits the database when they are actually used not when they are assigned. You can find more information about them here.
Where as the functions that doesn't return queryset cannot implement such things.
Take time and go through the Queryset Documentation more in depth explanation with examples are provided. It is useful to understand the behavior to make your application more efficient.