What is a Django QuerySet? - django

When I do this,
>>> b = Blog.objects.all()
>>> b
I get this:
>>>[<Blog: Blog Title>,<Blog: Blog Tile>]
When I query what type b is,
>>> type(b)
I get this:
>>> <class 'django.db.models.query.QuerySet'>
What does this mean? Is it a data type like dict, list, etc?
An example of how I can build data structure like a QuerySet will be appreciated.
I would want to know how Django builds that QuerySet (the gory details).

A django queryset is like its name says, basically a collection of (sql) queries, in your example above print(b.query) will show you the sql query generated from your django filter calls.
Since querysets are lazy, the database query isn't done immediately, but only when needed - when the queryset is evaluated. This happens for example if you call its __str__ method when you print it, if you would call list() on it, or, what happens mostly, you iterate over it (for post in b..). This lazyness should save you from doing unnecessary queries and also allows you to chain querysets and filters for example (you can filter a queryset as often as you want to).

Yes, it's just another type, built like every other type.

A QuerySet represents a collection of objects from your database. It can have zero, one or many filters. Filters narrow down the query results based on the given parameters. In SQL terms, a QuerySet equates to a SELECT statement, and a filter is a limiting clause such as WHERE or LIMIT.
https://docs.djangoproject.com/en/1.8/topics/db/queries/

A QuerySet is a list of objects of a given model, QuerySet allow you to read data from database

Related

Problem with .only() method, passing to Pagination / Serialization --- all fields are getting returned instead of the ones specified in only()

I am trying load some data into datatables. I am trying to specify columns in the model.objects query by using .only() --- at first glance at the resulting QuerySet, it does in fact look like the mySQL query is only asking for those columns.
However, When I try to pass the QuerySet into Paginator, and/or a Serializer, the result has ALL columns in it.
I cannot use .values_list() because that does not return the nested objects that I need to have serialized as part of my specific column ask. I am not sure what is happening to my .only()
db_result_object = model.objects.prefetch_related().filter(qs).order_by(asc+sort_by).only(*columns_to_return)
paginated_results = Paginator(db_result_object,results_per_page)
serialized_results = serializer(paginated_results.object_list,many=True)
paginated_results.object_list = serialized_results.data
return paginated_results
This one has tripped me up too. In Django, calling only() doesn't return data equivalent to a SQL statement like this:
SELECT col_to_return_1, ... col_to_return_n
FROM appname_model
The reason it doesn't do it like this is because Django returns data to you not when you construct the QuerySet, but when you first access data from that QuerySet (see lazy QuerySets).
In the case of only() (a specific example of what is called a deferred field) you still get all of the fields like you normally would, but the difference is that it isn't completely loaded in from the database immediately. When you access the data, it will only load the fields included in the only statement. Some useful docs here.
My recommendation would be to write your Serializer so that it is only taking care of the one specific filed, likely using a SerializerMethodField with another serializer to serialize your related fields.

How to get boolean result in annotate django?

I have a filter which should return a queryset with 2 objects, and should have one different field. for example:
obj_1 = (name='John', age='23', is_fielder=True)
obj_2 = (name='John', age='23', is_fielder=False)
Both the objects are of same model, but different primary key. I tried usign the below filter:
qs = Model.objects.filter(name='John', age='23').annotate(is_fielder=F('plays__outdoor_game_role')=='Fielder')
I used annotate first time, but it gave me the below error:
TypeError: QuerySet.annotate() received non-expression(s): False.
I am new to Django, so what am I doing wrong, and what should be the annotate to get the required objects as shown above?
The solution by #ktowen works well, quite straightforward.
Here is another solution I am using, hope it is helpful too.
queryset = queryset.annotate(is_fielder=ExpressionWrapper(
Q(plays__outdoor_game_role='Fielder'),
output_field=BooleanField(),
),)
Here are some explanations for those who are not familiar with Django ORM:
Annotate make a new column/field on the fly, in this case, is_fielder. This means you do not have a field named is_fielder in your model while you can use it like plays.outdor_game_role.is_fielder after you add this 'annotation'. Annotate is extremely useful and flexible, can be combined with almost every other expression, should be a MUST-KNOWN method in Django ORM.
ExpressionWrapper basically gives you space to wrap a more complecated combination of conditions, use in a format like ExpressionWrapper(expression, output_field). It is useful when you are combining different types of fields or want to specify an output type since Django cannot tell automatically.
Q object is a frequently used expression to specify a condition, I think the most powerful part is that it is possible to chain the conditions:
AND (&): filter(Q(condition1) & Q(condition2))
OR (|): filter(Q(condition1) | Q(condition2))
Negative(~): filter(~Q(condition))
It is possible to use Q with normal conditions like below:
(Q(condition1)|id__in=[list])
The point is Q object must come to the first or it will not work.
Case When(then) can be simply explained as if con1 elif con2 elif con3 .... It is quite powerful and personally, I love to use this to customize an ordering object for a queryset.
For example, you need to return a queryset of watch history items, and those must be in an order of watching by the user. You can do it with for loop to keep the order but this will generate plenty of similar queries. A more elegant way with Case When would be:
item_ids = [list]
ordering = Case(*[When(pk=pk, then=pos)
for pos, pk in enumerate(item_ids)])
watch_history = Item.objects.filter(id__in=item_ids)\
.order_by(ordering)
As you can see, by using Case When(then) it is possible to bind those very concrete relations, which could be considered as 1) a pinpoint/precise condition expression and 2) especially useful in a sequential multiple conditions case.
You can use Case/When with annotate
from django.db.models import Case, BooleanField, Value, When
Model.objects.filter(name='John', age='23').annotate(
is_fielder=Case(
When(plays__outdoor_game_role='Fielder', then=Value(True)),
default=Value(False),
output_field=BooleanField(),
),
)

Django Array contains a field

I am using Django, with mongoengine. I have a model Classes with an inscriptions list, And I want to get the docs that have an id in that list.
classes = Classes.objects.filter(inscriptions__contains=request.data['inscription'])
Here's a general explanation of querying ArrayField membership:
Per the Django ArrayField docs, the __contains operator checks if a provided array is a subset of the values in the ArrayField.
So, to filter on whether an ArrayField contains the value "foo", you pass in a length 1 array containing the value you're looking for, like this:
# matches rows where myarrayfield is something like ['foo','bar']
Customer.objects.filter(myarrayfield__contains=['foo'])
The Django ORM produces the #> postgres operator, as you can see by printing the query:
print Customer.objects.filter(myarrayfield__contains=['foo']).only('pk').query
>>> SELECT "website_customer"."id" FROM "website_customer" WHERE "website_customer"."myarrayfield_" #> ['foo']::varchar(100)[]
If you provide something other than an array, you'll get a cryptic error like DataError: malformed array literal: "foo" DETAIL: Array value must start with "{" or dimension information.
Perhaps I'm missing something...but it seems that you should be using .filter():
classes = Classes.objects.filter(inscriptions__contains=request.data['inscription'])
This answer is in reference to your comment for rnevius answer
In Django ORM whenever you make a Database call using ORM, it will generally return either a QuerySet or an object of the model if using get() / number if you are using count() ect., depending on the functions that you are using which return other than a queryset.
The result from a Queryset function can be used to implement further more refinement, like if you like to perform a order() or collecting only distinct() etc. Queryset are lazy which means it only hits the database when they are actually used not when they are assigned. You can find more information about them here.
Where as the functions that doesn't return queryset cannot implement such things.
Take time and go through the Queryset Documentation more in depth explanation with examples are provided. It is useful to understand the behavior to make your application more efficient.

Appending to QuerySet used as a __in lookup parameter

I'd like to retrieve all posts authored by a user OR by his/her friends:
Currently, I use Q objects:
# friends = QuerySet containing User objects
# user = User object
posts = Post.objects.filter(Q(author__in=friends) | Q(author=user))
Which generates a SQL query that looks like this:
SELECT ... WHERE ("posts_post"."author_id" IN (2, 3) OR "posts_post"."author_id" = 1)
Question:
Is it possible to append the user object to the friends QuerySet to generate a query that looks like this, instead:
SELECT ... WHERE "posts_post"."author_id" IN (2, 3, 1)
Directly using the QuerySet's .append() method does in fact work:
friends.append(user)
posts = Post.objects.filter(author__in=friends)
However, I've seen a number of answers here and elsewhere cautioning against treating QuerySets as basic lists.
Is this latter .append() technique safe? Is it efficient, particularly if the QuerySet is fairly large? Or is there another preferred method? Alternatively, feel free to tell me that I'm being silly and that there's nothing wrong with the Q objects approach!
Many thanks.
Querysets don't have an append method, so if that's working in your context friends is already a list rather than a queryset.
As for performance - my gut feeling is that a queryset that's too large to pull into memory so you can append to it is also going to be too large to do well as a subquery. But as always with performance questions, testing is the only real way to be sure.
You'll definitely take something of a performance hit either way. If you don't need friends for anything else, you could use a values_list queryset to get just the PKs into memory, append user.id, then filter on that list of PKs.
You can use this trick:
friends = Friend.objects.values_list('id', flat=True).order_by('id')
friends.append(user.pk)
posts = Post.objects.filter(author__in=friends)
As far as you save only id's, not the whole queryset this method is pretty safe.

django - reorder queryset after slicing it

I fetch the latest 5 rows from a Foo model which is ordered by a datetime field.
qs = Foo.objects.all()[:5]
In the following step, I want to reorder the queryset by some other criteria (actually, by the same datetime field in the opposite direction). But reordering after a slice is not permitted. reverse() undoes the first ordering, giving me a differet queryset. Is there a way to accomplish what I want without creating a list from the queryset and doing the ordering using it?
order_by gives you SQL in-database ordering. You're already using that, and then slicing on it. At that point, the results are retrieved into memory. If you want to change their order, you need to use Python in-memory sorting to do it, not the ORM in-database sorting.
In your case, Daniel has already given the best solution: since you simply want to sort by the same field, but in the other order, just reverse the list you have:
qs = Foo.objects.all()[:5]
objs = reversed(qs)
If you had wanted to sort by some other field, then you'd use the sorted() function with a custom key function:
qs = Foo.objects.all()[:5]
objs = sorted(qs, key=lambda o: o.some_other_field)
No, there's no way of doing that. order_by is an operation on the database, but when you slice a queryset it is evaluated and doesn't go back to the database after that.
Sounds like you already know the solution, though: run reversed() on the evaluated qs.
qs = reversed(Foo.objects.all()[:5])
Late answer, but this just worked for me:
import random
sorted(queryset[:10], key=lambda x: random.random())