How would one go about retrieving the last 1,000 values from a database via a Objects.filter? The one I am currently doing is bringing me the first 1,000 values to be entered into the database (i.e. 10,000 rows and it's bringing me the 1-1000, instead of 9000-1,000).
Current Code:
limit = 1000
Shop.objects.filter(ID = someArray[ID])[:limit]
Cheers
Solution:
queryset = Shop.objects.filter(id=someArray[id])
limit = 1000
count = queryset.count()
endoflist = queryset.order_by('timestamp')[count-limit:]
endoflist is the queryset you want.
Efficiency:
The following is from the django docs about the reverse() queryset method.
To retrieve the ''last'' five items in
a queryset, you could do this:
my_queryset.reverse()[:5]
Note that this is not quite the same
as slicing from the end of a sequence
in Python. The above example will
return the last item first, then the
penultimate item and so on. If we had
a Python sequence and looked at
seq[-5:], we would see the fifth-last
item first. Django doesn't support
that mode of access (slicing from the
end), because it's not possible to do
it efficiently in SQL.
So I'm not sure if my answer is merely inefficient, or extremely inefficient. I moved the order_by to the final query, but I'm not sure if this makes a difference.
reversed(Shop.objects.filter(id=someArray[id]).reverse()[:limit])
Related
I have a list of "posts" I have to render. For each post, I must do three filter querysets, OR them together, and then count the number of objects. Is this reasonable? What factors might make this slow?
This is roughly my code:
def viewable_posts(request, post):
private_posts = post.replies.filter(permissions=Post.PRIVATE, author_profile=request.user.user_profile).order_by('-modified_date')
community_posts = post.replies.filter(permissions=Post.COMMUNITY, author_profile__in=request.user.user_profile.following.all()).order_by('-modified_date')
public_posts = post.replies.filter(permissions=Post.PUBLIC).order_by('-modified_date')
mixed_posts = private_posts | community_posts | public_posts
return mixed_posts
def viewable_posts_count(request, post):
return viewable_posts(request, post).count()
The biggest factor I can see is that you have filter actions on each post. If possible, you should query the results associated with each post in ONE query. As of the count, it's the most efficient way of getting the number of results from a query, so it's likely not a problem.
Try the following code:
def viewable_posts(request, post):
private_posts = post.replies.filter(permissions=Post.PRIVATE, author_profile=request.user.user_profile).values_list('id',flat=True)
community_posts = post.replies.filter(permissions=Post.COMMUNITY, author_profile__in=request.user.user_profile.following.values_list('id',flat=True)
public_posts = post.replies.filter(permissions=Post.PUBLIC).values_list('id',flat=True)
Lposts_id = private_posts
Lposts_id.extend(community_posts)
Lposts_id.extend(public_posts)
viewable_posts = post.filter(id__in=Lposts_id).order_by('-modified_date')
viewable_posts_count = post.filter(id__in=Lposts_id).count()
return viewable_posts,viewable_posts_count
It should improve the following things:
order_by once, instead of three times
The count method runs on a query with only the index field
django uses a faster filter with "values", both for the count and the filtering.
Depends on your database, the db own cache may pick the last queried posts for viewable_posts, and use it for viewable_posts_count
Indeed, if you can squeeze the first three filter queries into one, you will save time as well.
I'm trying to display a chat log in django. I can get my entire chatlog in the proper order with this query.
latest_chats_list = Chat.objects.order_by('timestamp')
I want the functionality of this line (last 10 elements in order), but django doesn't allow negative indexes.
latest_chats_list = Chat.objects.order_by('timestamp')[-10:]
if I try this line, I get the messages I want, but they're in the wrong order.
latest_chats_list = Chat.objects.order_by('-timestamp')[:10]
This line gives the first 10 chats instead of the most recent.
latest_chats_list = Chat.objects.order_by('-timestamp')[:10].reverse()
last_ten = Chat.objects.all().order_by('-id')[:10]
last_ten_in_ascending_order = reversed(last_ten)
Edit (from comments)
Why not use Django's queryset.reverse() ?
Because it messes with the SQL query, as does queryset.order_by(). Slicing the queryset ([:10]) also alters the SQL query, adding LIMIT and OFFSET to it. The two can combine in not-obviously-expected ways...
On the other hand, the built-in Python function reversed(iterable) only changes the way queryset gets iterated over, not effecting the SQL at all.
First, I want the top 250 users, and update their top = 1
users = MyTable.objects.order_by('-month_length')[0: 250]
for u in users:
u.top = 1
u.save()
But, actually, I hope there is an elegent way, like this:
MyTable.objects.all().update(top=1)
And more, from this question: Django: Cannot update a query once a slice has been taken
Does that mean CAN NOT WRITE UPDATE ... WHERE ... LIMIT 5?
Until the queryset has been evaluated once (at which point it will cache itself), slicing result in new querysets. If the queryset has been cached, slicing is done using lists. At least the last time I read the Django code regarding this, probably around Django 1.5.
You can try this
users = MyTable.objects.order_by('-month_length').values_list("id", flat = True)[0: 250]
MyTable.objects.filter(id__in = list(users)).update(top = 1)
*assuming you have a primary key 'id' in MyTable
Which one would be better for performance?
We take a slice of products. which make us impossible to bulk update.
products = Product.objects.filter(featured=True).order_by("-modified_on")[3:]
for product in products:
product.featured = False
product.save()
or (invalid)
for product in products.iterator():
product.update(featured=False)
I have tried QuerySet's in statement too as following.
Product.objects.filter(pk__in=products).update(featured=False)
This line works fine on SQLite. But, it rises following exception on MySQL. So, I couldn't use that.
DatabaseError: (1235, "This version of MySQL doesn't yet support
'LIMIT & IN/ALL/ANY/SOME subquery'")
Edit: Also iterator() method causes re-evaluate the query. So, it is bad for performance.
As #Chris Pratt pointed out in comments, the second example is invalid because the objects don't have update methods. Your first example will require queries equal to results+1 since it has to update each object. That might really be costly if you have 1000 products. Ideally you do want to reduce this to a more fixed expense if possible.
This is a similar situation to another question:
Django: Cannot update a query once a slice has been taken
That being said, you would have to do it in at least 2 queries, but you have to be a bit sneaky on how to construct the LIMIT...
Using Q objects for complex queries:
# get the IDs we want to exclude
products = Product.objects.filter(featured=True).order_by("-modified_on")[:3]
# flatten them into just a list of ids
ids = products.values_list('id', flat=True)
# Now use the Q object to construct a complex query
from django.db.models import Q
# This builds a list of "AND id NOT EQUAL TO i"
limits = [~Q(id=i) for i in ids]
Product.objects.filter(featured=True, *limits).update(featured=False)
In some cases it's acceptable to cache QuerySet in array
products = list(products)
Product.objects.filter(pk__in=products).update(featured=False)
Small optimization with values_list
products_id = list(products.values_list('id', flat=True)
Product.objects.filter(pk__in=products_id).update(featured=False)
When I do something like
I. objects = Model.objects.all()
and then
II. objects.filter(field_1=some_condition)
I hit db every time when on the step 2 with various conditions. Is there any way to get all data in first action and then just take care of the result?
You actually don't hit the db until you evaluate the qs, queries are lazy.
Read more here.
edit:
After re-reading your question it becomes apparent you were asking how to prevent db hits when filtering for different conditions.
qs = SomeModel.objects.all()
qs1 = qs.filter(some_field='some_value')
qs2 = qs.filter(some_field='some_other_value')
Usually you would want the database to do the filtering for you.
You could force an evaluation of the qs by converting it to a list. This would prevent further db hits, however it would likely be worse than having your db return you results.
qs_l = list(qs)
qs1_l = [element for element in qs_l if element.some_field='some_value']
qs2_l = [element for element in qs_l if element.some_field='some_other_value']
Of course you will hit db every time. filter() transforms to SQL statement which is executed by your db, you can't filter without hitting it. So you can retrieve all the objects you need with values() or list(Model.objects.all()) and, as zeekay suggested, use Python expressions (like list comprehensions) for additional filtering.
Why don't you just do objs = Model.objects.filter(field=condition)? That said, once the SQL query is executed you can use Python expressions to do further filtering/processing without incurring additional database hits.