Get multiple rows with one query in django? - django

How can I get build a QuerySet that gets multiple rows from django? I thought filter() would work, but it seems to be worse off.
For example, I have two rows in the model Car, with two text attributes (license and vin). Now say I want to print the licenses and vins from these cars. How can I do that with one database call?
Here's an answer that will make two database calls:
#using get(), two total queries
a = Car.objects.get(id=1) #query here
b = Car.objects.get(id=2) #query here
print(a.license + a.vin) #no query
print(b.license + b.vin) #no query
That obviously didn't work because I made two get() queries. So next I'll try filter():
#using filter(), four total queries
c = Car.objects.filter(id__in=(1,2)) #no query
print(c[0].license + c[0].vin) #two queries
print(c[1].license + c[1].vin) #two queries
Hmmm, that's weird, why is making four database calls? Is there a way I can make it get the two in one database call?

It's seems weird because of how indexing into a queryset works.
c = list(Car.objects.filter(id__in=(1,2))) # query
print(c[0].license + c[0].vin) #no query
print(c[1].license + c[1].vin) #no query
If you do the following, you'll only have one query too:
for car in Car.objects.filter(id__in=(1,2)):
print(car.license + car.vin)
As #Torsten said, in your situation it appears like you're simply trying to get all the cars you've created. This can be achieved via the all() method:
for car in Car.objects.all():
print(car.license + car.vin)

Great example. A typo I think though in your last codeblock. Should be:
for car in Car.objects.filter(id__in=(1,2)):
print(car.license + car.vin)
How does that method stack up

Related

How to get and update Django object in one query?

To optimize a lot my database I would like to make as less as possible any query.
I'm trying to get an object, increment the field "count_limit" and make an If statement after on the Customer instance.
To achieve it I've made this query who worked well.
Customer.objects.filter(user=user).update(count_limit=F('count_limit') + 1)
So after this query, count_limit has been incremented by 1 as I wanted.
When I'm trying to get the Customer instance as a result of this query, it returns "1".
Is it possible to make both, update the instance and get it as a return object ?
Thanks a lot
The update() method will return the number of updated rows. If you are using Postgres, then you can use the returning clause with the raw query.
query = 'UPDATE customer SET count_limit=(customer.count_limit + 1) WHERE customer.user_id=%s returning *'
updated_obj = Customer.objects.raw(query, [user.id])
I don't know if this can be achieved by ORM, but suggestions will be appreciated.
Make sure that the table name in raw query is correct. If you haven't definer db_table in the meta class of your model, then by default it will be myapp_model.
And to prevent SQL injection, from the Docs:
Do not use string formatting on raw queries or quote placeholders in
your SQL strings!
Follow Docs on raw()
You are looking for F functions: https://docs.djangoproject.com/en/3.0/ref/models/expressions/#f-expressions
Example from their documentation how to increase a counter
from django.db.models import F
reporter = Reporters.objects.get(name='Tintin')
reporter.stories_filed = F('stories_filed') + 1
reporter.save()

Django QuerySet update performance

Which one would be better for performance?
We take a slice of products. which make us impossible to bulk update.
products = Product.objects.filter(featured=True).order_by("-modified_on")[3:]
for product in products:
product.featured = False
product.save()
or (invalid)
for product in products.iterator():
product.update(featured=False)
I have tried QuerySet's in statement too as following.
Product.objects.filter(pk__in=products).update(featured=False)
This line works fine on SQLite. But, it rises following exception on MySQL. So, I couldn't use that.
DatabaseError: (1235, "This version of MySQL doesn't yet support
'LIMIT & IN/ALL/ANY/SOME subquery'")
Edit: Also iterator() method causes re-evaluate the query. So, it is bad for performance.
As #Chris Pratt pointed out in comments, the second example is invalid because the objects don't have update methods. Your first example will require queries equal to results+1 since it has to update each object. That might really be costly if you have 1000 products. Ideally you do want to reduce this to a more fixed expense if possible.
This is a similar situation to another question:
Django: Cannot update a query once a slice has been taken
That being said, you would have to do it in at least 2 queries, but you have to be a bit sneaky on how to construct the LIMIT...
Using Q objects for complex queries:
# get the IDs we want to exclude
products = Product.objects.filter(featured=True).order_by("-modified_on")[:3]
# flatten them into just a list of ids
ids = products.values_list('id', flat=True)
# Now use the Q object to construct a complex query
from django.db.models import Q
# This builds a list of "AND id NOT EQUAL TO i"
limits = [~Q(id=i) for i in ids]
Product.objects.filter(featured=True, *limits).update(featured=False)
In some cases it's acceptable to cache QuerySet in array
products = list(products)
Product.objects.filter(pk__in=products).update(featured=False)
Small optimization with values_list
products_id = list(products.values_list('id', flat=True)
Product.objects.filter(pk__in=products_id).update(featured=False)

Using .extra() on fields created by .annotate() in Django

I want to retrieve a sum of two fields (which are aggregations themselves) for each object in a table.
The following may describe a bit better what I'm after but results in an Unknown column in field list-Error:
items = MyModel.objects.annotate(
field1=Sum("relatedModel__someField"),
field2=Sum("relatedModel__someField")).extra(
select={"sum_field1_field2": "field1 + field2"})
I also tried using F() for the field lookups but that gives me an invalid sql statement.
Any ideas on how to solve this are much appreciated.
it this what you want?
items = MyModel.objects.extra(
select = {'sum_field1_field2': 'SUM(relatedModel__someField) + SUM(relatedModel__someField)'},
)
To make it work for many to many or for many to one (reverse) relations, you may use the following:
items = MyModel.objects.extra(
select = {'sum_field1_field2': 'SUM("relatedModel"."someField") + SUM("relatedModel"."someField")'},
)
But this will break also if you need another annotate, like for a count, because extra will add the statement to the GROUP BY clause, whereas aggregate functions are not allowed in there.

fast lookup for the last element in a Django QuerySet?

I've a model called Valor. Valor has a Robot. I'm querying like this:
Valor.objects.filter(robot=r).reverse()[0]
to get the last Valor the the r robot. Valor.objects.filter(robot=r).count() is about 200000 and getting the last items takes about 4 seconds in my PC.
How can I speed it up? I'm querying the wrong way?
The optimal mysql syntax for this problem would be something along the lines of:
SELECT * FROM table WHERE x=y ORDER BY z DESC LIMIT 1
The django equivalent of this would be:
Valor.objects.filter(robot=r).order_by('-id')[:1][0]
Notice how this solution utilizes django's slicing method to limit the queryset before compiling the list of objects.
If none of the earlier suggestions are working, I'd suggest taking Django out of the equation and run this raw sql against your database. I'm guessing at your table names, so you may have to adjust accordingly:
SELECT * FROM valor v WHERE v.robot_id = [robot_id] ORDER BY id DESC LIMIT 1;
Is that slow? If so, make your RDBMS (MySQL?) explain the query plan to you. This will tell you if it's doing any full table scans, which you obviously don't want with a table that large. You might also edit your question and include the schema for the valor table for us to see.
Also, you can see the SQL that Django is generating by doing this (using the query set provided by Peter Rowell):
qs = Valor.objects.filter(robot=r).order_by('-id')[0]
print qs.query
Make sure that SQL is similar to the 'raw' query I posted above. You can also make your RDBMS explain that query plan to you.
It sounds like your data set is going to be big enough that you may want to denormalize things a little bit. Have you tried keeping track of the last Valor object in the Robot object?
class Robot(models.Model):
# ...
last_valor = models.ForeignKey('Valor', null=True, blank=True)
And then use a post_save signal to make the update.
from django.db.models.signals import post_save
def record_last_valor(sender, **kwargs):
if kwargs.get('created', False):
instance = kwargs.get('instance')
instance.robot.last_valor = instance
post_save.connect(record_last_valor, sender=Valor)
You will pay the cost of an extra db transaction when you create the Valor objects but the last_valor lookup will be blazing fast. Play with it and see if the tradeoff is worth it for your app.
Well, there's no order_by clause so I'm wondering about what you mean by 'last'. Assuming you meant 'last added',
Valor.objects.filter(robot=r).order_by('-id')[0]
might do the job for you.
django 1.6 introduces .first() and .last():
https://docs.djangoproject.com/en/1.6/ref/models/querysets/#last
So you could simply do:
Valor.objects.filter(robot=r).last()
Quite fast should also be:
qs = Valor.objects.filter(robot=r) # <-- it doesn't hit the database
count = qs.count() # <-- first hit the database, compute a count
last_item = qs[ count-1 ] # <-- second hit the database, get specified rownum
So, in practice you execute only 2 SQL queries ;)
Model_Name.objects.first()
//To get the first element
Model_name.objects.last()
//For get last()
in my case, the last is not work because there is only one row in the database
maybe help full for you too :)
Is there a limit clause in django? This way you can have the db, simply return a single record.
mysql
select * from table where x = y limit 1
sql server
select top 1 * from table where x = y
oracle
select * from table where x = y and rownum = 1
I realize this isn't translated into django, but someone can come back and clean this up.
The correct way of doing this, is to use the built-in QuerySet method latest() and feeding it whichever column (field name) it should sort by. The drawback is that it can only sort by a single db column.
The current implementation looks like this and is optimized in the same sense as #Aaron's suggestion.
def latest(self, field_name=None):
"""
Returns the latest object, according to the model's 'get_latest_by'
option or optional given field_name.
"""
latest_by = field_name or self.model._meta.get_latest_by
assert bool(latest_by), "latest() requires either a field_name parameter or 'get_latest_by' in the model"
assert self.query.can_filter(), \
"Cannot change a query once a slice has been taken."
obj = self._clone()
obj.query.set_limits(high=1)
obj.query.clear_ordering()
obj.query.add_ordering('-%s' % latest_by)
return obj.get()

How do I get the related objects In an extra().values() call in Django?

Thank to this post I'm able to easily do count and group by queries in a Django view:
Django equivalent for count and group by
What I'm doing in my app is displaying a list of coin types and face values available in my database for a country, so coins from the UK might have a face value of "1 farthing" or "6 pence". The face_value is the 6, the currency_type is the "pence", stored in a related table.
I have the following code in my view that gets me 90% of the way there:
def coins_by_country(request, country_name):
country = Country.objects.get(name=country_name)
coin_values = Collectible.objects.filter(country=country.id, type=1).extra(select={'count': 'count(1)'},
order_by=['-count']).values('count', 'face_value', 'currency_type')
coin_values.query.group_by = ['currency_type_id', 'face_value']
return render_to_response('icollectit/coins_by_country.html', {'coin_values': coin_values, 'country': country } )
The currency_type_id comes across as the number stored in the foreign key field (i.e. 4). What I want to do is retrieve the actual object that it references as part of the query (the Currency model, so I can get the Currency.name field in my template).
What's the best way to do that?
You can't do it with values(). But there's no need to use that - you can just get the actual Collectible objects, and each one will have a currency_type attribute that will be the relevant linked object.
And as justinhamade suggests, using select_related() will help to cut down the number of database queries.
Putting it together, you get:
coin_values = Collectible.objects.filter(country=country.id,
type=1).extra(
select={'count': 'count(1)'},
order_by=['-count']
).select_related()
select_related() got me pretty close, but it wanted me to add every field that I've selected to the group_by clause.
So I tried appending values() after the select_related(). No go. Then I tried various permutations of each in different positions of the query. Close, but not quite.
I ended up "wimping out" and just using raw SQL, since I already knew how to write the SQL query.
def coins_by_country(request, country_name):
country = get_object_or_404(Country, name=country_name)
cursor = connection.cursor()
cursor.execute('SELECT count(*), face_value, collection_currency.name FROM collection_collectible, collection_currency WHERE collection_collectible.currency_type_id = collection_currency.id AND country_id=%s AND type=1 group by face_value, collection_currency.name', [country.id] )
coin_values = cursor.fetchall()
return render_to_response('icollectit/coins_by_country.html', {'coin_values': coin_values, 'country': country } )
If there's a way to phrase that exact query in the Django queryset language I'd be curious to know. I imagine that an SQL join with a count and grouping by two columns isn't super-rare, so I'd be surprised if there wasn't a clean way.
Have you tried select_related() http://docs.djangoproject.com/en/dev/ref/models/querysets/#id4
I use it a lot it seems to work well then you can go coin_values.currency.name.
Also I dont think you need to do country=country.id in your filter, just country=country but I am not sure what difference that makes other than less typing.