django query optimization in iteration - django

class Value(models.Model):
attribute = models.ForeignKey(Attribute)
platform = models.ForeignKey(Platform)
value = models.CharField(max_length=30)
class Attribute(models.Model):
name = models.CharField(max_length=50)
....
1.
for attribute in attributes:
attribute.value = Value.objects.get(Q(attribute__id=attribute.id) & Q(platform__id=platform.id))
2.
values = Value.objects.filter(platform__id=platform.id)
for attribute in attributes:
attribute.value = values.get(attribute__id=attribute.id)
Can I say the method 2 is more efficient than 1 because it prevents excessive DB query?

Example 2 can be reduced to only 1 DB query like so:
values = Value.objects.filter(platform__id=platform.id)
attribute_values = {value.attribute_id: value for value in values}
for attribute in attributes:
attribute.value = attribute_values[attribute.id]
I'm assuming that Value.attribute is a ForeignKey

I wouldn't say that's the case, because filter and get are just building up some where statements for sql query. You might think that django is caching the value values because you only do it once, but the query is not even evaluated when you do:
values = Value.objects.filter(platform__id=platform.id)
Every time you call get, it's adding a where statement upon the filter statement and hit the database to fetch the results, so you don't gain anything in terms of performance.
By the way, Value.objects.get(Q(attribute__id=attribute.id) & Q(platform__id=platform.id)) is the same as:
Value.objects.get(attribute=attribute, platform=platform)
which is more readable.

Related

Django ORM. Filter many to many with AND clause

With the following models:
class Item(models.Model):
name = models.CharField(max_length=255)
attributes = models.ManyToManyField(ItemAttribute)
class ItemAttribute(models.Model):
attribute = models.CharField(max_length=255)
string_value = models.CharField(max_length=255)
int_value = models.IntegerField()
I also have an Item which has 2 attributes, 'color': 'red', and 'size': 3.
If I do any of these queries:
Item.objects.filter(attributes__string_value='red')
Item.objects.filter(attributes__int_value=3)
I will get Item returned, works as I expected.
However, if I try to do a multiple query, like:
Item.objects.filter(attributes__string_value='red', attributes__int_value=3)
All I want to do is an AND. This won't work either:
Item.objects.filter(Q(attributes__string_value='red') & Q(attributes__int_value=3))
The output is:
<QuerySet []>
Why? How can I build such a query that my Item is returned, because it has the attribute red and the attribute 3?
If it's of any use, you can chain filter expressions in Django:
query = Item.objects.filter(attributes__string_value='red').filter(attributes__int_value=3')
From the DOCS:
This takes the initial QuerySet of all entries in the database, adds a filter, then an exclusion, then another filter. The final result is a QuerySet containing all entries with a headline that starts with “What”, that were published between January 30, 2005, and the current day.
To do it with .filter() but with dynamic arguments:
args = {
'{0}__{1}'.format('attributes', 'string_value'): 'red',
'{0}__{1}'.format('attributes', 'int_value'): 3
}
Product.objects.filter(**args)
You can also (if you need a mix of AND and OR) use Django's Q objects.
Keyword argument queries – in filter(), etc. – are “AND”ed together. If you need to execute more complex queries (for example, queries with OR statements), you can use Q objects.
A Q object (django.db.models.Q) is an object used to encapsulate a
collection of keyword arguments. These keyword arguments are specified
as in “Field lookups” above.
You would have something like this instead of having all the Q objects within that filter:
** import Q from django
from *models import Item
#assuming your arguments are kwargs
final_q_expression = Q(kwargs[1])
for arg in kwargs[2:..]
final_q_expression = final_q_expression & Q(arg);
result = Item.objects.filter(final_q_expression)
This is code I haven't run, it's out of the top of my head. Treat it as pseudo-code if you will.
Although, this doesn't answer why the ways you've tried don't quite work. Maybe it has to do with the lookups that span relationships, and the tables that are getting joined to get those values. I would suggest printing yourQuerySet.query to visualize the raw SQL that is being formed and that might help guide you as to why .filter( Q() & Q()) is not working.

Django: Sort and filter rows by specific many to one value

In the provided schema I would like to sort Records by a specific Attribute of the record. I'd like to do this in native Django.
Example:
Query all Records (regardless of Attribute.color), but sort by Attribute.value where Attribute.color is 'red'. Obviously Records missing a 'red' Attribute can't be sorted, so they could be just interpreted as NULL or sent to the end.
Each Record is guaranteed to have one or zero of an Attribute of a particular color (enforced by unique_together). Given this is a one to many relationship, a Record can have Attributes of more than` one color.
class Record(Model):
pass
class Attribute(Model):
color = CharField() # **See note below
value = IntegerField()
record = ForeignKey(Record)
class Meta:
unique_together = (('color', 'record'),)
I will also need to filter Records by Attribute.value and Attribute.color as well.
I'm open to changing the schema, but the schema above seems to be the simplest to represent what I need to model.
How can I:
Query all Records where it has an Attribute.color of 'red' and, say, an Attribute.value of 10
Query all Records and sort by the Attribute.value of the associated Attribute where Attribute.color is 'red'.
** I've simplified it above -- in reality the color field would be a ForeignKey to an AttributeDefinition, but I think that's not important right now.
I think something like this would work:
record_ids = Attribute.objects.filter(color='red', value=10).values_list('record', flat=True)
and
record_ids = Attribute.objects.filter(color='red').order_by('value').values_list('record', flat=True)
That will give you IDs of records. Then, you can do this:
records = Record.objects.filter(id__in=record_ids)
Hope this helps!

Django get count of each age

I have this model:
class User_Data(AbstractUser):
date_of_birth = models.DateField(null=True,blank=True)
city = models.CharField(max_length=255,default='',null=True,blank=True)
address = models.TextField(default='',null=True,blank=True)
gender = models.TextField(default='',null=True,blank=True)
And I need to run a django query to get the count of each age. Something like this:
Age || Count
10 || 100
11 || 50
and so on.....
Here is what I did with lambda:
usersAge = map(lambda x: calculate_age(x[0]), User_Data.objects.values_list('date_of_birth'))
users_age_data_source = [[x, usersAge.count(x)] for x in set(usersAge)]
users_age_data_source = sorted(users_age_data_source, key=itemgetter(0))
There's a few ways of doing this. I've had to do something very similar recently. This example works in Postgres.
Note: I've written the following code the way I have so that syntactically it works, and so that I can write between each step. But you can chain these together if you desire.
First we need to annotate the queryset to obtain the 'age' parameter. Since it's not stored as an integer, and can change daily, we can calculate it from the date of birth field by using the database's 'current_date' function:
ud = User_Data.objects.annotate(
age=RawSQL("""(DATE_PART('year', current_date) - DATE_PART('year', "app_userdata"."date_of_birth"))::integer""", []),
)
Note: you'll need to change the "app_userdata" part to match up with the table of your model. You can pick this out of the model's _meta, but this just depends if you want to make this portable or not. If you do, use a string .format() to replace it with what the model's _meta provides. If you don't care about that, just put the table name in there.
Now we pick the 'age' value out so that we get a ValuesQuerySet with just this field
ud = ud.values('age')
And then annotate THAT queryset with a count of age
ud = ud.annotate(
count=Count('age'),
)
At this point we have a ValuesQuerySet that has both 'age' and 'count' as fields. Order it so it comes out in a sensible way..
ud = ud.order_by('age')
And there you have it.
You must build up the queryset in this order otherwise you'll get some interesting results. i.e; you can't group all the annotates together, because the second one for count depends on the first, and as a kwargs dict has no notion of what order the kwargs were defined in, when the queryset does field/dependency checking, it will fail.
Hope this helps.
If you aren't using Postgres, the only thing you'll need to change is the RawSQL annotation to match whatever database engine it is that you're using. However that engine can get the year of a date, either from a field or from its built in "current date" function..providing you can get that out as an integer, it will work exactly the same way.

Django compare values of two objects

I have a Django model that looks something like this:
class Response(models.Model):
transcript = models.TextField(null=True)
class Coding(models.Model):
qid = models.CharField(max_length = 30)
value = models.CharField(max_length = 200)
response = models.ForeignKey(Response)
coder = models.ForeignKey(User)
For each Response object, there are two coding objects with qid = "risk", one for coder 3 and one for coder 4. What I would like to be able to do is get a list of all Response objects for which the difference in value between coder 3 and coder 4 is greater than 1. The value field stores numbers 1-7.
I realize in hindsight that setting up value as a CharField may have been a mistake, but hopefully I can get around that.
I believe something like the following SQL would do what I'm looking for, but I'd rather do this with the ORM
SELECT UNIQUE c1.response_id FROM coding c1, coding c2
WHERE c1.coder_id = 3 AND
c2.coder_id = 4 AND
c1.qid = "risk" AND
c2.qid = "risk" AND
c1.response_id = c2.response_id AND
c1.value - c2.value > 1
from django.db.models import F
qset = Coding.objects.filter(response__coding__value__gt=F('value') + 1,
qid='risk', coder=4
).extra(where=['T3.qid = %s', 'T3.coder_id = %s'],
params=['risk', 3])
responses = [c.response for c in qset.select_related('response')]
When you join to a table already in the query, the ORM will assign the second one an alias, in this case T3, which you can using in parameters to extra(). To find out what the alias is you can drop into the shell and print qset.query.
See Django documentation on F objects and extra
Update: It seems you actually don't have to use extra(), or figure out what alias django uses, because every time you refer to response__coding in your lookups, django will use the alias created initially. Here's one way to look for differences in either direction:
from django.db.models import Q, F
gt = Q(response__coding__value__gt=F('value') + 1)
lt = Q(response__coding__value__lt=F('value') - 1)
match = Q(response__coding__qid='risk', response__coding__coder=4)
qset = Coding.objects.filter(match & (gt | lt), qid='risk', coder=3)
responses = [c.response for c in qset.select_related('response')]
See Django documentation on Q objects
BTW, If you are going to want both Coding instances, you have an N + 1 queries problem here, because django's select_related() won't get reverse FK relationships. But since you have the data in the query already, you could retrieve the required information using the T3 alias as described above and extra(select={'other_value':'T3.value'}). The value data from the corresponding Coding record would be accessible as an attribute on the retrieved Coding instance, i.e. as c.other_value.
Incidentally, your question is general enough, but it looks like you have an entity-attribute-value schema, which in an RDB scenario is generally considered an anti-pattern. You might be better off long-term (and this query would be simpler) with a risk field:
class Coding(models.Model):
response = models.ForeignKey(Response)
coder = models.ForeignKey(User)
risk = models.IntegerField()
# other fields for other qid 'attribute' names...

how to store a field in the database after querying

views.py:
q3=KEBReading.objects.filter(datetime_reading__month=a).filter(datetime_reading__year=selected_year).values("signed")
for item in q3:
item["signed"]="signed"
print item["signed"]
q3.save()
How do I save a field into the database? I'm trying to save the field called "signed" with a value. If I do q3.save() it gives a error as it is a queryset. I'm doing a query from the database and then, based on the result, want to set a value to a field and save it.
prevdate=KEBReading.objects.filter(datetime_reading__lt=date)
i am getting all the rows from the database less than the current date. but i want only the latest record. if im entering 2012-06-03. wen i query i want the date less than this date i.e the date just previous to this. can sumbody help?
q3 = KEBReading.objects.filter(datetime_reading__month=a,
datetime_reading__year=selected_year)
for item in q3:
item.signed = True
item.save()
q3=KEBReading.objects.filter(...)
will return you a list of objects. Any instance of a Django Model is an object and all fields of the instance are attributes of that object. That means, you must use them using dot (.) notation.
like:
item.signed = "signed"
If your object is a dictionary or a class derived from dictionary, then you can use named-index like:
item["signed"] = "signed"
and in your situation, that usage is invalid (because your object's type is not dictionary based)
You can either call update query:
KEBReading.objects.filter(...).update(selected="selected")
or set new value in a loop and then save it
for item in q3:
item.signed="signed"
q3.save()
but in your situation, update query is a better approach since it executes less database calls.
Try using update query:
If signed is a booleanfield:
q3 = KEBReading.objects.filter(datetime_reading__month = a).filter(datetime_reading__year = selected_year).update(signed = True)
If it is a charfield:
q3 = KEBReading.objects.filter(datetime_reading__month = a).filter(datetime_reading__year = selected_year).update(signed = "True")
Update for comments:
If you want to fetch records based datetime_reading month, you can do it by providing month as number. For example, 2 for February:
q3 = KEBReading.objects.filter(datetime_reading__month = 2).order_by('datetime_reading')
And if you to fetch records with signed = True, you can do it by:
q3 = KEBReading.objects.filter(signed = True)
If you want to fetch only records of previous date by giving a date, you can use:
prevdate = KEBReading.objects.filter(datetime_reading = (date - datetime.timedelta(days = 1)))