Django database caching - django

The object user has a foreign key relationship to address. Is there a difference between samples 1 and 2? Does sample 1 run the query multiple times? Or is the address object cached?
# Sample 1
country = user.address.country
city = user.address.city
state = user.address.state
# Sample 2
address = user.address
country = address.country
city = address.city
state = address.state

The address object is indeed cached. You can see this if you print the contents of user.__dict__ before and after accessing user.address. For example:
>>> user.__dict__
{'date_joined': datetime.datetime(2010, 4, 1, 12, 31, 59),
'email': u'user#test.com',
'first_name': u'myfirstname',
'id': 1L,
'is_active': 1,
'is_staff': 1,
'is_superuser': 1,
'last_login': datetime.datetime(2010, 4, 1, 12, 31, 59),
'last_name': u'mylastname',
'password': u'sha1$...$...',
'username': u'myusername'}
>>> country = user.address.country
>>> user.__dict__
{'_address': <myapp.models.address object at 0xwherever,
'email': u'user#test.com',
...etc}
So the user object gains a _address object which is used for subsequent lookups on the related object.
You can use select_related() when you first get the user to pre-populate this cache even before accessing address, so you only hit the database once.

Related

Get respectives values from Django annotate method

I have the following query:
result = data.values('collaborator').annotate(amount=Count('cc'))
top = result.order_by('-amount')[:3]
This one, get the collaborator field from data, data is a Django Queryset, i am trying to make like a GROUP BY query, and it's functional, but when i call the .values() method on the top variable, it's returning all the models instances as dicts into a queryset, i need the annotate method result as a list of dicts:
The following is the top variable content on shell:
<QuerySet [{'collaborator': '1092788966', 'amount': 20}, {'collaborator': '1083692812', 'amount': 20}, {'collaborator': '1083572767', 'amount': 20}]>
But when i make list(top.values()) i get the following result:
[{'name': 'Alyse Caffin', 'cc': '1043346592', 'location': 'Wu’an', 'gender': 'MASCULINO', 'voting_place': 'Corporación Educativa American School Barranquilla', 'table_number': '6', 'status': 'ESPERADO', 'amount': 1}, {'name': 'Barthel Hanlin', 'cc': '1043238706', 'location': 'General Santos', 'gender': 'MASCULINO', 'voting_place': 'Colegio San José – Compañía de Jesús Barranquilla', 'table_number': '10', 'status': 'PENDIENTE', 'amount': 1}, {'name': 'Harv Gertz', 'cc': '1043550513', 'location': 'Makueni', 'gender': 'FEMENINO', 'voting_place': 'Corporación Educativa American School Barranquilla', 'table_number': '7', 'status': 'ESPERADO', 'amount': 1}]
I just want the result to be like:
[{'collaborator': '1092788966', 'amount': 20}, {'collaborator': '1083692812', 'amount': 20}, {'collaborator': '1083572767', 'amount': 20}]
there is something wrong, maybe a typo (also it seems you do not show the full query... something like data=yourmodel.objects.filter... is missing before):
The output of list(top.values()) returns a completely different model's fields then what you post as top Queryset- are you sure you really did:
result = data.values('collaborator').annotate(amount=Count('cc'))
top = result.order_by('-amount')[:3]
list(top.values())
because it should deliver what you expect (provided that data is a Queryset)

Django - Query count of each distinct status

I have a model Model that has Model.status field. The status field can be of value draft, active or cancelled.
Is it possible to get a count of all objects based on their status? I would prefer to do that in one query instead of this:
Model.objects.filter(status='draft').count()
Model.objects.filter(status='active').count()
Model.objects.filter(status='cancelled').count()
I think that aggregate could help.
Yes, you can work with:
from django.db.models import Count
Model.objects.values('status').annotate(
count=Count('pk')
).order_by('count')
This will return a QuerSet of dictionaries:
<QuerySet [
{'status': 'active', 'count': 25 },
{'status': 'cancelled', 'count': 14 },
{'status': 'draft', 'count': 13 }
]>
This will however not list statuses for which no Model is present in the database.
Or you can make use of an aggregate with filter=:
from django.db.models import Count, Q
Model.objects.aggregate(
nactive=Count('pk', filter=Q(status='active')),
ncancelled=Count('pk', filter=Q(status='cancelled')),
ndraft=Count('pk', filter=Q(status='draft'))
)
This will return a dictionary:
{
'nactive': 25,
'ncancelled': 25,
'ndraft': 13
}
items for which it can not find a Model will be returned as None.

Wrongly big numbers when use multiple of Sum, Count aggregations in annotate

I have these models:
User:
email = EmailField()
Payment:
user = ForeignKey(User)
sum = DecimalField()
GuestAccount:
user = ForeignKey(User)
guest = ForeignKey(User)
I want to get user emails, amount of money that came from every user
and number of its guests accounts.
My query:
User.objects.annotate(
money=Sum('payment__sum'),
guests_number=Count('guestaccount')
).values('email', 'money', 'guests_number')
But money and guests_number in the result of the query are bigger then they really are:
{'guests_number': 0, 'email': 'a#b.cd', 'money': None}
{'guests_number': 20, 'email': 'user1#mail.com', 'money': Decimal('6600.00')}
{'guests_number': 4, 'email': 'user1000#test.com', 'money': Decimal('2500.00')}
{'guests_number': 0, 'email': 'zzzz#bbbbb.com', 'money': None}
I noticed that I get correct data if I split the query into 2 separate queries:
User.objects.annotate(money=Sum('payment__sum')).values('email', 'money')
User.objects.annotate(guests_number=Count('guestaccount')).values('email', 'guests_number')
Correct result of 1st half:
{'email': 'a#b.cd', 'money': None}
{'email': 'user1#mail.com', 'money': Decimal('1650.00')}
{'email': 'user1000#test.com', 'money': Decimal('1250.00')}
{'email': 'zzzz#bbbbb.com', 'money': None}
Correct result of 2nd half:
{'email': 'a#b.cd', 'guests_number': 0}
{'email': 'user1#mail.com', 'guests_number': 4}
{'email': 'user1000#test.com', 'guests_number': 2}
{'email': 'zzzz#bbbbb.com', 'guests_number': 0}
Also I noticed that I can add distinct=True in Count aggregation:
User.objects.annotate(
money=Sum('payment__sum'),
guests_number=Count('guestaccount', distinct=True)
).values('email', 'money', 'guests_number')
It fixes guests_number:
{'guests_number': 0, 'email': 'a#b.cd', 'money': None}
{'guests_number': 4, 'email': 'user1#mail.com', 'money': Decimal('6600.00')}
{'guests_number': 2, 'email': 'user1000#test.com', 'money': Decimal('2500.00')}
{'guests_number': 0, 'email': 'zzzz#bbbbb.com', 'money': None}
Unfortunatly, there are no distinct parameter in Sum aggregation.
What is wrong with my query? How to fix these numbers getting bigger with every aggregation in annotate?
Raw SQL query investigation showed that the problem comes from multiple LEFT OUTER JOINs. So I ended up with raw SQL:
User.objects.extra(select={
"money": """
SELECT SUM("website_payment"."sum")
FROM "website_payment"
WHERE "website_user"."id" = "website_payment"."user_id"
""",
"guests_number": """
SELECT COUNT("guests_guestaccount"."id")
FROM "guests_guestaccount"
WHERE "website_user"."id" = "guests_guestaccount"."user_id"
""",
}
).values('email', 'money', 'guests_number')
But I need to annotate these fields into queried objects and extra don't do it.

Is this an error in the documentation?

Today I started reading the documentation for django.forms. The API seems easy to use and I started experimenting with it. Then I started experimenting with django.forms.ModelForm but I cannot really see where I went wrong.
My problem starts here: the save method when creating a form with an instance.
My model is
class Process(models.Model):
key = models.CharField(max_length=32, default="")
name = models.CharField(max_length=30)
path = models.CharField(max_length=215)
author = models.CharField(max_length=100)
canparse = models.NullBooleanField(default=False)
last_exec = models.DateTimeField(null = True)
last_stop = models.DateTimeField(null = True)
last_change = models.DateTimeField(null = True, auto_now=True)
and my form is
class ProcessForm(ModelForm):
class Meta:
model = Process
fields = ('name', 'path', 'author')
I only wanted the name, path and author fields since the other ones are automatically set when saving the model. Anyway, in my test database I already have entries and I've chosen one whose fields are all set and valid.
In the documentation you can read:
# Create a form to edit an existing Article.
>>> a = Article.objects.get(pk=1)
>>> f = ArticleForm(instance=a)
>>> f.save()
Very well, I wanted to do the same with my own code:
>>> from remusdb.models import Process
>>> from monitor.forms import ProcessForm
>>>
>>> proc = Process.objects.get(name="christ")
>>> pf = ProcessForm(instance=proc)
>>> pf.save()
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "/home/shaoran/devpython/lib/python2.6/site-packages/django/forms/models.py", line 364, in save
fail_message, commit, construct=False)
File "/home/shaoran/devpython/lib/python2.6/site-packages/django/forms/models.py", line 87, in save_instance
save_m2m()
File "/home/shaoran/devpython/lib/python2.6/site-packages/django/forms/models.py", line 78, in save_m2m
cleaned_data = form.cleaned_data
AttributeError: 'ProcessForm' object has no attribute 'cleaned_data'
>>> pf.is_bound
False
>>> pf.is_valid()
False
Even though proc is a valid Process object the form object doesn't seem to agree with me. If I do as the next example
>>> post = { "name": "blabla", "path": "/somewhere", "author": "me" }
>>> pf = ProcessForm(post, instance=proc)
>>> pf.is_bound
True
>>> pf.is_valid()
True
>>> pf.cleaned_data
{'path': u'/somewhere', 'name': u'blabla', 'author': u'me'}
then it works like in third example of the documentation.
Am I missing something or is there an error in the documentation? Or is my Model code somewhat wrong?
This is the content of proc
proc.dict
{'name': u'christ', 'last_stop': datetime.datetime(2012, 10, 5, 16, 49, 13, 630040, tzinfo=), 'author': u'unkown', '_state': , 'canparse': False, 'last_exec': datetime.datetime(2012, 10, 5, 16, 49, 8, 545626, tzinfo=), 'key': u'aed72c9d46d2318b99ffba930a110610', 'path': u'/home/shaoran/projects/cascade/remusdb/test/samples/christ.cnf', 'last_change': datetime.datetime(2012, 10, 5, 16, 49, 13, 631764, tzinfo=), 'id': 5}
The first argument to the form class is a dictionary that contains values that you want the form to validate.
Since you never pass these values in, the form cannot validate any input; which is why cleaned_data is none. Since .save() triggers form and model validation, the form validation fails.
You'll notice the form actually has no data:
af.data will be {} (empty dict)
af.is_bound will be False (as you haven't bound the form to any data)
Since there is no data, validation fails. The error is a bit misleading. If you pass in an empty dict:
af = ArticleForm({},instance=a)
af.save()
You'll get a more appropriate error:
ValueError: The Article could not be changed because the data didn't validate.

Django query — how to get list of dictionaries with M2M relation?

Let's say, I have this simple application with two models — Tag and SomeModel
class Tag(models.Model):
text = ...
class SomeModel(models.Model):
tags = models.ManyToManyField(Tag, related_name='tags')
And I want to get something like this from database:
[{'id': 1, 'tags': [1, 4, 8, 10]}, {'id': 6, 'tags': []}, {'id': 8, 'tags': [1, 2]}]
It is list of several SomeModel's dictionaries with SomeModel's id and ids of tags.
What should the Django query looks like? I tried this:
>>> SomeModel.objects.values('id', 'tags').filter(pk__in=[1,6,8])
[{'id': 1, 'tags': 1}, {'id': 1, 'tags': 4}, {'id': 1, 'tags': 8}, ...]
This is not what I want, so I tried something like this:
>>> SomeModel.objects.values_list('id', 'tags').filter(pk__in=[1,6,8])
[(1, 1), (1, 4), (1, 8), ...]
And my last try was:
>>> SomeModel.objects.values_list('id', 'tags', flat=True).filter(pk__in=[1,6,8])
...
TypeError: 'flat' is not valid when values_list is called with more than one field.
—
Maybe Django cannot do this, so the most similar result to what I want is:
[{'id': 1, 'tags': 1}, {'id': 1, 'tags': 4}, {'id': 1, 'tags': 8}, ...]
Is there any Python build-in method which transform it to this?
[{'id': 1, 'tags': [1, 4, 8, 10]}, {'id': 6, 'tags': []}, {'id': 8, 'tags': [1, 2]}]
— EDIT:
If I write method in SomeModel:
class SomeModel(models.Model):
tags = models.ManyToManyField(Tag, related_name='tags')
def get_tag_ids(self):
aid = []
for a in self.answers.all():
aid.append(a.id)
return aid
And then call:
>>> sm = SomeModel.objects.only('id', 'tags').filter(pk__in=[1,6,8])
# Hit database
>>> for s in sm:
... s.get_tag_ids()
...
>>> # Hit database 3 times.
This is not working, because it access to database 4 times. I need just one access.
As ArgsKwargs mentioned here in comments — I write my own code, which packs the list:
>>> sm = SomeModel.objects.values('id', 'tags').filter(pk__in=[1,6,8])
>>> a = {}
>>> for s in sm:
... if s['id'] not in a:
... a[s['id']] = [s['tags'],]
... else:
... a[s['id']].append(s['tags'])
...
The output of this code is exactly what I need, and it hit database only once. But it is not very elegant, I don't like this code :)
Btw. is better use pk or id in queries? .values('id', 'tags') or .values('pk', 'tags')?
What about a custom method on the model that returns a list of all tags
class Tag(models.Model):
text = ...
class SomeModel(models.Model):
tags = models.ManyToManyField(Tag, related_name='tags')
def all_tags(self):
return self.tags.values_list('pk',flat=True)
and then
SomeModel.objects.values('id', 'all_tags').filter(pk__in=[1,6,8])