Consider a simple User and Content models setup. I would like to get the distribution of content per user, including 0 for users without content:
per_user | count
----------+-------
0 | 89
1 | 15
2 | 14
For the sake of this question the barebone models are:
class User(models.Model):
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
class Content(models.Model):
user = models.ForeignKey(User, on_delete=models.PROTECT)
One way to do this in pure SQL is:
SELECT
per_user,
count(per_user) count
FROM (
SELECT COUNT(c.id) per_user
FROM app_user u
LEFT JOIN app_content c ON (c.user_id = u.id)
GROUP BY u.id
) AS sub
GROUP BY
per_user
ORDER BY
per_user DESC;
I can do this to get the per_user count:
User.objects.annotate(per_user=Count("content")).values("per_user")
Unfortunately I cannot stick another .annotate(c=Count("per_user")) at the end of this:
FieldError: Cannot compute Count('per_user'): 'per_user' is an aggregate
Related
It's been a minute since I've used SQL so I'm not 100% sure LEFT or INNER join is the correct term, as I googled this and in most cases, people just wanted to concatenate the results, which is not SQL JOIN's. I have 3 models. Dumbed down they are as follows:
class Stakeholders(models.Model):
firstname = models.CharField(max_length=50)
lastname = models.CharField(max_length=50)
email = models.EmailField(max_length=254)
class Policy(models.Model):
name = models.CharField(max_length=50)
description = models.CharField(max_length=150)
class StakeholderPolicyResp(models.Model):
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
stakeholder = models.ForeignKey(Stakeholders, on_delete=models.CASCADE)
policy = models.ForeignKey(Policy, on_delete=models.CASCADE)
response = models.IntegerField(default=0)
I want to create a table that has a unique stakeholder with the response for up to 4 policies.
In simple example:
Stakeholder
-------------------
1 John Doe jd#email.com
Policy
-------------------
1 Policy 1
2 Policy 2
3 Policy 3
StakeholderPolicyResp
-------------------
UID1 1 1 0
UID2 1 2 4
UID3 1 3 3
What I want out is a table that can have varying amount of columns, but something like this:
MyTable
------------------
Stakeholder data - Policy x Resp - Policy y Resp - Policy z Resp
==================
John | Doe | 0 | 4 | 3
Here we have the stakeholder with the responses for policy 1, policy 2 and policy 3. From here I'm going the render the data and doing that is fairly easy. It need not be in qs format. A possible format might be using pandas data frame, but is there an easier django'y way of doing it?
The output could look like this, for example:
id
secondary_id
fk
1
1
1
2
2
1
3
3
1
4
1
2
5
2
2
For context:
(see models below)
I have a commission structure which will have brackets depending on how much a user is earning in a month.
Ideally, I need to know in my Commission Bracket model, the bracket index for a given structure.
Here are my models.
class CommissionStructure(APIBaseModel):
advisor = models.ManyToManyField(AdviserDetail)
name = models.CharField(max_length=20, blank=True, null=True, default='default')
company = models.ForeignKey(Company, on_delete=models.CASCADE)
start_dt = models.DateTimeField(auto_now_add=True)
end_dt = models.DateTimeField(default=timezone.datetime.max)
objects = CommissionStructureManager()
class CommissionBracket(APIBaseModel):
<secondary_id ???>
commission_structure = models.ForeignKey(CommissionStructure, on_delete=models.CASCADE, related_name="brackets")
lower_bound = models.DecimalField(decimal_places=2, default=0.00, max_digits=20, null=True, blank=True)
upper_bound = models.DecimalField(decimal_places=2, default=0.00, max_digits=20, null=True, blank=True)
Please note, I may not have to store it on my model if I can add an annotation to an aggregate set, but my preference is to follow DRY.
Thank you
My suggestion would be to execute custom SQL directly. You can add the secondary id as an integer field in CommissionBracket. Then, you can implement this:
from django.db import connection
def sample_view(request):
...
with connection.cursor() as cursor:
cursor.execute('''
INSERT INTO appname_commissionbracket (
secondary_id,
commission_structure_id
)
SELECT CASE
WHEN MAX(secondary_id)
THEN MAX(secondary_id) + 1
ELSE 1
END AS new_secid, %s
FROM appname_commissionbracket
WHERE commission_structure_id = %s''',
[1, 1] # Sample foreign key value
)
return render(...)
Here we're using INSERT INTO SELECT since we're basing the new record's secondary_id from the same table. We're also adding a CASE so that we can have a fallback value if no record with commission_structure_id value as 1 is returned.
In case you need to populate other columns during create, you can simply include them like so:
INSERT INTO (secondary_id, commission_structure_id, lower_bound, upper_bound)
SELECT CASE ... END AS new_secid, <fk_value>, <lower_bound_value>, <upper_bound_value>
I've found a way to annotate the queryset, but for interest, my original question still remains: how do I add another field partitioned by the foreign key?
brackets = CommissionBracket.objects.select_related("commission_structure")\
.prefetch_related(
'commission_structure__advisor',
'commission_structure__start_dt__gte',
'commission_structure__end_dt__lte',
'commission_structure__company',
'bracket_values'
).filter(
commission_structure__advisor=advisor,
commission_structure__start_dt__lte=date,
commission_structure__end_dt__gte=date,
commission_structure__company=advisor.user.company,
).annotate(index=Window(
expression=Count('id'),
partition_by="commission_structure",
order_by=F("lower_bound").asc()))
models.py:
class Address(models.Model):
text = models.TextField(max_length=2060, null=True, blank=True, default=None, unique=True)
class Tag(models.Model):
text = models.CharField(max_length=255, null=True, blank=True, default=None, unique=True)
class AddressTagJoin(models.Model):
address = models.ForeignKey(Address, on_delete=models.CASCADE, null=True, blank=True, related_name='address_tag_join')
tag = models.ForeignKey(Tag, on_delete=models.CASCADE, null=True, blank=True, related_name='address_tag_join')
In above, Address and Tag objects are only used as AddressTagJoin's foreignkey target.
What I want to do is two kind of queryset..
When I got address "https://www.google.com", I want to get Tag queryset ordered by most used for Address (text = "www.google.com")
Tag.objects.order_by(count_of_AddressTagJoin_and_It's_address_foreignkey_is_for_"www.google.com")
In reverse, I got tag "google", I want to get Address queryset ordered by most how many used for Tag (text="google")
Address.objects.order_by(count_of_AddressTagJoin_and_It's_tag_foreignkey_is_for_"google")
How can I do that?
from what I understood, you require:
"For the address "google.com" most used tags in order"
By taking an example I'll reach to the query.
There is this table AddressTagJoin:
address__text | tag__id
"google.com" | 1
"google.com" | 2
"google.com" | 1
"yahoo.com" | 2
"google.com" | 3
"google.com" | 3
"google.com" | 3
If we filter AddressTagJoin based on address "google.com" and then group this based on tag__id to get the tag counts(for the address most used tags), ordering it we get:
tag__id | tag_count
3 | 3
1 | 2
2 | 1
The desired result which you want is:
tags --> 3, 1, 2
Query for this will be:
from django.db.models import Count
tags_list = list(
AddressTagJoin.objects.filter(address__text__icontains="www.google.com")
.values('tag__id')
.annotate(tag_count=Count('tag__id'))
.order_by('-tag_count')
.values_list('tag__id', flag=True)
)
tags = Tag.objects.filter(id__in=tags_list)
Note
Please check the query there might be little adjustments required. This will give you an idea for the second query, both are almost same.
Also, If you want to optimize this query you can use select_related in the tag_list query. You can refer to the docs here
PS: I haven't implemented the models to check the query because of time constraints.
I'm using django 1.10 and have the following two models
class Post(models.Model):
title = models.CharField(max_length=500)
text = models.TextField()
class UserPost(models.Model):
user = models.ForeignKey(User, on_delete=models.CASCADE)
post = models.ForeignKey(Post, on_delete=models.CASCADE)
approved = models.BooleanField(default=False)
How do I get a list of all the posts including the 'approved' property for the logged in user if exists? So instead of multiple queries, it would be one left join query, pseudo-code:
select * from posts as p
left join user_posts as up
on up.post_id = p.post_id
and up.user_id = 2
Output
post_id | title | text | user_id | approved
1 | 'abc' | 'abc' | 2 | true
2 | 'xyz' | 'xyz' | null | null
3 | 'foo' | 'bar' | 2 | true
I created the models this way because the 'approved' property belongs to the user. Every user can approve/reject a post. The same post could be approved and rejected by other users. Should the models be setup differently?
Thanks
Update:
I'm trying to create a webpage to display all available posts and highlight the ones that the current user approved. I could just list all posts and then for each post check if the 'UserPost' table has a value, if yes get the approved property else ignore. But that means if I have 100 posts I'm making 100 + 1 calls to the db. Is it possible to do 1 call using ORM? If this is not possible, should the models be setup differently?
Then I think you need something like this:
Post.objects.all().annotate(
approved=models.Case(
models.When(userpost_set__user_id=2,
then=models.F('userpost__approved')),
default=models.Value(False),
output_field=models.BooleanField()
)
)
Model:
class Subjects (models.Model):
name = models.CharField(max_length=100)
places = models.CharField(max_length=100)
class Student (models.Model):
name = models.CharField(max_length=40)
lastname = models.CharField(max_length=80)
subjects = models.ManyToManyField(Subjects, blank=True)
Django creates appname_student_subjects when I use model above.
appname_student_subjects table looks for example, like this:
id | student_id | subjects_id
-----------------------------------------
1 | 1 | 10
2 | 4 | 11
3 | 4 | 19
4 | 5 | 10
...
~1000
How can I access subjects_id field and count how many times subjects_id exists in the table above (and then do something with it). For example: If subject with id 10 exists two times the template displays 2. I know that I should use "len" with result but i don't know how to access subject_id field.
With foreign keys I'm doing it like this in a for loop:
results_all = Students.objects.filter(subject_id='10')
result = len(results_all)
and I pass result to the template and display it within a for loop but it's not a foreign key so it's not working.
You can access the through table directly.
num = (Students.subjects # M2M Manager
.through # subjects_students through table
.objects # through table manager
.filter(student_id=10) # your query against through table
.count())