Python/Django recursion issue with saved querysets - django

I have user profiles that are each assigned a manager. I thought using recursion would be a good way to query every employee at every level under a particular manager. The goal is, if the CEO were to sign in, he should be able to query everyone at the company - but If I sign on I can only see people in my immediate team and the people below them, etc. until you get to the low level employees.
However when I run the following:
def team_training_list(request):
# pulls all training documents from training document model
user = request.user
manager_direct_team = Profile.objects.filter(manager=user)
query = Profile.objects.filter(first_name='fake')
trickle_team = manager_loop(manager_direct_team, query)
# manager_trickle_team = manager_direct_team | trickle_team
print(trickle_team)
def manager_loop(list, query):
for member in list:
user_instance = User.objects.get(username=member)
has_team = Profile.objects.filter(manager=user_instance)
if has_team:
query = query | has_team
manager_loop(has_team, query)
else:
continue
return query
It only returns the last query that was run instead of the compiled queryset that I am trying to grow. I've tried placing 'return' before 'manager_loop(has_team, query) in order save the values but it also kills the loop at the first non-manager employee instead of continuing to the next employee.
I'm new to django so if there is an better way than recursion to pull the information that I need, I'd appreciate suggestions on that too.
EDIT:
As requested, here is the profile model.
class Profile(models.Model):
user = models.OneToOneField(User, on_delete=models.CASCADE)
first_name = models.CharField(max_length=30, blank=False)
last_name = models.CharField(max_length=30, blank=False)
email = models.EmailField( blank=True, help_text='Optional',)
receive_email_notifications = models.BooleanField(default=False)
mobile_number = models.CharField(
max_length=15,
blank=True,
help_text='Optional'
)
carrier_options = (
(None, ''),
('#txt.att.net', 'AT&T'),
('#messaging.sprintpcs.com', 'Sprint'),
('#tmomail.net', 'T-Mobile'),
('#vtext.com', 'Verizon'),
)
mobile_carrier = models.CharField(max_length=25, choices=carrier_options, blank=True,
help_text='Optional')
receive_sms_notifications = models.BooleanField(default=False)
job_title = models.ForeignKey(JobTitle, unique=False, null=True)
manager = models.ForeignKey(User, unique=False, blank=True, related_name='+', null=True)

Ok, so it's a hierarchical model.
The problem with your current approach is this line:
query = query | has_team
This reassigns the local name query to a new queryset, but does not reassign the name in the caller. (Well, that's what I think it's trying to do - I am a little rusty but I don't think you can just | together querysets like that.) You'd also need something like:
query = manager_loop(has_team, query)
to propagate the changes via the returned object.
That said, while Django doesn't have built-in support for recursive queries, there are some third party packages that do. Old answers eg (Django self-recursive foreignkey filter query for all childs and Creating efficient database queries for hierarchical models (django)) recommend django-mptt. Your tag mentions postgres, so this post might be relevant:
https://two-wrongs.com/fast-sql-for-inheritance-in-a-django-hierarchy
If you don't use a third-party approach, it should be possible to clean up the evolution of the queryset - cast it to a set and use update or something, since you're accumulating profiles. But the key error is not using the returned modified object.

Related

how to build query with several manyTomany relationships - Django

I really don't understand all the ways to build the right query.
I have the following models in the code i'm working on. I can't change models.
models/FollowUp:
class FollowUp(BaseModel):
name = models.CharField(max_length=256)
questions = models.ManyToManyField(Question, blank=True, )
models/Survey:
class Survey(BaseModel):
name = models.CharField(max_length=256)
followup = models.ManyToManyField(
FollowUp, blank=True, help_text='questionnaires')
user = models.ManyToManyField(User, blank=True, through='SurveyStatus')
models/SurveyStatus:
class SurveyStatus(models.Model):
user = models.ForeignKey(User, on_delete=models.CASCADE)
survey = models.ForeignKey(Survey, on_delete=models.CASCADE)
survey_status = models.CharField(max_length=10,
blank=True,
null=True,
choices=STATUS_SURVEY_CHOICES,
)
models/UserSurvey:
class UserSurvey(BaseModel):
user = models.ForeignKey(User, null=True, blank=True,
on_delete=models.DO_NOTHING)
followups = models.ManyToManyField(FollowUp, blank=True)
surveys = models.ManyToManyField(Survey, blank=True)
questions = models.ManyToManyField(Question, blank=True)
#classmethod
def create(cls, user_id):
user = User.objects.filter(pk=user_id).first()
cu_quest = cls(user=user)
cu_quest.save()
cu_quest._get_all_active_surveys
cu_quest._get_all_followups()
cu_quest._get_all_questions()
return cu_quest
def _get_all_questions(self):
[[self.questions.add(ques) for ques in qstnr.questions.all()]
for qstnr in self.followups.all()]
return
def _get_all_followups(self):
queryset = FollowUp.objects.filter(survey__user=self.user).filter(survey__user__surveystatus_survey_status='active')
# queryset = self._get_all_active_surveys()
[self.followups.add(quest) for quest in queryset]
return
#property
def _get_all_active_surveys(self):
queryset = Survey.objects.filter(user=self.user,
surveystatus__survey_status='active')
[self.surveys.add(quest) for quest in queryset]
return
Now my questions:
my view sends to the create of the UserSurvey model in order to create a questionary.
I need to get all the questions of the followup of the surveys with a survey_status = 'active' for the user (the one who clicks on a button)...
I tried several things:
I wrote the _get_all_active_surveys() function and there I get all the surveys that are with a survey_status = 'active' and then the _get_all_followups() function needs to call it to use the result to build its own one. I have an issue telling me that
a list is not a callable object.
I tried to write directly the right query in _get_all_followups() with
queryset = FollowUp.objects.filter(survey__user=self.user).filter(survey__user__surveystatus_survey_status='active')
but I don't succeed to manage all the M2M relationships. I wrote the query above but issue also
Related Field got invalid lookup: surveystatus_survey_status
i read that a related_name can help to build reverse query but i don't understand why?
it's the first time i see return empty and what it needs to return above. Why this notation?
If you have clear explanations (more than the doc) I will very appreciate.
thanks
Quite a few things to answer here, I've put them into a list:
Your _get_all_active_surveys has the #property decorator but neither of the other two methods do? It isn't actually a property so I would remove it.
You are using a list comprehension to add your queryset objects to the m2m field, this is unnecessary as you don't actually want a list object and can be rewritten as e.g. self.surveys.add(*queryset)
You can comma-separate filter expressions as .filter(expression1, expression2) rather than .filter(expression1).filter(expression2).
You are missing an underscore in surveystatus_survey_status it should be surveystatus__survey_status.
Related name is just another way of reverse-accessing relationships, it doesn't actually change how the relationship exists - by default Django will do something like ModelA.modelb_set.all() - you can do reverse_name="my_model_bs" and then ModelA.my_model_bs.all()

What is the best way to handle DJANGO migration data with over 500k records for MYSQL

A migration handles creating two new fields action_duplicate and status_duplicate
The second migration copies the data from the action and status fields, to the two newly created fields
def remove_foreign_keys_from_user_request(apps, schema_editor):
UserRequests = apps.get_model("users", "UserRequest")
for request_initiated in UserRequest.objects.all().select_related("action", "status"):
request_initiated.action_duplicate = request_initiated.action.name
request_initiated.status_duplicate = request_initiated.status.name
request_initiated.save()
The third migration is suppose to remove/delete the old fields action and status
The fourth migration should rename the new duplicate fields to the old deleted fields
The solution here is to remove the dependency on the status and action, to avoid unnecessary data base query, since the status especially will only be pending and completed
My question is for the second migration. The number of records are between 300k to 600k records so I need to know a more efficient way to do this so it doesn't take up all the memory available.
Note: The Database is MySQL.
A trimmed-down version of the UserRequest model
class UserRequest(models.Model):
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
reference = models.CharField(max_length=50, null=True, blank=True)
requester = models.ForeignKey(User, blank=True, null=True, on_delete=models.CASCADE)
action = models.ForeignKey(Action, on_delete=models.CASCADE)
action_duplicate = models.CharField(
max_length=50, choices=((ACTION_A, ACTION_A), (ACTION_B, ACTION_B)), default=ACTION_A
)
status = models.ForeignKey(ProcessingStatus, on_delete=models.CASCADE)
status_duplicate = models.CharField(
max_length=50,
choices=((PENDING, PENDING), (PROCESSED, PROCESSED)),
default=PENDING,
)
You can work with a Subquery expression [Django-doc], and do the update in bulk:
def remove_foreign_keys_from_user_request(apps, schema_editor):
UserRequests = apps.get_model('users', 'UserRequests')
Action = apps.get_user('users', 'Action')
Status = apps.get_user('users', 'ProcessingStatus')
UserRequests.objects.update(
action_duplicate=Subquery(
Action.objects.filter(
pk=OuterRef('action_id')
).values('name')[:1]
),
status_duplicate=Subquery(
Status.objects.filter(
pk=OuterRef('status_id')
).values('name')[:1]
)
)
That being said, it looks that what you are doing is actually the opposite of database normalization [wiki]: usually if there is duplicated data, you make an extra model where you make one Action/Status per value, and thus prevent having the same value for action_duplicate/status_duplicate multiple times in the database: this will make the database larger, and harder to maintain.
Note: normally a Django model is given a singular name, so UserRequest instead of UserRequests.

Django Query: How to order posts by amount of upvotes?

I'm currently working on a website (with Django), where people can write a story, which can be upvoted by themselves or by other people. Here are the classes for Profile, Story and Upvote:
class Profile(AbstractBaseUser, PermissionsMixin):
email = models.EmailField(unique=True)
first_name = models.CharField(max_length=30, null=True)
last_name = models.CharField(max_length=30, null=True)
class Story(models.Model):
author = models.ForeignKey('accounts.Profile', on_delete=models.CASCADE, related_name="author")
title = models.CharField(max_length=50)
content = models.TextField(max_length=10000)
class Upvote(models.Model):
profile = models.ForeignKey('accounts.Profile', on_delete=models.CASCADE, related_name="upvoter")
story = models.ForeignKey('Story', on_delete=models.CASCADE, related_name="upvoted_story")
upvote_time = models.DateTimeField(auto_now=True)
As you can see, Upvote uses two foreign keys to store the upvoter and the related story. Now I want to make a query which gives me all the stories, sorted by the amount of upvotes they have. I've tried my best to come up with some queries myself, but it's not exactly what I'm searching for.
This one doesn't work at all, since it just gives me all the stories in the order they were created, for some reason. Also it contains duplicates, although I want them to be grouped by story.
hot_feed = Upvote.objects.annotate(upvote_count=Count('story')).order_by('-upvote_count')
This one kind of works. But if I'm trying to access a partical story in my template, it just gives me back the id. So I'm not able to fetch the title, author and content from that id, since it's just an integer, and not an object.
hot_feed = Upvote.objects.values('story').annotate(upvote_count=Count('story')).order_by('-upvote_count')
Could someone help me out with finding the query I'm searching for?
You are querying from the wrong model, you here basically fetch Upvotes ordered by the number of stories, or something similar.
But your probaby want to retrieve Storys by the number of upvotes, so you need to use Story as "queryset root", and annotate it with the number of upvotes:
Story.objects.annotate(
upvote_count=Count('upvoted_story')
).order_by('-upvote_count')
I think the related_name of your story is however a bit "misleading". The related_name is the name of the relation "in reverse", so probably a better name is upvotes:
class Upvote(models.Model):
profile = models.ForeignKey(
'accounts.Profile',
on_delete=models.CASCADE,
related_name='upvotes'
)
story = models.ForeignKey(
'Story',
on_delete=models.CASCADE,
related_name='upvotes'
)
upvote_time = models.DateTimeField(auto_now=True)
In that case the query is:
Story.objects.annotate(
upvote_count=Count('upvotes')
).order_by('-upvote_count')

Django user audit

I would like to create a view with a table that lists all changes (created/modified) that a user has made on/for any object.
The Django Admin site has similar functionality but this only works for objects created/altered in the admin.
All my models have, in addition to their specific fields, following general fields, that should be used for this purpose:
created_by = models.ForeignKey(User, verbose_name='Created by', related_name='%(class)s_created_items',)
modified_by = models.ForeignKey(User, verbose_name='Updated by', related_name='%(class)s_modified_items', null=True)
created = CreationDateTimeField(_('created'))
modified = ModificationDateTimeField(_('modified'))
I tried playing around with:
u = User.objects.get(pk=1)
u.myobject1_created_items.all()
u.myobject1_modified_items.all()
u.myobject2_created_items.all()
u.myobject2_modified_items.all()
... # repeat for >20 models
...and then grouping them together with itertool's chain(). But the result is not a QuerySet which makes it kind of non-Django and more difficult to handle.
I realize there are packages available that will do this for me, but is it possible to achieve what I want using the above models, without using external packages? The required fields (created_by/modified_by and their timefields) are in my database already anyway.
Any idea on the best way to handle this?
Django admin uses generic foreign keys to handle your case so you should probably do something like that. Let's take a look at how django admn does it (https://github.com/django/django/blob/master/django/contrib/admin/models.py):
class LogEntry(models.Model):
action_time = models.DateTimeField(_('action time'), auto_now=True)
user = models.ForeignKey(settings.AUTH_USER_MODEL)
content_type = models.ForeignKey(ContentType, blank=True, null=True)
object_id = models.TextField(_('object id'), blank=True, null=True)
object_repr = models.CharField(_('object repr'), max_length=200)
action_flag = models.PositiveSmallIntegerField(_('action flag'))
change_message = models.TextField(_('change message'), blank=True)
So, you can add an additional model (LogEntry) that will hold a ForeignKey to the user that changed (added / modified) the object and a GenericForeignKey (https://docs.djangoproject.com/en/1.7/ref/contrib/contenttypes/#generic-relations) to the object that was modified.
Then, you can modify your views to add LogEntry objects when objects are modified. When you want to display all changes by a User, just do something like:
user = User.objects.get(pk=1)
changes = LogEntry.objects.filter(user=user)
# Now you can use changes for your requirement!
I've written a nice blog post about that (auditing objects in django) which could be useful: http://spapas.github.io/2015/01/21/django-model-auditing/#adding-simple-auditing-functionality-ourselves

Efficient alternative at Django filter by property

I know that filtering by property is not possible with Django, as filtering is done at database level and properties live in Python code. However, I have the following scenario:
In one hand, I have the model RegisteredUser on the other hand Subscription. A user can have multiple subscriptions, a subscription is from one user and a user has one or none active subscriptions.
To implement this, I have a foreign key from Subscription to RegisteredUser and a property subscription at RegisteredUser that points to the active one (latest created subscription for that user) or none if he hasn't any subscriptions.
Which would be the most efficent way to filter users that have subscription "platinum", "gold", "silver"...? I could do a "fetch all subscriptions" and then iterate over them to check each one for a match. But it would be really expensive and if I have to do the same process for each kind of subscription type, then cost would be s * u (where s is the number of different subscriptions and u is the number of users).
Any help will be appreciated. Thanks in advance!
UPDATE:
When I first explained the problem, I didn't include all the models related to
simplify a litte. But as you are asking me for the models and some of you haven't understood me
(perhaps I wasn't clear enough) here you have the code.
I've simplified the models and stripped out code that is not important now.
What do I have here? A RegisteredUser can have many subscriptions (because he may change it
as many times as he wants), and a subscription is from just one user. The user has only
one current subscription, which is the latest one and is returned by the property
subscription. Subscription is attached with Membership and this is the model whose
slug can be: platinum, gold, silver, etc.
What do I need? I need to lookup Content whose author has a specific kind of membership.
If the property approach worked, I'd have done it like this:
Content.objects.filter(author__id__in=RegisteredUser.objects.filter(
subscription__membership__slug="gold"))
But I can't do this because properties can't be used when filtering!
I thought that I could solve the problem converting the "virtual" relation created by
the property into a real ForeignKey, but this may cause side effects, as I should update it manually each time a user changes its subscription and now it's automatic! Any better ideas?
Thanks so much!
class RegisteredUser(AbstractUser):
birthdate = models.DateField(_("Birthdate"), blank=True, null=True)
phone_number = models.CharField(_("Phone number"), max_length=9, blank=True, default="")
#property
def subscription(self):
try:
return self.subscriptions_set.filter(active=True).order_by("-date_joined",
"-created")[0]
except IndexError:
return None
class Subscription(models.Model):
date_joined = models.DateField(_("Date joined"), default=timezone.now)
date_canceled = models.DateField(_("Date canceled"), blank=True, null=True)
subscriber = models.ForeignKey(AUTH_USER_MODEL, verbose_name=_("Subscriber"),
related_name="subscriptions_set")
membership = models.ForeignKey(Membership, verbose_name=_("Membership"),
related_name="subscriptions_set")
created = models.DateTimeField(_("Created"), auto_now_add=True)
last_updated = models.DateTimeField(_("Last updated"), auto_now=True)
active = models.BooleanField(_("Active"), default=True)
class Membership(models.Model):
name = models.CharField(_("Name"), max_length=15)
slug = models.SlugField(_("Slug"), max_length=15, unique=True)
price = models.DecimalField(_("Price"), max_digits=6, decimal_places=2)
recurring = models.BooleanField(_("Recurring"))
duration = models.PositiveSmallIntegerField(_("Duration months"))
class Content(models.Model):
author = models.ForeignKey(AUTH_USER_MODEL, verbose_name=_("Author"),
related_name="contents_set")
title = models.CharField(_("Title"), max_length=50)
slug = models.SlugField(_("Slug"), max_length=70, unique=True)
content = RichTextField(_("Content"))
date = models.DateField(_("Date"), default=timezone.now)
published = models.BooleanField(_("Published"))
Finally, to solve the problem I replaced the subscription property by a real foreign key and added a signal to attach the RegisteredUser with the created subscription.
Foreign key:
subscription = models.ForeignKey(Subscription, verbose_name=_("Subscription"),
related_name='subscriber_set', blank=True, null=True)
Signal:
#receiver(post_save, sender=Subscription)
def signal_subscription_post_save(sender, instance, created, **kwargs):
if created:
instance.subscriber.subscription = instance
instance.subscriber.save()
I think you model are something like:
KIND = (("p", "platinum"), ("g","gold"), ("s","silver"),)
class RegisteredUser(models.Model):
# Fields....
class Subscription(models.Model):
kind = models.CharField(choices=KIND, max_len=2)
user = models.ForeignKey(RegisteredUser, related_name="subscriptions")
Now, you can do something like that:
gold_users = RegisteredUser.objects.filter(subscriptions_kind="g")
silver_users = RegisteredUser.objects.filter(subscriptions_kind="s")
platinum_users = RegisteredUser.objects.filter(subscriptions_kind="p")
Adapt it to your models
Hope helps
EDIT
Now, With your models, I think you want something like:
content_of_golden_users = Content.objects.filter(author__subscriptions_set__membership__slug="golden")
content_of_silver_users = Content.objects.filter(author__subscriptions_set__membership__slug="silver")
content_of_platinum_users = Content.objects.filter(author__subscriptions_set__membership__slug="platinum")