Accessing M2M field elements during the execution of models's save method - django

I need to override django models's save method. I have used filter_horizontal for a many-to-many field in admin.py. I need to access the contents of that many-to-many field in the save method. But the many-to-many field is empty always when save method is executing. So I tried using Timer thread, to execute the process little later, but throws up error related to thread. Threads are not allowed in most of server-side technologies to avoid some deadlock problem. Is there any way that I can run a set of code that will execute immediately after save method has completed execution. I read something about signal.post_save() that is called in models's save_base method but I dont know whether that will be useful.

Looks like you might want to use a custom model form as here: http://reinout.vanrees.org/weblog/2011/11/29/many-to-many-field-save-method.html
There are a lot of links to related SO questions and bugs there an in the comments.

Related

Django & Django ORM: performance improvements to bulk updating records

I'm hoping this will be a really simple question.
I just wanted some advise on the bulk updating of records.
An example bulk update may go something like this:
for user in User.objects.all().iterator():
user.is_active = False
user.save()
Is there a more efficient way to do this on the database level using the Django ORM?
Would the following be more efficient:
User.objects.all().update(is_active=False)?
It will work, but be aware that the update command will be converted directly to a SQL command, without running anything you customized on save and without triggering the save signals. If in your case there is no problem and you are more worried with the performance, go for it.
From django docs:
Be aware that the update() method is converted directly to an SQL statement. It is a bulk operation for direct updates. It doesn’t run any save() methods on your models, or emit the pre_save or post_save signals (which are a consequence of calling save()), or honor the auto_now field option. If you want to save every item in a QuerySet and make sure that the save() method is called on each instance, you don’t need any special function to handle that.
https://docs.djangoproject.com/en/2.2/topics/db/queries/#updating-multiple-objects-at-once
Yes, using update would be more efficient. That will do a single database call, instead of one per object.
Yes. You can use User.objects.all().update(is_active=False).
This will result in a single query that looks like:
UPDATE `auth_user`
SET `is_active` = 0
It will thus reduce the number of roundtrips to the database to one.

Why is a auto_now=True field designed to not update when using QuerySet.update()?

from django:
The field is only automatically updated when calling Model.save(). The field isn’t updated when making updates to other fields in other ways such as QuerySet.update(), though you can specify a custom value for the field in an update like that.
Both will have to execute an update query, so what is the reason behind save.() updating the auto_now=True field and QuerySet.update() not updating the field?
Update query is meant to be faster than the regular field changing and saving pattern, thus it does not call the save() method, which handles updating auto_now fields, sending signals and so on. If you're not sure what you're doing, then it's always a good idea to explicitly call the save() on a model. Advanced and "less restricted" methods such as update or bulk_create are faster and meant for editing data on DB level. From Django docs:
Finally, realize that update() does an update at the SQL level and,
thus, does not call any save() methods on your models, nor does it
emit the pre_save or post_save signals (which are a consequence of
calling Model.save()).
If you were hoping for a more technical explanation, then the update query probably doesn't bother to check if the table has an auto_now field. It would require some data gathering and make the process slower. If you do want to update the field, you can update it explicitly.

How to save a related, inline django model when parent is saved in the admin?

In my models, I have an Event class, a Volunteer class, and a Session class. The Session class has a foreign key field for an Event and a Volunteer, and is a unique coupling of both, as well as a date and time. Taken together, Volunteer and Event I think technically have a ManyToMany relationship.
Using the pre-packaged Django admin, I edit Volunteers and Events with their own admin.ModelAdmin classes respectively. Sessions are edited inline in the Events ModelAdmin.
When I add a new Session to an event in the admin interface, with a Volunteer, I need the Volunteer's hours field to be automatically updated, to reflect however many hours the newly added session lasted (plus all past sessions). Currently, I just have a calculate_hours function in the Volunteer model, which iterates over all sessions each time it is called and finds the sum of the hours. I tried to call it with a custom save function in Session, but it appears never to be called after the Event save function. I would try it in Event, but I have no way to isolate which Volunteers need their hours recalculated. The hours field IS updated if I manually go over to the Volunteer admin page, edit, and then save the Volunteer, but this is pretty unacceptable.
I see that there are many questions on SO about Django problems when saving inline objects on the admin site, particularly with ManyToMany fields. I'm not sure, after reading many of these questions, if what they say applies in my case--maybe I need to receive a signal somewhere, or include a custom save in a special place, or call save_model in my admin.ModelAdmin class... I just don't know. What is the best way to go about this?
Code can be found here: Models.py, Admin.py
First of all, the relationship you're describing is what called a ManyToMany "through" (you can read about it in the documentation here).
Secondly, I don't understand why you need the 'hours' to be a field at all. Isn't a function enough for this? why save it in the database in the first place? you can just call it every time you need it.
Finally, it seems to me you're doing a lot of extra work that I don't understand - why do you need the volunteer time boolean field? If you link a volunteer with an event isn't that enough to know that he was there? And what's the purpose of "counts_towards_volunteer_time"? I'm probably missing some of the logic here, but a lot of that seems wasteful.

To use signals or override model save method?

Simple use case:
After a user updates a record, I want to get the changed fields and save them in a history table. I'm using django-ditryfields to grab this history. So my thought process was to use the pre_save signal to grab all the 'dirty' fields and them store them in my history table.
Problem there is that I can't get request.user while using signals. I need this to see which user has made the change to the record. My other thought was just to override the save method of my model but then I also can't get request.user from a model directly either. I would have to send a **kwarg['user'] with the user info from the view to get this info. This is fine but I am going to be making save calls from a bunch of different places around the code. I don't want to have to keep passing request.user every time I edit an object. This is why I'd love to have one spot, like a signal, to handle all of this. Perhaps some middleware I'm not familiar with?
Is there a better way to achieve such a thing?
You cannot access the user object from a signal.
You can consider using this third party package: django-requestprovider to access the request object in the signal.
The other way would be to overriding the models' save method.

Using QuerySet.update() versus ModelInstance.save() in Django

I am curious what others think about this problem...
I have been going back and forth in the past few days about using QuerySet.update() versus ModelInstance.save(). Obviously if there are lots of fields being changed, I'd use save(), but for updating a couple of fields, I think it's better to use QuerySet.update(). The benefit of using QuerySet.update() is that you can have multiple threads running update() at the same time, on different fields of the same object, and you won't have race issues. The default save() method saves all the fields, so parallel save() from two threads will be problematic.
So then the issue is what if you have overloaded, custom save() methods. The best I can think of is to abstract whatever in the custom save() method into separate updater methods that actually uses QuerySet.update() to set a couple of fields in the model. Has anyone used this pattern?
What's a bit irritating is that in Django Admin, even in editing in change list mode where you are editing just one field, the entire model is saved. This basically means if someone have a change list open on his/her browser, while some where else in the system a field gets updated, that updated value will be thrown away when this user saves changes from the change list. Is there a solution to this problem?
Thoughts?
Thanks.
The main reason for using QuerySet.update() is that you can update more than one object with just one database query, while every call to an object's save method will hit the database!
Another point worth mentioning is, that django's pre_save & post_save signals are only sent when you call an object's save-method, but not upon QuerySet.update().
Coming to the conflict issues you describe, I think it would also be irritating if you hit 'save' and then you have to discover that afterwards some values are the same as when you changed them, but some that you left unchanged have changed?!? Of course it's up to you to modify the admin's save_model or the object's save method to the bahaviour you suggest.
The problem you described about Django Admin is essentially about updating a model instance using an outdated copy. It is easy to fix by adding a version number to each model instance and increment the number on each update. Then in the save method of the model, just make sure what you are saving is not behind what is already in the database.
I want to make sure when there are parallel writes to the same object, each write updates a different fields, they don't overwrite each other's values.
Depending on the application, this may or may not be a sensible thing. Saving a whole model even if only a single field is updated can often avoid breaking integrity of data. Thinking about the following example about travel itinerary of three-leg flight. Assume there is an instance of three fields representing three legs and three fields are SF->LA, LA->DC, DC->NY. Now if one update is to update the first two legs to SF->SD, SD->DC, and another update is to update the last two legs to LA->SJ, SJ->NY, and if you allow both to happen with update instead of saving the full model instance, you would come out with a broken itinerary of SF->SD, LA->SJ, SJ->NY.