Using QuerySet.update() versus ModelInstance.save() in Django - django

I am curious what others think about this problem...
I have been going back and forth in the past few days about using QuerySet.update() versus ModelInstance.save(). Obviously if there are lots of fields being changed, I'd use save(), but for updating a couple of fields, I think it's better to use QuerySet.update(). The benefit of using QuerySet.update() is that you can have multiple threads running update() at the same time, on different fields of the same object, and you won't have race issues. The default save() method saves all the fields, so parallel save() from two threads will be problematic.
So then the issue is what if you have overloaded, custom save() methods. The best I can think of is to abstract whatever in the custom save() method into separate updater methods that actually uses QuerySet.update() to set a couple of fields in the model. Has anyone used this pattern?
What's a bit irritating is that in Django Admin, even in editing in change list mode where you are editing just one field, the entire model is saved. This basically means if someone have a change list open on his/her browser, while some where else in the system a field gets updated, that updated value will be thrown away when this user saves changes from the change list. Is there a solution to this problem?
Thoughts?
Thanks.

The main reason for using QuerySet.update() is that you can update more than one object with just one database query, while every call to an object's save method will hit the database!
Another point worth mentioning is, that django's pre_save & post_save signals are only sent when you call an object's save-method, but not upon QuerySet.update().
Coming to the conflict issues you describe, I think it would also be irritating if you hit 'save' and then you have to discover that afterwards some values are the same as when you changed them, but some that you left unchanged have changed?!? Of course it's up to you to modify the admin's save_model or the object's save method to the bahaviour you suggest.

The problem you described about Django Admin is essentially about updating a model instance using an outdated copy. It is easy to fix by adding a version number to each model instance and increment the number on each update. Then in the save method of the model, just make sure what you are saving is not behind what is already in the database.
I want to make sure when there are parallel writes to the same object, each write updates a different fields, they don't overwrite each other's values.
Depending on the application, this may or may not be a sensible thing. Saving a whole model even if only a single field is updated can often avoid breaking integrity of data. Thinking about the following example about travel itinerary of three-leg flight. Assume there is an instance of three fields representing three legs and three fields are SF->LA, LA->DC, DC->NY. Now if one update is to update the first two legs to SF->SD, SD->DC, and another update is to update the last two legs to LA->SJ, SJ->NY, and if you allow both to happen with update instead of saving the full model instance, you would come out with a broken itinerary of SF->SD, LA->SJ, SJ->NY.

Related

How to implement memcached with Django & APIs while underlying Database objects may change

I am using Django's native Authorization/Authentication model to manage logins for my WebApp. This creates instances of the User model.
I would like to write a simple class-based-APIView that can tell me if a specific email is already used (IE: Is there already a user with the given email in my database?). The first time this API is called, it should get the matching User object from the DB. But subsequent times it is called, it should return it from the Memcache (if and only if, the underlying row in the database is unchanged). How can I do that??
Should I inherit from generic.APIView? Why or why not? What would the view look like? In particular I want to understand how to properly do the memcaching and cache-coherency checking. Furthermore, how would this memcaching scheme work if I had another API that modified the User object?
Thanks. I was unable to find detailed idiot-proof manual on using memcaching properly in Django.
Caching is perhaps the simplest part of django - so I'll leave that discussion to the last. The bigger problem is figuring out when your model changed.
You can decide what constitutes an update. For example, you might consider that only when a particular field is updated, then the cache is updated. Your cache update process should be limited to the writing/updating code or view. If you go about this method, then I would recommend django-model-utils and its StatusField - you can add this logic in save() method by overriding it; or implement it at the code that is updating models.
You can also do a simpler approach, that is, no matter what is updated - as long as save() is called, expire the cache and repopulate it.
The rest of the code is very simple.
Attempt to fetch the item from the cache, if the item doesn't exist (called a cache miss), then you populate the cache by fetching from the database. Otherwise, you'll get the item from the cache and then you save yourself a database hit.
The cache interface is very simple, you set('somekey', 'somevalue') you can optionally tell it when to expire the item. Then you try to get('somekey'), if this returns None, then its a cache miss (perhaps the item expired), and you have to fetch it and populate the cache. Otherwise, you'll get the cached object.

How to save a related, inline django model when parent is saved in the admin?

In my models, I have an Event class, a Volunteer class, and a Session class. The Session class has a foreign key field for an Event and a Volunteer, and is a unique coupling of both, as well as a date and time. Taken together, Volunteer and Event I think technically have a ManyToMany relationship.
Using the pre-packaged Django admin, I edit Volunteers and Events with their own admin.ModelAdmin classes respectively. Sessions are edited inline in the Events ModelAdmin.
When I add a new Session to an event in the admin interface, with a Volunteer, I need the Volunteer's hours field to be automatically updated, to reflect however many hours the newly added session lasted (plus all past sessions). Currently, I just have a calculate_hours function in the Volunteer model, which iterates over all sessions each time it is called and finds the sum of the hours. I tried to call it with a custom save function in Session, but it appears never to be called after the Event save function. I would try it in Event, but I have no way to isolate which Volunteers need their hours recalculated. The hours field IS updated if I manually go over to the Volunteer admin page, edit, and then save the Volunteer, but this is pretty unacceptable.
I see that there are many questions on SO about Django problems when saving inline objects on the admin site, particularly with ManyToMany fields. I'm not sure, after reading many of these questions, if what they say applies in my case--maybe I need to receive a signal somewhere, or include a custom save in a special place, or call save_model in my admin.ModelAdmin class... I just don't know. What is the best way to go about this?
Code can be found here: Models.py, Admin.py
First of all, the relationship you're describing is what called a ManyToMany "through" (you can read about it in the documentation here).
Secondly, I don't understand why you need the 'hours' to be a field at all. Isn't a function enough for this? why save it in the database in the first place? you can just call it every time you need it.
Finally, it seems to me you're doing a lot of extra work that I don't understand - why do you need the volunteer time boolean field? If you link a volunteer with an event isn't that enough to know that he was there? And what's the purpose of "counts_towards_volunteer_time"? I'm probably missing some of the logic here, but a lot of that seems wasteful.

Django example of a form that submits multiple instances into a database?

I need a django form that submits multiple separate requests, and can't find an example of how to do this without a lot of customization. I.e., suppose there is a form that is used by a car repair shop. The form will list all the possible repairs that the shop is capable of doing, and the user will select which repairs they want to have done (i.e., using checkboes.)
Each repair can be assigned to a different mechanic. Each repair can also be cancelled or declared to be done, independent of the other repairs. That seems to require that each repair become a separate instance in a database.
Additionally, each repair job can be only performed by certain mechanic. So I need the ability to associate each repair job to it's own unique list of mechanics to choose from.
Has anyone seen an example of a django form, that does something like this? Thanks.
This is what formsets (and model formsets) are for.
It's been a while since the question is asked and I had the same problem:
I solved it by instance = form.save(commit=False), then setting the different attributes, then instances.save(force_insert=True), then deleting the form.instance.id....
HOWEVER this means that all fields that are eventually overwritten in the save method stay after the frist call to save()... This bit me hard!
How did you end up doing it?

What belongs in django model clean method

I'm wondering what the appropriate things to put in my model's clean() method are.
Does it make sense to put all the verification of and manipulation to a model's properties to ensure it is valid (ie. business logic)? There is a lot of that in my case and I'm wondering if it makes sense to execute it all every time a model is saved.
For example i'm doing things like :
- if a video is marked as private, remove all its references in playlsts
- ensure that the video's title is unique with relation to the users other videos
- etc.
some of the things i'm doing only really need to be done on creation of a new video - so checking/ setting them every time the model is saved also seems excessive.
Is this the correct use of the clean() method?
Clearing relationships is probably best handled by a signal. To validate your signals are working properly, you can write a unit test.
Validating that the title is unique is something that definitely belongs in a form/model validator. To me, that seems like a better separation of concerns.

Detect which fields change in a Django ModelForm

I have an app where user submitted data needs to go through a verification process before it shows up on the site. At the moment this means they cannot edit the item without removing it from the site (so our admins can check it's okay).
I'd like to write another model where I can store revisions. Basically three fields where I store the date submitted, a boolean saying if the user is ready for that revision to be considered and a third where I store all the changes (as a pickled/JSON dict).
The problem I have at the moment is I don't want to bombard the admins with a complete listing each time. I only want them to see the changed fields. This means I need a way of generating a list of which fields have changed when the user submits the edit ModelForm so I only save this data in the revision.
There are probably several ways of doing this but my post-pub-quiz brain is slightly numb and can't think of the best way. How would you do it?
In terms of where this would go, I'd probably write it as an abstract ModelForm-inheriting class that other forms use. I'd override save() to stop it writing the data directly back to database (I'd want to redirect it through this fancy new revisions model).
Come to think of it, is there an app that already does this generically?