What is the recommend approach when extending some sort of save behavior in Django, such as saving calculated values?
I've seen people overriding the save method and I've seen people using signals.
What is the correct/most used/better approach for this?
save(), delete() do not get called on bulk actions, signals are your only option then.
I use simple approach. If need to update some fields on object itself - redefine save(). If need to work with other objects or querysets somehow - connect signals.
Related
I'm hoping this will be a really simple question.
I just wanted some advise on the bulk updating of records.
An example bulk update may go something like this:
for user in User.objects.all().iterator():
user.is_active = False
user.save()
Is there a more efficient way to do this on the database level using the Django ORM?
Would the following be more efficient:
User.objects.all().update(is_active=False)?
It will work, but be aware that the update command will be converted directly to a SQL command, without running anything you customized on save and without triggering the save signals. If in your case there is no problem and you are more worried with the performance, go for it.
From django docs:
Be aware that the update() method is converted directly to an SQL statement. It is a bulk operation for direct updates. It doesn’t run any save() methods on your models, or emit the pre_save or post_save signals (which are a consequence of calling save()), or honor the auto_now field option. If you want to save every item in a QuerySet and make sure that the save() method is called on each instance, you don’t need any special function to handle that.
https://docs.djangoproject.com/en/2.2/topics/db/queries/#updating-multiple-objects-at-once
Yes, using update would be more efficient. That will do a single database call, instead of one per object.
Yes. You can use User.objects.all().update(is_active=False).
This will result in a single query that looks like:
UPDATE `auth_user`
SET `is_active` = 0
It will thus reduce the number of roundtrips to the database to one.
On Google AppEngine, we have .put() and put_async(), which are called to save an model object. ().
Being new to GAE, it is not clear to me how I can ensure that some functionality gets executed every time I same an object.
In vanilla Django, I can use signals, or override the .save() method.
How would I achieve similar results on GAE, considering I can actually rely on .put() being called when an object is saved?
There are several ways you could accomplish this. You could override the put method with your own code. Just be sure to call the models super put().
However, the route I would choose would be to implement a post put hook (assuming you're using NDB). See the hook method documentation here: https://developers.google.com/appengine/docs/python/ndb/modelclass
Simple use case:
After a user updates a record, I want to get the changed fields and save them in a history table. I'm using django-ditryfields to grab this history. So my thought process was to use the pre_save signal to grab all the 'dirty' fields and them store them in my history table.
Problem there is that I can't get request.user while using signals. I need this to see which user has made the change to the record. My other thought was just to override the save method of my model but then I also can't get request.user from a model directly either. I would have to send a **kwarg['user'] with the user info from the view to get this info. This is fine but I am going to be making save calls from a bunch of different places around the code. I don't want to have to keep passing request.user every time I edit an object. This is why I'd love to have one spot, like a signal, to handle all of this. Perhaps some middleware I'm not familiar with?
Is there a better way to achieve such a thing?
You cannot access the user object from a signal.
You can consider using this third party package: django-requestprovider to access the request object in the signal.
The other way would be to overriding the models' save method.
I haven't seen any thing on this topic in Django's online documents.
I am trying to save a list of objects to database, but what I can do is loop through the list and call save() on every object.
So does Django hit database several times? Or Django will do one batch save instead?
As of Django 1.4, there exists a bulk_create() method on the QuerySet object, which allows for inserting a list of objects in a single query. For more info, see:
Django documentation for bulk_create
Django 1.4 release notes
The ticket that implemented this feature
Unfortunately, batch inserts are something that Django 1.3 and prior do not directly support. If you want to use the ORM, then you do have to call save() on each individual object. If it's a large list and performance is an issue, you can use django.db.cursor to INSERT the items manually inside a transaction to dramatically speed the process up. If you have a huge dataset, you need to start looking at Database engine specific methods, like COPY FROM in Postgres.
From Django 1.4 exists bulk_create(), but, always but.
You need to be careful, using bulk_create() it wont call instance save() method internally.
As django docs says
The model’s save() method will not be called
So, if you are overriding save method, (as my case was) you can't use bulk_create.
This question is also addressed in How do I perform a batch insert in Django?, which provides some ways to make Django do this.
This might be a good starting point, but as the author of the code snippet says, it might not be production ready.
I am working on a Django project and I want to send a signal when something gets added to some model's related set. E.g. we have an owner who has a set of collectables, and each time the method owner.collectable_set.add(something) is getting called, I want a signal like collectable_added or something. Signals are clear to me, but I don't know which manager(?) contains the "add" method that I want to override.
Edit for Xavier's request to provide more details: you can easily override a model’s save method, by simply defining it and calling the "super-save" so it gets properly saved with some extra functionality. But I wonder where to override a related set's add method.
Gosh, I think I haven't brought in any further details, but I think it should be clear what I want to do even from the first paragraph.
Edit 2: This is the method I want to override. Is it recommended to do so, or do you suggest another way to place the sending of the signal?
This is the solution I found, the m2m_changed signal. Took me quite some searching and reading. Furthermore, I found out that it is not trivial to extend the ManyRelatedManager class, which would have been the other option. But with the m2m_changed signal I can rely on built-in functions which is the preferred way most of the time.
I think you're looking for the RelatedManager Class.
After much searching (thanks to this Paul's hint), I came across this snippet that helped to explain the m2m_changed implementation to intercept not override the add method on the ManyRelatedManager. It appears that the manager on a many-to-many relationship happens on the fly, so it's not trivial to override the method.