Django: Efficiently updating many to many relation - django

Let's say we have a Django User with a lot of Groups. We want to update the groups with a new list of groups.
A simple but not performant solution could be:
def update_users_groups(new_groups: List[Group]):
user.groups.set(new_groups)
A little bit more performant solution is something similar to:
def update_users_groups(new_groups: List[Group]):
new_groups = set([x.id for x in new_groups])
old_groups = set([x.id for x in user.groups.all()])
groups_to_add = new_groups - old_groups
if groups_to_add:
user.groups.add(*groups_to_add)
groups_to_remove = old_groups - new_groups
if groups_to_remove:
user.groups.remove(*groups_to_remove)
I could not find any hints in the documentation for a built-in method. Is there some best practice or any way I could improve my example from above? Maybe even s.o. has an idea for a more performant solution.
Thank you in advance!

Related

Django/DRF filtering

I filter my list of Product models by title field. For example, I want to find this title = 'Happy cake'. And if I type
Case 1. 'happy cake',
Case 2. 'hapy cake', happi kake'
it should return me 'Happy cake'.As I know icontains helps me with case 1. How can I get that?May be some sort of technologies should be added or django itself has appropriate solution?
You can try using the lookup __in
Model.objects.filter(title__in=['happy cake', 'happi kake'])
You can put as many test cases you want in the list.
You can do this other way.
If you are sure about the start ha here
Happy Cake
Hapy cake
happi kake
Product.objects.filter(title__startswith='ha')
This kind of question is hard to solve by just using Django inbuild search system. So this is one way to solve this question. ElasticSearch. It has fuzzy search and indexing. Cool thing to deal with tough tasks). I pushed to git some starting code. It doesn't solve this question fully, but with some workaround that goal can be achieved.

Django ORM Which technique is better?

I have a project.
project = Project.objects.get(id=1)
and now i want to select the data from related tables of project. It can be done it 2 ways, let me know which one is better. and why?
attachments = project.attachments_set.all()
samples = project.projectsamples_set.all()
OR
attachments = Attachments.objects.filter(project=ctx['project'])
samples = ProjectSamples.objects.filter(project=ctx['project'])
I would like to know the Technical prospective.
These queries are exactly equivalent, as you can see if you examine the generated SQL. I would say that the first is preferable as it is more compact and readable, but that is very much subjective so it is up to you which you use.
(Note that if you don't actually have the project object to start with, and don't need it, then it's more efficient to query Attachments and Samples via project_id than to get the product and use the related accessors. However that doesn't appear to be the case in your example.)

Django: can I add a list of new model instances to the database via a single method call?

If I am creating a list of new model objects based on some form input, e.g.,
new_items = []
for name, value in self.cleaned_data.items():
if name.startswith('content_item_'):
new_items.append(ContentItem(item=value))
# can I add the entire new_items list to the database in one swoop?
I'm having trouble finding whether this in the docs, which generally refer to creating objects one at a time via the .save() method. But one-at-a-time seems inefficient when you have a whole list of objects to add.
Thanks!
https://docs.djangoproject.com/en/dev/ref/models/querysets/#bulk-create
Edit: Unfortunately this is not on 1.3
Original Answer
Thank god for bulk_create!
You could then do something like this:
ContentItem.objects.bulk_create(new_items)
For those too lazy to click the link, here is the example from the docs:
>>> Entry.objects.bulk_create([
... Entry(headline="Django 1.0 Released"),
... Entry(headline="Django 1.1 Announced"),
... Entry(headline="Breaking: Django is awesome")
... ])
I believe Brandon Konkle's reply to a similar question is still valid: Question about batch save objects in Django
In summary: Sadly, no, you'll have to use django.db.cursor with a manual query to do so. If the dataset is small, or the performance is of less importance though, looping through isn't really THAT bad, and is the simplest solution.
Also, see this ticket: https://code.djangoproject.com/ticket/661

django: best practice way to get model from an instance of that model

Say my_instance is of model MyModel.
I'm looking for a good way to do:
my_model = get_model_for_instance(my_instance)
I have not found any really direct way to do this.
So far I have come up with this:
from django.db.models import get_model
my_model = get_model(my_instance._meta.app_label, my_instance.__class__.__name__)
Is this acceptable? Is it even a sure-fire, best practice way to do it?
There is also _meta.object_name which seems to deliver the same as __class__.__name__. Does it? Is better or worse? If so, why?
Also, how do I know I'm getting the correct model if the app label occurs multiple times within the scope of the project, e.g. 'auth' from 'django.contrib.auth' and let there also be 'myproject.auth'?
Would such a case make get_model unreliable?
Thanks for any hints/pointers and sharing of experience!
my_model = type(my_instance)
To prove it, you can create another instance:
my_new_instance = type(my_instance)()
This is why there's no direct way of doing it, because python objects already have this feature.
updated...
I liked marcinn's response that uses type(x). This is identical to what the original answer used (x.__class__), but I prefer using functions over accessing magic attribtues. In this manner, I prefer using vars(x) to x.__dict__, len(x) to x.__len__ and so on.
updated 2...
For deferred instances (mentioned by #Cerin in comments) you can access the original class via instance._meta.proxy_for_model.
my_new_instance = type(my_instance)()
At least for Django 1.11, this should work (also for deferred instances):
def get_model_for_instance(instance):
return instance._meta.model
Source here.

Multijoin queries in Django

What's the best and/or fastest method of doing multijoin queries in Django using the ORM and QuerySet API?
If you are trying to join across tables linked by ForeignKeys or ManyToManyField relationships then you can use the double underscore syntax. For example if you have the following models:
class Foo(models.Model):
name = models.CharField(max_length=255)
class FizzBuzz(models.Model):
bleh = models.CharField(max_length=255)
class Bar(models.Model):
foo = models.ForeignKey(Foo)
fizzbuzz = models.ForeignKey(FizzBuzz)
You can do something like:
Fizzbuzz.objects.filter(bar__foo__name = "Adrian")
Don't use the API ;-) Seriously, if your JOIN are complex, you should see significant performance increases by dropping down in to SQL rather than by using the API. And this doesn't mean you need to get dirty dirty SQL all over your beautiful Python code; just make a custom manager to handle the JOINs and then have the rest of your code use it rather than direct SQL.
Also, I was just at DjangoCon where they had a seminar on high-performance Django, and one of the key things I took away from it was that if performance is a real concern (and you plan to have significant traffic someday), you really shouldn't be doing JOINs in the first place, because they make scaling your app while maintaining decent performance virtually impossible.
Here's a video Google made of the talk:
http://www.youtube.com/watch?v=D-4UN4MkSyI&feature=PlayList&p=D415FAF806EC47A1&index=20
Of course, if you know that your application is never going to have to deal with that kind of scaling concern, JOIN away :-) And if you're also not worried about the performance hit of using the API, then you really don't need to worry about the (AFAIK) miniscule, if any, performance difference between using one API method over another.
Just use:
http://docs.djangoproject.com/en/dev/topics/db/queries/#lookups-that-span-relationships
Hope that helps (and if it doesn't, hopefully some true Django hacker can jump in and explain why method X actually does have some noticeable performance difference).
Use the queryset.query.join method, but only if the other method described here (using double underscores) isn't adequate.
Caktus blog has an answer to this: http://www.caktusgroup.com/blog/2009/09/28/custom-joins-with-djangos-queryjoin/
Basically there is a hidden QuerySet.query.join method that allows adding custom joins.