Django DB design for deleting crucial models? - django

I've run into an issue that I really haven't dealt with before. I have a task to upgrade from django 1 ==> 2. (django 1 doesn't require on_delete when dealing with relationships)
I have a couple of crucial models that have relationships inside, but I definitely don't want to CASCADE those records. For example, if a user deletes their account, I don't want their expenses to be deleted. Maybe we need to keep those expense instances for tax records later, etc.
I have read that DO_NOTHING can also be dangerous.
With a model like this, what would be the best course of action when dealing with the ForeignKeys?
I appreciate all the help in advance.
class Expenses(models.Model):
user = models.ForeignKey(Account, null=True, blank=True,
on_delete=models.?)
event = models.ForeignKey(Event, null=True, blank=True,
on_delete=models.?)
payee = models.CharField(max_length=128, null=True, blank=True)
category = models.ForeignKey(Category, blank=True, null=True,
related_name='expense_category', on_delete=models.?)

I have a task to upgrade from django 1 to 2. (django 1 doesn't require on_delete when dealing with relationships)
In django-1.x, if you did not specify on_delete, it used CASCADE [Django-doc], so in fact by specifying it, you can make it more safe.
I have read that DO_NOTHING can also be dangerous.
Well most databases will raise an integrity error for this, since then it would refer to a user that no longer exists. So DO_NOTHING is not in itself dangerous, it will simply for most databases not allow deleting the database, but that by rasing an IntegrityError.
With a model like this, what would be the best course of action when dealing with the ForeignKeys?
Perhaps here PROTECT [Django-doc] is here more appropriate, since it will simply prevent deleting the object if it is still referenced.
The best solution however depends on a large number of details. Therefore it might be better to look at the possible on_delete=… strategies [Django-doc].

Related

Do null foreign keys slow things down?

I'm implementing file attachments for certain objects in the project I am working on. There are six or so object classes which might reasonably have attached files (which would be revealed in their Detail views and managed via a link from there). The model would be like
class JobFile( models.Model):
job = models.ForeignKey( 'jobs.Job', models.SET_NULL,
null=True, blank=True, related_name='attachments', )
quote = models.ForeignKey( 'quotation.Quote', models.SET_NULL,
null=True, blank=True, related_name='attachments', )
#etc
document = models.FileField( ... ) # the attachment
One advantage of this rather than a Generic ForeignKey is that an upload can be attached to more than one sort of object at once. Another is the simplicity of referring to obj.attachments.all() in the obj detail views. I'm not looking for a large set of object classes to which these files might be attached.
However, for any one file attachment most of its ForeignKeys will be null. I have seen various references to null ForeignKeys causing slow Django ORM queries. Is this anything I need to be concerned about?
If it makes any difference, these objects will be almost exclusively accessed via the attachments reverse ForeignKey manager on the related object. The only time I can see a need for explicit filtering of JobLine.objects.filter(field__isnull=True) is in a management context looking for "orphaned" files (which shouldn't normally happen).

Django objects uniqueness hell with M2M fields

class Badge(SafeDeleteModel):
owner = models.ForeignKey(settings.AUTH_USER_MODEL,
blank=True, null=True,
on_delete=models.PROTECT)
restaurants = models.ManyToManyField(Restaurant)
identifier = models.CharField(max_length=2048) # not unique at a DB level!
I want to ensure that for any badge, for a given restaurant, it must have a unique identifier. Here are the 4 ideas I have had:
idea #1: using unique_together -> Does not work with M2M fields as explained [in documentation]
(https://docs.djangoproject.com/en/2.1/ref/models/options/#unique-together)
idea #2: overriding save() method. Does not fully work with M2M, because when calling add or remove method, save() is not called.
idea #3: using an explicite through model, but since I'm live in production, I'd like to avoid taking risks on migrating important structures like theses. EDIT: after thinking of it, I don't see how it could help actually.
idea #4: Using a m2m_changedsignal to check the uniqueness anytime the add() method is called.
I ended up with the idea 4 and thought everything was OK, with this signal...
#receiver(m2m_changed, sender=Badge.restaurants.through)
def check_uniqueness(sender, **kwargs):
badge = kwargs.get('instance', None)
action = kwargs.get('action', None)
restaurant_pks = kwargs.get('pk_set', None)
if action == 'pre_add':
for restaurant_pk in restaurant_pks:
if Badge.objects.filter(identifier=badge.identifier).filter(restaurants=restaurant_pk):
raise BadgeNotUnique(MSG_BADGE_NOT_UNIQUE.format(
identifier=badge.identifier,
restaurant=Restaurant.objects.get(pk=restaurant_pk)
))
...until today when I found in my database lots of badges with the same identifier but no restaurant (should not happend at the business level)
I understood there is no atomicity between the save() and the signal.
Which means, if the user have an error about uniqueness when trying to create a badge, the badge is created but without restaurants linked to it.
So, the question is: how do you ensure at the model level that if the signal raises an Error, the save() is not commited?
Thanks!
I see two separate issues here:
You want to enforce a particular constraint on your data.
If the constraint is violated, you want to revert previous operations. In particular, you want to revert the creation of the Badge instance if any Restaurants are added in the same request that violate the constraint.
Regarding 1, your constraint is complicated because it involves multiple tables. That rules out database constraints (well, you could probably do it with a trigger) or simple model-level validation.
Your code above is apparently effective at preventing adds that violate the constraint. Note, though, that this constraint could also be violated if the identifier of an existing Badge is changed. Presumably you want to prevent that as well? If so, you need to add similar validation to Badge (e.g. in Badge.clean()).
Regarding 2, if you want the creation of the Badge instance to be reverted when the constraint is violated, you need to make sure the operations are wrapped in a database transaction. You haven't told us about the views where these objects area created (custom views? Django admin?) so it's hard to give specific advice. Essentially, you want to have this:
with transaction.atomic():
badge_instance.save()
badge_instance.add(...)
If you do, an exception thrown by your M2M pre_add signal will rollback the transaction, and you won't get the leftover Badge in your database. Note that admin views are run in a transaction by default, so this should already be happening if you're using the admin.
Another approach is to do the validation before the Badge object is created. See, for example, this answer about using ModelForm validation in the Django admin.
I'm afraid the correct way to achieve this really is by adapting the "through" model. But remember that at database level this "through" model already exists, and therefore your migration would simply be adding a unique constraint. It's a rather simple operation, and it doesn't really involve any real migrations, we do it often in production environments.
Take a look at this example, it pretty much sums everything you need.
You can specify your own connecting model for your M2M-models, and then add a unique_together constraint in the meta class of the membership model
class Badge(SafeDeleteModel):
...
restaurants = models.ManyToManyField(Restaurant, through='BadgeMembership')
class BadgeMembership(models.Model):
restaurant = models.ForeignKey(Restaurant, null=False, blank=False, on_delete=models.CASCADE)
badge = models.ForeignKey(Badge, null=False, blank=False, on_delete=models.CASCADE)
class Meta:
unique_together = (("restaurant", "badge"),)
This creates an object that's between the Badge and Restaurant which will be unique for each badge per restaurant.
Optional: Save check
You can also add a custom save function where you can manually check for uniqueness. In this way you can manually raise an exception.
class BadgeMembership(models.Model):
restaurant = models.ForeignKey(Restaurant, null=False, blank=False, on_delete=models.CASCADE)
badge = models.ForeignKey(Badge, null=False, blank=False, on_delete=models.CASCADE)
def save(self, *args, **kwargs):
# Only save if the object is new, updating won't do anything
if self.pk is None:
membershipCount = BadgeMembership.objects.filter(
Q(restaurant=self.restaurant) &
Q(badge=self.badge)
).count()
if membershipCount > 0:
raise BadgeNotUnique(...);
super(BadgeMembership, self).save(*args, **kwargs)

django remove least recent many to many entry

In my Django app, I want to allow users to see which profiles they view and which profiles view them. In my Profile model I have created 2 fields that accomplish this.
viewed = models.ManyToManyField('self', null=True, blank=True, related_name='viewed_profiles', symmetrical=False)
visitors = models.ManyToManyField('self', null=True, blank=True, related_name='visitors_profiles', symmetrical=False)
I also have the code set up in my views.py file to add profiles to these fields as necessary. However, I would like to only track and display the most recent 25 or so viewed and visitor profiles. Is there a way to query these fields ordered by date added and delete everything past the first 25 results? Is this possible without creating another field to track the order of the profiles viewed?
Take a look at the documentation on Querysets for details of how to do this. You can use order_by to order your objects by date, and use Python's array slicing syntax to limit the number of results.
An example of showing the most recently added items in your view might look something like this:
viewed = Profile.objects.order_by("-date_added")[:25]
This doesn't delete everything after 25 - it just fetches the 25 most recent objects (assuming your Profile model has a field called date_added).
EDIT: Oops, I think I misread your question.
I think what you'd need to do is have an intermediate model - Django allows you to use a third model as an intermediate one between two different models in a many-to-many relationship. Then you could add the time viewed to that model and store it that way. There's a good example in the documentation.
I wouldn't really bother deleting the old ones unless database space was likely to be an issue, but if you need to for any reason, I guess you could set up a signal that was triggered by a new view being created and have that call a function that deletes all but the 25 most recent.
Django doesn't track the date added for a ManyToMany relationship, so it's not possible to do this reliably without adding a field. To achieve this you'll need to do is add a date field on your ManyToMany intermediary table, then order by that - for example
class ProfileViewed(models.Model):
viewed = models.ForeignKey('Profile')
viewer = models.ForeignKey('Profile')
date_added = models.DateField(auto_now_add=True)
class Profile(models.Model):
...
viewed = models.ManyToManyField('self', null=True, blank=True, related_name='viewed_profiles', symmetrical=False, through=ProfileViewed)
Then you can order your results like so:
profile = Profile.objects.get(...)
views = ProfileViewed.objects.filter(viewed=profile).order_by('date_added')

Django 1.4 Multiple Databases Foreignkey relationship (1146, "Table 'other.orders_iorder' doesn't exist")

I have a Foreign Key from one model into another model in a differente database (I know I shouldn't do it but if I take care properly of Referential Integrity it shouldn't be a problem).
The thing is that everything works fine...all the system does (relationships on any direction, the router takes care of it) but when I try to delete the referenced model (which doesn't have the foreign key attribute)...Django still wants to go throught the relationship to check if the relationship is empty, but the related object is on another database so it doesn't find the object in this database.
I tried to set up on_delete=models.DO_NOTHING with no success. Also tried to clear the relationship (but it happens clear doesn't have "using" argument so I it doesn't work either). Also tried to empty the relationship with delete(objects...), no success.
Now I am pretty sure the problem is in super(Object,self).delete(), I can not do super(Object,self).delete(using=other_database) because the self object is not in another database just the RelatedManager is. So I don't know how to make Django to understand I don't want even to check that relationship, which by the way was already emptied before the super(Object,self).delete() request.
I was thinking if there is some method I can override to make Django avoid this check.
More graphical:
DB1: "default" database (orders app)
from django.db import models from shop.models import Order
class IOrder(models.Model):
name = models.CharField(max_length=20, unique=True, blank=False, null=False)
order = models.ForeignKey(Order, related_name='iorders', blank=True, null=True)
DB2: "other" database
class Order(models.Model):
description = models.CharField(max_length=20, blank=False, null=False)
def delete(self):
# Delete iOrder if any
for iorder in self.iorders.using('default'):
iorder.delete()
# Remove myself
super(Order, self).delete()
The problem happens when supper(Order.self).delete() is called, then it can not find the table (iorder) in this database (because it is in 'default')
Some idea? Thanks in advance,
I already resolved my issue changing super(Order,self).delete() with a raw SQL delete command. Anyway I would love to know if there is a more proper way of doing this

Django - delete an object without deleting its related objects

I have two models:
class Client(models.Model):
some_field = models.CharField()
class Ticket(models.Model):
client = models.ForeignKey(Client)
Tickets are FOREVER in my system, but I want users to be able to delete clients they don't want anymore. Currently it'll delete all the tickets created by the Client.
Is this a bad idea (architecturally speaking), and should I just mark them as not_needed or something instead?
If it's not a bad idea, what's the best way to do it, while remaining DRY. I don't want to have to override delete() for each model that does this, but will if I have to (what's the best way to do that, if that's the only way).
So this question is very old but in case someone runs across it (like I did): starting from Django 1.3, you can use the on_delete parameter for ForeignKey models, as described here.
The django.contrib.auth module has to deal with this same problem in the User model. Their solution is to have:
class User(models.Model):
# ...
is_active = models.BooleanField(default=True)
# ...
So "deleting" a User is just setting is_active to False. Everything that works with Users needs to check is_active.
For the record, I think deleting Clients in this case is a Bad Idea.
But for the sake of argument, if you delete a Client, then its related Tickets need to first become clientless. Change the model for Ticket to:
class Ticket(models.Model):
client = models.ForeignKey(Client, null=True, blank=True,
related_name='tickets')
Then, to delete a Client, do:
for ticket in clientToDelete.tickets:
ticket.client = None
ticket.save()
clientToDelete.delete()
You can put this code into Client's delete method, but it will get skipped if you do a mass (i.e. QuerySet-based) delete of Clients.
Personally, I think it is a bad idea (you'd have orphan ticket records). I would just mark those clients as 'not_needed' or 'deleted'. You also get the added benefit of the ability to 'un-delete' those clients later.