Django slots + foreign key save silently fails - django

I have found that when Django2.0 model has ForeignKey and mixin with slots it's .save() method doesn't work. While it's quite specific case, it is still kinda surprising, because there is no any Exception, data is just not saved. Here is an example:
from django.db import models
class FooSlots:
__slots__ = ["bar", "value"]
class Bar(models.Model):
pass
class FooSloted(models.Model, FooSlots):
value = models.FloatField(default=0.42)
bar = models.ForeignKey(Bar,
on_delete=models.CASCADE,
related_name="foo_sloted"
)
def check_sanity(source, bar, value=0.5):
instance = source.objects.create(bar=bar)
instance.value = value
instance.save()
instance = source.objects.get(pk=instance.pk)
assert instance.value == value # Must be true!
So
check_sanity(FooSloted, Bar.objects.first())
will raise assertion error, because data won't be saved, but there is no exceptions from Django itself. Even more confusing that in case bar is not a ForeignKey, but e.g. a CharField, everything is ok. Also when slots are not specified, there is no such problem too.
Is there any explanation for such behavior?
PS. To make this example less fictional: I have several "Foo"-like models with keys to different "Bar" which are populated in similar way elsewhere. FooSlots is used to enforce same interface and to treat given data for different Foo in a same way.

Related

Force a cascading delete for a Django model

I have a Django model with foreign-key relations that are marked as deletion.PROTECT, and I am OK with that behavior, since it's how the model should behave in most scenarios.
However, there is one use case for those models where I kind of need to do a "hard delete" (ie user wants to delete their account). In that case, I'd really like everything to behave like a CASCADE, instead of having to delete each of the foreign-key relationships manually. Is there a way to do this cleanly? In an ideal world, the model.delete() call would take a parameter that is something like force_cascade=True.
As django also creates the database with PROTECTED relations you need to do the cascading deletion yourself manually. The database itself will otherwise forbid the deletion.
Django's ORM can help you with that, the only thing you need to do is to find recursively all references to the user and delete them in reverse order.
It is also an advantage to do this manually as you might want to replace some occurrences of the user with a substitute (i.e. a virtual "deleted user"). I could think of comments in a message board that should be kept even so if the user deletes their account.
To find the relations pointing to the current user and replace them with a ghost user, you can use the following snippet.
from typing import List
from django.contrib.auth import get_user_model
from django.db.models import Model
from django.db.models.fields.reverse_related import (
ManyToOneRel,
ForeignObjectRel,
)
User = get_user_model()
def get_all_relations(model: Model) -> List[ForeignObjectRel]:
"""
Return all Many to One Relation to point to the given model
"""
result: List[ForeignObjectRel] = []
for field in model._meta.get_fields(include_hidden=True):
if isinstance(field, ManyToOneRel):
result.append(field)
return result
def print_updated(name, number):
"""
Simple Debug function
"""
if number > 0:
print(f" Update {number} {name}")
def delete_user_and_replace_with_substitute(user_to_delete: User):
"""
Replace all relations to user with fake replacement user
:param user_to_delete: the user to delete
"""
replacement_user: User = User.objects.get(pk=0) # define your replacement user
# replacement_user: User = User.objects.get(email='email#a.com')
for field in get_all_relations(user_to_delete):
field: ManyToOneRel
target_model: Model = field.related_model
target_field: str = field.remote_field.name
updated: int = target_model.objects.filter(
**{target_field: user_to_delete}
).update(**{target_field: replacement_user})
print_updated(target_model._meta.verbose_name, updated)
user_to_delete.delete()
For a real deletion simply replace the .update(...) function with a .delete() call (don't forget to recursively look for protected relations before, if needed)
There might be also a postgresql related solution that I am not aware of. The given solution is database independent.
In general it is a good idea to keep every relation PROTECTED to prevent accidentally deleting important database entries and delete manually with care.
models.PROTECT is a setting for the table in the database. You would have to issue raw SQL instructions to override it, and that would be database-specific (and I don't have a clue how to do that).
The alternative is to navigate the "tree" of objects that you want to remove, and then delete objects working from the protected "leaves" inwards to the "trunk". So if you had
class Bar( models.Model):
user = models.ForeignKey( User, models.PROTECT, ...)
...
class Foo( models.Model):
bar = models.ForeignKey( Bar, models.PROTECT, ... )
...
Then to delete a user object user you would need
def delete_user( user):
for bar in user.bar_set.all():
bar.foo_set.all().delete()
bar.delete()
user.delete()
I'd wrap it in a transaction so it either deleted everything or nothing.
It will hit the DB multiple times. I'm assuming that the number of related (bar, baz) objects is fairly small and that you won't be deleting users very often.
I have always wondered what one does if instance a has a protected foreign key relation to instance b and vice versa (maybe via intermediates). At face value this means you can create objects that are un-deleteable.

Concise way of getting or creating an object with given field values

Suppose I have:
from django.db import models
class MyContentClass(models.Model):
content = models.TextField()
another_field = models.TextField()
x = MyContentClass(content="Hello, world!", another_field="More Info")
Is there a more concise way to perform the following logic?
existing = MyContentClass.objects.filter(content=x.content, another_field=x.another_field)
if existing:
x = existing[0]
else:
x.save()
# x now points to an object which is saved to the DB,
# either one we've just saved there or one that already existed
# with the same field values we're interested in.
Specifically:
Is there a way to query for both (all) fields without specifying
each one separately?
Is there a better idiom for either getting the old object or saving the new one? Something like get_or_create, but which accepts an object as a parameter?
Assume the code which does the saving is separate from the code which generates the initial MyContentClass instance which we need to compare to. This is typical of a case where you have a function which returns a model object without also saving it.
You could convert x to a dictionary with
x_data = x.__dict__
Then that could be passed into the object's get_or_create method.
MyContentClass.objects.get_or_create(**x_data)
The problem with this is that there are a few fields that will cause this to error out (eg the unique ID, or the _state Django modelstate field). However, if you pop() those out of the dictionary beforehand, then you'd probably be good to go :)
cleaned_dict = remove_unneeded_fields(x_data)
MyContentClass.objects.get_or_create(**cleaned_dict)
def remove_unneeded_fields(x_data):
unneeded_fields = [
'_state',
'id',
# Whatever other fields you don't want the new obj to have
# eg any field marked as 'unique'
]
for field in unneeded_fields:
del x_data[field]
return x_data
EDIT
To avoid issues associated with having to maintain a whitelist/blacklist of fields you, could do something like this:
def remove_unneeded_fields(x_data, MyObjModel):
cleaned_data = {}
for field in MyObjModel._meta.fields:
if not field.unique:
cleaned_data[field.name] = x_data[field.name]
return cleaned_Data
There would probably have to be more validation than simply checking that the field is not unique, but this might offer some flexibility when it comes to minor model field changes.
I would suggest to create a custom manager for those models and add the functions you want to do with the models (like a custom get_or_create function).
https://docs.djangoproject.com/en/1.10/topics/db/managers/#custom-managers
This would be the cleanest way and involves no hacking. :)
You can create specific managers for specific models or create a superclass with functions you want for all models.
If you just want to add a second manager with a different name, beware that it will become the default manager if you don't set the objects manager first (https://docs.djangoproject.com/en/1.10/topics/db/managers/#default-managers)

Django delete foreign object?

If we set up a profile how Django recommends:
class Profile(models.Model):
user = models.ForeignKey(User, unique=True)
Then when you delete the User object from Django admin, it deletes his profile too.This is because the profile has a foreign key to user and it wants to protect referential integrity. However, I want this functionality even if the pointer is going the other way. For example, on my Profile class I have:
shipper = models.ForeignKey(Shipper, unique=True, blank=True, null=True)
carrier = models.ForeignKey(Carrier, unique=True, blank=True, null=True)
affiliat = models.ForeignKey(Affiliate, unique=True, blank=True, null=True, verbose_name='Affiliate')
And I want it so that if you delete the Profile it'll delete the associated shipper/carrier/affiliate objects (don't ask me why Django made "affiliate" some weird keyword). Because shippers, carriers and affiliates are types of users, and it doesn't make sense for them to exist without the rest of the data (no one would be able to log in as one).
The reason I didn't put the keys on the other objects, is because then Django would have to internally join all those tables every time I wanted to check which type the user was...
While using a post_delete signal as described by bernardo above is an ok approach, that will work well, I try to avoid using signals as little as humanly possible as I feel like it convolutes your code unnecessarily by adding behavior to standard functionality in places that one might be expecting.
I prefer the overriding method above, however, the example given by Felix does have one fatal flaw; the delete() function it is overriding looks like this:
def delete(self, using=None):
using = using or router.db_for_write(self.__class__, instance=self)
assert self._get_pk_val() is not None, "%s object can't be deleted because its %s attribute is set to None." % (self._meta.object_name, self._meta.pk.attname)
collector = Collector(using=using)
collector.collect([self])
collector.delete()
Notice the parameter 'using', in most cases we call delete() with empty arguments so we may have even known it was there. In the above example this parameter is buried by us overriding and not looking at the superclass functionality, if someone where to pass the 'using' parameter when deleting Profile it will cause unexpected behavior. To avoid that, we would make sure to preserve the argument along with its default lika so:
class Profile(models.Model):
# ...
def delete(self, using=None):
if self.shipper:
self.shipper.delete()
if self.carrier:
self.carrier.delete()
if self.affiliat:
self.affiliat.delete()
super(Profile, self).delete(using)
One pitfall to the overriding approach, however, is that delete() does not get explicitly called per db record on bulk deletes, this means that if you are going to want to delete multiple Profiles at one time and keep the overriding behavior (calling .delete() on a django queryset for example) you will need to either leverage the delete signal (as described by bernardo) or you will need to iterate through each record deleting them individually (expensive and ugly).
A better way to do this and that works with object's delete method and queryset's delete method is using the post_delete signal, as you can see in the documentation.
In your case, your code would be quite similar to this:
from django.db import models
from django.dispatch import receiver
#receiver(models.signals.post_delete, sender=Profile)
def handle_deleted_profile(sender, instance, **kwargs):
if instance.shipper:
instance.shipper.delete()
if instance.carrier:
instance.carrier.delete()
if instance.affiliat:
instance.affiliat.delete()
This works only for Django 1.3 or greater because the post_delete signal was added in this Django version.
You can override the delete() method of the Profile class and delete the other objects in this method before you delete the actual profile.
Something like:
class Profile(models.Model):
# ...
def delete(self):
if self.shipper:
self.shipper.delete()
if self.carrier:
self.carrier.delete()
if self.affiliat:
self.affiliat.delete()
super(Profile, self).delete()

What is the canonical way to find out if a Django model is saved to db?

I generally check if obj.pk to knwo if the objects is saved. This wont work however, if you have primary_key = True set on some fields. Eg I set user = models.OneToOneField(User, primary_key=True) on my UserProfile.
What is the canonical way to find out if a Django model is saved to db?
Nowadays you can check for:
self._state.adding
This value is set by the QuerySet.iterator() for objects which are not added yet in the database. You can't use this value in the __init__() method yet, as it's set after the object is constructed.
Important Note (as of 6 May '19): If your models use UUID fields (or other method of internal ID generation, use self._state.adding as mentioned in the comments.
Actually,obj.pk is the most canonical way. Django itself often doesn't "know" if the object is saved or not. According to the django model instance reference, if there is a primary key set already, it checks onsave() calls by selecting for the id in the database before any insert.
Even if you set user = models.OneToOneField(..., primary_key=True) the .pk attribute will still point to the correct primary key (most likely user_id) and you can use it and set it as if it was the same property.
If you want to know after an object has been saved, you can catch the post_save signal. This signal is fired on model saves, and if you want you can add your own application-specific attribute to the model, for example obj.was_saved = True. I think django avoids this to keep their instances clean, but there's no real reason why you couldn't do this for yourself. Here is a minimal example:
from django.db.models.signals import post_save
from myapp.models import MyModel
def save_handler(sender, instance, **kwargs):
instance.was_saved = True
post_save.connect(save_handler, sender=MyModel)
You can alternately have this function work for all models in your app by simply connecting the signal without specifying the sender= argument. Beware though, you can create undefined behaviours if you override a property on someone else's model instance that you are importing.
Lets say obj is an instance of MyModel. Then we could use the following block of code to check if there already is an instance with that primary key in the database:
if obj.pk is None:
# Definitely doesn't exist, since there's no `pk`.
exists = False
else:
# The `pk` is set, but it doesn't guarantee exists in db.
try:
obj_from_db = MyModel.objects.get(pk=obj.pk)
exists = True
except MyModel.DoesNotExist:
exists = False
This is better than checking whether obj.pk is None, because you could do
obj = MyModel()
obj.pk = 123
then
obj.pk is None # False
This is even very likely when you don't use the autoincrement id field as the primary key but a natural one instead.
Or, as Matthew pointed out in the comments, you could do
obj.delete()
after which you still have
obj.pk is None # False
#Crast's answer was good, but I think incomplete. The code I use in my unit tests for determining if an object is in the database is as follows. Below it, I will explain why I think it is superior to checking if obj.pk is None.
My solution
from django.test import TestCase
class TestCase(TestCase):
def assertInDB(self, obj, msg=None):
"""Test for obj's presence in the database."""
fullmsg = "Object %r unexpectedly not found in the database" % obj
fullmsg += ": " + msg if msg else ""
try:
type(obj).objects.get(pk=obj.pk)
except obj.DoesNotExist:
self.fail(fullmsg)
def assertNotInDB(self, obj, msg=None):
"""Test for obj's absence from the database."""
fullmsg = "Object %r unexpectedly found in the database" % obj
fullmsg += ": " + msg if msg else ""
try:
type(obj).objects.get(pk=obj.pk)
except obj.DoesNotExist:
return
else:
self.fail(fullmsg)
Notes: Use the above code with care if you use custom managers on your models name something other than objects. (I'm sure there's a way to get Django to tell you what the default manager is.) Further, I know that /assert(Not)?InDB/ are not a PEP 8 method names, but I used the style the rest of the unittest package used.
Justification
The reason I think assertInDB(obj) is better than assertIsNotNone(obj.pk) is because of the following case. Suppose you have the following model.
from django.db import models
class Node(models.Model):
next = models.OneToOneField('self', null=True, related_name='prev')
Node models a doubly linked list: you can attach arbitrary data to each node using foreign keys and the tail is the Node obj such that obj.next is None. By default, Django adds the SQL constraint ON DELETE CASCADE to the primary key of Node. Now, suppose you have a list nodes of length n such that nodes[i].next == nodes[i + 1] for i in [0, n - 1). Suppose you call nodes[0].delete(). In my tests on Django 1.5.1 on Python 3.3, I found that nodes[i].pk is not None for i in [1, n) and only nodes[0].pk is None. However, my /assert(Not)?InDB/ methods above correctly detected that nodes[i] for i in [1, n) had indeed been deleted.

Django: Querying read-only view with no primary key

class dbview(models.Model):
# field definitions omitted for brevity
class Meta:
db_table = 'read_only_view'
def main(request):
result = dbview.objects.all()
Caught an exception while rendering: (1054, "Unknown column 'read_only_view.id' in 'field list'")
There is no primary key I can see in the view. Is there a workaround?
Comment:
I have no control over the view I am accessing with Django. MySQL browser shows columns there but no primary key.
When you say 'I have no control over the view I am accessing with Django. MySQL browser shows columns there but no primary key.'
I assume you mean that this is a legacy table and you are not allowed to add or change columns?
If so and there really isn't a primary key (even a string or non-int column*) then the table hasn't been set up very well and performance might well stink.
It doesn't matter to you though. All you need is a column that is guaranteed to be unique for every row. Set that to be 'primary_key = True in your model and Django will be happy.
There is one other possibility that would be problemmatic. If there is no column that is guaranteed to be unique then the table might be using composite primary keys. That is - it is specifying that two columns taken together will provide a unique primary key. This is perfectly valid relational modelling but unfortunatly unsupported by Django. In that case you can't do much besides raw SQL unless you can get another column added.
I have this issue all the time. I have a view that I can't or don't want to change, but I want to have a page to display composite information (maybe in the admin section). I just override the save and raise a NotImplementedError:
def save(self, **kwargs):
raise NotImplementedError()
(although this is probably not needed in most cases, but it makes me feel a bit better)
I also set managed to False in the Meta class.
class Meta:
managed = False
Then I just pick any field and tag it as the primary key. It doesn't matter if it's really unique with you are just doing filters for displaying information on a page, etc.
Seems to work fine for me. Please commment if there are any problems with this technique that I'm overlooking.
If there really is no primary key in the view, then there is no workaround.
Django requires each model to have exactly one field primary_key=True.
There should have been an auto-generated id field when you ran syncdb (if there is no primary key defined in your model, then Django will insert an AutoField for you).
This error means that Django is asking your database for the id field, but none exists. Can you run django manage.py dbshell and then DESCRIBE read_only_view; and post the result? This will show all of the columns that are in the database.
Alternatively, can you include the model definition you excluded? (and confirm that you haven't altered the model definition since you ran syncdb?)
I know this post is over a decade old, but I ran into this recently and came to SO looking for a good answer. I had to come up with a solution that addresses the OP's original question, and, additionally, allows for us to add new objects to the model for unit testing purposes, which is a problem I still had with all of the provided solutions.
main.py
from django.db import models
def in_unit_test_mode():
"""some code to detect if you're running unit tests with a temp SQLite DB, like..."""
import sys
return "test" in sys.argv
"""You wouldn't want to actually implement it with the import inside here. We have a setting in our django.conf.settings that tests to see if we're running unit tests when the project starts."""
class AbstractReadOnlyModel(models.Model):
class Meta(object):
abstract = True
managed = in_unit_test_mode()
"""This is just to help you fail fast in case a new developer, or future you, doesn't realize this is a database view and not an actual table and tries to update it."""
def save(self, *args, **kwargs):
if not in_unit_test_mode():
raise NotImplementedError(
"This is a read only model. We shouldn't be writing "
"to the {0} table.".format(self.__class__.__name__)
)
else:
super(AbstractReadOnlyModel, self).save(*args, **kwargs)
class DbViewBaseModel(AbstractReadOnlyModel):
not_actually_unique_field = IntegerField(primary_key=True)
# the rest of your field definitions
class Meta:
db_table = 'read_only_view'
if in_unit_test_mode():
class DbView(DbViewBaseModel):
not_actually_unique_field = IntegerField()
"""This line removes the primary key property from the 'not_actually_unique_field' when running unit tests, so Django will create an AutoField named 'id' on the table it creates in the temp DB that it creates for running unit tests."""
else:
class DbView(DbViewBaseModel):
pass
class MainClass(object):
#staticmethod
def main_method(request):
return DbView.objects.all()
test.py
from django.test import TestCase
from main import DbView
from main import MainClass
class TestMain(TestCase):
#classmethod
def setUpTestData(cls):
cls.object_in_view = DbView.objects.create(
"""Enter fields here to create test data you expect to be returned from your method."""
)
def testMain(self):
objects_from_view = MainClass.main_method()
returned_ids = [object.id for object in objects_from_view]
self.assertIn(self.object_in_view.id, returned_ids)