Is there any way around saving models that reference each other twice? - django

My issue is when saving new models that need to reference each other, not just using a related_name lookup, such as this:
class Many:
owner = models.ForeignKey('One')
class One:
current = models.OneToOneField('Many')
By default, these have null=False and, please correct me if I'm wrong, using these are impossible until I change one of the relationships:
current = models.OneToOneField('Many', null=True)
The reason is because you can't assign a model to a relationship unless its already saved. Otherwise resulting in ValueError: 'Cannot assign "<...>": "..." instance isn't saved in the database.'.
But now when I create a pair of these objects I need to save twice:
many = Many()
one = One()
one.save()
many.owner = one
many.save()
one.current = many
one.save()
Is this the right way to do it, or is there another way around saving twice?

There is no way around it, you need to save one of the objects twice anyway.
This is because, at the database level, you need to save an object to get its ID. There is no way to tell a sql database "save those 2 objects and assign the ids to those fields on the other object". So if you were to do it manually, you would INSERT the first object with NULL for the FK, get its ID back, INSERT the second object with the ID of the first one, get its ID back, then UPDATE the first object to set the FK.
You would encapsulate the whole thing in a transaction.
So what you're doing with the ORM is the closest you can get. You may want to add the following on top of that:
1) Use a transaction for the changes, like this:
from django.db import transaction
with transaction.atomic():
many, one = Many(), One()
one.save()
many.owner = one
many.save()
one.current = many
one.save(update_fields=['current']) # slight optimization here
2) Now this is encapsulated in a transaction, you would want to remove the null=True. But you cannot, as those are, unfortunately, checked immediately.
[edit: it appears Oracle might support deferring the NOT NULL check, so if you're using Oracle you can try dropping the null=True and it should work.]
You'll probably want to check how your code reacts if at a later point, when reading the db, if for some reason (manual editing, bugged insert somewhere, ...) one.current.owner != one.

Related

Django foreign key: auto-lookup related object when the update record has the key value

I have legacy code which had no foreign keys defined in the schema.
The raw data for the row includes the key value of the parent, naturally.
My first porting attempt to postgresql just updated the field with the raw value: I did not add foreign keys to Django's models.
Now I am trying to add foreign keys to make the schema more informative.
When I add a foreign key, django's update requires me to provide an instance of the parent object: I can no longer update by simply providing the key value. But this is onerous because now I need to include in my code knowledge of all the relations to go and fetch related objects, and have specific update calls per model. This seems crazy to me, at least starting from where I am, so I feel like I am really missing something.
Currently, the update code just pushes rows in blissful ignorance. The update code is generic for tables, which is easy when there are no relations.
Django's model data means that I can find the related object dynamically for any given model, and doing this means I can still keep very abstracted table update logic. So this is what I am thinking of doing. Or just doing raw SQL updates.
Does a solution to this already exist, even if I can't find it? I am expecting to be embarrassed.
The ValueError comes in django ORM code which knows exactly which model it expects and what the related field is: the missing step if to find the instance of related object.
db.models.fields.related_descriptors.py:
in this code, which throws the exception, value is supposed to be an instance of the parent model. Instead, value is the key value. This basically I think tells me how I can inspect the model to deal with this in advance, but I wonder if I am re-inventing the wheel.
if value is not None and not isinstance(value, self.field.remote_field.model._meta.concrete_model):
raise ValueError(
'Cannot assign "%r": "%s.%s" must be a "%s" instance.' % (
value,
instance._meta.object_name,
self.field.name,
self.field.remote_field.model._meta.object_name,
)
)
You could use _id suffix to set id value directly
For given model
class Album(models.Model):
artist = models.ForeignKey(Musician, on_delete=models.CASCADE)
You can set artist by id in following manner
Album.objects.create(artist_id=2)

Detect duplicate inserts when adding many-to-many relation

Let's assume there are two models, A and B:
class A(models.Model):
name = models.CharField(max_length=100)
class B(models.Model):
children = models.ManyToManyField(A)
I'm using b.children.add() method to add instance of A to b:
a = A.objects.get(pk=SOMETHING)
b.children.add(a)
As far as I know, Django by default doesn't allow duplicate many-to-many relationship. So I cannot add same instance of A more than once.
But the problem is here, I fetch instances of A with another query, then loop around them and add them one by one. How can I detect a duplicate relation? Does add() method return something useful?
A look at the source code reveals that Django first checks to see if there are any entries that already exist in the database, and then only adds the new ones. It doesn't return any information to the caller, though.
It's not clear if you actually need to detect duplicates, or if you just want to make sure that they're not being added to the database? If it's the latter then everything's fine. If it's the former, there's no way around hitting the database. If you're really concerned about performance you could always perform the check and update the through table yourself (i.e. re-implement add()).

Is it possible to improve the process of instance creation/deletion in Django using querysets?

So I have a list of unique pupils (pupil is the primary_key in an LDAP database, each with an associated teacher, which can be the same for several pupils.
There is a box in an edit form for each teacher's pupils, where a user can add/remove an pupil, and then the database is updated according using the below function. My current function is as follows. (teacher is the teacher associated with the edit page form, and updated_list is a list of the pupils' names what has been submitted and passed to this function)
def update_pupils(teacher, updated_list):
old_pupils = Pupil.objects.filter(teacher=teacher)
for pupils in old_pupils:
if pupil.name not in updated_list:
pupil.delete()
else:
updated_list.remove(pupil.name)
for pupil in updated_list:
if not Pupil.objects.filter(name=name):
new_pupil = pupil(name=name, teacher=teacher)
new_pupil.save()
As you can see the function basically finds what was the old pupil list for the teacher, looks at those and if an instance is not in our new updated_list, deletes it from the database. We then remove those deleted from the updated_list (or at least their names)...meaning the ones left are the newly created ones, which we then iterate over and save.
Now ideally, I would like to access the database as infrequently as possible if that makes sense. So can I do any of the following?
In the initial iteration, can I simply mark those pupils up for deletion and potentially do the deleting and saving together, at a later date? I know I can bulk delete items but can I somehow mark those which I want to delete, without having to access the database which I know can be expensive if the number of deletions is going to be high...and then delete a lot at once?
In the second iteration, is it possible to create the various instances and then save them all in one go? Again, I see in Django 1.4 that you can use bulk_create but then how do you save these? Plus, I'm actually using Django 1.3 :(...
I am kinda assuming that the above steps would actually help with the performance of the function?...But please let me know if that's not the case.
I have of course been reading this https://docs.djangoproject.com/en/1.3/ref/models/querysets/ So I have a list of unique items, each with an associated email address, which can be the same for several items.
First, in this line
if not Pupil.objects.filter(name=name):
It looks like the name variable is undefined no ?
Then here is a shortcut for your code I think:
def update_pupils(teacher, updated_list):
# Step 1 : delete
Pupil.objects.filter(teacher=teacher).exclude(name__in=updated_list).delete() # delete all the not updated objects for this teacher
# Step 2 : update
# either
for name in updated_list:
Pupil.objects.update_or_create(name=name, defaults={teacher:teacher}) # for updated objects, if an object of this name exists, update its teacher, else create a new object with the name from updated_list and the input teacher
# or (but I'm not sure this one will work)
Pupil.objects.update_or_create(name__in=updated_list, defaults={teacher:teacher})
Another solution, if your Pupil object only has those 2 attributes and isn't referenced by a foreign key in another relation, is to delete all the "Pupil" instances of this teacher, and then use a bulk_create.. It allows only 2 access to the DB, but it's ugly
EDIT: in first loop, pupil also is undefined

django save instance between parent and child model class

I came across this problem on form save the data needs to be persisted somewhere then go through a payment process then on success retrieve the data and save to the proper model.
I have seen this done using session, but with some hacky way to persist file uploads when commit=False and it doesn't seem very pythonic
I am thinking if I have a model class A, and have a child class extending A, such as A_Temp
class A(models.Model):
name = models.CharField(max_lenght=25)
image = models.ImageField()
class A_Temp(A):
pass
class AForm(forms.ModelForm):
class Meta:
model = A_Temp
On model form (A_Temp) save, it stores to A_Temp, and when payment successful, it move the instance to the parent model class A.
Here are the questions:
Has anyone done this before?
How to properly move an instance of a child model class to the parent model class?
Edit:
There are other different ways to do it, such as adding extra fields to the table, yes I would've done that if I am using PHP without a ORM framework, but since the ORM is pretty decent in django, I thought that I might trial something different.
Since I am asking here, means I am not convinced myself about this approach as well. What are your thoughts?
As suggested in the question comments, adding an extra field to your model containing payment state may be the easiest approach. Conceptually it's the same object, it's just that the state changes once payment has been made. As you've indicated, you will need logic to purge out items from your database which never proceed through the required states such as payment. This may involve adding both a payment_state and state_change_time field to your model which indicates when the state last changed. If the state is PAYMENT_PENDING for for too long, that record could be purged.
If you take the approach that unpaid items are stored in a different table as you've suggested, you still have to manage that table to determine when it's safe to delete items. For example, if a payment is never processed, when will you delete record from the A_temp table? Also, having a separate table means that you really only have two states possible, paid and unpaid as determine by the table in which the record occurs. Having a single table with a payment_state may be more flexible in that it allows you to extend the state as required. eg. Let's say you decide you need the payment states ITEM_SUBMITTED, AWAITING_PAYMENT, PAYMENT_ACCEPTED, PAYMENT_REJECTED. This could all be implemented with a single state field. If this was implemented as you've described, you'd need a separate table for each state.
Having said all that, if you're still set on having a separate table structure, you can create a function which will copy the values from an instance of A_temp to A. Something like the following may work, but any relationship type fields such as ForeignKey are likely to require special attention.
def copy_A_temp_to_A(a, a_temp):
for field_name in a._meta.fields:
value = getattr(a, field_name)
setattr(a_temp, field_name, value)
When you need to do the move from A_temp to A, you'd have to instantiate an A instance, then call the copy function, save the instance and delete the A_temp instance from the database.

How to modify a queryset and save it as new objects?

I need to query for a set of objects for a particular Model, change a single attribute/column ("account"), and then save the entire queryset's objects as new objects/rows. In other words, I want to duplicate the objects, with a single attribute ("account") changed on the duplicates. I'm basically creating a new account and then going through each model and copying a previous account's objects to the new account, so I'll be doing this repeatedly, with different models, probably using django shell. How should I approach this? Can it be done at the queryset level or do I need to loop through all the objects?
i.e.,
MyModel.objects.filter(account="acct_1")
# Now I need to set account = "acct_2" for the entire queryset,
# and save as new rows in the database
From the docs:
If the object’s primary key attribute is not set, or if it’s set but a
record doesn’t exist, Django executes an INSERT.
So if you set the id or pk to None it should work, but I've seen conflicting responses to this solution on SO: Duplicating model instances and their related objects in Django / Algorithm for recusrively duplicating an object
This solution should work (thanks #JoshSmeaton for the fix):
models = MyModel.objects.filter(account="acct_1")
for model in models:
model.id = None
model.account = "acct_2"
model.save()
I think in my case, I have a OneToOneField on the model that I'm testing on, so it makes sense that my test wouldn't work with this basic solution. But, I believe it should work, so long as you take care of OneToOneField's.