Detect duplicate inserts when adding many-to-many relation - django

Let's assume there are two models, A and B:
class A(models.Model):
name = models.CharField(max_length=100)
class B(models.Model):
children = models.ManyToManyField(A)
I'm using b.children.add() method to add instance of A to b:
a = A.objects.get(pk=SOMETHING)
b.children.add(a)
As far as I know, Django by default doesn't allow duplicate many-to-many relationship. So I cannot add same instance of A more than once.
But the problem is here, I fetch instances of A with another query, then loop around them and add them one by one. How can I detect a duplicate relation? Does add() method return something useful?

A look at the source code reveals that Django first checks to see if there are any entries that already exist in the database, and then only adds the new ones. It doesn't return any information to the caller, though.
It's not clear if you actually need to detect duplicates, or if you just want to make sure that they're not being added to the database? If it's the latter then everything's fine. If it's the former, there's no way around hitting the database. If you're really concerned about performance you could always perform the check and update the through table yourself (i.e. re-implement add()).

Related

Is there any way around saving models that reference each other twice?

My issue is when saving new models that need to reference each other, not just using a related_name lookup, such as this:
class Many:
owner = models.ForeignKey('One')
class One:
current = models.OneToOneField('Many')
By default, these have null=False and, please correct me if I'm wrong, using these are impossible until I change one of the relationships:
current = models.OneToOneField('Many', null=True)
The reason is because you can't assign a model to a relationship unless its already saved. Otherwise resulting in ValueError: 'Cannot assign "<...>": "..." instance isn't saved in the database.'.
But now when I create a pair of these objects I need to save twice:
many = Many()
one = One()
one.save()
many.owner = one
many.save()
one.current = many
one.save()
Is this the right way to do it, or is there another way around saving twice?
There is no way around it, you need to save one of the objects twice anyway.
This is because, at the database level, you need to save an object to get its ID. There is no way to tell a sql database "save those 2 objects and assign the ids to those fields on the other object". So if you were to do it manually, you would INSERT the first object with NULL for the FK, get its ID back, INSERT the second object with the ID of the first one, get its ID back, then UPDATE the first object to set the FK.
You would encapsulate the whole thing in a transaction.
So what you're doing with the ORM is the closest you can get. You may want to add the following on top of that:
1) Use a transaction for the changes, like this:
from django.db import transaction
with transaction.atomic():
many, one = Many(), One()
one.save()
many.owner = one
many.save()
one.current = many
one.save(update_fields=['current']) # slight optimization here
2) Now this is encapsulated in a transaction, you would want to remove the null=True. But you cannot, as those are, unfortunately, checked immediately.
[edit: it appears Oracle might support deferring the NOT NULL check, so if you're using Oracle you can try dropping the null=True and it should work.]
You'll probably want to check how your code reacts if at a later point, when reading the db, if for some reason (manual editing, bugged insert somewhere, ...) one.current.owner != one.

Django default caching of foreign key models

For the life of me I can't find the answer to this question online, although it's a basic one.
I have two models, one referencing the other:
class A(models.Model):
name = models.CharField(...)
...
class B(models.Model):
a = models.ForeignKey(A)
Now, I'm keeping an instance of B in memory, and ever so often I access b.a.name . Does accessing b.a.name cause a database query every time, so that changes in a.name (changes done by another process) are seen in my process? Or do I have to query a explicitly each time?
I'm surprised you haven't been able to find any information on this. It's fairly well documented that forward relationships are cached on first usage - what happens is that a cache attribute is created, and subsequent lookups will check this first.
So yes, this means that changes to that object in another process will not be seen, and you will need to re-query each time. (Note also that depends on your database's transaction isolation setting, you might not even see the new value on re-querying - you may need to commit the current transaction first.)

django save instance between parent and child model class

I came across this problem on form save the data needs to be persisted somewhere then go through a payment process then on success retrieve the data and save to the proper model.
I have seen this done using session, but with some hacky way to persist file uploads when commit=False and it doesn't seem very pythonic
I am thinking if I have a model class A, and have a child class extending A, such as A_Temp
class A(models.Model):
name = models.CharField(max_lenght=25)
image = models.ImageField()
class A_Temp(A):
pass
class AForm(forms.ModelForm):
class Meta:
model = A_Temp
On model form (A_Temp) save, it stores to A_Temp, and when payment successful, it move the instance to the parent model class A.
Here are the questions:
Has anyone done this before?
How to properly move an instance of a child model class to the parent model class?
Edit:
There are other different ways to do it, such as adding extra fields to the table, yes I would've done that if I am using PHP without a ORM framework, but since the ORM is pretty decent in django, I thought that I might trial something different.
Since I am asking here, means I am not convinced myself about this approach as well. What are your thoughts?
As suggested in the question comments, adding an extra field to your model containing payment state may be the easiest approach. Conceptually it's the same object, it's just that the state changes once payment has been made. As you've indicated, you will need logic to purge out items from your database which never proceed through the required states such as payment. This may involve adding both a payment_state and state_change_time field to your model which indicates when the state last changed. If the state is PAYMENT_PENDING for for too long, that record could be purged.
If you take the approach that unpaid items are stored in a different table as you've suggested, you still have to manage that table to determine when it's safe to delete items. For example, if a payment is never processed, when will you delete record from the A_temp table? Also, having a separate table means that you really only have two states possible, paid and unpaid as determine by the table in which the record occurs. Having a single table with a payment_state may be more flexible in that it allows you to extend the state as required. eg. Let's say you decide you need the payment states ITEM_SUBMITTED, AWAITING_PAYMENT, PAYMENT_ACCEPTED, PAYMENT_REJECTED. This could all be implemented with a single state field. If this was implemented as you've described, you'd need a separate table for each state.
Having said all that, if you're still set on having a separate table structure, you can create a function which will copy the values from an instance of A_temp to A. Something like the following may work, but any relationship type fields such as ForeignKey are likely to require special attention.
def copy_A_temp_to_A(a, a_temp):
for field_name in a._meta.fields:
value = getattr(a, field_name)
setattr(a_temp, field_name, value)
When you need to do the move from A_temp to A, you'd have to instantiate an A instance, then call the copy function, save the instance and delete the A_temp instance from the database.

How to modify a queryset and save it as new objects?

I need to query for a set of objects for a particular Model, change a single attribute/column ("account"), and then save the entire queryset's objects as new objects/rows. In other words, I want to duplicate the objects, with a single attribute ("account") changed on the duplicates. I'm basically creating a new account and then going through each model and copying a previous account's objects to the new account, so I'll be doing this repeatedly, with different models, probably using django shell. How should I approach this? Can it be done at the queryset level or do I need to loop through all the objects?
i.e.,
MyModel.objects.filter(account="acct_1")
# Now I need to set account = "acct_2" for the entire queryset,
# and save as new rows in the database
From the docs:
If the object’s primary key attribute is not set, or if it’s set but a
record doesn’t exist, Django executes an INSERT.
So if you set the id or pk to None it should work, but I've seen conflicting responses to this solution on SO: Duplicating model instances and their related objects in Django / Algorithm for recusrively duplicating an object
This solution should work (thanks #JoshSmeaton for the fix):
models = MyModel.objects.filter(account="acct_1")
for model in models:
model.id = None
model.account = "acct_2"
model.save()
I think in my case, I have a OneToOneField on the model that I'm testing on, so it makes sense that my test wouldn't work with this basic solution. But, I believe it should work, so long as you take care of OneToOneField's.

Circular dependency in Django ForeignKey?

I have two models in Django:
A:
b = ForeignKey("B")
B:
a = ForeignKey(A)
I want these ForeignKeys to be non-NULL.
However, I cannot create the objects because they don't have a PrimaryKey until I save(). But I cannot save without having the other objects PrimaryKey.
How can I create an A and B object that refer to each other?
I don't want to permit NULL if possible.
If this is really a bootstrapping problem and not something that will reoccur during normal usage, you could just create a fixture that will prepopulate your database with some initial data. The fixture-handling code includes workarounds at the database layer to resolve the forward-reference issue.
If it's not a bootstrapping problem, and you're going to want to regularly create these circular relations among new objects, you should probably either reconsider your schema--one of the foreign keys is probably unnecessary.
It sounds like you're talking about a one-to-one relationship, in which case it is unnecessary to store the foreign key on both tables. In fact, Django provides nice helpers in the ORM to reference the corresponding object.
Using Django's OneToOneField:
class A(models.Model):
<snip>
class B(models.Model):
a = OneToOneField(A)
Then you can simply reference them like so:
a = A()
a.save()
b = B(a=a)
b.save()
print a.b
print b.a
In addition, you may look into django-annoying's AutoOneToOneField, which will auto-create the associated object on save if it doesn't exist on the instance.
If your problem is not a one-to-one relationship, you should clarify because there is almost certainly a better way to model the data than mutual foreign keys. Otherwise, there is not a way to avoid setting a required field on save.