Django default caching of foreign key models - django

For the life of me I can't find the answer to this question online, although it's a basic one.
I have two models, one referencing the other:
class A(models.Model):
name = models.CharField(...)
...
class B(models.Model):
a = models.ForeignKey(A)
Now, I'm keeping an instance of B in memory, and ever so often I access b.a.name . Does accessing b.a.name cause a database query every time, so that changes in a.name (changes done by another process) are seen in my process? Or do I have to query a explicitly each time?

I'm surprised you haven't been able to find any information on this. It's fairly well documented that forward relationships are cached on first usage - what happens is that a cache attribute is created, and subsequent lookups will check this first.
So yes, this means that changes to that object in another process will not be seen, and you will need to re-query each time. (Note also that depends on your database's transaction isolation setting, you might not even see the new value on re-querying - you may need to commit the current transaction first.)

Related

Synchronized Model instances in Django

I'm building a model for a Django project (my first Django project) and noticed
that instances of a Django model are not synchronized.
a_note = Notes.objects.create(message="Hello") # pk=1
same_note = Notes.objects.get(pk=1)
same_note.message = "Good day"
same_note.save()
a_note.message # Still is "Hello"
a_note is same_note # False
Is there a built-in way to make model instances with the same primary key to be
the same object? If yes, (how) does this maintain a globally consistent state of all
model objects, even in the case of bulk updates or changing foreign keys
and thus making items enter/exit related sets?
I can imagine some sort of registry in the model class, which could at least handle simple cases (i.e. it would fail in cases of bulk updates or a change in foreign keys). However, the static registry makes testing more difficult.
I intend to build the (domain) model with high-level functions to do complex
operations which go beyond the simple CRUD
actions of Django's Model class. (Some classes of my model have an instance
of a Django Model subclass, as opposed to being an instance of subclass. This
is by design to prevent direct access to the database which might break consistencies and to separate the business logic from the purely data access related Django Model.) A complex operation might touch and modify several components. As a developer
using the model API, it's impossible to know which components are out of date after
calling a complex operation. Automatically synchronized instances would mitigate this issue. Are there other ways to overcome this?
TL;DR "Is there a built-in way to make model instances with the same primary key to be the same object?" No.
A python object in memory isn't the same thing as a row in your database. So when you create a_note and then fetch same_note from the db, those are two different objects in memory, even though they are the same representation of the underlying row in your database. When you fetch same_note, in fact, you instantiate a new Notes object and initialise it with the values fetched from the database.
Then you change and save same_note, but the a_note object in memory isn't changed. If you did a_note.refresh_from_db() you would see that a_note.message was changed.
Now a_note is same_note will always be False because the location in memory of these two objects will always be different. Two variables are the same (is is True) if they point to the same object in memory.
But a_note == same_note will return True at any time, since Django defines two model instances to be equal if their pk is the same.
Note that if the complexity you're talking about is that in the case of multiple requests one request might change underlying values that are being used by another request, then use F to avoid race conditions.
Within one request, since everything is sequential and single threaded, there's not risk of variables going out of sync: You know the order in which things are done and therefore can always call refresh_from_db() when you know a previous method call might have changed the database value.
Note also: Having two variables holding the same row means you'll have performed two queries to your db, which is the one thing you want to avoid at all cost. So you should think why you have this situation in the first place.

Detect duplicate inserts when adding many-to-many relation

Let's assume there are two models, A and B:
class A(models.Model):
name = models.CharField(max_length=100)
class B(models.Model):
children = models.ManyToManyField(A)
I'm using b.children.add() method to add instance of A to b:
a = A.objects.get(pk=SOMETHING)
b.children.add(a)
As far as I know, Django by default doesn't allow duplicate many-to-many relationship. So I cannot add same instance of A more than once.
But the problem is here, I fetch instances of A with another query, then loop around them and add them one by one. How can I detect a duplicate relation? Does add() method return something useful?
A look at the source code reveals that Django first checks to see if there are any entries that already exist in the database, and then only adds the new ones. It doesn't return any information to the caller, though.
It's not clear if you actually need to detect duplicates, or if you just want to make sure that they're not being added to the database? If it's the latter then everything's fine. If it's the former, there's no way around hitting the database. If you're really concerned about performance you could always perform the check and update the through table yourself (i.e. re-implement add()).

Is there any way around saving models that reference each other twice?

My issue is when saving new models that need to reference each other, not just using a related_name lookup, such as this:
class Many:
owner = models.ForeignKey('One')
class One:
current = models.OneToOneField('Many')
By default, these have null=False and, please correct me if I'm wrong, using these are impossible until I change one of the relationships:
current = models.OneToOneField('Many', null=True)
The reason is because you can't assign a model to a relationship unless its already saved. Otherwise resulting in ValueError: 'Cannot assign "<...>": "..." instance isn't saved in the database.'.
But now when I create a pair of these objects I need to save twice:
many = Many()
one = One()
one.save()
many.owner = one
many.save()
one.current = many
one.save()
Is this the right way to do it, or is there another way around saving twice?
There is no way around it, you need to save one of the objects twice anyway.
This is because, at the database level, you need to save an object to get its ID. There is no way to tell a sql database "save those 2 objects and assign the ids to those fields on the other object". So if you were to do it manually, you would INSERT the first object with NULL for the FK, get its ID back, INSERT the second object with the ID of the first one, get its ID back, then UPDATE the first object to set the FK.
You would encapsulate the whole thing in a transaction.
So what you're doing with the ORM is the closest you can get. You may want to add the following on top of that:
1) Use a transaction for the changes, like this:
from django.db import transaction
with transaction.atomic():
many, one = Many(), One()
one.save()
many.owner = one
many.save()
one.current = many
one.save(update_fields=['current']) # slight optimization here
2) Now this is encapsulated in a transaction, you would want to remove the null=True. But you cannot, as those are, unfortunately, checked immediately.
[edit: it appears Oracle might support deferring the NOT NULL check, so if you're using Oracle you can try dropping the null=True and it should work.]
You'll probably want to check how your code reacts if at a later point, when reading the db, if for some reason (manual editing, bugged insert somewhere, ...) one.current.owner != one.

django save instance between parent and child model class

I came across this problem on form save the data needs to be persisted somewhere then go through a payment process then on success retrieve the data and save to the proper model.
I have seen this done using session, but with some hacky way to persist file uploads when commit=False and it doesn't seem very pythonic
I am thinking if I have a model class A, and have a child class extending A, such as A_Temp
class A(models.Model):
name = models.CharField(max_lenght=25)
image = models.ImageField()
class A_Temp(A):
pass
class AForm(forms.ModelForm):
class Meta:
model = A_Temp
On model form (A_Temp) save, it stores to A_Temp, and when payment successful, it move the instance to the parent model class A.
Here are the questions:
Has anyone done this before?
How to properly move an instance of a child model class to the parent model class?
Edit:
There are other different ways to do it, such as adding extra fields to the table, yes I would've done that if I am using PHP without a ORM framework, but since the ORM is pretty decent in django, I thought that I might trial something different.
Since I am asking here, means I am not convinced myself about this approach as well. What are your thoughts?
As suggested in the question comments, adding an extra field to your model containing payment state may be the easiest approach. Conceptually it's the same object, it's just that the state changes once payment has been made. As you've indicated, you will need logic to purge out items from your database which never proceed through the required states such as payment. This may involve adding both a payment_state and state_change_time field to your model which indicates when the state last changed. If the state is PAYMENT_PENDING for for too long, that record could be purged.
If you take the approach that unpaid items are stored in a different table as you've suggested, you still have to manage that table to determine when it's safe to delete items. For example, if a payment is never processed, when will you delete record from the A_temp table? Also, having a separate table means that you really only have two states possible, paid and unpaid as determine by the table in which the record occurs. Having a single table with a payment_state may be more flexible in that it allows you to extend the state as required. eg. Let's say you decide you need the payment states ITEM_SUBMITTED, AWAITING_PAYMENT, PAYMENT_ACCEPTED, PAYMENT_REJECTED. This could all be implemented with a single state field. If this was implemented as you've described, you'd need a separate table for each state.
Having said all that, if you're still set on having a separate table structure, you can create a function which will copy the values from an instance of A_temp to A. Something like the following may work, but any relationship type fields such as ForeignKey are likely to require special attention.
def copy_A_temp_to_A(a, a_temp):
for field_name in a._meta.fields:
value = getattr(a, field_name)
setattr(a_temp, field_name, value)
When you need to do the move from A_temp to A, you'd have to instantiate an A instance, then call the copy function, save the instance and delete the A_temp instance from the database.

Concurrency in django admin change view

My model:
class Order(models.Model):
property_a = models.CharField()
property_b = models.CharField()
property_c = models.CharField()
Many users will access a given record in a short time frame via admin change page, so I am having concurrency issues:
User 1 and 2 open the change page at the same time. Assume all values are blank when they load the page. User 1 sets property_a to "a", and property_b to "b", then saves. A second later if user 2 changes property b and c then saves, it will quietly overwrite all the values from user 1. in this case, property_a will go back to being blank and b and c will be whatever user 2 put in.
I need recommendations on how to handle this. If I have to have a version field in the model, how do i pass it to the admin, where do I do the check so I can elegantly notify the user their changes can't be saved because another user has modified the record? Is there a more seamless way than just returning an error to the user?
The standard solution is to prevent your users from sharing a single record. It's not at all clear why so many users are messing with the exact same Order instance.
Consider that Order is probably a composite object and you've put too much into a single model. That's the first -- and best -- solution.
If (for inexplicable reasons) you won't decompose this, then you have to create a two-part update transaction.
Requery the data. Compare with the original query as done for this user's session.
If the data doesn't match the original query, then someone else changed it. The user's changes are invalidated, rolled back, wiped out, and the user sees a new query.
If the data does match, you can try to commit the change.
The above algorithm has a race condition, which is usually resolved via low-level SQL. Note that it invalidates a user's work, making it maximally irritating.
That's why your first choice is to decompose your models to eliminate the concurrency.
my model has a miscellaneous notes field
This is a bad design. (a) Concurrency is ruined by collisions on this field. (b) There's no log or history of comments.
Item (b) means that a badly-behaved user can maliciously corrupt this data. If you keep notes and comments as a log, you can -- in principle -- limit users to changing only their own comments.
[In most databases with "miscellaneous notes" the field has become a costly, hard-to-maintain liability full of important but impossible-to-parse data. Miscellaneous notes is where users invent their own processes outside the application software. ]
"miscellaneous notes" must be treated like a log, with an unlimited number of notes -- date-stamped -- identified by user -- appended to the Order.
If you simply partition the design to put notes in a separate table, you solve your concurrency issues.