Django foreign key: auto-lookup related object when the update record has the key value - django

I have legacy code which had no foreign keys defined in the schema.
The raw data for the row includes the key value of the parent, naturally.
My first porting attempt to postgresql just updated the field with the raw value: I did not add foreign keys to Django's models.
Now I am trying to add foreign keys to make the schema more informative.
When I add a foreign key, django's update requires me to provide an instance of the parent object: I can no longer update by simply providing the key value. But this is onerous because now I need to include in my code knowledge of all the relations to go and fetch related objects, and have specific update calls per model. This seems crazy to me, at least starting from where I am, so I feel like I am really missing something.
Currently, the update code just pushes rows in blissful ignorance. The update code is generic for tables, which is easy when there are no relations.
Django's model data means that I can find the related object dynamically for any given model, and doing this means I can still keep very abstracted table update logic. So this is what I am thinking of doing. Or just doing raw SQL updates.
Does a solution to this already exist, even if I can't find it? I am expecting to be embarrassed.
The ValueError comes in django ORM code which knows exactly which model it expects and what the related field is: the missing step if to find the instance of related object.
db.models.fields.related_descriptors.py:
in this code, which throws the exception, value is supposed to be an instance of the parent model. Instead, value is the key value. This basically I think tells me how I can inspect the model to deal with this in advance, but I wonder if I am re-inventing the wheel.
if value is not None and not isinstance(value, self.field.remote_field.model._meta.concrete_model):
raise ValueError(
'Cannot assign "%r": "%s.%s" must be a "%s" instance.' % (
value,
instance._meta.object_name,
self.field.name,
self.field.remote_field.model._meta.object_name,
)
)

You could use _id suffix to set id value directly
For given model
class Album(models.Model):
artist = models.ForeignKey(Musician, on_delete=models.CASCADE)
You can set artist by id in following manner
Album.objects.create(artist_id=2)

Related

Is there any way around saving models that reference each other twice?

My issue is when saving new models that need to reference each other, not just using a related_name lookup, such as this:
class Many:
owner = models.ForeignKey('One')
class One:
current = models.OneToOneField('Many')
By default, these have null=False and, please correct me if I'm wrong, using these are impossible until I change one of the relationships:
current = models.OneToOneField('Many', null=True)
The reason is because you can't assign a model to a relationship unless its already saved. Otherwise resulting in ValueError: 'Cannot assign "<...>": "..." instance isn't saved in the database.'.
But now when I create a pair of these objects I need to save twice:
many = Many()
one = One()
one.save()
many.owner = one
many.save()
one.current = many
one.save()
Is this the right way to do it, or is there another way around saving twice?
There is no way around it, you need to save one of the objects twice anyway.
This is because, at the database level, you need to save an object to get its ID. There is no way to tell a sql database "save those 2 objects and assign the ids to those fields on the other object". So if you were to do it manually, you would INSERT the first object with NULL for the FK, get its ID back, INSERT the second object with the ID of the first one, get its ID back, then UPDATE the first object to set the FK.
You would encapsulate the whole thing in a transaction.
So what you're doing with the ORM is the closest you can get. You may want to add the following on top of that:
1) Use a transaction for the changes, like this:
from django.db import transaction
with transaction.atomic():
many, one = Many(), One()
one.save()
many.owner = one
many.save()
one.current = many
one.save(update_fields=['current']) # slight optimization here
2) Now this is encapsulated in a transaction, you would want to remove the null=True. But you cannot, as those are, unfortunately, checked immediately.
[edit: it appears Oracle might support deferring the NOT NULL check, so if you're using Oracle you can try dropping the null=True and it should work.]
You'll probably want to check how your code reacts if at a later point, when reading the db, if for some reason (manual editing, bugged insert somewhere, ...) one.current.owner != one.

Do Django models really need a single unique key field

Some of my models are only unique in a combination of keys. I don't want to use an auto-numbering id as the identifier as subsets of the data will be exported to other systems (such as spreadsheets), modified and then used to update the master database.
Here's an example:
class Statement(models.Model):
supplier = models.ForeignKey(Supplier)
total = models.DecimalField("statement total", max_digits=10, decimal_places=2)
statement_date = models.DateField("statement date")
....
class Invoice(models.Model):
supplier = models.ForeignKey(Supplier)
amount = models.DecimalField("invoice total", max_digits=10, decimal_places=2)
invoice_date = models.DateField("date of invoice")
statement = models.ForeignKey(Statement, blank=True, null=True)
....
Invoice records are only unique for a combination of supplier, amount and invoice_date
I'm wondering if I should create a slug for Invoice based on supplier, amount and invoice_date so that it is easy to identify the correct record.
An example of the problem of having multiple related fields to identify the right record is django-csvimport which assumes there is only one related field and will not discriminate on two when building the foreign key links.
Yet the slug seems a clumsy option and needs some kind of management to rebuild the slugs after adding records in bulk.
I'm thinking this must be a common problem and maybe there's a best practice design pattern out there somewhere.
I am using PostgreSQL in case anyone has a database solution. Although I'd prefer to avoid that if possible, I can see that it might be the way to build my slug if that's the way to go, perhaps with trigger functions. That just feels a bit like hidden functionality though, and may cause a headache for setting up on a different server.
UPDATE - after reading initial replies
My application requires that data may be exported, modified remotely, and merged back into the master database after review and approval. Hidden autonumber keys don't easily survive that consistently. The relation invoices[2417] is part of statements[265] is not persistent if the statement table was emptied and reloaded from a CSV.
If I use the numeric autonumber pk then any process that is updating the database would need to refresh the related key numbers or by using the multiple WITH clause.
If I create a slug that is based on my 3 keys but easy to reproduce then I can use it as the key - albeit clumsily. I'm thinking of a slug along the lines:
u'%s %s %s' % (self.supplier,
self.statement_date.strftime("%Y-%m-%d"),
self.total)
This seems quite clumsy and not very DRY as I expect I may have to recreate the slug elsewhere duplicating the algorithm (maybe in an Excel formula, or an Access query)
I thought there must be a better way I'm missing but it looks like yuvi's reply means there should be, and there will be, but not yet :-(
What you're talking about it a multi-column primary key, otherwise known as "composite" or "compound" keys. Support in django for composite keys today is still in the works, you can read about it here:
Currently Django models only support a single column in this set,
denying many designs where the natural primary key of a table is
multiple columns [...] Current state is that the issue is
accepted/assigned and being worked on [...]
The link also mentions a partial implementation which is django-compositekeys. It's only partial and will cause you trouble with navigating between relationships:
support for composite keys is missing in ForeignKey and
RelatedManager. As a consequence, it isn't possible to navigate
relationships from models that have a composite primary key.
So currently it isn't entirely supported, but will be in the future. Regarding your own project, you can make of that what you will, though my own suggestion is to stick with the fully supported default of a hidden auto-incremented field that you don't even need to think about (and use unique_together to enforce the uniqness of the described fields instead of making them your primary keys).
I hope this helps!
No.
Model needs to have one field that is primary_key = True. By default this is the (hidden) autofield which stores object Id. But you can set primary_key to True at any other field.
I've done this in cases, Where i'm creating django project upon tables which were previously created manually or through some other frameworks/systems.
In reality - you can use whatever means you can think of, for joining objects together in queries. As long as query returns bunch of data that can be associated with models you have - it does not really matter which field you are using for joins. Just keep in mind, that the solution you use should be as effective as possible.
Alan

How to modify a queryset and save it as new objects?

I need to query for a set of objects for a particular Model, change a single attribute/column ("account"), and then save the entire queryset's objects as new objects/rows. In other words, I want to duplicate the objects, with a single attribute ("account") changed on the duplicates. I'm basically creating a new account and then going through each model and copying a previous account's objects to the new account, so I'll be doing this repeatedly, with different models, probably using django shell. How should I approach this? Can it be done at the queryset level or do I need to loop through all the objects?
i.e.,
MyModel.objects.filter(account="acct_1")
# Now I need to set account = "acct_2" for the entire queryset,
# and save as new rows in the database
From the docs:
If the object’s primary key attribute is not set, or if it’s set but a
record doesn’t exist, Django executes an INSERT.
So if you set the id or pk to None it should work, but I've seen conflicting responses to this solution on SO: Duplicating model instances and their related objects in Django / Algorithm for recusrively duplicating an object
This solution should work (thanks #JoshSmeaton for the fix):
models = MyModel.objects.filter(account="acct_1")
for model in models:
model.id = None
model.account = "acct_2"
model.save()
I think in my case, I have a OneToOneField on the model that I'm testing on, so it makes sense that my test wouldn't work with this basic solution. But, I believe it should work, so long as you take care of OneToOneField's.

Primary key and unique key in django

I had a custom primary key that need to be set up on a particular data in a model.
This was not enough, as an attempt to insert a duplicate number succeeded. So now when i replace primary_key=True to unique=True it works properly and rejects duplicate numbers!!. But according this document (which uses fields).
primary_key=True implies null=False
and unique=True.
Which makes me confused as in why does
it accept the value in the first place
with having an inbuilt unique=True ?
Thank you.
Updated statement:
personName = models.CharField(primary_key=True,max_length=20)
Use an AutoField with primary_key instead.
Edit:
If you don't use an AutoField, you'll have to manually calculate/set the value for the primary key field. This is rather cumbersome. Is there a reason you need ReportNumber to the primary key? You could still have a unique report number on which you can query for reports, as well as an auto-incrementing integer primary key.
Edit 2:
When you say duplicate primary key values are allowed, you indicate that what's happening is that an existing record with the same primary key is updated -- there aren't actually two objects with the same primary key in the database (which can't happen). The issue is in the way Django's ORM layer chooses to do an UPDATE (modify an existing DB record) vs. an INSERT INTO (create a new DB record). Check out this line from django.db.models.base.Model.save_base():
if (force_update or (not force_insert and
manager.using(using).filter(pk=pk_val).exists())):
# It does already exist, so do an UPDATE.
Particularly, this snippet of code:
manager.using(using).filter(pk=pk_val).exists()
This says: "If a record with the same primary key as this Model exists in the database, then do an update." So if you re-use a primary key, Django assumes you are doing an update, and thus doesn't raise an exception or error.
I think the best idea is to let Django generate a primary key for you, and then have a separate field (CharField or whatever) that has the unique constraint.

Django - access m2m objects (or raw pks) from ``clean`` before model is saved

Of course you can't just use self.related_field.objects.all(), or you'll get a ...needs to have a primary key... error, but if I want to run custom validation inside of Model.clean, there appears to be no way to access this data. Of course you can use Form.clean to do this, but I'm not always using forms.
What you are asking for is impossible - the M2M records cannot exist until the main object has a primary key value. There is no way to access the data because it does not exist.