How to update django model through data in foreignkey fields? - django

I am trying to migrate some data from one table to another, and I have the following script:
def forward_func(apps, schema_editor):
targetModel = apps.get_model("target", "some_target")
targetModel.objects.all().update(a='other__table__a')
both fields are boolean, and the data inside the original table has no problem and has correct type.
For some reason, I'm getting this message:
ValidationError: [u"'other__table__a' value must be either True or False."]
From the django doc (v1.11), I can't really find a topic talking directly about this, but I've found
However, unlike F() objects in filter and exclude clauses, you can’t introduce joins when you use F() objects in an update – you can only reference fields local to the model being updated. If you attempt to introduce a join with an F() object, a FieldError will be raised:
# This will raise a FieldError
>>> Entry.objects.update(headline=F('blog__name'))
Does that mean I simply cannot call .update using foreinkeys? Do I have to loop through every object and update this way?

Related

Why does get and filter work differently for annotated fields?

I have a get_queryset in a custom Manager for a Model that renames fields:
class Manager:
def get_queryset(self):
return super(Manager, self).get_queryset().values(renamed_field=F('original_field'))
Why is it that I can do a .filter on the renamed field but when I do a .get I need to use the original field name?
This works:
Model.objects.filter(renamed_field='Test')
But this errors out with matching query does not exist:
Model.objects.get(renamed_field='Test')
From the docs about Querysets:
Internally, a QuerySet can be constructed, filtered, sliced, and
generally passed around without actually hitting the database. No
database activity actually occurs until you do something to evaluate
the queryset.
When you call the get method, you hit the database. This explains why you get the error about no matching query.

Django Get and Filter

I have a design question on django.
Can someone explain to me why the Django ORM 'get' doesn't return a queryset?
To the best of my understanding, a queryset should/is the result of a db query. With that logic, isn't a get query a queryset?
Also, from my research towards this question, I found out the get query calls the model manager while the queryset doesn't.
Is there a reasoning behind all of this?
From django documentation queryset api reference
get() raises MultipleObjectsReturned if more than one object was found. The MultipleObjectsReturned exception is an attribute of the model class.
While filter returns a set of instances that match the lookups kwargs in an instance of QuerySet
For example:
A=Model.objects.get(id=1)
B=Model.objects.filter(id=1)
print(isinstance(A,Model)) # True
print(isinstance(B,QuerySet)) # True
Get sql >>
select * from public.app_model where id=1;
Filter sql >>
select * from public.app_model where id=1;
but at implementation get must return an instance, if the query returns more than 1 row it will raise MultipleObjectsReturned yet filter not, it will return all the rows in Model class instance form
BTW, Both use the cache property

Django - copy and insert queryset clone using bulk_create

My goal is to create a clone of a queryset and then insert it into the database.
Following the suggestions of this post, I have the following code:
qs_new = copy.copy(qs)
MyModel.objects.bulk_create(qs_new)
However, with this code I run into duplicate primary key error. As for now, I only can come up with the following work-around:
qs_new = copy.copy(qs)
for x in qs_new:
x.id = None
MyModel.objects.bulk_create(qs_new)
Question: Can I implement this code snippet without going through loop ?
Can't think of a way without loop, but just a suggestion:
# add all fields here except 'id'
qs = qs.values('field1', 'field2', 'field3')
new_qs = [MyModel(**i) for i in qs]
MyModel.objects.bulk_create(new_qs)
Note that bulk_create behaves differently depending on the underlying database. With Postgres you get the new primary keys set:
Support for setting primary keys on objects created using
bulk_create() when using PostgreSQL was added.
https://docs.djangoproject.com/en/1.10/ref/models/querysets/#django.db.models.query.QuerySet.bulk_create
You should, however make sure that the objects you are creating either have no primary keys or only keys that are not taken yet. In the latter case you should run the code that sets the PKs as well as the bulk_create inside transaction.atomic().
Fetching the values explicitly as suggested by Shang Wang might be faster because only the given values are retrieved from the DB instead of fetching everything. If you have foreign key relations or m2m relations you might want to avoid simply throwing the complex instances into bulk_create but instead explicitly naming all attributes that are required when constructing a new MyModel instance.
Here an example:
class MyModel(Model):
name = TextField(...)
related = ForeignKeyField(...)
my_m2m = ManyToManyField(...)
In case of MyModel above, you would want to preserve the ForeignKey relations by specifying related_id and the PK of the related object in the constructor of MyModel, avoiding specifying related.
With m2m relations, you might end up skipping bulk_create altogether because you need each specific new PK, the corresponding original PK (from the instance that was copied) and the m2m relations of that original instance. Then you would have to create new m2m relations with the new PK and these mappings.
# add all fields here except 'id'
qs = qs.values('name', 'related_id')
MyModel.objects.bulk_create([MyModel(**i) for i in qs])
Note for completeness:
If you have overriden save() on your model (or if you are inheriting from 3rd party with custom save methods), it won't be executed and neither will any post_save handlers (yours or 3rd party).
I tried and you need a loop to set the id to None, then it works. so finally it may be like this:
qs_new = copy.copy(qs)
for q in qs_new:
q.id = None
# also, you can set other fields if you need
MyModel.objects.bulk_create(qs_new)
This works for me.

How to track changes when using update() in Django models

I'm trying to keep track of the changes whenever a field is changed.
I can see the changes in Django Admin History whenever I use the .save() method, but whenever I use the .update() method it does not record whatever I changed in my object.
I want to use update() because it can change multiple fields at the same time. It makes the code cleaner and more efficient (one query, one line...)
Right now I'm using this:
u = Userlist.objects.filter(username=user['username']).update(**user)
I can see all the changes when I do
u = Userlist.objects.get(username=user['username'])
u.lastname=lastname
u.save()
I'm also using django-simple-history to see the changes.setup.
From the docs:
Finally, realize that update() does an update at the SQL level and,
thus, does not call any save() methods on your models, nor does it
emit the pre_save or post_save signals (which are a consequence of
calling Model.save())
update() works at the DB level, so Django admin cannot track changes when updates are applied via .update(...).
If you still want to track the changes on updates, you can use:
for user in Userlist.objects.filter(age__gt=40):
user.lastname = 'new name'
user.save()
This is however more expensive and is not advisable if the only benefit is tracking changes via the admin history.
Here's how I've handled this and it's worked well so far:
# get current model instance to update
instance = UserList.objects.get(username=username)
# use model_to_dict to convert object to dict (imported from django.forms.models import model_to_dict)
obj_dict = model_to_dict(instance)
# create instance of the model with this old data but do not save it
old_instance = UserList(**obj_dict)
# update the model instance (there are multiple ways to do this)
UserList.objects.filter(username=username).update(**user)
# get the updated object
updated_object = UserList.objects.get(id=id)
# get list of fields in the model class
my_model_fields = [field.name for field in cls._meta.get_fields()]
# get list of fields if they are different
differences = list(filter(lambda field: getattr(updated_object, field, None)!= getattr(old_instance, field, None), my_model_fields))
The differences variable will give you the list of fields that are different between the two instances. I also found it helpful to add which model fields I don't want to check for differences (e.g. we know the updated_date will always be changed, so we don't need to keep track of it).
skip_diff_fields = ['updated_date']
my_model_fields = []
for field in cls._meta.get_fields():
if field.name not in skip_diff_fields:
my_model_fields.append(field.name)

How to to retrieve the SQL type of the primary key of a related field mapped by Django's ORM

I'm introspecting the types of some fields in a model, in particular I'm interested in retrieving the RDBMS-dependent type, i.e. "VARCHAR(20)", and not the Django field class (django.db.models.CharField in this case).
I've problems with relationships, however, since the database mixes both tables with varchar primary keys and other with integer pks (so I can not make any assumption).
So far I've tried to retrieve the field type with the following code:
# model is a django.db.model class
for field in model._meta.get_fields(include_parents=False):
try:
# this code works for anything but relations
ft = field.db_type(connection=connection)
except:
# I'm introspecting a relation -> I would like to retrieve the field type of the related object's pk
ft = field.related_model.pk.db_type(connection=connection)
that, when dealing with a relationship, fails with the following error:
'property' object has no attribute 'db_type'
when it fails, field.__class__ appears to be a ManyToOneRel object, if this may be of any help. It's worth noting also that the code have to be compatible with new Django 1.8 _meta.
Try this for related fields:
ft = field.related_model._meta.pk.db_type(connection=connection)
model.pk is indeed a property that gets you the value of the pk for a model instance. model._meta.pk is the actual Field instance that is the primary key for that model.