how to update unique field for multiple entries in django - django

I have a simple Django model similar to this:
class TestModel(models.Model):
test_field = LowerCaseCharField(max_length=20, null=False,
verbose_name='Test Field')
other_test_field = LowerCaseCharField(max_length=20, null=False, unique=True,
verbose_name='Other Test Field')
Notice that other_test_field is a unique field. Now I also have some data stored that looks like this:
[
{
test_field: "object1",
other_test_field: "test1"
},
{
test_field: "object2",
other_test_field: "test2"
}
]
All I'm trying to do now is switch the other_test_field fields in these two objects, so that the first object has "test2" and the second object has "test1" for other_test_field. How do I accomplish that while preserving the uniqueness? Ultimately I'm trying to update data in bulk, not just swapping two fields.
Anything that updates data in serial is going to hit an IntegrityError due to unique constraint violation, and I don't know a good way to remove the unique constraint temporarily, for this one operation, before adding it back. Any suggestions?

Related

Django, Create GIN index for child element in JSON Array field

I have a model that uses PostgreSQL and has field like this:
class MyModel(models.Model):
json_field = models.JSONField(default=list)
This field contains data like this:
[
{"name": "AAAAA", "product": "11111"},
{"name": "BBBBB", "product": "22222"},
]
Now I want to index by json_field -> product field, because it is being used as identification. Then i want to create GinIndex like this:
class Meta:
indexes = [
GinIndex(name='product_json_idx', fields=['json_field->product'], opclasses=['jsonb_path_ops'])
]
When I try to create migration, I get error like this:
'indexes' refers to the nonexistent field 'json_field->product'.
How to create GinIndex that will be used for child attribute in Json Array?
Please don't use a JSONField [Django-doc] for well-structured data: if the structure is clear, like here where we have a list of objects where each object has a name and a product, it makes more sense to work with extra models, like:
class MyModel(models.Model):
# …
pass
class Product(models.Model):
# …
pass
class Entry(models.Model):
my_model = models.ForeignKey(MyModel, on_delete=models.CASCADE)
name = models.CharField(max_length=255)
product = models.ForeignKey(Product, on_delete=models.CASCADE)
This will automatically add indexes on the ForeignKeys, but will also make querying simpeler and usually more efficient.
While databases like PostgreSQL indeed have put effort into making JSON columns easier to query, aggregate, etc. usually it is still beter to perform database normalization [wiki], especially since it has more means for referential integrity, and a lot of aggregates are simpeler on linear data.
If for example later a product is removed, it will require a lot of work to inspect the JSON blobs to remove that product. This is however a scenario that both Django and PostgreSQL databases cover with ON DELETE triggers and which will likely be more effective and safe when using the Django toolchain for this.

Soft delete with unique constraint in Django

I have models with this layout:
class SafeDeleteModel(models.Model):
.....
deleted = models.DateTimeField(editable=False, null=True)
......
class MyModel(SafeDeleteModel):
safedelete_policy = SOFT_DELETE
field1 = models.CharField(max_length=200)
field2 = models.CharField(max_length=200)
field3 = models.ForeignKey(MyModel3)
field4 = models.ForeignKey(MyModel4)
field5 = models.ForeignKey(MyModel5)
class Meta:
unique_together = [['field2', 'field3', 'field4', 'deleted'],]
The scenario here is that I never want users to delete data. Instead a delete will just hide records. However, I still want all non-soft-deleted records to respect unique key constraints. Basically, I want to have as many duplicated deleted records, but only a single unique un-deleted record can exist. So I was thinking to include "deleted" field (provided by django-safedelete library), but the issue becomes that Django's unique checks fail with "psycopg2.IntegrityError: duplicate key value violates unique constraint" for ['field2', 'field3', 'field4', 'deleted'] because NULL is not "equal to" NULL and it yields false in PostgreSQL.
Is there a way to enforce a unique_together constraint with the Django model layout as mine? Or is there a better idea to physically delete the record, then move it to an archive database, and if the user wants the record back, then software will look for the record in the archive and recreate it?
Yes, as of Django version 2.2 it is possible to use a UniqueConstraint with a condition.
Have a look at the documentation in this link: https://docs.djangoproject.com/en/2.2/ref/models/constraints/#uniqueconstraint
So your model would be something like this:
class MyModel(SafeDeleteModel):
safedelete_policy = SOFT_DELETE
field1 = models.CharField(max_length=200)
field2 = models.CharField(max_length=200)
field3 = models.ForeignKey(MyModel3)
field4 = models.ForeignKey(MyModel4)
field5 = models.ForeignKey(MyModel5)
class Meta:
constraints = [
models.UniqueConstraint(
fields=['field2', 'field3', 'field4'],
condition=Q(deleted=False),
name='unique_if_not_deleted')
]
If you are using an older version of Django that doesn't have this feature available, you can create a migration with a partial unique index (have a look at this question here: Postgresql: Conditionally unique constraint).
As for your second question (would it be better to physically delete the record and move it elsewhere), it really depends on the characteristics of your application. If these soft-deletes don't happen very often and your table is still on the small side, I would keep the records in the same table for simplicity's sake, but if the number of records in the table starts growing fast and they affect the performance of the queries on this table then you should move the records elsewhere. You have to evaluate the trade-off between complexity and performance.

Prefetch of ForeignKey in Serializer.is_valid() in Django

I am attempting to create many model instances in one POST using a Mixin to support POST of arrays.
My use case will involve creating 1000s of model instances in each call. This very quickly becomes slow with DRF due to each model being created one at a time.
In an attempt to optimise the creation, I have changed to use bulk_create(). While this does result in a significant improvement, I noticed that for each model instance being created, a SELECT statement was being run to get the ForeignKey, which I traced to the call to serializer.is_valid().
As such, adding n instances would result in n SELECT queries to get the ForeignKey and 1 INSERT query.
As an example:
Models (using automatic ID fields):
class Customer(models.Model):
name = models.CharField(max_length=100, blank=False)
joined = models.DateTimeField(auto_now_add=True)
class Order(models.Model):
customer = models.ForeignKey(Customer, on_delete=models.CASCADE)
timestamp = models.DateTimeField()
price = models.FloatField()
POST data to api/orders/:
[
{
"customer": 13,
...
},
{
"customer": 14,
...
},
{
"customer": 14,
...
}
]
This would result in 3 SELECT statements to get the Customer for each of the Orders, followed by 1 INSERT statement to push the data in.
Similar to prefetch_related() for queries when fetching data in GET requests, is there any way to avoid performing so many queries when deserializing and validating (such as setting the serializer to prefetch foreign keys)?

How to write this class for Django's data model (converting from Propel's YML format)

I am converting a web project that currently uses the Propel ORM, to a django project.
My first task is to 'port' the model schema to django's.
I have read the django docs, but they do not appear to be in enough detail. Case in point, how may I 'port' a (contrived) table defined in the Propel YML schema as follows:
demo_ref_country:
code: { type: varchar(4), required: true, index: unique }
name: { type: varchar(64), required: true, index: unique }
geog_region_id: { type: integer, foreignTable: demo_ref_geographic_region, foreignReference: id, required: true, onUpdate: cascade, onDelete: restrict }
ccy_id: { type: integer, foreignTable: demo_ref_currency_def, foreignReference: id, required: true, onUpdate: cascade, onDelete: restrict }
flag_image_path: { type: varchar(64), required: true, default: ''}
created_at: ~
_indexes:
idx_f1: [geog_region_id, ccy_id, created_at]
_uniques:
idxu_f1_key: [code, geog_region_id, ccy_id]
Here is my (feeble) attempt so far:
class Country(models.Model):
code = models.CharField(max_length=4) # Erm, no index on this column .....
name = models.CharField(max_length=64) # Erm, no index on this column .....
geog_region_id = models.ForeignKey(GeogRegion) # Is this correct ? (how about ref integrity constraints ?
ccy_id = models.ForeignKey(Currency) # Is this correct?
flag_image_path = models.CharField(max_length=64) # How to set default on this col?
created_at = models.DateTimeField() # Will this default to now() ?
# Don't know how to specify indexes and unique indexes ....
[Edit]
To all those suggesting that I RTFM, I understand your frustration. Its just that the documentation is not very clear to me. It is probably a Pythonic way of documentation - but coming from a C++ background, I feel the documentation could be improved to make it more accesible for people coming from different languages.
Case in point: the documentation merely states the class name and an **options parameter in the ctor, but doesn't tell you what the possible options are.
For example class CharField(max_length=None,[**options])
There is a line further up in the documentation that gives a list of permissible options, which are applicable to all field types.
However, the options are provided in the form:
Field.optionname
The (apparently implicit) link between a class property and a constructor argument was not clear to me. It appears that if a class has a property foo, then it means that you can pass an argument named foo to its constructor. Does that observation hold true for all Python classes?
The indexes are automatically generated for your references to other models (i.e. your foreign keys). In other words: your geog_region_id is correct (but it would be better style to call it geog_region).
You can set default values using the default field option.
import datetime
class Country(models.Model):
code = models.CharField(max_length=4, unique=True)
name = models.CharField(max_length=64)
geog_region = models.ForeignKey(GeogRegion)
ccy = models.ForeignKey(Currency, unique=True)
flag_image_path = models.CharField(max_length=64, default='')
created_at = models.DateTimeField(default=datetime.now())
(I'm no expert on propel's orm)
Django always tries to imitate the "cascade on delete" behaviour, so no need to specify that somewhere. By default all fields are required, unless specified differently.
For the datetime field see some more options here. All general field options here.
code = models.CharField(max_length=4) # Erm, no index on this column .....
name = models.CharField(max_length=64) # Erm, no index on this column .....
You can pass the unique = True keyword argument and value for both of the above.
geog_region_id = models.ForeignKey(GeogRegion) # Is this correct ? (how about ref integrity constraints ?
ccy_id = models.ForeignKey(Currency) # Is this correct?
The above lines are correct if GeogRegion and Currency are defined before this model. Otherwise put quotes around the model names. For e.g. models.ForeignKey("GeogRegion"). See documentation.
flag_image_path = models.CharField(max_length=64) # How to set default on this col?
Easy. Use the default = "/foo/bar" keyword argument and value.
created_at = models.DateTimeField() # Will this default to now() ?
Not automatically. You can do default = datetime.now (remember to first from datetime import datetime). Alternately you can specify auto_now_add = True.
# Don't know how to specify indexes and unique indexes ....
Take a look at unique_together.
You'll see that the document I have linked to is the same pointed out by others. I strongly urge you to read the docs and work through the tutorial.
I'm sorry, you haven't read the docs. A simple search for index, unique or default on the field reference page reveals exactly how to set those options.
Edit after comment I don't understand what you mean about multiple lines. Python doesn't care how many lines you use within brackets - so this:
name = models.CharField(unique=True, db_index=True)
is exactly the same as this:
name = models.CharField(
unique=True,
db_index=True
)
Django doesn't support multi-column primary keys, but if you just want a multi-column unique constraint, see unique_together.
Class demo_ref_country(models.Model)
code= models.CharField(max_length=4, db_index=True, null=False)
name= models.CharField(max_length=64, db_index=True, null=False)
geog_region = models.ForeignKey(geographic_region, null=False)
ccy = models.ForeignKey(Currency_def, null=False)
flag = models.ImageField(upload_to='path to directory', null=False, default="home")
created_at = models.DateTimeField(auto_now_add=True, db_index=True)
class Meta:
unique_together = (code, geog_region, ccy)
You can set default values,, db_index paramaeter creates indexes for related fields. You can use unique=True for seperate fields, but tahat unique together will check uniqueness in columns together.
UPDATE: First of all, i advice you to read documentatin carefully, since django gives you a lot of opportunuties, some of them have some restrictions... Such as, unique_together option is used just for django admin. It means if you create a new record or edit it via admin interface, it will be used. If you will alsa insert data with other ways (like a DataModel.objects.create statement) its better you use uniaue=True in field definition like:
code= models.CharField(max_length=4, db_index=True, null=False, unique=True)
ForeignKey fields are unique as default, so you do not need to define uniqueness for them.
Django supports method override, so you can override Model save and delete methods as you like.
check it here. Django also allows you to write raw sql queries you can check it here
As i explained, unique together is a django admin feature. So dont forget to add unique=True to required fields.
Unique together also allows you to define diffrent unique pairs, such as;
unique_together = (('id','code'),('code','ccy','geog_region'))
That means, id and code must be unique together and code, ccy and geog_region must be unique together
UPDATE 2: Prior to your question update...
It is better yo start from tutorials. It defines basics with good examples.
As for doc style, let me give you an example, but if you start from tutors, it will be easier for you...
There are from model structure... Doc here
BooleanField
class BooleanField(**options)
that defines, the basic structure of a database field, () is used, and it has some parameters taken as options. that is the part:
models.BooleansField()
Since this is a field struvture, available options are defines as:
unique
Field.unique
So,
models.BooleansField(unique=True)
That is the general usage. Since uniqu is a basic option available to all field types, it classified as field.unique. There are some options available to a single field type, like symmetrical which is a ManyToMany field option, is classified as ManyToMany.Symmetrical
For the queryset
class QuerySet([model=None])
That is used as you use a function, but you use it to filter a model, with other words, write a filter query to execute... It has some methods, like filter...
filter(**kwargs)
Since this takes some kwargs, and as i told before, this is used to filter your query results, so kwargs must be your model fields (database table fields) Like:
MyModel.objects.filter(id=15)
what object is defines in the doc, but it is a manager that helps you get related objects.
Doc contains good examples, but you have to start from tutors, that is what i can advice you...

How can i get a list of objects from a postgresql view table to display

this is a model of the view table.
class QryDescChar(models.Model):
iid_id = models.IntegerField()
cid_id = models.IntegerField()
cs = models.CharField(max_length=10)
cid = models.IntegerField()
charname = models.CharField(max_length=50)
class Meta:
db_table = u'qry_desc_char'
this is the SQL i use to create the table
CREATE VIEW qry_desc_char as
SELECT
tbl_desc.iid_id,
tbl_desc.cid_id,
tbl_desc.cs,
tbl_char.cid,
tbl_char.charname
FROM tbl_desC,tbl_char
WHERE tbl_desc.cid_id = tbl_char.cid;
i dont know if i need a function in models or views or both. i want to get a list of objects from that database to display it. This might be easy but im new at Django and python so i having some problems
Django 1.1 brought in a new feature that you might find useful. You should be able to do something like:
class QryDescChar(models.Model):
iid_id = models.IntegerField()
cid_id = models.IntegerField()
cs = models.CharField(max_length=10)
cid = models.IntegerField()
charname = models.CharField(max_length=50)
class Meta:
db_table = u'qry_desc_char'
managed = False
The documentation for the managed Meta class option is here. A relevant quote:
If False, no database table creation
or deletion operations will be
performed for this model. This is
useful if the model represents an
existing table or a database view that
has been created by some other means.
This is the only difference when
managed is False. All other aspects of
model handling are exactly the same as
normal.
Once that is done, you should be able to use your model normally. To get a list of objects you'd do something like:
qry_desc_char_list = QryDescChar.objects.all()
To actually get the list into your template you might want to look at generic views, specifically the object_list view.
If your RDBMS lets you create writable views and the view you create has the exact structure than the table Django would create I guess that should work directly.
(This is an old question, but is an area that still trips people up and is still highly relevant to anyone using Django with a pre-existing, normalized schema.)
In your SELECT statement you will need to add a numeric "id" because Django expects one, even on an unmanaged model. You can use the row_number() window function to accomplish this if there isn't a guaranteed unique integer value on the row somewhere (and with views this is often the case).
In this case I'm using an ORDER BY clause with the window function, but you can do anything that's valid, and while you're at it you may as well use a clause that's useful to you in some way. Just make sure you do not try to use Django ORM dot references to relations because they look for the "id" column by default, and yours are fake.
Additionally I would consider renaming my output columns to something more meaningful if you're going to use it within an object. With those changes in place the query would look more like (of course, substitute your own terms for the "AS" clauses):
CREATE VIEW qry_desc_char as
SELECT
row_number() OVER (ORDER BY tbl_char.cid) AS id,
tbl_desc.iid_id AS iid_id,
tbl_desc.cid_id AS cid_id,
tbl_desc.cs AS a_better_name,
tbl_char.cid AS something_descriptive,
tbl_char.charname AS name
FROM tbl_desc,tbl_char
WHERE tbl_desc.cid_id = tbl_char.cid;
Once that is done, in Django your model could look like this:
class QryDescChar(models.Model):
iid_id = models.ForeignKey('WhateverIidIs', related_name='+',
db_column='iid_id', on_delete=models.DO_NOTHING)
cid_id = models.ForeignKey('WhateverCidIs', related_name='+',
db_column='cid_id', on_delete=models.DO_NOTHING)
a_better_name = models.CharField(max_length=10)
something_descriptive = models.IntegerField()
name = models.CharField(max_length=50)
class Meta:
managed = False
db_table = 'qry_desc_char'
You don't need the "_id" part on the end of the id column names, because you can declare the column name on the Django model with something more descriptive using the "db_column" argument as I did above (but here I only it to prevent Django from adding another "_id" to the end of cid_id and iid_id -- which added zero semantic value to your code). Also, note the "on_delete" argument. Django does its own thing when it comes to cascading deletes, and on an interesting data model you don't want this -- and when it comes to views you'll just get an error and an aborted transaction. Prior to Django 1.5 you have to patch it to make DO_NOTHING actually mean "do nothing" -- otherwise it will still try to (needlessly) query and collect all related objects before going through its delete cycle, and the query will fail, halting the entire operation.
Incidentally, I wrote an in-depth explanation of how to do this just the other day.
You are trying to fetch records from a view. This is not correct as a view does not map to a model, a table maps to a model.
You should use Django ORM to fetch QryDescChar objects. Please note that Django ORM will fetch them directly from the table. You can consult Django docs for extra() and select_related() methods which will allow you to fetch related data (data you want to get from the other table) in different ways.