Soft delete with unique constraint in Django - django

I have models with this layout:
class SafeDeleteModel(models.Model):
.....
deleted = models.DateTimeField(editable=False, null=True)
......
class MyModel(SafeDeleteModel):
safedelete_policy = SOFT_DELETE
field1 = models.CharField(max_length=200)
field2 = models.CharField(max_length=200)
field3 = models.ForeignKey(MyModel3)
field4 = models.ForeignKey(MyModel4)
field5 = models.ForeignKey(MyModel5)
class Meta:
unique_together = [['field2', 'field3', 'field4', 'deleted'],]
The scenario here is that I never want users to delete data. Instead a delete will just hide records. However, I still want all non-soft-deleted records to respect unique key constraints. Basically, I want to have as many duplicated deleted records, but only a single unique un-deleted record can exist. So I was thinking to include "deleted" field (provided by django-safedelete library), but the issue becomes that Django's unique checks fail with "psycopg2.IntegrityError: duplicate key value violates unique constraint" for ['field2', 'field3', 'field4', 'deleted'] because NULL is not "equal to" NULL and it yields false in PostgreSQL.
Is there a way to enforce a unique_together constraint with the Django model layout as mine? Or is there a better idea to physically delete the record, then move it to an archive database, and if the user wants the record back, then software will look for the record in the archive and recreate it?

Yes, as of Django version 2.2 it is possible to use a UniqueConstraint with a condition.
Have a look at the documentation in this link: https://docs.djangoproject.com/en/2.2/ref/models/constraints/#uniqueconstraint
So your model would be something like this:
class MyModel(SafeDeleteModel):
safedelete_policy = SOFT_DELETE
field1 = models.CharField(max_length=200)
field2 = models.CharField(max_length=200)
field3 = models.ForeignKey(MyModel3)
field4 = models.ForeignKey(MyModel4)
field5 = models.ForeignKey(MyModel5)
class Meta:
constraints = [
models.UniqueConstraint(
fields=['field2', 'field3', 'field4'],
condition=Q(deleted=False),
name='unique_if_not_deleted')
]
If you are using an older version of Django that doesn't have this feature available, you can create a migration with a partial unique index (have a look at this question here: Postgresql: Conditionally unique constraint).
As for your second question (would it be better to physically delete the record and move it elsewhere), it really depends on the characteristics of your application. If these soft-deletes don't happen very often and your table is still on the small side, I would keep the records in the same table for simplicity's sake, but if the number of records in the table starts growing fast and they affect the performance of the queries on this table then you should move the records elsewhere. You have to evaluate the trade-off between complexity and performance.

Related

Django - prefetch_related GenericForeignKey results and sort them

I have the below structure, where content modules, which are subclassed from a common model, are attached to pages via a 'page module' model that references them via a GenericForeignKey:
class SitePage(models.Model):
title = models.CharField()
# [etc..]
class PageModule(models.Model):
page = models.ForeignKey(SitePage, db_index=True, on_delete=models.CASCADE)
module_type = models.ForeignKey(ContentType, on_delete=models.CASCADE)
module_id = models.PositiveIntegerField()
module_object = GenericForeignKey("module_type", "module_id")
class CommonModule(models.Model):
published_time = models.DateTimeField()
class SingleImage(CommonModule):
title = models.CharField()
# [etc..]
class Article(CommonModule):
title = models.CharField()
# [etc..]
At the moment, populating pages from this results in a LOT of SQL queries. I want to fetch all the module contents (i.e. all the SingleImage and Article instances) for a given page in the most database-efficient manner.
I can't just do a straight prefetch_related because it "must be restricted to a homogeneous set of results", and I'm fetching multiple content types.
I can get each module type individually:
image_modules = PageModule.objects.filter(page=whatever_page, module_type=ContentType.objects.get_for_model(SingleImage)).prefetch_related('module_object_')
article_modules = PageModule.objects.filter(page=whatever_page, module_type=ContentType.objects.get_for_model(Article)).prefetch_related('module_object')
all_modules = image_modules | article_modules
But I need to sort them:
all_modules.order_by('module_object__published_time')
and I can't because:
"Field 'module_object' does not generate an automatic reverse relation
and therefore cannot be used for reverse querying"
... and I don't think I can add the recommended GenericRelation field to all the content models because there's already content in there.
So... can I do this at all? Or am I stuck?
Following the advice in the comments above I eventually arrived at this code (from 2012!) that has roughly halved the number of queries:
https://gist.github.com/justinfx/3095246
However, as I noted above, it's done that at the expense of creating some fairly inefficient WHERE pk IN() queries, so I've not actually saved much time in total.

unique_together in Django doesn't work

unique_together doesn't work, it only set the unique constraints on the first field and ignore the second field. Is there any way to enforce unique constraints?
class BaseModel(models.Model):
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
deleted = models.DateTimeField(db_index=True, null=True, blank=True)
last_modified_at = models.DateTimeField(auto_now=True)
class Meta:
abstract = True
class Book(BaseModel):
first_form_number = models.CharField(max_length=8)
class Meta:
unique_together = (("first_form_number", "deleted"),)
Your models work correctly in that extent that the right unique index is created:
$ python manage.py sqlmigrate app 0001_initial
...
CREATE UNIQUE INDEX "app_base_slug_version_a455c5b7_uniq" ON "app_base" ("slug", "version");
...
(expected like the name of your application is "app")
I must roughly agree with user3541631's answer. It depends on the database in general, but all four db engines supported directly by Django are similar. They expect that "nulls are distinct in a UNIQUE column" (see NULL Handling in SQLite Versus Other Database Engines)
I verified your problem with and without null:
class Test(TestCase):
def test_without_null(self):
timestamp = datetime.datetime(2017, 8, 25, tzinfo=pytz.UTC)
book_1 = Book.objects.create(deleted=timestamp, first_form_number='a')
with self.assertRaises(django.db.utils.IntegrityError):
Book.objects.create(deleted=timestamp, first_form_number='a')
def test_with_null(self):
# this test fails !!! (and a duplicate is created)
book_1 = Book.objects.create(first_form_number='a')
with self.assertRaises(django.db.utils.IntegrityError):
Book.objects.create(first_form_number='a')
A solution is possible for PostgreSQL if you are willing to manually write a migration to create two special partial unique indexes:
CREATE UNIQUE INDEX book_2col_uni_idx ON app_book (first_form_number, deleted)
WHERE deleted IS NOT NULL;
CREATE UNIQUE INDEX book_1col_uni_idx ON app_book (first_form_number)
WHERE deleted IS NULL;
See:
Answer for Create unique constraint with null columns
Django docs Writing database migrations
Django docs migrations.RunSQL(sql)
depending on your database, it is possible that NULL isn't equal to any other NULL.
Therefore the rows you create are not the same, if one of the values is NULL, will be unique only by the non null field, in your case 'first_form_number'.
Also take in consideration that is case sensitive so "char" and "Char" are not the same.
I had a similar situation and I did my own check by overriding the save method on the model.
You check if exist in the database, but also exclude the current instance, in case of updating, not to compare with itself..
if not deleted:
exists = model.objects.exclude(pk=instance.pk).filter(first_form_number__iexact=first_form_number).exists()
Make sure you actually extend the inherited Meta class, rather than defining your own Meta class (which is ignored by Django):
class Meta(BaseModel.Meta):
unique_together = (("first_form_number", "deleted"),)

Django JOIN query without foreign key

Is there a way in Django to write a query using the ORM, not raw SQL that allows you to JOIN on another table without there being a foreign key? Looking through the documentation it appears in order for the One to One relationship to work there must be a foreign key present?
In the models below I want to run a query with a JOIN on UserActivity.request_url to UserActivityLink.url.
class UserActivity(models.Model):
id = models.IntegerField(primary_key=True)
last_activity_ip = models.CharField(max_length=45L, blank=True)
last_activity_browser = models.CharField(max_length=255L, blank=True)
last_activity_date = models.DateTimeField(auto_now_add=True)
request_url = models.CharField(max_length=255L, blank=True)
session_id = models.CharField(max_length=255L)
users_id = models.IntegerField()
class Meta:
db_table = 'user_activity'
class UserActivityLink(models.Model):
id = models.IntegerField(primary_key=True)
url = models.CharField(max_length=255L, blank=True)
url_description = models.CharField(max_length=255L, blank=True)
type = models.CharField(max_length=45L, blank=True)
class Meta:
db_table = 'user_activity_link'
The link table has a more descriptive translation of given URLs in the system, this is needed for some reporting the system will generate.
I've tried creating the foreign key from UserActivity.request_url to UserActivityLink.url but it fails with the following error: ERROR 1452: Cannot add or update a child row: a foreign key constraint fails
No, there isn't an effective way unfortunately.
The .raw() is there for this exact thing. Even if it could it probably would be a lot slower than raw SQL.
There is a blogpost here detailing how to do it with query.join() but as they themselves point out. It's not best practice.
Just reposting some related answer, so everyone could see it.
Taken from here: Most efficient way to use the django ORM when comparing elements from two lists
First problem: joining unrelated models
I'm assuming that your Model1 and Model2 are not related,
otherwise you'd be able to use Django's related objects
interface. Here are two approaches you could take:
Use extra and a SQL subquery:
Model1.objects.extra(where = ['field in (SELECT field from myapp_model2 WHERE ...)'])
Subqueries are not handled very efficiently in some databases
(notably MySQL) so this is probably not as good as #2 below.
Use a raw SQL query:
Model1.objects.raw('''SELECT * from myapp_model1
INNER JOIN myapp_model2
ON myapp_model1.field = myapp_model2.field
AND ...''')
Second problem: enumerating the result
Two approaches:
You can enumerate a query set in Python using the built-in enumerate function:
enumerate(Model1.objects.all())
You can use the technique described in this answer to do the enumeration in MySQL. Something like this:
Model1.objects.raw('''SELECT *, #row := #row + 1 AS row
FROM myapp_model1
JOIN (SELECT #row := 0) rowtable
INNER JOIN myapp_model2
ON myapp_model1.field = myapp_model2.field
AND ...''')
The Django ForeignKey is different from SQL ForeignKey. Django ForeignKey just represent a relation, it can specify whether to use database constraints.
Try this:
request_url = models.ForeignKey(UserActivityLink, to_field='url_description', null=True, on_delete=models.SET_NULL, db_constraint=False)
Note that the db_constraint=False is required, without it Django will build a SQL like:
ALTER TABLE `user_activity` ADD CONSTRAINT `xxx` FOREIGN KEY (`request_url`) REFERENCES `user_activity_link` (`url_description`);"
I met the same problem, after a lot of research, I found the above method.
Hope it helps.

Django ManyToManyField not present in Sqlite3

I'm new to Django and I have some issues with a ManyToMany relationship.
I work on a blastn automatisation and here are my classes:
class Annotation(models.Model):
sequence = models.IntegerField()
annotation = models.TextField()
start = models.IntegerField()
end = models.IntegerField()
class Blast(models.Model):
sequence = models.ManyToManyField(Annotation, through="AnnotBlast")
expectValue = models.IntegerField()
class AnnotBlast(models.Model):
id_blast = models.ForeignKey(Blast, to_field="id")
id_annot = models.ForeignKey(Annotation, to_field="id")
class Hit(models.Model):
id_hit = models.ForeignKey(Blast, to_field="id")
length = models.IntegerField()
evalue = models.IntegerField()
start_seq = models.IntegerField()
end_seq = models.IntegerField()
In a view, I want to access to Annotation's data from the rest of the model via this many to many field and then apply filters based on a form. But when I do a syncdb , the "sequence" field of the Blast class disappear :
In Sqlite3 :
.schema myApp_blast
CREATE TABLE "myApp_blast" (
"id" integer not null primary key,
"expectValue" integer not null
);
So I can't load data in this table as I want. I don't understand why this field disappear during the syncdb. How can I do to link the first class to the others (and then be able to merge data in a template) ?
A ManyToManyField isn't itself a column in the database. It's represented only by an element in the joining table, which you have here defined explicitly as AnnotBlast (note that since you're not defining any extra fields on the relationship, you didn't actually need to define a through table - Django would have done it automatically if you hadn't).
So to add data to your models, you add data to the AnnotBlast table pointing at the relevant Blast and Annotation rows.
For a many-to-many relationship an intermediate join table is created, see documentation here: https://docs.djangoproject.com/en/1.2/ref/models/fields/#id1

How can i get a list of objects from a postgresql view table to display

this is a model of the view table.
class QryDescChar(models.Model):
iid_id = models.IntegerField()
cid_id = models.IntegerField()
cs = models.CharField(max_length=10)
cid = models.IntegerField()
charname = models.CharField(max_length=50)
class Meta:
db_table = u'qry_desc_char'
this is the SQL i use to create the table
CREATE VIEW qry_desc_char as
SELECT
tbl_desc.iid_id,
tbl_desc.cid_id,
tbl_desc.cs,
tbl_char.cid,
tbl_char.charname
FROM tbl_desC,tbl_char
WHERE tbl_desc.cid_id = tbl_char.cid;
i dont know if i need a function in models or views or both. i want to get a list of objects from that database to display it. This might be easy but im new at Django and python so i having some problems
Django 1.1 brought in a new feature that you might find useful. You should be able to do something like:
class QryDescChar(models.Model):
iid_id = models.IntegerField()
cid_id = models.IntegerField()
cs = models.CharField(max_length=10)
cid = models.IntegerField()
charname = models.CharField(max_length=50)
class Meta:
db_table = u'qry_desc_char'
managed = False
The documentation for the managed Meta class option is here. A relevant quote:
If False, no database table creation
or deletion operations will be
performed for this model. This is
useful if the model represents an
existing table or a database view that
has been created by some other means.
This is the only difference when
managed is False. All other aspects of
model handling are exactly the same as
normal.
Once that is done, you should be able to use your model normally. To get a list of objects you'd do something like:
qry_desc_char_list = QryDescChar.objects.all()
To actually get the list into your template you might want to look at generic views, specifically the object_list view.
If your RDBMS lets you create writable views and the view you create has the exact structure than the table Django would create I guess that should work directly.
(This is an old question, but is an area that still trips people up and is still highly relevant to anyone using Django with a pre-existing, normalized schema.)
In your SELECT statement you will need to add a numeric "id" because Django expects one, even on an unmanaged model. You can use the row_number() window function to accomplish this if there isn't a guaranteed unique integer value on the row somewhere (and with views this is often the case).
In this case I'm using an ORDER BY clause with the window function, but you can do anything that's valid, and while you're at it you may as well use a clause that's useful to you in some way. Just make sure you do not try to use Django ORM dot references to relations because they look for the "id" column by default, and yours are fake.
Additionally I would consider renaming my output columns to something more meaningful if you're going to use it within an object. With those changes in place the query would look more like (of course, substitute your own terms for the "AS" clauses):
CREATE VIEW qry_desc_char as
SELECT
row_number() OVER (ORDER BY tbl_char.cid) AS id,
tbl_desc.iid_id AS iid_id,
tbl_desc.cid_id AS cid_id,
tbl_desc.cs AS a_better_name,
tbl_char.cid AS something_descriptive,
tbl_char.charname AS name
FROM tbl_desc,tbl_char
WHERE tbl_desc.cid_id = tbl_char.cid;
Once that is done, in Django your model could look like this:
class QryDescChar(models.Model):
iid_id = models.ForeignKey('WhateverIidIs', related_name='+',
db_column='iid_id', on_delete=models.DO_NOTHING)
cid_id = models.ForeignKey('WhateverCidIs', related_name='+',
db_column='cid_id', on_delete=models.DO_NOTHING)
a_better_name = models.CharField(max_length=10)
something_descriptive = models.IntegerField()
name = models.CharField(max_length=50)
class Meta:
managed = False
db_table = 'qry_desc_char'
You don't need the "_id" part on the end of the id column names, because you can declare the column name on the Django model with something more descriptive using the "db_column" argument as I did above (but here I only it to prevent Django from adding another "_id" to the end of cid_id and iid_id -- which added zero semantic value to your code). Also, note the "on_delete" argument. Django does its own thing when it comes to cascading deletes, and on an interesting data model you don't want this -- and when it comes to views you'll just get an error and an aborted transaction. Prior to Django 1.5 you have to patch it to make DO_NOTHING actually mean "do nothing" -- otherwise it will still try to (needlessly) query and collect all related objects before going through its delete cycle, and the query will fail, halting the entire operation.
Incidentally, I wrote an in-depth explanation of how to do this just the other day.
You are trying to fetch records from a view. This is not correct as a view does not map to a model, a table maps to a model.
You should use Django ORM to fetch QryDescChar objects. Please note that Django ORM will fetch them directly from the table. You can consult Django docs for extra() and select_related() methods which will allow you to fetch related data (data you want to get from the other table) in different ways.