Django unique_together on postgres: enforced by ORM or DB? - django

As I look at the sqlall for a models.py that contains unique_together statements, I don't notice anything that looks like enforcement.
In my mind, I can imagine that this knowledge might help the database optimize a query, like so:
"I have already found a row with spam 42 and eggs 91, so in my search for eggs 91, I no longer need to check rows with spam 42."
Am I right that this knowledge can be helpful to the DB?
Am I right that it is not enforced this way (ie, it is only enforced by the ORM)?
If yes to both, is this a flaw?

Here's an example how this should look. Assume that you have model:
class UserConnectionRequest(models.Model):
sender = models.ForeignKey(UserProfile, related_name='sent_requests')
recipient = models.ForeignKey(UserProfile, related_name='received_requests')
connection_type = models.PositiveIntegerField(verbose_name=_(u'Connection type'), \
choices=UserConnectionType.choices())
class Meta:
unique_together = (("sender", "recipient", "connection_type"),)
Running sqlall it returns:
CREATE TABLE "users_userconnectionrequest" (
"id" serial NOT NULL PRIMARY KEY,
"sender_id" integer NOT NULL REFERENCES "users_userprofile" ("id") DEFERRABLE INITIALLY DEFERRED,
"recipient_id" integer NOT NULL REFERENCES "users_userprofile" ("id") DEFERRABLE INITIALLY DEFERRED,
"connection_type" integer,
UNIQUE ("sender_id", "recipient_id", "connection_type")
)
When this model is properly synced on DB it has unique constraint (postgres):
CONSTRAINT users_userconnectionrequest_sender_id_2eec26867fa22bfa_uniq
UNIQUE (sender_id, recipient_id, connection_type),

Related

Find and delete similar records in Postgres

TL\DR version - Find and delete rows in Postgres with same date but different time, leaving one record per date.
Long read:
At some point we've migrated our app's backend to a newer version - this is a Django application - migrated from Python 2, Django 1.8 to Python 3 - Django 4, and with this update we're changed timezone for backend from UTC+2 to UTC+3. And now strange things happens - records which previously successfully have been read from db with queryset StatChildVisit.objects.filter(date=day, garden_group=garden_group) - (day is a python's date only not datetime) after update returns empty queryset, although records for that day are still in db. More so newly created records have different time in them - records created with old timezone looks like 2022-12-28 22:00:00.000000 +00:00 new records looks like 2022-12-28 21:00:00.000000 +00:00
Seems that bug happened because date field in django's model have been declared as DateTimeField -
class StatChildVisit(models.Model):
child = models.ForeignKey(Child, on_delete=models.CASCADE)
date = models.DateTimeField(default=timezone.now)
visit = models.BooleanField(_('Atended'), default=True)
disease = models.BooleanField(_('Sick'), default=False)
other_approved = models.BooleanField(_('Other approved'), default=False)
garden_group = models.ForeignKey(GardenGroup, verbose_name=_('Garden group'), editable=False, blank=True, null=True, on_delete=models.CASCADE)
rossecure_visit = models.ForeignKey('rossecure.Visits', editable=False, null=True, blank=True, on_delete=models.CASCADE)
class Meta:
verbose_name = _('Attendence')
verbose_name_plural = _('Attendence')
index_together = (
('date', 'garden_group'),
)
unique_together = (
('date', 'child'),
)
all records are always being created with date only (not datetime) passing to constructor
So we've decided to migrate this field to DateField, but after migration field type in DB is still 'timestamp with time zone', and besides, because this is a production database users after finding that some data looks like lost partially recreated records.
So now we have multiple records for same day but with different time which need to be deleted and because of constrains table column can not be altered with ALTER TABLE reports_statchildvisit ALTER COLUMN date TYPE date;
Because table have rather large records count (about 4 million) I think that problem should be solved via SQL side, and not Django side. My plan is to delete duplicates and then change column type to date.
I've tried to alter records with
update reports_statchildvisit
set date = date(date) + '21:00:00'::time
but because I've tried that after users created similar records script failed with ERROR: duplicate key value violates unique constraint
UPD: DDL on the SQL side looks like this:
create table public.reports_statchildvisit
(
id serial
primary key,
date timestamp with time zone not null,
visit boolean not null,
disease boolean not null,
child_id integer not null
constraint reports_statchil_child_id_30fbdf92a34d3fea_fk_children_child_id
references public.children_child
deferrable initially deferred,
garden_group_id integer
constraint repor_garden_group_id_ef61dd52421b5d2_fk_project_gardengroup_id
references public.project_gardengroup
deferrable initially deferred,
other_approved boolean not null,
rossecure_visit_id integer
constraint repo_rossecure_visit_id_488614f59207663f_fk_rossecure_visits_id
references public.rossecure_visits
deferrable initially deferred,
constraint reports_statchildvisit_date_3d6916481fe1e727_uniq
unique (date, child_id)
);
alter table public.reports_statchildvisit
owner to django;
create index reports_statchildvisit_10e12719
on public.reports_statchildvisit (garden_group_id);
create index reports_statchildvisit_42d2af72
on public.reports_statchildvisit (rossecure_visit_id);
create index reports_statchildvisit_date_66064e65c46d4137_idx
on public.reports_statchildvisit (date, garden_group_id);
create index reports_statchildvisit_f36263a3
on public.reports_statchildvisit (child_id);

django orm: select_related, fooling reverse foreign key with a fake foreign key addition to the model, what can go wrong?

I am trying to learn how to use Django's ORM for more advanced queries, instead of using raw sql.
select_related makes joins to reduce database hits, in principle it could make the joins that I would do manually.
But there is a problem: It doesn't use reverse foreign key relationships to make the sql. For the schema I have this is a nuisance. I have found what looks like a surprisingly easy work around, and I'm worried that it will be incorrect in ways I don't understand.
I'm using a legacy DB, and it is read-only, and therefore unmanaged by the ORM, and there are no consequences for cascade settings. I have made models for its tables via manage.py analysedb which did a very good job; I have a 6000 line model file, and only needed to spend ten minutes fixing things.
One tweak I did was a hack to fool select_related by creating a foreign key relationship which is actually a reverse FK relationship. This means I have two entries in my model pointing to the same database column, which is the primary key (correct) and also now a foreign key. I exploit the "new" column in select_related.
The main table for my query is Service. It has foreign key relationships to a number of other tables, and that works well with select_related.
The column servrecid_1 is my artificial addition. The other two columns are two of the genuine FKs; there are more.
class Service(models.Model):
servrecid = models.AutoField(db_column='ServRecID', primary_key=True)
servrecid_1 = models.ForeignKey('RecServ', models.DO_NOTHING, to_field='servrecid',
db_column='ServRecID') # Field name made lowercase.
visitrecordid = models.ForeignKey('Visit', models.DO_NOTHING, db_column='VisitRecordID', blank=True,
null=True) # Field name made lowercase.
itemno = models.ForeignKey(Fees, models.DO_NOTHING, db_column='ItemNo', to_field='itemno')
...
class Meta:
managed = False # same for all the models
class RecServ(models.Model):
allocationid = models.AutoField(db_column='AllocationID', primary_key=True) # Field name made lowercase.
servrecid = models.ForeignKey('Service', models.DO_NOTHING, db_column='ServRecID') # Field name made lowercase.
receiptno = models.ForeignKey(Receipt, models.DO_NOTHING, db_column='ReceiptNo') # Field name made lowercase.
(There are more relations than I show in the snippets above)
With this, I can now do queries like this:
q = models.Service.objects.select_related('visitrecordid__servdoctor').select_related('visitrecordid__invoiceto') \
.select_related('visitrecordid__patientno').select_related('itemno').select_related(
'servrecid_1__receiptno').all()[:5]
which creates a query with these joins:
... FROM [SERVICE] INNER JOIN [REC_SERV] ON ([SERVICE].[ServRecID] = [REC_SERV].[ServRecID])
INNER JOIN [RECEIPT] ON ([REC_SERV].[ReceiptNo] = [RECEIPT].[ReceiptNo])
LEFT OUTER JOIN [VISIT] ON ([SERVICE].[VisitRecordID] = [VISIT].[VisitRecordID])
LEFT OUTER JOIN [CM_PATIENT] ON ([VISIT].[PatientNo] = [CM_PATIENT].[PATIENT_ID])
LEFT OUTER JOIN [DOCTOR] ON ([VISIT].[ServDoctor] = [DOCTOR].[DoctorCode])
LEFT OUTER JOIN [INVOICETO] ON ([VISIT].[InvoiceTo] = [INVOICETO].[InvoiceTo])
INNER JOIN [FEES] ON ([SERVICE].[ItemNo] = [FEES].[ItemNo])
The first join only appears because of my false FK. The SQL looks fine to me, I think I have solved my problem.
Should this actually work? What will happen now with this mutual foreign key relationship between both tables?

unique_together in Django doesn't work

unique_together doesn't work, it only set the unique constraints on the first field and ignore the second field. Is there any way to enforce unique constraints?
class BaseModel(models.Model):
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
deleted = models.DateTimeField(db_index=True, null=True, blank=True)
last_modified_at = models.DateTimeField(auto_now=True)
class Meta:
abstract = True
class Book(BaseModel):
first_form_number = models.CharField(max_length=8)
class Meta:
unique_together = (("first_form_number", "deleted"),)
Your models work correctly in that extent that the right unique index is created:
$ python manage.py sqlmigrate app 0001_initial
...
CREATE UNIQUE INDEX "app_base_slug_version_a455c5b7_uniq" ON "app_base" ("slug", "version");
...
(expected like the name of your application is "app")
I must roughly agree with user3541631's answer. It depends on the database in general, but all four db engines supported directly by Django are similar. They expect that "nulls are distinct in a UNIQUE column" (see NULL Handling in SQLite Versus Other Database Engines)
I verified your problem with and without null:
class Test(TestCase):
def test_without_null(self):
timestamp = datetime.datetime(2017, 8, 25, tzinfo=pytz.UTC)
book_1 = Book.objects.create(deleted=timestamp, first_form_number='a')
with self.assertRaises(django.db.utils.IntegrityError):
Book.objects.create(deleted=timestamp, first_form_number='a')
def test_with_null(self):
# this test fails !!! (and a duplicate is created)
book_1 = Book.objects.create(first_form_number='a')
with self.assertRaises(django.db.utils.IntegrityError):
Book.objects.create(first_form_number='a')
A solution is possible for PostgreSQL if you are willing to manually write a migration to create two special partial unique indexes:
CREATE UNIQUE INDEX book_2col_uni_idx ON app_book (first_form_number, deleted)
WHERE deleted IS NOT NULL;
CREATE UNIQUE INDEX book_1col_uni_idx ON app_book (first_form_number)
WHERE deleted IS NULL;
See:
Answer for Create unique constraint with null columns
Django docs Writing database migrations
Django docs migrations.RunSQL(sql)
depending on your database, it is possible that NULL isn't equal to any other NULL.
Therefore the rows you create are not the same, if one of the values is NULL, will be unique only by the non null field, in your case 'first_form_number'.
Also take in consideration that is case sensitive so "char" and "Char" are not the same.
I had a similar situation and I did my own check by overriding the save method on the model.
You check if exist in the database, but also exclude the current instance, in case of updating, not to compare with itself..
if not deleted:
exists = model.objects.exclude(pk=instance.pk).filter(first_form_number__iexact=first_form_number).exists()
Make sure you actually extend the inherited Meta class, rather than defining your own Meta class (which is ignored by Django):
class Meta(BaseModel.Meta):
unique_together = (("first_form_number", "deleted"),)

How can I set a table constraint "deferrable initially deferred" in django model?

I am trying to set a constraint to a table model in django with a postgresql database.
I can do it via postgresql with this sentence:
ALTER TABLE public.mytable ADD CONSTRAINT "myconstraint" UNIQUE(field1, field2) DEFERRABLE INITIALLY DEFERRED;
But I want to do it via django model.
Reading the django official documentation I have not found anything related.
I need something like this:
class Meta:
unique_together = (('field1', 'field2',), DEFERRABLE INITIALLY DEFERRED)
Is it possible to do something like this?
I would do this via a single migration. First programatically get the unique constraint name, then drop and re-add (since altering it seems to only work for FK constraints, not unique constraints). Add reverse migration that undoes this too.
from django.db import migrations, connection
def _make_deferrable(apps, schema_editor):
"""
Change the unique constraint to be deferrable
"""
# Get the db name of the constraint
MyModel = apps.get_model('myapp', 'MyModel')
CONSTRAINT_NAME = schema_editor._constraint_names(MYModel,
['col1', 'col2'],
unique=True)[0]
TABLE_NAME = MyModel._meta.db_table
# Drop then re-add with deferrable as ALTER doesnt seem to work for unique constraints in psql
with schema_editor.connection.create_cursor() as curs:
curs.execute(
f'ALTER TABLE {TABLE_NAME} DROP CONSTRAINT "{CONSTRAINT_NAME}";'
)
curs.execute(
f'ALTER TABLE {TABLE_NAME} ADD CONSTRAINT'
f' {CONSTRAINT_NAME}'
f' UNIQUE (col1, col2) DEFERRABLE INITIALLY DEFERRED;'
)
def _unmake_deferrable(apps, schema_editor):
"""
Reverse the unique constraint to be not deferrable
"""
# Get the db name of unique constraint
MyModel = apps.get_model('myapp', 'MyModel')
CONSTRAINT_NAME = schema_editor._constraint_names(MyModel,
['col1', 'col2'],
unique=True)[0]
TABLE_NAME = MyModel._meta.db_table
with schema_editor.connection.create_cursor() as curs:
curs.execute(
f'ALTER TABLE {TABLE_NAME} DROP CONSTRAINT "{CONSTRAINT_NAME}";'
)
curs.execute(
f'ALTER TABLE {TABLE_NAME} ADD CONSTRAINT'
f' {CONSTRAINT_NAME}'
f' UNIQUE (col1, col2) NOT DEFERRABLE;'
)
class Migration(migrations.Migration):
dependencies = [
('myapp', '<previous_mig>'),
]
operations = [
migrations.RunPython(code=_make_deferrable, reverse_code=_unmake_deferrable)
]
Very recently, Django has added support for this feature (See ticket). Starting from Django 3.1 you can write:
class UniqueConstraintDeferrable(models.Model):
name = models.CharField(max_length=255)
shelf = models.CharField(max_length=31)
class Meta:
required_db_features = {
'supports_deferrable_unique_constraints',
}
constraints = [
models.UniqueConstraint(
fields=['name'],
name='name_init_deferred_uniq',
deferrable=models.Deferrable.DEFERRED,
),
models.UniqueConstraint(
fields=['shelf'],
name='sheld_init_immediate_uniq',
deferrable=models.Deferrable.IMMEDIATE,
),
]
Django doesn't support that.
You can do it with custom SQL. In your models.py, add this:
from django.db import connection
from django.db.models.signals import post_migrate
def after_migrate(sender, **kwargs):
cursor = connection.cursor()
cursor.execute('ALTER TABLE public.mytable ALTER CONSTRAINT '
'myconstraint DEFERRABLE INITIALLY DEFERRED')
post_migrate.connect(after_migrate)
Although I've done such things in the past, I've found that over the years I prefer to keep my work simpler and independent from any specific RDBMS. For example, you really want to support SQLite, because it makes development so much easier. With a little change in design you can often get rid of such stuff.
Update: I think #fpghost's answer is better. I don't know what I was thinking :-)

Adding values in my database via a ManyToMany relationship represented in admin.py

I've got a tiny little problem that, unfortunately, is taking all my time.
It is really simple, I already have my database and I created then modified models.py, and admin.py. Some staff users, who will need to enter values in my database, need the simpliest form to do so.
Here is my database :
-- Table NGSdb.line
CREATE TABLE IF NOT EXISTS `NGSdb`.`line` (
`id` INT NOT NULL AUTO_INCREMENT ,
`value` INT NOT NULL ,
PRIMARY KEY (`id`) )
ENGINE = InnoDB;
CREATE UNIQUE INDEX `value_UNIQUE` ON `NGSdb`.`line` (`value` ASC) ;
-- Table NGSdb.run_has_sample_lines
CREATE TABLE IF NOT EXISTS `NGSdb`.`run_has_sample_lines` (
`line_id` INT NOT NULL ,
`runhassample_id` INT NOT NULL ,
PRIMARY KEY (`line_id`, `runhassample_id`) ,
CONSTRAINT `fk_sample_has_line_line1`
FOREIGN KEY (`line_id` )
REFERENCES `NGSdb`.`line` (`id` )
ON DELETE NO ACTION
ON UPDATE NO ACTION,
CONSTRAINT `fk_sample_has_line_run_has_sample1`
FOREIGN KEY (`runhassample_id` )
REFERENCES `NGSdb`.`run_has_sample` (`id` )
ON DELETE NO ACTION
ON UPDATE NO ACTION)
-- Table NGSdb.run_has_sample
CREATE TABLE IF NOT EXISTS `NGSdb`.`run_has_sample` (
`id` INT NOT NULL AUTO_INCREMENT ,
`run_id` INT NOT NULL ,
`sample_id` INT NOT NULL ,
`dna_quantification_ng_per_ul` FLOAT NULL ,
PRIMARY KEY (`id`, `run_id`, `sample_id`) ,
CONSTRAINT `fk_run_has_sample_run1`
FOREIGN KEY (`run_id` )
REFERENCES `NGSdb`.`run` (`id` )
ON DELETE NO ACTION
ON UPDATE NO ACTION,
CONSTRAINT `fk_run_has_sample_sample1`
FOREIGN KEY (`sample_id` )
REFERENCES `NGSdb`.`sample` (`id` )
ON DELETE NO ACTION
ON UPDATE NO ACTION)
Here is my models.py :
class Run(models.Model):
id = models.AutoField(primary_key=True)
start_date = models.DateField(null=True, blank=True, verbose_name='start date')
end_date = models.DateField(null=True, blank=True, verbose_name='end date')
project = models.ForeignKey(Project)
sequencing_type = models.ForeignKey(SequencingType)
def __unicode__(self):
return u"run started %s from the project %s" % (self.start_date,self.project)
class Line(models.Model):
id = models.AutoField(primary_key=True)
value = models.IntegerField()
def __unicode__(self):
return u"%s" % str(self.value)
class RunHasSample(models.Model):
id = models.AutoField(primary_key=True)
run = models.ForeignKey(Run)
sample = models.ForeignKey(Sample)
dna_quantification_ng_per_ul = models.FloatField(null=True, blank=True)
lines = models.ManyToManyField(Line)
def __unicode__(self):
return u"Sample %s from run %s" % (self.sample, self.run)
And here is my admin.py :
class RunHasSamplesInLine(admin.TabularInline):
model = RunHasSample
fields = ['sample', 'dna_quantification_ng_per_ul', 'lines']
extra = 6
class RunAdmin(admin.ModelAdmin):
fields = ['project', 'start_date', 'end_date', 'sequencing_type']
inlines = [RunHasSamplesInLine]
list_display = ('project', 'start_date', 'end_date', 'sequencing_type')
As you can see, my samples are displayed in lines in the run form so that the staff can easily fullfill the database.
When I try to fill the database I have this error :
(1054, "Unknown column 'run_has_sample_lines.id' in 'field list'")
Of course, there are no field "lines" in my database ! It is a many to many field so I already created my intermediate table !
Okay okay ! So I tried to create the model for the intermediate table (run_has_sample_lines) and add a "through" to the ManyToManyField in the RunHasSample model. But, as I add manually the "through", I cannot use the ManyToMany field. The only way to add lines to the admin view is to stack them in lines... As you can see the samples are already in lines, it is impossible to put a new "inlines" in the already in lines samples...
Finally, I just tried to see what django had created with the manage.py sqlall.
I see that :
CREATE TABLE `run_has_sample_lines` (
`id` integer AUTO_INCREMENT NOT NULL PRIMARY KEY,
`runhassample_id` integer NOT NULL,
`line_id` integer NOT NULL,
UNIQUE (`runhassample_id`, `line_id`)
)
;
ALTER TABLE `run_has_sample_lines` ADD CONSTRAINT `line_id_refs_id_4f0766aa` FOREIGN KEY (`line_id`) REFERENCES `line` (`id`);
It seems that there are no foreign key on the run_has_sample table whereas I created it in the database in the first place. I guess that the problem is coming from here but I cannot resolve it and I really hope that you can...
Thank you very much !
you may wish to try a 'through' attribute on the many-to-many relationship and declare your intermediate table in Django.
I found where the problem is...
It is not a problem in the ManyToManyField but in the intermediate table. Django refused that my intermediate table doesn't have an unique id !
So, in the sql which created django, it created automatically an unique id named "id", but in my database I didn't create one (because the couple of two foreign key is usually enough).
Next time, I'll be more carefull.