Django CheckConstraint not enforcing check on model field - django

I am using Django 3.2.
Backend database used for testing: sqlite3
I have a model Foo:
class Foo(models.Model):
# some fields ...
some_count = models.IntegerField()
class Meta:
models.constraints = [
models.CheckConstraint(
check = ~models.Q(some_count=0),
name = 'check_some_count',
),
]
I also have a unit test like this:
def test_Foo_some_count_field_zero_value(self):
# Wrap up the db error in a transaction, so it doesn't get percolated up the stack
with transaction.atomic():
with self.assertRaises(IntegrityError) as context:
baker.make(Foo, some_count=0)
When my unit tests are run, it fails at the test above, with the error message:
baker.make(Foo, some_count=0)
AssertionError: IntegrityError not raised
I then changed the CheckConstraint attribute above to:
class Meta:
models.constraints = [
models.CheckConstraint(
check = models.Q(some_count__lt=0) | models.Q(some_count__gt=0),
name = 'check_some_count',
),
]
The test still failed, with the same error message. I then tried this to check if constraints were being enforced at all:
def test_Foo_some_count_field_zero_value(self):
foo = baker.make(Foo, some_count=0)
self.assertEqual(foo.some_count, 0)
To my utter dismay, the test passed - clearly showing that the constraint check was being ignored. I've done a quick lookup online to see if this is a known issue with sqlite3, but I haven't picked up anything on my radar yet - so why is the constraint check being ignored - and how do I fix it (without overriding models.Model.clean() ) ?

Related

DB constraints vs. clean() method in Django

After creating a clean() method to avoid overlapping date ranges in an admin form, I added an ExclusionContraint to ensure integrity at the DB level, too:
class DateRangeFunc(models.Func):
function = 'daterange'
output_field = DateRangeField()
class Occupancy(models.Model):
unit = models.ForeignKey(Unit, on_delete=models.CASCADE)
number_of = models.IntegerField()
begin = models.DateField()
end = models.DateField(default=datetime.date(9999,12,31))
class Meta:
constraints = [
ExclusionConstraint(
name="exclude_overlapping_occupancies",
expressions=(
(
DateRangeFunc(
"begin", "end", RangeBoundary(inclusive_lower=True, inclusive_upper=True)
),
RangeOperators.OVERLAPS,
),
("unit", RangeOperators.EQUAL),
),
),
]
This constraint works as expected, but it seems to precede clean(), because any overlap raises an IntegrityError for the admin form. I would have expected that clean() is called first.
I have two questions (related, but not identical to this question):
Is there any way to change the order of evaluation (clean() → ExclusionConstraint)?
Which method (save()?) would I need to override to catch the IntegrityError raised by the constraint?
[Django 4.1.5/Python 3.11.1/PostgreSQL 14.6]

Django: Model with managed = False automatically includes id in return query

I have a queryset based on a Model that has managed = False in the meta, and django is automatically adding an id field to the queryset even when not included in the .values() list.
class GenerateTimeSeriesOuterModel(models.Model):
class Meta:
db_table = 'GenerateTimeSeriesOuterModel'
app_label = 'GenerateTimeSeriesOuterModel'
managed = False
# Used by TableSubquery(BaseTable): class in Subquery.py to generate 'from (subquery)'
subquery = None
objects = SubqueryManager()
The model has the managed flag set to False, and uses a custom manager. When I try to do something like this:
GenerateTimeSeriesOuterModel.objects.all().set_subquery(some_subquery).annotate(
'foo_bar': F('foo_bar')
).values('foo_bar')
This returns the following query:
select
subquery.id,
subquery.foo_bar
from (
select something as foo_bar from some_table
) as subquery
as you can see id is added automatically but does not exist from the subquery so it throws an error. Any thoughts on removing the id?

Using Django's CheckConstraint with annotations

I have a Django model where each instance requires a unique identifier that is derived from three fields:
class Example(Model):
type = CharField(blank=False, null=False) # either 'A' or 'B'
timestamp = DateTimeField(default=timezone.now)
number = models.IntegerField(null=True) # a sequential number
This produces a label of the form [type][timestamp YEAR][number], which must be unique unless number is null.
I thought I might be able to use a couple of annotations:
uid_expr = Case(
When(
number=None,
then=Value(None),
),
default=Concat(
'type', ExtractYear('timestamp'), 'number',
output_field=models.CharField()
),
output_field=models.CharField()
)
uid_count_expr = Count('uid', distinct=True)
I overrode the model's manager's get_queryset to apply the annotations by default and then tried to use CheckConstraint:
class Example(Model):
...
class Meta:
constraints = [
models.CheckConstraint(check=Q(uid_cnt=1), name='unique_uid')
]
This fails because it's unable to find a field on the instance called uid_cnt, however I thought annotations were accessible to Q objects. It looks like CheckConstraint queries against the model directly rather than using the queryset returned by the manager:
class CheckConstraint(BaseConstraint):
...
def _get_check_sql(self, model, schema_editor):
query = Query(model=model)
...
Is there a way to apply a constraint to an annotation? Or is there a better approach?
I'd really like to enforce this at the db layer.
Thanks.
This is pseudo-code, but try:
class Example(Model):
...
class Meta:
constraints = [
models.UniqueConstraint(
fields=['type', 'timestamp__year', 'number'],
condition=Q(number__isnull=False),
name='unique_uid'
)
]

Django Tests: setUpTestData on Postgres throws: "Duplicate key value violates unique constraint"

I am running into a database issue in my unit tests. I think it has something to do with the way I am using TestCase and setUpData.
When I try to set up my test data with certain values, the tests throw the following error:
django.db.utils.IntegrityError: duplicate key value violates unique constraint
...
psycopg2.IntegrityError: duplicate key value violates unique constraint "InventoryLogs_productgroup_product_name_48ec6f8d_uniq"
DETAIL: Key (product_name)=(Almonds) already exists.
I changed all of my primary keys and it seems to be running fine. It doesn't seem to affect any of the tests.
However, I'm concerned that I am doing something wrong. When it first happened, I reversed about an hour's worth of work on my app (not that much code for a noob), which corrected the problem.
Then when I wrote the changes back in, the same issue presented itself again. TestCase is pasted below. The issue seems to occur after I add the sortrecord items, but corresponds with the items above it.
I don't want to keep going through and changing primary keys and urls in my tests, so if anyone sees something wrong with the way I am using this, please help me out. Thanks!
TestCase
class DetailsPageTest(TestCase):
#classmethod
def setUpTestData(cls):
cls.product1 = ProductGroup.objects.create(
product_name="Almonds"
)
cls.variety1 = Variety.objects.create(
product_group = cls.product1,
variety_name = "non pareil",
husked = False,
finished = False,
)
cls.supplier1 = Supplier.objects.create(
company_name = "Acme",
company_location = "Acme Acres",
contact_info = "Call me!"
)
cls.shipment1 = Purchase.objects.create(
tag=9,
shipment_id=9999,
supplier_id = cls.supplier1,
purchase_date='2015-01-09',
purchase_price=9.99,
product_name=cls.variety1,
pieces=99,
kgs=999,
crackout_estimate=99.9
)
cls.shipment2 = Purchase.objects.create(
tag=8,
shipment_id=8888,
supplier_id=cls.supplier1,
purchase_date='2015-01-08',
purchase_price=8.88,
product_name=cls.variety1,
pieces=88,
kgs=888,
crackout_estimate=88.8
)
cls.shipment3 = Purchase.objects.create(
tag=7,
shipment_id=7777,
supplier_id=cls.supplier1,
purchase_date='2014-01-07',
purchase_price=7.77,
product_name=cls.variety1,
pieces=77,
kgs=777,
crackout_estimate=77.7
)
cls.sortrecord1 = SortingRecords.objects.create(
tag=cls.shipment1,
date="2015-02-05",
bags_sorted=20,
turnout=199,
)
cls.sortrecord2 = SortingRecords.objects.create(
tag=cls.shipment1,
date="2015-02-07",
bags_sorted=40,
turnout=399,
)
cls.sortrecord3 = SortingRecords.objects.create(
tag=cls.shipment1,
date='2015-02-09',
bags_sorted=30,
turnout=299,
)
Models
from datetime import datetime
from django.db import models
from django.db.models import Q
class ProductGroup(models.Model):
product_name = models.CharField(max_length=140, primary_key=True)
def __str__(self):
return self.product_name
class Meta:
verbose_name = "Product"
class Supplier(models.Model):
company_name = models.CharField(max_length=45)
company_location = models.CharField(max_length=45)
contact_info = models.CharField(max_length=256)
class Meta:
ordering = ["company_name"]
def __str__(self):
return self.company_name
class Variety(models.Model):
product_group = models.ForeignKey(ProductGroup)
variety_name = models.CharField(max_length=140)
husked = models.BooleanField()
finished = models.BooleanField()
description = models.CharField(max_length=500, blank=True)
class Meta:
ordering = ["product_group_id"]
verbose_name_plural = "Varieties"
def __str__(self):
return self.variety_name
class PurchaseYears(models.Manager):
def purchase_years_list(self):
unique_years = Purchase.objects.dates('purchase_date', 'year')
results_list = []
for p in unique_years:
results_list.append(p.year)
return results_list
class Purchase(models.Model):
tag = models.IntegerField(primary_key=True)
product_name = models.ForeignKey(Variety, related_name='purchases')
shipment_id = models.CharField(max_length=24)
supplier_id = models.ForeignKey(Supplier)
purchase_date = models.DateField()
estimated_delivery = models.DateField(null=True, blank=True)
purchase_price = models.DecimalField(max_digits=6, decimal_places=3)
pieces = models.IntegerField()
kgs = models.IntegerField()
crackout_estimate = models.DecimalField(max_digits=6,decimal_places=3, null=True)
crackout_actual = models.DecimalField(max_digits=6,decimal_places=3, null=True)
objects = models.Manager()
purchase_years = PurchaseYears()
# Keep manager as "objects" in case admin, etc. needs it. Filter can be called like so:
# Purchase.objects.purchase_years_list()
# Managers in docs: https://docs.djangoproject.com/en/1.8/intro/tutorial01/
class Meta:
ordering = ["purchase_date"]
def __str__(self):
return self.shipment_id
def _weight_conversion(self):
return round(self.kgs * 2.20462)
lbs = property(_weight_conversion)
class SortingModelsBagsCalulator(models.Manager):
def total_sorted(self, record_date, current_set):
sorted = [SortingRecords['bags_sorted'] for SortingRecords in current_set if
SortingRecords['date'] <= record_date]
return sum(sorted)
class SortingRecords(models.Model):
tag = models.ForeignKey(Purchase, related_name='sorting_record')
date = models.DateField()
bags_sorted = models.IntegerField()
turnout = models.IntegerField()
objects = models.Manager()
def __str__(self):
return "%s [%s]" % (self.date, self.tag.tag)
class Meta:
ordering = ["date"]
verbose_name_plural = "Sorting Records"
def _calculate_kgs_sorted(self):
kg_per_bag = self.tag.kgs / self.tag.pieces
kgs_sorted = kg_per_bag * self.bags_sorted
return (round(kgs_sorted, 2))
kgs_sorted = property(_calculate_kgs_sorted)
def _byproduct(self):
waste = self.kgs_sorted - self.turnout
return (round(waste, 2))
byproduct = property(_byproduct)
def _bags_remaining(self):
current_set = SortingRecords.objects.values().filter(~Q(id=self.id), tag=self.tag)
sorted = [SortingRecords['bags_sorted'] for SortingRecords in current_set if
SortingRecords['date'] <= self.date]
remaining = self.tag.pieces - sum(sorted) - self.bags_sorted
return remaining
bags_remaining = property(_bags_remaining)
EDIT
It also fails with integers, like so.
django.db.utils.IntegrityError: duplicate key value violates unique constraint "InventoryLogs_purchase_pkey"
DETAIL: Key (tag)=(9) already exists.
UDPATE
So I should have mentioned this earlier, but I completely forgot. I have two unit test files that use the same data. Just for kicks, I matched a primary key in both instances of setUpTestData() to a different value and sure enough, I got the same error.
These two setups were working fine side-by-side before I added more data to one of them. Now, it appears that they need different values. I guess you can only get away with using repeat data for so long.
I continued to get this error without having any duplicate data but I was able to resolve the issue by initializing the object and calling the save() method rather than creating the object via Model.objects.create()
In other words, I did this:
#classmethod
def setUpTestData(cls):
cls.person = Person(first_name="Jane", last_name="Doe")
cls.person.save()
Instead of this:
#classmethod
def setUpTestData(cls):
cls.person = Person.objects.create(first_name="Jane", last_name="Doe")
I've been running into this issue sporadically for months now. I believe I just figured out the root cause and a couple solutions.
Summary
For whatever reason, it seems like the Django test case base classes aren't removing the database records created by let's just call it TestCase1 before running TestCase2. Which, in TestCase2 when it tries to create records in the database using the same IDs as TestCase1 the database raises a DuplicateKey exception because those IDs already exists in the database. And even saying the magic word "please" won't help with database duplicate key errors.
Good news is, there are multiple ways to solve this problem! Here are a couple...
Solution 1
Make sure if you are overriding the class method tearDownClass that you call super().tearDownClass(). If you override tearDownClass() without calling its super, it will in turn never call TransactionTestCase._post_teardown() nor TransactionTestCase._fixture_teardown(). Quoting from the doc string in TransactionTestCase._post_teardown()`:
def _post_teardown(self):
"""
Perform post-test things:
* Flush the contents of the database to leave a clean slate. If the
class has an 'available_apps' attribute, don't fire post_migrate.
* Force-close the connection so the next test gets a clean cursor.
"""
If TestCase.tearDownClass() is not called via super() then the database is not reset in between test cases and you will get the dreaded duplicate key exception.
Solution 2
Override TransactionTestCase and set the class variable serialized_rollback = True, like this:
class MyTestCase(TransactionTestCase):
fixtures = ['test-data.json']
serialized_rollback = True
def test_name_goes_here(self):
pass
Quoting from the source:
class TransactionTestCase(SimpleTestCase):
...
# If transactions aren't available, Django will serialize the database
# contents into a fixture during setup and flush and reload them
# during teardown (as flush does not restore data from migrations).
# This can be slow; this flag allows enabling on a per-case basis.
serialized_rollback = False
When serialized_rollback is set to True, Django test runner rolls back any transactions inserted into the database beween test cases. And batta bing, batta bang... no more duplicate key errors!
Conclusion
There are probably many more ways to implement a solution for the OP's issue, but these two should work nicely. Would definitely love to have more solutions added by others for clarity sake and a deeper understanding of the underlying Django test case base classes. Phew, say that last line real fast three times and you could win a pony!
The log you provided states DETAIL: Key (product_name)=(Almonds) already exists. Did you verify in your db?
To prevent such errors in the future, you should prefix all your test data string by test_
I discovered the issue, as noted at the bottom of the question.
From what I can tell, the database didn't like me using duplicate data in the setUpTestData() methods of two different tests. Changing the primary key values in the second test corrected the problem.
I think the problem here is that you had a tearDownClass method in your TestCase without the call to super method.
In this way the django TestCase lost the transactional functionalities behind the setUpTestData so it doesn't clean your test db after a TestCase is finished.
Check warning in django docs here:
https://docs.djangoproject.com/en/1.10/topics/testing/tools/#django.test.SimpleTestCase.allow_database_queries
I had similar problem that had been caused by providing the primary key value to a test case explicitly.
As discussed in the Django documentation, manually assigning a value to an auto-incrementing field doesn’t update the field’s sequence, which might later cause a conflict.
I have solved it by altering the sequence manually:
from django.db import connection
class MyTestCase(TestCase):
#classmethod
def setUpTestData(cls):
Model.objects.create(id=1)
with connection.cursor() as c:
c.execute(
"""
ALTER SEQUENCE "app_model_id_seq" RESTART WITH 2;
"""
)

get_or_create failure with Django and Postgres (duplicate key value violates unique constraint)

Thanks for taking time to read my question.
I have a django app with the following model:
class UserProfile(models.Model):
user = models.OneToOneField(User)
...
class Visit(models.Model):
profile = models.ForeignKey(UserProfile)
date = models.DateField(auto_now_add=True, db_index=True)
ip = models.IPAddressField()
class Meta:
unique_together = ('profile', 'date', 'ip')
In a view:
profile = get_object_or_404(Profile, pk = ...)
get, create = Visit.objects.get_or_create(profile=profile, date=now.date(), ip=request.META['REMOTE_ADDR'])
if create: DO SOMETHING
Everything works fine, except that the Postgres Logs are full with duplicate key errors:
2012-02-15 14:13:44 CET ERROR: duplicate key value violates unique constraint "table_visit_profile_id_key"
2012-02-15 14:13:44 CET STATEMENT: INSERT INTO "table_visit" ("profile_id", "date", "ip") VALUES (1111, E'2012-02-15', E'xx.xx.xxx.xxx') RETURNING "table_visit"."id"
Tried different solution e.g.
from django.db import transaction
from django.db import IntegrityError
#transaction.commit_on_success
def my_get_or_create(prof, ip):
try:
object = Visit.objects.create(profile=prof, date=datetime.now().date(), ip=ip)
except IntegrityError:
transaction.commit()
object = Visit.objects.get(profile=prof, date=datetime.now().date(), ip=ip)
return object
....
created = my_get_or_create(prof, request.META['REMOTE_ADDR'])
if created: DO SOMETHING
This only helps for MySQL? Does anyone know how to avaid the duplicate key value errors for postgres?
Another possible reason for these errors in get_or_create() is data type mismatch in one of the search fields - for example passing False instead of None into a nullable field. The .get() inside .get_or_create() will not find it and Django will continue with new row creation - which will fail due to PostgreSQL constraints.
I had issues with get_or_create when using postgres. In the end I abandoned the boilerplate code for traditional:
try:
jobInvite = Invite.objects.get(sender=employer.user, job=job)
except Invite.DoesNotExist:
jobInvite = Invite(sender=employer.user, job=job)
jobInvite.save()
# end try
Have you at some point had unique=True set on Visit's profile field?
It looks like there's been a unique constraint generated for postgres that's still in effect. "table_visit_profile_id_key" is what it's auto generated name would be, and naturally it would cause those errors if you're recording multiple visits for a user.
If this is the case, are you using South to manage your database changes? If you aren't, grab it!
PostgreSQL behaves somewhat differently in some subtle queries, which results in IntegrityError errors, especially after you switch to Django 1.6. Here's the solution - you need to add select_on_save option to each failing model:
class MyModel(models.Model):
...
class Meta:
select_on_save = True
It's documented here: Options.select_on_save