Any way to handler IntegrityError in django? - django

In django the exception IntegrityError could be caused by a lot of reasons.
Some times this error means a unique conflict. Some other times this could be caused by some foreign key check.
Currently we can only know the root cause by convert the exception into text:
(1062, "Duplicate entry '79d3dd88917a11e98d42f000ac192cee-not_created' for key 'fs_cluster_id_dir_e8164dce_uniq'")
But this is very unfriendly for program to identify. Is there any way for code to identify the root cause of exception?
For example, if I know this is caused by a unique conflict, I can tell the client this is caused because some resouce already exist. If I know this is caused by foreign key not exist, I can tell the client this is caused by some parent resource not created.
So can any good way to identify the cause by code?

Don't know about good.
Do you need to identify the exact cause in your code? If you have an alternative way of proceeding that you want to avoid using every time because of efficiency considerations, you might just code:
try:
# the simple way
except IntegrityError as simplefail:
try:
# the complicated way
except IntegrityError as complexfail:
# log this utter failure and re-raise one or other of the caught errors

Related

Is it ever OK to catch Django ORM errors inside atomic blocks?

Example code:
with transaction.atomic():
# Create and save some models here
try:
model_instance.save()
except IntegrityError:
raise SomeCustomError()
Two questions:
1) Will this work as intended and roll back any previously saved models, given that nothing is done in the exception handler except for re-raising a custom error?
2) From a code style perspective, does it make sense to not use a nested transaction inside the try block in cases like this? (That is, only one line of code within the try block, no intention of persisting anything else within the transaction, no writes to the database inside the exception handler, etc.)
Will this work as intended and roll back any previously saved models, given that nothing is done in the exception handler except for re-raising a custom error?
Yes. The rollback will be triggered by the exception (any exception), and as long as you don't touch the database after the database error you won't risk the TransactionManagementError mentioned in the documentation.
From a code style perspective, does it make sense to not use a nested transaction inside the try block in cases like this?
Style is a matter of opinion, but I don't see any point in using a nested transaction here. It makes the code more complicated (not to mention the unnecessary savepoints in the transaction) for no discernible benefit.
You should use atomic transaction like below code. Handle atomic transaction in catching block, avoid catching exceptions inside atomic.
try:
with transaction.atomic():
SomeModel.objects.get(id=NON_EXISTENT_ID)
except SomeModel.DoesNotExist:
raise SomeCustomError()

Can't execute queries until end of atomic block in my data migration on django 1.7

I have a pretty long data migration that I'm doing to correct an earlier bad migration where some rows were created incorrectly. I'm trying to assign values to a new column based upon old ones, however, sometimes this leads to integrity errors. When this happens, I want to throw away the one that's causing the integrity error
Here is a code snippet:
def load_data(apps, schema_editor):
MyClass = apps.get_model('my_app', 'MyClass')
new_col_mapping = {old_val1: new_val1, ....}
for inst in MyClass.objects.filter(old_col=c):
try:
inst.new_col = new_col_mapping[c]
inst.save()
except IntegrityError:
inst.delete()
Then in the operations of my Migration class I do
operations = [
migrations.RunPython(load_data)
]
I get the following error when running the migration
django.db.transaction.TransactionManagementError: An error occurred in the current transaction. You can't execute queries until the end of the 'atomic' block
I get the feeling that doing
with transaction.atomic():
somewhere is my solution but I'm not exactly sure where the right place is. More importantly I'd like to understand WHY this is necessary
This is similar to the example in the docs.
First, add the required import if you don't have it already.
from django.db import transaction
Then wrap the code that might raise an integrity error in an atomic block.
try:
with transaction.atomic():
inst.new_col = new_col_mapping[c]
inst.save()
except IntegrityError:
inst.delete()
The reason for the error is explained in the warning block 'Avoid catching exceptions inside atomic!' in the docs. Once Django encounters a database error, it will roll back the atomic block. Attempting any more database queries will cause a TransactionManagementError, which you are seeing. By wrapping the code in an atomic block, only that code will be rolled back, and you can execute queries outside of the block.
Each migration is wrapped around one transaction, so when something fails during migration, all operations will be cancelled. Because of that, each transaction in which something failed, can't take new queries (they will be cancelled anyway).
Wrapping some operations with with transaction.atomic(): is not good solution, because you won't be able to cancel that operation when something will fail. Instead of that, avoid integrity errors by doing some more checks before saving data.
It seems that the same exception can have a variety of causes. In my case it was caused by an invalid model field name: I used a greek letter delta 𐤃 in my field name.
It seemed to work fine, all app worked well (perhaps I just didn't try any more complex use case). The tests, however, raised TransactionManagementError.
I solved the problem by removing 𐤃 from the field name and from all the migration files.
I faced on same issue, but I resolved it by using django.test.TransactionTestCase instead of django.test.TestCase.

IntegrityError "django_session_pkey"

I'm contributing to a django application which from time to time crashes responding with:
IntegrityError at /signin/
duplicate key value violates unique constraint "django_session_pkey"
DETAIL: Key (session_key)=(ahwvmdwpz5hg1rnu8q7ctudvfl9nfnga) already exists.
To me it makes absolutely no sense, as session key is always randomly generated and it is almost impossible to see this kind of error. When it happens it blocks the whole site, as users get this for every request, no matter what browser or system. The only way to get rid of this is to reboot machine.
Has anyone ever seen this error or has any permanent solution?
I tried to google it, but found no sensible results.

Django get_or_create vs catching IntegrityError

I want to insert several User rows in the DB. I don't really care if the insert didn't succeed, as long as I'm notified about it, which I'm able to do in both cases, so which one is better in terms of performance (speed mostly)?
Always insert the row (by calling the model's save method) and catching potential IntegrityError exceptions
Call the get_or_create method from the QuerySet class
Think about what are you doing: you want to insert rows into the database, you don't need to get the object out of it if it exists. Ask for forgiveness, and catch IntegrityError exception:
try:
user = User(username=username)
user.save()
except IntegrityError:
print "Duplicate user"
This way you would avoid an overhead of an additional get() lookup made by get_or_create().
FYI, EAFP is a common practice/coding style in Python:
Easier to ask for forgiveness than permission. This common Python
coding style assumes the existence of valid keys or attributes and
catches exceptions if the assumption proves false. This clean and fast
style is characterized by the presence of many try and except
statements.
Also see: https://stackoverflow.com/questions/6092992/why-is-it-easier-to-ask-forgiveness-than-permission-in-python-but-not-in-java

Duplicate request threads create duplicate database entries in Django model

The problem: a signal receiver checks to see if a model entry exists for certain conditions, and if not, it creates a new entry. In some rare circumstances, the entry is being duplicated.
Within the receiver function:
try:
my_instance = MyModel.objects.get(field1=value1, field2=sender)
except:
my_instance = MyModel(field1=value1, field2=sender)
my_instance.save()
It's an obvious candidate for get_or_create, but aside from cleaning up that code, would using get_or_create help prevent this problem?
The signal is sent after a user action, but I don't believe that the originating request is being duplicated because that would have trigged other actions.
The duplication has occurred a few times in thousands of instances. Is this necessarily caused by multiple requests or is there some way a duplicate thread could be created? And is there a way - perhaps with granular transaction management - to prevent the duplication?
Using Django 1.1, Python 2.4, PostgreSQL 8.1, and mod_wsgi on Apache2.
to prevent signals duplication add a "dispatch_uid" parameter to the signal attachment code as described in the docs.
make sure that you have a transaction opened - otherwise it may happen, that between checking (objects.get()) and cration (save()) state of the table changes.
Perhaps this answer may help. Apparently, a transaction is properly used with get_or_create but I've not confirmed this. mod_wsgi is multi-process and multi-threaded (both configurable), which means that race conditions can definitely occur. What I guess is happening in your application is that two separate requests are launched that will generate the same value for field1, and it just so happens that they execute with just the right timing to add 'duplicate' entries.
If the combination of MyModel(field1=value1, field2=sender) must be unique, then define a unique_together constraint on your model to further aide in integrity.