Commit SQL even inside atomic transaction (django) - django

How can I always commit a insert even inside an atomic transaction? In this case I need just one point to be committed and everything else rolled back.
For example, my view, decorator contains with transaction.atomic() and other stuffs:
#my_custom_decorator_with_transaction_atomic
def my_view(request):
my_core_function()
return ...
def my_core_function():
# many sql operations that need to rollback in case of error
try:
another_operation()
except MyException:
insert_activity_register_on_db() # This needs to be in DB, and not rolled back
raise MyException()
I would not like to make another decorator for my view without transaction atomic and do it manually on core. Is there a way?

Related

Is #transaction.atomic cheap?

This is mostly curiosity, but is the DB penalty for wrapping an entire view with #transaction.atomic a negligible one?
I'm thinking of views where the GET of a form or its re-display after a validation fail involves processing querysets. (ModelChoiceFields, for example, or fetching an object that the template displays.)
It seems to me to be far more natural to use with transaction.atomic() around the block of code which actually alters a bunch of related DB objects only after the user's inputs have validated.
Am I missing something?
From the source code:
def atomic(using=None, savepoint=True, durable=False):
# Bare decorator: #atomic -- although the first argument is called
# `using`, it's actually the function being decorated.
if callable(using):
return Atomic(DEFAULT_DB_ALIAS, savepoint, durable)(using)
# Decorator: #atomic(...) or context manager: with atomic(...): ...
else:
return Atomic(using, savepoint, durable)
It's the same. In both cases the function is returning an Atomic object which handles whether the transaction should commit or not.

Using django select_for_update without rolling back on error

I'm trying to utilize django's row-level-locking by using the select_for_update utility. As per the documentation, this can only be used when inside of a transaction.atomic block. The side-effect of using a transaction.atomic block is that if my code throws an exception, all the database changes get rolled-back. My use case is such that I'd actually like to keep the database changes, and allow the exception to propagate. This leaves me with code looking like this:
with transaction.atomic():
user = User.objects.select_for_update.get(id=1234)
try:
user.do_something()
except Exception as e:
exception = e
else:
exception = None
if exception is not None:
raise exception
This feels like a total anti-pattern and I'm sure I must be missing something. I'm aware I could probably roll-my-own solution by manually using transaction.set_autocommit to manage the transaction, but I'd have thought that there would be a simpler way to get this functionality. Is there a built in way to achieve what I want?
I ended up going with something that looks like this:
from django.db import transaction
class ErrorTolerantTransaction(transaction.Atomic):
def __exit__(self, exc_type, exc_value, traceback):
return super().__exit__(None, None, None)
def error_tolerant_transaction(using=None, savepoint=True):
"""
Wraps a code block in an 'error tolerant' transaction block to allow the use of
select_for_update but without the effect of automatic rollback on exception.
Can be invoked as either a decorator or context manager.
"""
if callable(using):
return ErrorTolerantTransaction('default', savepoint)(using)
return ErrorTolerantTransaction(using, savepoint)
I can now put an error_tolerant_transaction in place of transaction.atomic and exceptions can be raised without a forced rollback. Of course database-related exceptions (i.e. IntegrityError) will still cause a rollback, but that's expected behavior given that we're using a transaction. As a bonus, this solution is compatible with transaction.atomic, meaning it can be nested inside an atomic block and vice-versa.

Multi-DB Transactions

Django Version 1.10.5 with Postgres 9.6.1
For the last year I've been working in a multi-schema default database environment. However things are beginning to grow to the point I've decided to split the single database into 3 databases.
I've got things working with a master/slave router for all 3 databases.
I am not using the 'default' database key. Instead I have 'db1', 'db2', and 'db3'
The part I am confused about is with transactions in this multi-database environment.
In this example it fails as expected. Caused of course by not using #transaction.atomic(using='db1') which is clear to me.
#transaction.atomic()
def edit(self, context):
"""Edit
:param dict context: Context
:return: None
"""
# Check if employee exists
try:
result = Passport.objects.get(pk=self.user.employee_id)
except Passport.DoesNotExist:
return False
result.name = context.get('name')
result.save()
However I have this strange example, simply because I'm trying to understand... I would have expected this to fail but it does not:
#transaction.atomic(using='db1')
def edit(self, context):
"""Edit
:param dict context: Context
:return: None
"""
# Check if employee exists
try:
result = Passport.objects.get(pk=self.user.employee_id)
except Passport.DoesNotExist:
return False
result.name = context.get('name')
with transaction.atomic(using='db2'):
result.save()
The model Passport does not exist in DB2 models at all.
My router is setup so that all writes go to each respected DB.
So what is the purpose of setting the using='db1' in the atomic transaction? I've looked at the source and I see it defaults to default when not "using".
In the above example I even made another transaction inside of the initial transaction but this time using='db2' where the model doesn't even exist. I figured that would have failed, but it didn't and the data was written to the proper database.
I bring this up because there will be situations where I need to interact with all 3 databases and if a single problem occurs when writing to all 3 databases, all 3 need to be rolled back or if on success of everything, then committed of course.
Perhaps someone can help break this down for me so I can understand?
You're interpreting transaction.atomic(using='X') to mean: run the following database commands on X, inside a transaction.
In fact, it just means: open a transaction on database X, and then either commit it or roll it back at the end of the block.
Or, as the documentation puts it:
Under the hood, Django’s transaction management code:
opens a transaction when entering the outermost atomic block;
commits or rolls back the transaction when exiting the outermost block.
The question of which database to use for a given command is determined by your router, not the using clause. So your transaction.atomic(using='db2') block is pointless (it will simply open a transaction on db2 and then close it), but not an error.

TransactionManagementError While Executing Unittest in Django Rest Framework

I have written a test which checks whether IntegrityError was raised in case of duplicate records in the database. To create that scenario I am issue a REST API twice. The code looks like this:
class TestPost(APITestCase):
#classmethod
def setUpClass(cls):
super().setUpClass()
common.add_users()
def tearDown(self):
super().tearDown()
self.client.logout()
def test_duplicate_record(self):
# first time
response = self.client.post('/api/v1/trees/', dict(alias="some name", path="some path"))
# same request second time
response = self.client.post('/api/v1/trees/', dict(alias="some name", path="some path"))
self.assertEqual(response.status_code, status.HTTP_400_BAD_RREQUEST)
But I get an error stack like this
"An error occurred in the current transaction. You can't "
django.db.transaction.TransactionManagementError: An error occurred in the current transaction. You can't execute queries until the end of the 'atomic' block.
How can I avoid this error this is certainly undesirable.
I've come across this issue today, and it took me a while to see the bigger picture, and fix it properly. It's true that removing self.client.logout() fixes the issue, and is probably not needed there, but the problem lies in your view.
Tests which are subclasses of TestCase wrap your test cases and tests in db transactions (atomic blocks), and your view somehow breaks that transaction. For me it was swallowing an IntegrityError exception. At that point, the transaction was already broken, but Django didn't know about it, so it couldn't correctly perform a rollback. Any query that would then be executed would cause a TransactionManagementError.
The fix for me was to correctly wrap the view code in yet another atomic block:
try:
with transaction.atomic(): # savepoint which will be rolled back
counter.save() # the operation which is expected to throw an IntegrityError
except IntegrityError:
pass # swallow the exception without breaking the transaction
This may not be a problem for you in production, if you're not using ATOMIC_REQUESTS, but I still think it's the correct solution.
Try removing self.client.logout() from the tearDown method. Django rolls back the transaction at the end of each test. You shouldn't have to log out manually.

Commit manually in Django data migration

I'd like to write a data migration where I modify all rows in a big table in smaller batches in order to avoid locking issues. However, I can't figure out how to commit manually in a Django migration. Everytime I try to run commit I get:
TransactionManagementError: This is forbidden when an 'atomic' block is active.
AFAICT, the database schema editor always wraps Postgres migrations in an atomic block.
Is there a sane way to break out of the transaction from within the migration?
My migration looks like this:
def modify_data(apps, schema_editor):
counter = 0
BigData = apps.get_model("app", "BigData")
for row in BigData.objects.iterator():
# Modify row [...]
row.save()
# Commit every 1000 rows
counter += 1
if counter % 1000 == 0:
transaction.commit()
transaction.commit()
class Migration(migrations.Migration):
operations = [
migrations.RunPython(modify_data),
]
I'm using Django 1.7 and Postgres 9.3. This used to work with South and older versions of Django.
The best workaround I found is manually exiting the atomic scope before running the data migration:
def modify_data(apps, schema_editor):
schema_editor.atomic.__exit__(None, None, None)
# [...]
In contrast to resetting connection.in_atomic_block manually this allows using atomic context manager inside the migration. There doesn't seem to be a much saner way.
One can contain the (admittedly messy) transaction break out logic in a decorator to be used with the RunPython operation:
def non_atomic_migration(func):
"""
Close a transaction from within code that is marked atomic. This is
required to break out of a transaction scope that is automatically wrapped
around each migration by the schema editor. This should only be used when
committing manually inside a data migration. Note that it doesn't re-enter
the atomic block afterwards.
"""
#wraps(func)
def wrapper(apps, schema_editor):
if schema_editor.connection.in_atomic_block:
schema_editor.atomic.__exit__(None, None, None)
return func(apps, schema_editor)
return wrapper
Update
Django 1.10 will support non-atomic migrations.
From the documentation about RunPython:
By default, RunPython will run its contents inside a transaction on databases that do not support DDL transactions (for example, MySQL and Oracle). This should be safe, but may cause a crash if you attempt to use the schema_editor provided on these backends; in this case, pass atomic=False to the RunPython operation.
So, instead of what you've got:
class Migration(migrations.Migration):
operations = [
migrations.RunPython(modify_data, atomic=False),
]
For others coming across this. You can have both data (RunPython), in the same migration. Just make sure all the alter tables goes first. You cannot do the RunPython before any ALTER TABLE.
First you need to set Migration.atomic = False
class Migration(migrations.Migration):
atomic = False
Then in your function you can wrap certain block of code inside of transaction.atomic() to make only that block atomic
from django.db import transaction
for row in rows:
with transaction.atomic():
do_something(row)
# Changes made by `do_something` will be committed by this point
Here's the relevant documentation: https://docs.djangoproject.com/en/4.1/howto/writing-migrations/#non-atomic-migrations
Gotcha: migrations.RunPython(forwards_func, atomic=False) does NOT do what you want. It prevents django from manually putting your migration code inside a transaction, which it doesn't do for Postgresql anyway. This atomic=False option is meant for DBs that don't support DDL transaction, as stated in their documentation: https://docs.djangoproject.com/en/4.1/ref/migration-operations/#runpython
By default, RunPython will run its contents inside a transaction on databases that do not support DDL transactions (for example, MySQL and Oracle). This should be safe, but may cause a crash if you attempt to use the schema_editor provided on these backends; in this case, pass atomic=False to the RunPython operation.
On databases that do support DDL transactions (SQLite and PostgreSQL), RunPython operations do not have any transactions automatically added besides the transactions created for each migration.