Does durable Django atomic transaction imply we don't need savepoints - django

A Django atomic transaction has the durable and savepoint arguments. See docs.
durable=True ensures the atomic block is the outermost atomic block. Per the docs:
It is sometimes useful to ensure an atomic block is always the outermost atomic block, ensuring that any database changes are committed when the block is exited without errors. This is known as durability and can be achieved by setting durable=True
A PostgreSQL SAVEPOINT establishes a new savepoint within the current transaction. Sounds like this is only needed if the atomic block is nested.
I have the following questions:
If durable=True then savepoint should ALWAYS be False, right? Because there's no point in using a savepoint if the atomic block is the outermost atomic block.
If durable=True, should Django set savepoint=False for us? Reading through the source code, it doesn't appear to do it for us but I feel like it should.

If transaction.atomic(durable=True) is nested in another transaction.atomic a RuntimeError will be thrown. I think you can't really have a situation when you could set savepoint=False and durable=True (other than outermost transaction.atomic in which case it won't be a savepoint in the first place).
This blog post might be helpful.

Related

When to use savepoint = False in django transactions?

I can use inner atomic blocks as savepoints and catch inner savepoint errors. This way, we proceed within the atomic outer scope, just rolling back the inner atomic block, as explained in this question, but there is this argument savepoint=False that I don't see any use case for.
In the docs:
You can disable the creation of savepoints for inner blocks by setting the savepoint argument to False. If an exception occurs, Django will perform the rollback when exiting the first parent block with a savepoint if there is one, and the outermost block otherwise. Atomicity is still guaranteed by the outer transaction. This option should only be used if the overhead of savepoints is noticeable. It has the drawback of breaking the error handling described above.
If I understand correctly, it will just change the fact that even catching an error from the inner block, the outer scope will still rollback. Is it correct?

How does django decide which transaction to choose on transaction.on_commit()

I had to use transaction.on_commit() for synchronous behaviour in one of the signals of my project. Though it works fine, I couldn't understand how does transaction.on_commit() decide which transaction to take. I mean there can be multiple transactions at the same time. But how does django know which transaction to take by using transaction.on_commit()
According to the docs
You can also wrap your function in a lambda:
transaction.on_commit(lambda: some_celery_task.delay('arg1'))
The function you pass in will be called immediately after a hypothetical database write made where on_commit() is called would be successfully committed.
If you call on_commit() while there isn’t an active transaction, the callback will be executed immediately.
If that hypothetical database write is instead rolled back (typically when an unhandled exception is raised in an atomic() block), your function will be discarded and never called.
If you are using it on post_save method with sender=SomeModel. Probably the on_commit is executed each time a SomeModel object is saved. Without the proper code we would not be able to tell the exact case.
If I understand the question correctly, I think the docs on Savepoints explains this.
Essentially, you can nest any number of transactions, but on_commit() is only called after the top most one commits. However, on_commit() that's nested within a savepoint will only be called if that savepoint was committed and all the ones above it are committed. So, it's tied to which ever one is currently open at the point it's called.

sqlite3_step(stmt) inside a transaction fails with error 5 without my busy-handler being called

[sqlite version 3.28.0 (2019-04-16)]
I am using sqlite in a multi-threaded application.
Sqlite threading-mode is configured for multi-threading (compilation flag: SQLITE_THREADSAFE=2).
In accordance to that each thread is using its own database connection.
In addition, since the multiple connections can access the database simultaneously and in order to avoid the ‘database is locked’ (error 5) I implement a busy-handler callback and assign it to the connection using sqlite3_busy_handler() right after connection creation.
Generally speaking, all is working well, however I did find that in the following scenario my busy-handler is not being called:
Code begins transaction (BEGIN TRANSACTION),
Code working with sqlite-statement (sqlite3_prepare_v2(), sqlite3_bind_xxx()),
Eventually when calling sqlite3_step() I receive the ‘database is locked’ (error 5) without my busy-handler ever being called.
I know that in sqlite documentation it says:
The presence of a busy handler does not guarantee that it will be invoked when there is lock contention. If SQLite determines that invoking the busy handler could result in a deadlock, it will go ahead and return SQLITE_BUSY
However, my condition is a simple stand-alone application that does not share the database with any other applications. For this reason, I cannot see why would my busy-handler callback not get called.
My question:
Is there a way to configure sqlite to always call my busy-handler callback?
...after consulting with SQLite Forum...
The solution:
Replace the command BEGIN TRANSACTION with BEGIN IMMEDIATE TRANSACTION.
More
In sqlite there are three types of transactions:
Deferred (the default)
Immediate,
Exclusive
If the transaction-type is not explicitly specified in the BEGIN TRANSACTION statement then the default DEFERRED transaction-type is selected.
In the DEFERRED transaction-type sqlite will not try to acquire any locks until it encounters the first command that requires this lock.
So if the transaction contained one READ command and one WRITE command, then sqlite will do:
BEGIN TRANSACTION // No locks acquired
SELECT... // Try to acquire READ lock
INSERT... // Try to acquire WRITE lock
In multi-threading environment, if the above transaction will be executed simultaniously by two (or more) different threads then a deadlock could occur:
Thread-A acquires READ lock,
Thread-B acquires READ lock,
Thread-A tries to acquire a WRITE lock, which can be acquired only once all READ and WRITE locks are released,
Thread-B tries to acquire a WRITE lock...
This condition is a deadlock, and as sqlite documentation explains: if a potential deadlock is identified then the busy-handler will not get called as it cannot resolve this condition.
The above described how DEFERRED transaction work.
In both other cases - IMMEDIATE and EXCLUSIVE - the WRITE lock is acquired right in the beginning of the transaction, thus no risk of deadlocks:
BEGIN IMMEDIATE TRANSACTION // Try to acquire the WRITE lock
SELECT... // ...no locks activity...
INSERT... // ...no locks activity...
Since my code is using sqlite in Multi-Thread threading mode I had to replace my BEGIN TRANSACTION command with BEGIN IMMEDIATE TRANSACTION.
This solved the problem.

Transactions in Graph Engine

Is it possible to implement transactions in Graph Engine?
I like to do multiple updates on different cells and then commit or rollback these changes.
Even with one cell it is difficult. When I use the next code the modification is not written to disk but the memory is changed!
using (Character_Accessor characterAccessor = Global.LocalStorage.UseCharacter(cellId, CellAccessOptions.StrongLogAhead))
{
characterAccessor.Name = "Modified";
throw new Exception("Test exception");
}
My understanding is: Regardless of you throwing this Exception or not: The changes are always only in memory - until you explicitly call Global.LocalStorage.SaveStorage().
You could implement your transaction by saving the storage before you start the transaction, then make the changes, and in case you want to rollback, just call Global.LocalStorage.ResetStorage().
All this, of course, only if you do not need high-performance throughput and access the database on a single thread.
The write-ahead log is only flushed to the disk at the end of the "using" scope -- when the accessor is being disposed and the lock in the memory storage is about to be released.
This is like a mini-transaction on a single cell. Others cannot access the cell when you hold the lock. You could do multiple changes to the cell and "commit" them at the end -- or, making a shadow copy at the beginning of the using scope, and then rollback later to this copy when anything goes wrong (this is still a manual process though).
Also, please check this out: https://github.com/Microsoft/GraphEngine/tree/multi_cell_lock
We are working on enabling one thread to hold multiple locks. This will make multi-entity transactions much easier to implement.

Is "transaction.atomic" same as "transaction.commit_on_success"?

Django 1.6 proposes #transaction.atomic as part of the rehaul in the transaction management from 1.5.
I have a function which is called by a Django management command which is in turn called by cron, i.e. no HTTP request triggering transactions in this case. Snippet:
from django.db import transaction
#transaction.commit_on_success
def my_function():
# code here
In the above code block commit_on_success uses a single transaction for all the work done in my_function.
Does replacing #transaction.commit_on_success with #transaction.atomic result in the identical behaviour? #transaction.atomic docs state:
Atomicity is the defining property of database transactions. atomic
allows us to create a block of code within which the atomicity on the
database is guaranteed. If the block of code is successfully
completed, the changes are committed to the database. If there is an
exception, the changes are rolled back.
I take it that they result in the same behaviour; correct?
Based on the documentation I have read on the subject, there is a significant difference when these decorators are nested.
Nesting two atomic blocks does not work the same as nesting two commit_on_success blocks.
The problem is that there are two guarantees that you would like to have from these blocks.
You would like the content of the block to be atomic, either everything inside the block is committed, or nothing is committed.
You would like durability, once you have left the block without an exception you are guaranteed, that everything you wrote inside the block is persistent.
It is impossible to provide both guarantees when blocks are nested. If an exception is raised after leaving the innermost block but before leaving the outermost block, you will have to fail in one of two ways:
Fail to provide durability for the innermost block.
Fail to provide atomicity for the outermost block.
Here is where you find the difference. Using commit_on_success would give durability for the innermost block, but no atomicity for the outermost block. Using atomic would give atomicity for the outermost block, but no durability for the innermost block.
Simply raising an exception in case of nesting could prevent you from running into the problem. The innermost block would always raise an exception, thus it never promises any durability. But this loses some flexibility.
A better solution would be to have more granularity about what you are asking for. If you can separately ask for atomicity and durability, then you can perform nesting. You just have to ensure that every block requesting durability is outside those requesting atomicity. Requesting durability inside a block requesting atomicity would have to raise an exception.
atomic is supposed to provide the atomicity part. As far as I can tell django 1.6.1 does not have a decorator, which can ask for durability. I tried to write one, and posted it on codereview.
Yes. You should use atomic in the places where you previously used commit_on_success.
Since the new transaction system is designed to be more robust and consistent, though, it's possible that you could see different behavior. For example, if you catch database errors and try to continue on you will see a TransactionManagementError, whereas the previous behavior was undefined and probably case-dependent.
But, if you're doing things properly, everything should continue to work the same way.