Django 1.6 proposes #transaction.atomic as part of the rehaul in the transaction management from 1.5.
I have a function which is called by a Django management command which is in turn called by cron, i.e. no HTTP request triggering transactions in this case. Snippet:
from django.db import transaction
#transaction.commit_on_success
def my_function():
# code here
In the above code block commit_on_success uses a single transaction for all the work done in my_function.
Does replacing #transaction.commit_on_success with #transaction.atomic result in the identical behaviour? #transaction.atomic docs state:
Atomicity is the defining property of database transactions. atomic
allows us to create a block of code within which the atomicity on the
database is guaranteed. If the block of code is successfully
completed, the changes are committed to the database. If there is an
exception, the changes are rolled back.
I take it that they result in the same behaviour; correct?
Based on the documentation I have read on the subject, there is a significant difference when these decorators are nested.
Nesting two atomic blocks does not work the same as nesting two commit_on_success blocks.
The problem is that there are two guarantees that you would like to have from these blocks.
You would like the content of the block to be atomic, either everything inside the block is committed, or nothing is committed.
You would like durability, once you have left the block without an exception you are guaranteed, that everything you wrote inside the block is persistent.
It is impossible to provide both guarantees when blocks are nested. If an exception is raised after leaving the innermost block but before leaving the outermost block, you will have to fail in one of two ways:
Fail to provide durability for the innermost block.
Fail to provide atomicity for the outermost block.
Here is where you find the difference. Using commit_on_success would give durability for the innermost block, but no atomicity for the outermost block. Using atomic would give atomicity for the outermost block, but no durability for the innermost block.
Simply raising an exception in case of nesting could prevent you from running into the problem. The innermost block would always raise an exception, thus it never promises any durability. But this loses some flexibility.
A better solution would be to have more granularity about what you are asking for. If you can separately ask for atomicity and durability, then you can perform nesting. You just have to ensure that every block requesting durability is outside those requesting atomicity. Requesting durability inside a block requesting atomicity would have to raise an exception.
atomic is supposed to provide the atomicity part. As far as I can tell django 1.6.1 does not have a decorator, which can ask for durability. I tried to write one, and posted it on codereview.
Yes. You should use atomic in the places where you previously used commit_on_success.
Since the new transaction system is designed to be more robust and consistent, though, it's possible that you could see different behavior. For example, if you catch database errors and try to continue on you will see a TransactionManagementError, whereas the previous behavior was undefined and probably case-dependent.
But, if you're doing things properly, everything should continue to work the same way.
Related
I can use inner atomic blocks as savepoints and catch inner savepoint errors. This way, we proceed within the atomic outer scope, just rolling back the inner atomic block, as explained in this question, but there is this argument savepoint=False that I don't see any use case for.
In the docs:
You can disable the creation of savepoints for inner blocks by setting the savepoint argument to False. If an exception occurs, Django will perform the rollback when exiting the first parent block with a savepoint if there is one, and the outermost block otherwise. Atomicity is still guaranteed by the outer transaction. This option should only be used if the overhead of savepoints is noticeable. It has the drawback of breaking the error handling described above.
If I understand correctly, it will just change the fact that even catching an error from the inner block, the outer scope will still rollback. Is it correct?
A Django atomic transaction has the durable and savepoint arguments. See docs.
durable=True ensures the atomic block is the outermost atomic block. Per the docs:
It is sometimes useful to ensure an atomic block is always the outermost atomic block, ensuring that any database changes are committed when the block is exited without errors. This is known as durability and can be achieved by setting durable=True
A PostgreSQL SAVEPOINT establishes a new savepoint within the current transaction. Sounds like this is only needed if the atomic block is nested.
I have the following questions:
If durable=True then savepoint should ALWAYS be False, right? Because there's no point in using a savepoint if the atomic block is the outermost atomic block.
If durable=True, should Django set savepoint=False for us? Reading through the source code, it doesn't appear to do it for us but I feel like it should.
If transaction.atomic(durable=True) is nested in another transaction.atomic a RuntimeError will be thrown. I think you can't really have a situation when you could set savepoint=False and durable=True (other than outermost transaction.atomic in which case it won't be a savepoint in the first place).
This blog post might be helpful.
I had to use transaction.on_commit() for synchronous behaviour in one of the signals of my project. Though it works fine, I couldn't understand how does transaction.on_commit() decide which transaction to take. I mean there can be multiple transactions at the same time. But how does django know which transaction to take by using transaction.on_commit()
According to the docs
You can also wrap your function in a lambda:
transaction.on_commit(lambda: some_celery_task.delay('arg1'))
The function you pass in will be called immediately after a hypothetical database write made where on_commit() is called would be successfully committed.
If you call on_commit() while there isn’t an active transaction, the callback will be executed immediately.
If that hypothetical database write is instead rolled back (typically when an unhandled exception is raised in an atomic() block), your function will be discarded and never called.
If you are using it on post_save method with sender=SomeModel. Probably the on_commit is executed each time a SomeModel object is saved. Without the proper code we would not be able to tell the exact case.
If I understand the question correctly, I think the docs on Savepoints explains this.
Essentially, you can nest any number of transactions, but on_commit() is only called after the top most one commits. However, on_commit() that's nested within a savepoint will only be called if that savepoint was committed and all the ones above it are committed. So, it's tied to which ever one is currently open at the point it's called.
Is it possible to implement transactions in Graph Engine?
I like to do multiple updates on different cells and then commit or rollback these changes.
Even with one cell it is difficult. When I use the next code the modification is not written to disk but the memory is changed!
using (Character_Accessor characterAccessor = Global.LocalStorage.UseCharacter(cellId, CellAccessOptions.StrongLogAhead))
{
characterAccessor.Name = "Modified";
throw new Exception("Test exception");
}
My understanding is: Regardless of you throwing this Exception or not: The changes are always only in memory - until you explicitly call Global.LocalStorage.SaveStorage().
You could implement your transaction by saving the storage before you start the transaction, then make the changes, and in case you want to rollback, just call Global.LocalStorage.ResetStorage().
All this, of course, only if you do not need high-performance throughput and access the database on a single thread.
The write-ahead log is only flushed to the disk at the end of the "using" scope -- when the accessor is being disposed and the lock in the memory storage is about to be released.
This is like a mini-transaction on a single cell. Others cannot access the cell when you hold the lock. You could do multiple changes to the cell and "commit" them at the end -- or, making a shadow copy at the beginning of the using scope, and then rollback later to this copy when anything goes wrong (this is still a manual process though).
Also, please check this out: https://github.com/Microsoft/GraphEngine/tree/multi_cell_lock
We are working on enabling one thread to hold multiple locks. This will make multi-entity transactions much easier to implement.
So imagine you've got an exception you're catching and then in the catch you write to a log file that some exception occurred. Then you want your program to continue, so you have to make sure that certain invariants are still in a a good state. However what actually occurs in the system after the exception was "handled" by a catch?
The stack has been unwound at that point so how does it get to restore it's state?
"Stack unwinding" means that all scopes between throw and the matching catch clause are left, calling destructors for all automatic objects in those scopes, pretty much in the same way function scopes are left when you return from a function.
Nothing else "special" is done, the scope of a catch clause is a normal scope, and leaving it is no different from leaving the scope of an else clause.
If you need to make sure certain invariants still hold, you need to program the code changing them in a thread-safe manner. Dave Abrahams wrote a classic on the different levels of exception safety, you might want to read that. Basically, you will have to consequently employ RAII in order to be on the safe side when exceptions are thrown.
Only objects created inside the try will have been destroyed during unwinding. It's up to you to write a program in such way that if an exception occurs program state stays consistent - that's called exception safety.
C++ doesn't care - it unwinds stack, then passes control into an appropriate catch, then control flow continues normally.
It is up to you to ensure that the application is recovered into a stable state after catching the exception. Usually it is achieved by "forgetting" whatever operation or change(s) produced the exception, and starting afresh on a higher level.
This includes ensuring that any resources allocated during the chain of events leading to the exception gets properly released. In C++, the standard idiom to ensure this is RAII.
Update
For example, if an error occurs while processing a request in a web server, it generates an exception in some lower level function, which gets caught in a higher level class (possibly right in the top level request handler). Usually the best possible thing to do is to roll back any changes done and free any resources allocated so far related to the actual request, and return an appropriate error message to the client. Changes may include DB transactions, file writes, etc - one must implement all these in an exception safe manner. Databases typically have built in transactions to deal with this; with other resources it may be more tricky.
This is up to the application. There are several levels of exception-safety. The level you describe is hard to achieve for the whole application.
Certain pieces of code, however, can be made 'Failure transparent', by using techniques like RAII, and by smartly ordering the sequence of actions. I could imagine a piece of code querying several urls for data, for instance: when one url would 'throw', the rest of the urls can still be handled. Or it can be retried...
If you have exception handling in every function you can resume on the next higher level but its rather complicated, in fact I use exceptions mainly to detect errors as close to the source as possible but don't use them for resuming execution.
if on the other hand there are errors that are predictable one can devise schemes to handle that, but for me exceptions are considered exceptions so I tend to try and exit gracefully instead with a good hint in the log file where it happened. JM2CW
It can't. Exceptions aren't resumable in C++. Nor in most
modern languages; some of the first languages to support
exceptions did support resumable exceptions, and found that it
wasn't a good idea.
If you want to be able to resume from some specific point, you
have to put your try/catch block there. If you just want to log
and continue, don't throw the exception in the first place.