We work with DB2 effective version 8 (more or less, so no CUR_COMMIT) on z/OS.
In our (Java based, though this should not be relevant) application a method exists which runs in a transaction and deletes multiple records from a table called, say, MY_TABLE, based on the value of a certain column which we will call SPECIAL_COLUMN, executing the statement
DELETE FROM MY_TABLE WHERE SPECIAL_COLUMN=?
Apart from executing this statement, some other SQL statements get executed which I omit because for the moment I think they are perhaps not relevant for the problem I describe.
Running the method concurrently we sometimes see the exception
nested exception is com.ibm.db2.jcc.am.SqlException:
UNSUCCESSFUL EXECUTION CAUSED BY DEADLOCK OR TIMEOUT. REASON CODE 00C90088, TYPE OF RESOURCE 00000302, AND RESOURCE NAME ... SQLCODE=-913, SQLSTATE=57033, DRIVER=3.63.131
thrown during the execution of the DELETE FROM MY_TABLE WHERE SPECIAL_COLUMN=? statement. According to http://www.idug.org/p/fo/et/thread=20542 this seems to be related to locks placed on "pages".
My questions are the following:
Can in fact two DELETE statements executed concurrently for the same value of SPECIAL_COLUMN, to which multiple rows correspond, cause such deadlock (a scenario which I have in mind is something like the following: the first statement "puts a lock" on "1st page", second statement "puts a lock" on "2nd page", and then first statement waits for the lock on "2nd page", while the second statement waits for the lock on "1st page".
Or is the placing of such locks is "atomic", meaning that if first statement has started to put locks, the 2nd will wait?
Same question for different values of SPECIAL_COLUMN (seems more likely)
In the case such scenarios are possible and might be the reason for the deadlock observed (otherwise we'll have to examine the "unsuspicios" so far SQL), which might be a reasonable solution? (I have thought on synchronizing the Java code, but I think it is not such a good idea; I have thought also on issuing SELECT FOR UPDATE on the rows to be deleted before doing delete, but since additional locks will be involved, I am quite in doubt also about that).
EDIT:
link to a report on a similar problem http://www.dbforums.com/showthread.php?575408-db2-OS390-TABLE-LOCK-DURING-DELETE
Related
I had to use transaction.on_commit() for synchronous behaviour in one of the signals of my project. Though it works fine, I couldn't understand how does transaction.on_commit() decide which transaction to take. I mean there can be multiple transactions at the same time. But how does django know which transaction to take by using transaction.on_commit()
According to the docs
You can also wrap your function in a lambda:
transaction.on_commit(lambda: some_celery_task.delay('arg1'))
The function you pass in will be called immediately after a hypothetical database write made where on_commit() is called would be successfully committed.
If you call on_commit() while there isn’t an active transaction, the callback will be executed immediately.
If that hypothetical database write is instead rolled back (typically when an unhandled exception is raised in an atomic() block), your function will be discarded and never called.
If you are using it on post_save method with sender=SomeModel. Probably the on_commit is executed each time a SomeModel object is saved. Without the proper code we would not be able to tell the exact case.
If I understand the question correctly, I think the docs on Savepoints explains this.
Essentially, you can nest any number of transactions, but on_commit() is only called after the top most one commits. However, on_commit() that's nested within a savepoint will only be called if that savepoint was committed and all the ones above it are committed. So, it's tied to which ever one is currently open at the point it's called.
I have a function which aims to delete a concrete row from a SQLite database by an UID identifier.
The sequence is the following:
1. Create select query to check if the row exists
2. Prepare the query
3. Bind the row UID
4. Step
5. Finalize
If the row exist
{
6. Create delete query
7. Prepare it
8. Bind the UID
9. Step
10. Finalize
11. Finalize
}
As you can see first it checks if the row exist in order to notify the caller if the required UID is wrong, than it create new delete query.
The program works as expected in ~14/15 test cases. In the cases where the program is crashing it crashes to the last finalize invocation (11th point). I've checked all the data and it seems that everything is valid.
The question is what is the expected behaviour of consecutive invocation of finalize function. I tried to set 5 invocations of finalize one after another but the behaviour is the same.
Though the documentation doesn't feel the need to state this explicitly, it's fairly obvious that what you're doing is "undefined behaviour" (within the scope of the library).
Much like deleteing dynamically allocated memory, you are supposed to finalize once. Not twice, not five times, but once. After you've finalized a prepared statement, it has been "deleted" and no longer exists. Any further operations on that prepared statement constitutes what the documentation calls "a grivous error" (if we presume that a superfluous call to finalize constitutes "use"; and why would we not?).
Fortunately there is no reason ever to want to do this. So, quite simply, don't! If your design is such that you've lost control of your code flow and, at the point of finalize, for some reason have insufficient information about your program's context to know whether the prepared statement has already been finalized, that's fine: much like we do with pointers, you can set it to nullptr so that subsequent calls are no-ops. But if you need to do this, you really should also revisit your design.
Why did it appear to work for you? Pure chance, much like with any other undefined behaviours:
Any use of a prepared statement after it has been finalized can result in undefined and undesirable behavior such as segfaults and heap corruption.
See also: "Why can't I close my car door twice without opening it?" and "Why can't I shave my imaginary beard?"
I would like to backup a running rocksdb-instance to a location on the same disk in a way that is safe, and without interrupting processing during the backup.
I have read:
Rocksdb Backup Instructions
Checkpoints Documentation
Documentation in rocksdb/utilities/{checkpoint.h,backupable_db.{h,cc}}
My question is whether the call to CreateNewBackupWithMetadata is marked as NOT threadsafe to express, that two concurrent calls to this function will have unsafe behavior, or to indicate that ANY concurrent call on the database will be unsafe. I have checked the implementation, which appears to be creating a checkpoint - which the second article claims are used for online backups of MyRocks -, but I am still unsure, what part of the call is not threadsafe.
I currently interpret this as, it is unsafe, because CreateBackup... calls DisableFileDeletions and later EnableFileDeletions, which, of course, if two overlapping calls are made, may cause trouble. Since the SST-files are immutable, I am not worried about them, but am unsure whether modifying the WAL through insertions can corrupt the backup. I would assume that triggering a flush on backup should prevent this, but I would like to be sure.
Any pointers or help are appreciated.
I ended up looking into the implementation way deeper, and here is what I found:
Recall a rocksdb database consists of Memtables, SSTs and a single WAL, which protects data in the Memtables against crashes.
When you call rocksdb::BackupEngine::CreateBackupWithMetadata, there is no lock taken internally, so this call can race, if two calls are active at the same time. Most notably this call does Disable/EnableFileDeletions, which, if called by one call, while another is still active spells doom for the other call.
The process of copying the files from the database to the backup is protected from modifications while the call is active by creating a rocksdb::Checkpoint, which, if flush_before_backup was set to true, will first flush the Memtables, thus clearing the active WAL.
Internally the call to CreateCustomCheckpoint calls DB::GetLiveFiles in db_filecheckpoint.cc. GetLiveFiles takes the global database lock (_mutex), optionally flushes the Memtables, and retrieves the list of SSTs. If a flush in GetLiveFiles happens while holding the global database-lock, the WAL must be empty at this time, which means the list should always contain the SST-files representing a complete and consistent database state from the time of the checkpoint. Since the SSTs are immutable, and since file deletion through compaction is turned off by the backup-call, you should always get a complete backup without holding writes on the database. However this, of course, means it is not possible to determine the exact last write/sequence number in the backup when concurrent updates happen - at least not without inspecting the backup after it has been created.
For the non-flushing version, there maybe WAL-files, which are retrieved in a different call than GetLiveFiles, with no lock held in between, i.e. these are not necessarily consistent, but I did not investigate further, since the non-flushing case was not applicable to my use.
Say I have two C++ functions foo1() and foo2(), and I want to minimize the likelihood that that foo1() starts execution but foo2() is not called due to some external event. I don't mind if neither is called, but foo2() must execute if foo1() was called. Both functions can be called consecutively and do not throw exceptions.
Is there any benefit / drawback to wrapping the functions in an object and calling both in the destructor? Would things change if the application was multi-threaded (say the parent thread crashes)? Are there any other options for ensuring foo2() is called so long as foo1() is called?
I thought having them in a destructor might help with e.g. SIGINT, though I learned SIGINT will stop execution immediately, even in the middle of the destructor.
Edit:
To clarify: both foo1() and foo2() will be abstracted away, so I'm not concerned about someone else calling them in the wrong order. My concern is solely related to crashes, exceptions, or other interruptions during the execution of the application (e.g. someone pressing SIGINT, another thread crashing, etc.).
If another thread crashes (without relevant signal handler -> the whole application exits), there is not much you can do to guarantee that your application does something - it's up to what the OS does. And there are ALWAYS cases where the system will kill your app without your actual knowledge (e.g. a bug that causes "all" memory being used by your app and the OS "out of memory killer" killing your process).
The only time your destructor is guaranteed to be executed is if the object is constructed and a C++ exception is thrown. All signals and such, make no such guarantees, and contininuing to execute [in the same thread] after for example SIGSEGV or SIGBUS is well into the "undefined" parts of the world - nothing much you can do about that, since the SEGV typically means "you tried to do something to memory that doesn't exist [or that you can't access in the way you tried, e.g. write to code-memory]", and the processor would have aborted the current instruction. Attempting to continue where you were will either lead to the same instruction being executed again, or the instruction being skipped [if you continue at the next instruction - and I'm ignoring the trouble of determining where that is for now]. And of course, there are situations where it's IMPOSSIBLE to continue even if you wanted to - say for example the stack pointer has been corrupted [restored from memory that was overwritten, etc].
In short, don't spend much time trying to come up with something that tries to avoid these sort of scenarios, because it's unlikely to work. Spend your time trying to come up with schemes where you don't need to know if you completed something or not [for example transaction based programming, or "commit-based" programming (not sure if that's the right term, but basically you do some steps, and then "commit" the stuff done so far, and then do some further steps, etc - only stuff that has been "committed" is sure to be complete, uncommitted work is discarded next time around) , where something is either completely done, or completely discarded, depending on if it completed or not].
Separating "sensitive" and "not sensitive" parts of your application into separate processes can be another way to achieve some more safety.
Is it possible to implement transactions in Graph Engine?
I like to do multiple updates on different cells and then commit or rollback these changes.
Even with one cell it is difficult. When I use the next code the modification is not written to disk but the memory is changed!
using (Character_Accessor characterAccessor = Global.LocalStorage.UseCharacter(cellId, CellAccessOptions.StrongLogAhead))
{
characterAccessor.Name = "Modified";
throw new Exception("Test exception");
}
My understanding is: Regardless of you throwing this Exception or not: The changes are always only in memory - until you explicitly call Global.LocalStorage.SaveStorage().
You could implement your transaction by saving the storage before you start the transaction, then make the changes, and in case you want to rollback, just call Global.LocalStorage.ResetStorage().
All this, of course, only if you do not need high-performance throughput and access the database on a single thread.
The write-ahead log is only flushed to the disk at the end of the "using" scope -- when the accessor is being disposed and the lock in the memory storage is about to be released.
This is like a mini-transaction on a single cell. Others cannot access the cell when you hold the lock. You could do multiple changes to the cell and "commit" them at the end -- or, making a shadow copy at the beginning of the using scope, and then rollback later to this copy when anything goes wrong (this is still a manual process though).
Also, please check this out: https://github.com/Microsoft/GraphEngine/tree/multi_cell_lock
We are working on enabling one thread to hold multiple locks. This will make multi-entity transactions much easier to implement.