Lately I consider a thing that I haven't totally figured. Basically, the thing is summarized in the question. In a broader way, my question is as follows:
(Assume that these things are done by scripts on different machines or different tasks within the same machine (concurrently))
Assume we have bucket called "bucket-one" and an object in it where the key is "foo/bar.ext"
one task tries to move "foo/bar.ext" to "foo2/bar.ext" and the other tries to move the object to "foo3/bar.txt", and ,say we use boto3 s3 client/resource, for example (probably does not affect the output though).
What happens when you try concurrently move an object at the exact same time from a folder to another folder within the same bucket ?
Outputs I have in mind are:
Both request would succeed with both moving the same file into different folders, so that we have now both "foo2/bar.ext" and "foo3/bar.ext".
Only one of them is moved either to "foo2/bar.ext" or "foo3/bar.ext"
Both of the requests failed, object is not moved and retained as "foo/bar.ext".
All of the above may happen without precisely knowing the output beforehand.
The second question would be the same thing only a change in time "not at the exact time, but very close (nearly-same time)".
I know the odds are not likely at all, but I am curios what it would result in.
Thanks
The only possible outcome is that you get both destination objects.
S3 doesn't support moving an object to a new key, it only supports making a copy of the object at a new key (whether in the same bucket or a different bucket) and then deleting the original object with a second request.
Deleting an object that is already in the process of being copied or downloaded has no impact on operations that are already in progress on that object.
Additionally, authorized delete operations on recently-deleted objects never fail (this may in fact always be true of delete requests, but this detail isn't important, here) so neither process will be aware that the other process has also just deleted the object when they try, because that operation will succeed.
You don't even need things to occur at the exact same time, in order to end up with two objects.
If the events occur in the order Copy 1, Copy 2, Delete 1, Delete 2, this would still be the outcome, no matter how close in time Copy 1 and Copy 2 occur as long as Delete 1 hasn't prevented Copy 2 from starting... but in fact, delete operations on objects are not themselves instantaneous, so Copy 2 could potentially still work even it starts a brief time period after Delete 1 has already finished. This is caused by the eventual consistency behavior that S3 provides for delete and overwrite operations, an optimization that trades this consistency for higher performance PUT and GET (including copy). The amount of time for full consistency is not a fixed value and is often close to zero. There is no exposed interface for determining whether a bucket's index replicas are fully consistent.
See Amazon S3 Data Consistency Model in the Amazon Simple Storage Service Developer Guide.
Related
So I was finding caching solutions for my AWS Lambda functions and I find out something called 'Simple Caching'. It's fits perfectly for what I want since my data is not changed frequently. However one thing that I was unable to find that what is the timeout for this cache. When is the data refreshed by the function and is there any way I can control it ?
An example of the code I am using for the function:
let cachedValue;
module.exports.handler = function(event, context, callback) {
console.log('Starting Lambda.');
if (!cachedValue) {
console.log('Setting cachedValue now...');
cachedValue = 'Foobar';
} else {
console.log('Cached value is already set: ', cachedValue);
}
};
What you're doing here is taking advantage of a side effect of container reuse. There is no lower or upper bound for how long such values will persist, and no guarantee that they will persist at all. It's a valid optimization to use, but it's entirely outside your control.
Importantly, you need to be aware that this stores the value in one single container. It lives for as long as the Node process in the container are alive, and is accessible whenever a future invocation of the function reuses that process in that container.
If you have two or more invocations of the same function running concurrently, they will not be in the same container, and they will not see each other's global variables. This doesn't make it an invalid technique, but you need to be aware of that fact. The /tmp/ directory will exhibit very similar behavior, which is why you need to clean that up when you use it.
If you throw any exception, the process and possibly the container will be destroyed, either way the cached values will be gone on the next invocation, since there's only one Node process per container.
If you don't invoke the function at all for an undefined/undocumented number of minutes, the container will be released by the service, so this goes away.
Re-deploying the function will also clear this "cache," since a new function version won't reuse containers from older function versions.
It's a perfectly valid strategy as long as you recognize that it is a feature of a black box with no user-serviceable parts.
See also https://aws.amazon.com/blogs/compute/container-reuse-in-lambda/ -- a post that is several years old but still accurate.
When I upload to S3, I get it that I may have to wait for a while before it is downloadable. If I call "doesObjectExist" on an amazonS3 object, and it returns true, can I guarantee that it is downloadable everywhere, and not just from my own machine?
As long as the object never existed before, and as long as you don't do anything to try to check whether it exists prior to uploading it, it is guaranteed immediately available when the upload is complete. You do not have to wait at all after initial object creation as long as you have not tried in any way to access the nonexistent object.
In all other cases -- such as overwrites or cases where you try to read before write -- there is no way to verify with absolute certainty whether it will be subsequently accessible to all requesters, but a check like doesObjectExist gives you a reasonably good indication that the object is accessible. There is nothing special about your machine from one request to any subsequent request. You may or may not be talking to the same system components inside S3 across different requests, even if consecutive.
I would like to backup a running rocksdb-instance to a location on the same disk in a way that is safe, and without interrupting processing during the backup.
I have read:
Rocksdb Backup Instructions
Checkpoints Documentation
Documentation in rocksdb/utilities/{checkpoint.h,backupable_db.{h,cc}}
My question is whether the call to CreateNewBackupWithMetadata is marked as NOT threadsafe to express, that two concurrent calls to this function will have unsafe behavior, or to indicate that ANY concurrent call on the database will be unsafe. I have checked the implementation, which appears to be creating a checkpoint - which the second article claims are used for online backups of MyRocks -, but I am still unsure, what part of the call is not threadsafe.
I currently interpret this as, it is unsafe, because CreateBackup... calls DisableFileDeletions and later EnableFileDeletions, which, of course, if two overlapping calls are made, may cause trouble. Since the SST-files are immutable, I am not worried about them, but am unsure whether modifying the WAL through insertions can corrupt the backup. I would assume that triggering a flush on backup should prevent this, but I would like to be sure.
Any pointers or help are appreciated.
I ended up looking into the implementation way deeper, and here is what I found:
Recall a rocksdb database consists of Memtables, SSTs and a single WAL, which protects data in the Memtables against crashes.
When you call rocksdb::BackupEngine::CreateBackupWithMetadata, there is no lock taken internally, so this call can race, if two calls are active at the same time. Most notably this call does Disable/EnableFileDeletions, which, if called by one call, while another is still active spells doom for the other call.
The process of copying the files from the database to the backup is protected from modifications while the call is active by creating a rocksdb::Checkpoint, which, if flush_before_backup was set to true, will first flush the Memtables, thus clearing the active WAL.
Internally the call to CreateCustomCheckpoint calls DB::GetLiveFiles in db_filecheckpoint.cc. GetLiveFiles takes the global database lock (_mutex), optionally flushes the Memtables, and retrieves the list of SSTs. If a flush in GetLiveFiles happens while holding the global database-lock, the WAL must be empty at this time, which means the list should always contain the SST-files representing a complete and consistent database state from the time of the checkpoint. Since the SSTs are immutable, and since file deletion through compaction is turned off by the backup-call, you should always get a complete backup without holding writes on the database. However this, of course, means it is not possible to determine the exact last write/sequence number in the backup when concurrent updates happen - at least not without inspecting the backup after it has been created.
For the non-flushing version, there maybe WAL-files, which are retrieved in a different call than GetLiveFiles, with no lock held in between, i.e. these are not necessarily consistent, but I did not investigate further, since the non-flushing case was not applicable to my use.
Is it possible to implement transactions in Graph Engine?
I like to do multiple updates on different cells and then commit or rollback these changes.
Even with one cell it is difficult. When I use the next code the modification is not written to disk but the memory is changed!
using (Character_Accessor characterAccessor = Global.LocalStorage.UseCharacter(cellId, CellAccessOptions.StrongLogAhead))
{
characterAccessor.Name = "Modified";
throw new Exception("Test exception");
}
My understanding is: Regardless of you throwing this Exception or not: The changes are always only in memory - until you explicitly call Global.LocalStorage.SaveStorage().
You could implement your transaction by saving the storage before you start the transaction, then make the changes, and in case you want to rollback, just call Global.LocalStorage.ResetStorage().
All this, of course, only if you do not need high-performance throughput and access the database on a single thread.
The write-ahead log is only flushed to the disk at the end of the "using" scope -- when the accessor is being disposed and the lock in the memory storage is about to be released.
This is like a mini-transaction on a single cell. Others cannot access the cell when you hold the lock. You could do multiple changes to the cell and "commit" them at the end -- or, making a shadow copy at the beginning of the using scope, and then rollback later to this copy when anything goes wrong (this is still a manual process though).
Also, please check this out: https://github.com/Microsoft/GraphEngine/tree/multi_cell_lock
We are working on enabling one thread to hold multiple locks. This will make multi-entity transactions much easier to implement.
I have written a program (suppose X) in c++ which creates a data structure and then uses it continuously.
Now I would like to modify that data structure without aborting the previous program.
I tried 2 ways to accomplish this task :
In the same program X, first I created data structure and then tried to create a child process which starts accessing and using that data structure for some purpose. The parent process continues with its execution and asks the user for any modification like insertion, deletion, etc and takes input from console and subsequently modification is done. The problem here is, it doesn't modify the copy of data structure that the child process was using. Later on, I figured out this won't help because the child process is using its own copy of data structure and hence modifications done via parent process won't be reflected in it. But definitely, I didn't want this to happen. So I went for multithreading.
Instead of creating child process, I created an another thread which access that data structure and uses it and tried to take user input from console in different thread. Even,
this didn't work because of very fast switching between threads.
So, please help me to solve this issue. I want the modification to be reflected in the original data structure. Also I don't want the process (which is accessing and using it continuously) to wait for sometimes since it's time crucial.
First point: this is not a trivial problem. To handle it at all well, you need to design a system, not just a quick hack or two.
First of all, to support the dynamic changing, you'll almost certainly want to define the data structure in code in something like a DLL or .so, so you can load it dynamically.
Part of how to proceed will depend on whether you're talking about data that's stored strictly in memory, or whether it's more file oriented. In the latter case, some of the decisions will depend a bit on whether the new form of a data structure is larger than an old one (i.e., whether you can upgrade in place or no).
Let's start out simple, and assume you're only dealing with structures in memory. Each data item will be represented as an object. In addition to whatever's needed to access the data, each object will provide locking, and a way to build itself from an object of the previous version of the object (lazily -- i.e., on demand, not just in the ctor).
When you load the DLL/.so defining a new object type, you'll create a collection of those the same size as your current collection of existing objects. Each new object will be in the "lazy" state, where it's initialized, but hasn't really been created from the old object yet.
You'll then kick off a thread that walks makes the new collection known to the rest of the program, then walks through the collection of new objects, locking an old object, using it to create a new object, then destroying the old object and removing it from the old collection. It'll use a fairly short timeout when it tries to lock the old object (i.e., if an object is in use, it won't wait for it very long, just go on to the next. It'll iterate repeatedly until all the old objects have been updated and the collection of old objects is empty.
For data on disk, things can be just about the same, except your collections of objects provide access to the data on disk. You create two separate files, and copy data from one to the other, converting as needed.
Another possibility (especially if the data can be upgraded in place) is to use a single file, but embed a version number into each record. Read some raw data, check the version number, and use appropriate code to read/write it. If you're reading an old version number, read with the old code, convert to the new format, and write in the new format. If you don't have space to update in place, write the new record to the end of the file, and update the index to indicate the new position.
Your approach to concurrent access is similar to sharing a cake between a classroom full of blindfolded toddlers. It's no surprise that you end up with a sticky mess. Each toddler will either have to wait their turn to dig in or know exactly which part of the cake she alone can touch.
Translating to code, the former means having a lock or mutex that controls access to a data structure so that only one thread can modify it at any time.
The latter can be done by having a data structure that is modified in place by threads that each know exactly which parts of the data structure they can update, e.g. by passing a struct with details on which range to update, effectively splitting up the data beforehand. These should not overlap and iterators should not be invalidated (e.g. by resizing), which may not be possible for a given problem.
There are many many algorithms for handling resource competition, so this is grossly simplified. Distributed computing is a significant field of computer science dedicated to these kinds problems; study the problem (you didn't give details) and don't expect magic.