Basically, if two processes attempt to append to the same key at the same time, is there any chance that one will ever overwrite the other?
e.g.:
Process 1 appends "a" to the key "k"
Process 2 appends "b" to the key "k"
Are we guaranteed to have two characters (either "ab" or "ba") as the value after we perform these actions?
Yes, memcached does not do a read/write to append so concurrency is ensured
Related
Consider the following example:
These are the inputs:
ASED
BTY
ASED->CWD
CWD->DTT
EI->FHK
These are just a string. They have no special meaning. But "->" indicates propagate as a clone. And I want to get DTT's father according to these entries. Is there a faster solution?
I did not ask about coding, I only ask about the method.
It all depends if the tokens (ASED, CWD, etc.) are unique or not. Also, do you want to use only standard C++ libraries or you are open to using additional libraries.
Assuming the tokens are unique, and you want to use standard C++. There are no tree data structures in the standard C++, but in this case you don't need them to address your problem.
Assuming also that a token can have only one parent, you can revert the expression parent -> child to child -> parent (in your algorithm, not in the input list). Once done, you can store the child as a key of a map and the parent as a value. You will need to introduce a stop key as a value (ex. NULL) to signify that the particular key has no parent.
To extract the parent, you will need to fetch the value from the map corresponding to the child, which is the intermediate parent (the link). Next you fetch the intermediate parent to get his parent. This will be repeated until you reach the stop key.
As of complexity of this approach, depends on which type of map you select std::map or std::unordered_map
This algorithm can be slightly modified to support for multiple parents, in which case during the traveling you will travel all possible values for the key you are fetching, which can be done with recursion or with a stack, and you will use std::multimap to store the data.
The problem is: I have an input of strings, every line of text is ordered 1,2,3,4,5... I have to put these strings inside every line, for example, if the input is
"1.Hi john 2. How are you? 3. XXXX 4.TTTTT"
The output will be:
(1)Hi john
(2) How are you?
(3) XXXX
(4)TTTTT
I can't have an input of row 7, if row 5 and 6 aren't already filled.
In input I have also some command, for example:
print line 3
change line 2 with a given string
delete line 3 (and the next lines will upscale, so 4 becomes 3, 5 becomes 4...)
undo
redo
Which is the best data structure to implement? I started with a heap, because everything is ordered, but if I delete a node, I need to push every next lines up, and I have some problems with heaps. I also thought about persistent tree, because I need to remember precedent step to be able to do the undo and redo.
The answer primarily depends on which operations you want to optimize on.
Array - access is quickest. So it will optimize on printline and change line.
Double Ended Linked List - access anywhere apart from head and tail needs traversal. However, it can easily handle your delete, undo and redo situations.
For your undo and redo function, you should create a separate undo stack, where you store the removed or changed nodes, together with its old state. Redo is a matter of popping elements off the stack to reapply them into the main list.
If you expect to be adding and deleting lines a lot, and you also expect to be referring to lines by their (current) index, what you want is an order statistic tree. This is a type of binary tree which, in addition to normal binary tree operations (including efficient insertion and deletion), allows you to efficiently access items by index. In this case, it's not a binary search tree because you don't have a sort key; all of your accesses will be by (current) index.
In order to efficiently support undo/redo, you would additionally make the tree into a persistent data structure, using "path copying" to partially modify the data while still allowing access to the old version of it. (Path copying would be ideal for order-statistic trees, since all your updates propagate to the root anyway.) Undo would simply be reverting to that old version.
But: Unless you are dealing with millions or billions of lines, these weird exotic data structures are not going to be worth your time. So while the literal answer is "persistent order-statistic tree" the practical answer is "probably just put stuff in an array, and have a stack of undo operations to support undo/redo".
You can use combination of a Map and Double Link List.
Double Link List : Where actual string will be stored, and the index of node will represent number of corresponding string.
Ex-> Print line 3, you print 3rd node in this list
Map : Here key will be the line no, and value will be the corresponding references to Double link list. This will help in insertion and deletion of a new string in Link List.
Ex -> Delete line 3, find the reference of line 3 DLL node from map, change the references in DLL and remove line 3 node. Similar process for update and insertion of a line.
For Undo and Redo as #John advised, you can use 2 different stacks, 1 reflecting the operations done (which will help in undo), and 1 reflecting the undo operations (which will help in redo operation).
I have two threads where one thread "A" inserts a key X to the map and the same key X is being modified by that thread "A" frequently.
At a particular point the thread "A" completes modifications to that key X and then thread "B" will read the key "X" and delete the key "X" from the map.
While the thread "B" reads and deletes the map , the thread "A" will insert and write some other keys in the map(not the same key X) concurrently.
In this case , does the map needs to be synchronized? As the thread "B" is sure that the key "X" is completely modified by thread "A" and no more concurrent modifications will be made for that key "X".
Yes, you need synchronization.
Inserting and deletion can change internal state of the map class that can overlap with other similar operations (even if they are for different keys).
While thread A updates the object you don't need to lock the map. Map guarantees that iterators and object pointers are stable under insertions/deletions so your object won't be touched.
We have lists with fixed sizes that will get populated concurrently by different processes. Is there a way to perform this without using TRANSACTIONS?
For example, is there an atomic operation where you add an item to a list ONLY if the size of the list is smaller than X?
There is no single command, which will add an item to a list only if the list contains less than n items. You would need to wrap them into a transaction in order to make them "atomic".
The only way to implement an atomic call without transactions would be via a LUA script. Something along the lines (pseudo code):
local len = redis.call("LLEN", KEYS[1])
if len >= ARGV[1] then
return nil
end
redis.call("LPUSH", KEYS[1], ARGV[2])
return ARGV[2]
You would call this LUA script with the key-name of the list (KEYS[1]), the maximum length of the list (ARGV[1]) and the item that's supposed to be put on the list (ARGV[2]). Only if the length of the list is less than the maximum will the item be added and returned. If the list length is greater-equal to the maximum "nil" will be returned.
I'll begin my answer with a question: why are you so reluctant to use transactions to ensure atomicity? Note that Redis' design is such that it provides you with flexible building blocks that you can put together to implement all kinds of patterns. If an atomic command doesn't exist, you're encouraged to use MULTI/EXEC blocks or Lua scripts to achieve the same effect using existing primitives.
So the answer's no, Redis doesn't have an atomic command that can add an item and keep a fixed-sized list. This pattern is popularly implemented with a LTRIM or LRANGE, depending on the exact type of behavior you're looking for (e.g. what happens what trying to add an item to a "full" list, is an empty/smaller list possible and how are items always pushed before popping). While in some cases transactionality can be eschewed, most times you'll want to ensure that there are no race conditions in the list's management.
Can I check if part of an element exists in an Array of Strings, or check if multiple elements exist in one query? So:
1) Does an element start with 'aaa:' in the array ['aaa:1', 'bab:0', 'aab:1']
2) Does the element 'aaa:1' OR 'aaa:0' exist in the array ['aaa:1', 'bab:0','aab:1']
If so, do not execute the API operation.
Is this possible? The Documentation isn't clear if UPDATE_ITEM is this robust or not.
The short answer is: NO for both
The long answer is:
Point 1) will never be possible
Point 2) can be done using the CONTAINS filter of the scan operation but... only for a single match. No "OR" stuff. However, scan is both slow and expensive and thus heavily discouraged.
UPDATE_ITEM conditions will only allow exact matches.