Concurrent Hash Map get and put overlapping - concurrency

I have read that get method is fully concurrent in ConcurrentHashMap(Jdk 7 ) and so it can overlap with all update operations. What will happen if two threads run put(Key,V) and Get (Key) concurrently if Key is not already present?

According to JDK7 Javadoc:
Retrievals reflect the results of the most recently completed update operations holding upon their onset.
This means get() will return null if it was invoked before put() completed.
The short answer to your question is: put() will return null.

Related

How atomic is Compare and Swap against global variable in modern computer?

Assume we implement in a modern programming language like C++. Let's say we have 5 threads t1 to t5, and we also have an array of timestamps TS[5] one for each thread. We also have a global timestamp GT which will increase gradually as the process runs. So now each of the five thread tx makes a local copy of TS[x] as local_ts and is trying to do a compare and swap on its timestamp CAS(&TS[x],local_ts,TS). Then my question is will the final timestamps in TS[5] reflect the order of when each thread's compare and swap actually takes place. For example, if a thread does CAS before another thread, its stored timestamp must be less than or equal to the other thread's timestamp.
Please refer to the following as a simple code example in C++
TS timestamps[5];//assuming TS ia a class enclosing a long value.
TS GT=0;//it is periodically increased by another thread not shown here.
void work_load(int id){
for(int i=0;i<10000;i++);//simulate the thread does some work
TS local = timestamps[id];
timestamps[id].compare_exchange_strong(local,GT,std::some_memotry_order);
//each thread reads other threads' entries in the array and does things accordingly to their values against its own value
}
void main(){
for(int i=0;i<5;i++){
std::thread t(work_load,i);
}
}
So the goal is to design a transactional system with no locks. Each transaction appends its updates to a record as deltas. In validation phase, a transaction needs to check its deltas against other deltas on the same records that may conflict. I'm trying to design a global data structure which records atomically when each transaction starts its commit phase so transactions can decide if they need to abort based on observing who makes the conflicting deltas.
So many thanks and kudos to the people in the comment section. So my question is answered. The atomic CAS does not include an atomic read of the value I want to replace with upon success. Basically CAS(&object, expected, new_value) does not include fetching new_value from the global variable as an operation within the atomic CAS. Read_modify_write only guarantees the atomicity on the target object.

Is ConcurrentHashMap::computeIfAbsent atomic per key or per ConcurrentHashMap?

In a call to ConcurrentHashMap::computeIfAbsent I use a slightly expensive mappingFunction. The mappingFunctions are safe to execute concurrently if and only if they are for different keys.
I'm wondering if the mappingFunctions are executed concurrently for different keys. If this is not the case, each mappingFunction would be executed one-at-a-time, leading to unnecessary waiting time. To fix this I would need to write more complex code and use putIfAbsent.
Does anyone know if mappingFunctions are executed concurrently if they are for different keys?
The documentation states:
The entire method invocation is performed atomically, so the function is applied at most once per key.
This may or may not answer my question, depending on how you read it

How do key/value stores manage concurrency?

I have read a number of articles stating key value stores only require two operations:
set(key, value)
get(key)
This is fine for a single process, but when you have multiple processes, how does the key value store manage concurrency? I would have thought a version number (E.g., an unsigned integer) used for compare-and-swap style concurrency would be required. E.g., the two operations would be:
set(key, value, version), where version is the condition - a mismatch causes a concurrency error and a successful match causes an increment.
get(key) (returning both the value and the version).
There is two kind of design. Some are using locks and other are using MultiVersion Concurrency Control.
MVCC achieve concurrency without locks. It can be summed up as:
In a single read, the database return the latest version of the data
In a single write, the databae add a new version for the data
In the case of read/write concurrent request, the read get the most recent version of the data, that is before the current write.
In the case of write/write concurrent request, I think that one of the write is abandonned and replayed later.

Is it possible to update a list of list in real time as a program is running?

I have a code that is running 24/7. And, I am wondering if there is any methodology which I could use to allow me to make changes to the variables in real-time without invoking any error? Had been using raw_input() but this 'stops' the program since it's running sequentially.
My idea is to use a while true loop:
while true:
...
...
and for the first few loops, it'll use the default catch all values that i have pre-programmed into the system. As it's running, I'll like to make changes to some constant terms (which act as control) in 'real-time'. So, in the next loop and beyond, it'll use the new values rather than the pre-programmed version.
Some of your code or details of what you are trying to do would help.
But one way to do it is to have two processes, one process that reads from standard in with raw_input(), we can call it p1; and one that handles the data structure, in this case the list, we call it p2.
The two processes could communicate with message passing using sockets or what ever you want.
Then to be sure to avoid race conditions that new data is read in p1, but not yet updated in p2. Thus p2 will carry on and use the out of date data. One way to do this is using locks.

Efficiently iterating through a map while inserting on other thread

I have an std::map < std::string, std::string > which is having values added to it at irregular intervals from one thread (but frequently and needs to be very fast), and occasionally having groups of entries removed.
I need from a different thread to dump a snapshot of the map as text to a debug log on command from a user.
Clearly it's not thread safe to just iterate through the map outputting the debug information while it could be updated so I'm currently taking a read lock (mutex) before dumping the data and a write lock for every insert or delete. This works fine, but I can't really lock the map for this long, it delays the processing of incoming updates too much.
I don't believe I can lock and unlock the debug dump thread for each item as modifying the map from the other thread can invalidate the iterator I believe.
Is there any way I can do this safely without having to take out a read lock on the whole data structure while I write it out so that new values can still be inserted quickly? I realise I won't be able to get a guarenteed consistent view of the data if values can be added and removed while I'm iterating though it, but as long as it's safe that's understood.
If there is no way to use a map for this, can anyone suggest any other data structure I could use?
edit: I'm hoping for a solution that means I don't need to take out an expensive lock when adding an item.
There are 2 solutions I can see at this moment:
(Easy, but might still take too long): copy the map (or assign to another container) while locked and then dump the local copy to the debug log while not locked
(Some more work): delegate the updates of the map to another thread via a queue. If the other thread is the one that dumps to the debug log, then you don't need the locks anymore. This way the fast threads are only locked while accessing the queue.