Firstly, forgive me if this is a stupid question.
I would like to create a generic synchronised list (like in Java) for reuse in my Go projects.
I found the source of Go's linked list and I was wondering would it be sufficient to simply add mutex locks to the list manipulation functions?
If you're going to make a concurrent-safe container, you need to protect all access to the data, not just writes. Inspecting an element, or even calling Len() without synchronizing the read could return invalid or corrupt data.
It's probably easier to protect the entire data structure with a mutex, rather than implement your own concurrent linked list.
Related
How to organize parallel access to data (e.g. ETS table) from bunch of processes in erlang or elixir?
In traditional model I would create RWLock and make critical section as small as possible. So, I can access to hash-table with parallel reads at least.
In erlang first idea is implement gen_server that store table in state. But all access will be serialized. How to handle it to serve faster?
Use direct access to :ets and specify read_concurrency: true in call to :ets.new/2.
GenServer is a redundant link here, that might become a bottleneck.
I have read that std::map is not thread safe. So if I am accessing (read/write) the std::map from different threads, should I simply wrap the relevant code in a critical section?
Note: I am using Visual C++ 2010.
Simple answer: yes. But how to do it properly can be tricky. The basic strategy would be to wrap calls to your map in critical sections, including wrapping the lifetimes of iterators.
But you also need to make sure that your app's assumptions about the map are handled carefully as well. For example if you need to delete many related items from the map, either make sure other threads are tolerant to only some of those items missing, or wrap the whole batch operation in the critsec. This can easily spiral out of control so you end up wrapping huge amounts of code in critical sections, which will end up causing deadlocks and degrading performance. Careful!
Just got a simultaneous write for the same question.
Bottom line : use a read/write lock.
I am fairly new to the concept of multi-threading and was exploring to see some interesting problems in order to get a better idea.
One of my friends suggested the following:
"It is fairly straight-forward to have a linked-list and do the regular insert, search and delete operations. But how would you do these operations if multiple threads need to work on the same list.
How many locks are required minimum. How many locks can we have to have optimized linked list functions?"
Giving some thought, I feel that a single lock should be sufficient to work with. We acquire the lock for every single read and write operation. By this I mean when we are accessing a node data in a list we acquire the lock. When we are inserting/deleting elements, we acquire lock for the complete series of steps.
But I was not able to think of a way where using more locks will give us more optimized performance.
Any help/pointers?
The logical extension of "one lock per list" would be "one lock per item".
The case when this would be useful would e.g. be if you're often only modifying a single item of the list.
For deletion and insertion, acquiring the proper locks gets more complicated, though. You'd have to acquire the lock for the item before and after, and you'd have to make sure to always acquire them in the same order (to prevent deadlocks). And there's of course also special cases to be considered if the root element has to be modified (and possibly also if it's a double-linked list or a circular linked list). This overhead resulting from the more complicated locking logic might lead to your implementation being slower again, especially if you often have to insert and delete from the list.
So I would only consider this if the majority of accesses is the modification of a single node.
If you're searching for peak performance for a specific use case, then in the end, it boils down to implementing both, and running performance comparisons for a typical scenario.
You definitely need at least one semaphore/lock to ensure list integrity.
But, presumably any operation on the list changes at most two nodes: The node being insert/changed/deleted and the adjacent node which points to it. So you could implement locking on a per-node basis, locking at most two nodes for a given operation. This would allow for a degree of concurrency when different threads accesses the list, though you'd need to distinguish between read and write locks to the get full benefit of this approach I think.
If you're new to multi-threading, embrace the notion that premature optimization is a waste of time. Linked lists are a very straight-forward data structure, and you can make it thread-safe by putting a critical section on all reads and writes. These will lock the thread into the CPU for the duration of the execution of the read/insert/delete operation, and ensure thread-safety. They also don't consume the overhead of a mutex lock, or more complicated locking mechanism.
If you want to optimize after the fact, only do so with a valid profiling tool that gives you raw numbers. The linked list operations will never end up being the biggest source of application slowdown, and it will probably never be worth your while to add in the node-level locking being discussed.
Using one lock for the entire list would completely defeat most reasons for multithreading in the first place. By locking the entire list down, you guarantee that only one thread can use the list at a time.
This is certainly safe in the sense that you will have no deadlocks or races, but it is naive and inefficient because you serialize access to the entire list.
A better approach would be to have a lock for each item in the list, and another one for the list itself. The latter would be needed when appending to the list, depending on how the list is implemented (eg, if it maintains a node count seperate from the nodes themselves).
However this might also be less than optimal depending on a number of factors. For instance, on some platforms mutexes might be expensive in terms of resources and time when instantiating the mutex. If space is at a premium, another approach might be to have a fixed-size pool of mutexes from which you draw whenever you need to access an item. These mutexes would have some kind of ownership flag which indicates which node they are allocated to, so that no other mutex would be allocated to that node at the same time.
Another technique is to use reader/write locks, which will allow read access to any thread, but write access to only one, the two being mutually exclusive. However it has been suggested in the literature that in many cases using a reader/write lock is actually less efficient than simply using a plain mutex. This will depend on your actual usage pattern and how the lock is implemented.
You only need to lock when you're writing and you say there's usually only one write, so try read/write locks.
I'm developing a multi-threaded plugin for a single-threaded application (which has a non-thread-safe API).
My current plugin has two threads: the main one which is application's thread and another one which is used for processing data of the main thread. Long story short, the first one creates objects, gives them an ID, inserts them into a map and sometimes even access and delete them (if application says so); the second one is reading data from that map and is altering objects.
My question is: What tehniques can I use in order to make my plugin thread-safe?
First, you have to identify where race conditions may exist. Then, you will have to use some mechanism to assure that the shared data is accessed in a safe way, hence achieving Thread Safety.
For your particular case, it seems the race condition will be on the shared map and possibly the objects (map's values) it contains as well (if it's possible that both threads attempt to alter the same object simultaneously).
My suggestion is that you use a well tested thread safe map implementation, and then if needed add the extra "protection" for the map's values themselves. This way you ensure the map is always in a consistent state for both threads, and if both threads attempt to modify the same object data (map's values), the data won't be corrupted or left inconsistent.
For the map itself, you can search for "Concurrent Hash Map" or "Atomic Hash Map" data structures for C++ and see if they are of good quality and are available for your compiler/platform. Good examples are Intel's TBB concurrent_hash_map or Facebook's folly AtomicHashMap. They both have advantages and disadvantages and you will have to analyze what's best for your situation.
As for the objects the map contains, you can use plain mutexes (simple, lock, modify data, unlock), atomic operations (trickier, only for simple datatypes) or other method, once more depending on your compiler/platform and speed requirements.
Hope this helps!
I am writing a multi-threaded program using OpenMP in C++. At one point my program forks into many threads, each of which need to add "jobs" to some container that keeps track of all added jobs. Each job can just be a pointer to some object.
Basically, I just need the add pointers to some container from several threads at the same time.
Is there a simple solution that performs well? After some googling, I found that STL containers are not thread-safe. Some stackoverflow threads address this question, but none that forms a consensus on a simple solution.
There's no built-in way to do this. You can simply use a lock to guard one of the existing container types. It might be a better idea to have each thread use it's own container, then combine the results together in the end.
Using a mutex or similar synchronization primitive to control access to a linked list is not very difficult, so I'd recommend you try that first.
If it performs so poorly that you can't use it, try this instead: give each thread its own job queue, and have the job consumer check all the queues in turn. This way each queue has only one reader and one writer, so a lock-free implementation is relatively straightforward. By this I mean it may exist for your platform; you should not attempt to write it yourself.