I'm looking for a concurrent cache structure. I'm using the PPL from Microsoft so I have the concurrent_unordered_map class, but it doesn't quite seem to be what I need. I've got a hash value and I need to associate it with a pointer type, or return that pointer if it was already in the cache. I'm not using LRU or MRU cache strategy and values wil never be removed, so it's more like a concurrent memoize.
Would it be simpler to just lock an existing std::unordered_map?
I don't know about the Microsoft PPL. I just looked at the Intel Thread Building Blocks header files for Intel's concurrent_unordered_map and its insert function returns false as the second part of the returned pair when the key is already in the map.
This seems to be exactly what you need. Do the insert and if it returns true then it was a new insert. If it returns false then it was already in the map.
Edit: There seems to be some confusion here. I didn't mean that you should always run the insert. I meant that you should look for the value and if it is missing then attempt the insert. Two or more threads may occasionally race on the insert and so work will be duplicated, but this should be a rare event.
Related
In my application I basically have multiple threads that perform inserts and mostly one thread that is iterating through a map and removing items if it meets certain criteria. The reason I wanted to use a concurrent structure is that it would have provided finer grain locking in the code that removes items from the queue which looks similar to this which is not ideal for various reasons including that the thread could get pre-empted while holding the lock.
Function_reap()
{
while(timetaken != timeoutTime)
{
my_map_mutex.lock();
auto iter = my_unordered_map.begin();
while(iter != my_unordered_map.end())
{
if(status_completed == iter->second.status)
{
iter = my_unordered_map.erase(iter);
}
}
my_map_mutex.unlock();
}
}
Was going through the documentation for Intel TBB(Threading Building Blocks) and more specifically the concurrent_unordered_map documentation (https://software.intel.com/en-us/node/506171) to see if this is a good fit for my application and came across this excerpt.
Description concurrent_unordered_map and concurrent_unordered_multimap support concurrent insertion and
traversal, but not concurrent erasure. The interfaces have no visible
locking. They may hold locks internally, but never while calling
user-defined code. They have semantics similar to std::unordered_map
and std::unordered_multimap respectively, except as follows:
The erase and extract methods are prefixed with unsafe_, to indicate that they are not concurrency safe.
Why does TBB not provide safe synchronized deletion from the map? what is the technical reason for this?
What if any other options do i have here? Ideally something that definitely works on Linux and if possible portable to windows.
Well, it is difficult to design a solution that (efficiently) supports all operations. TBB has the concurrent_unordered_map which supports concurrent insert, find and iteration, but no erase - and the concurrent_hash_map which supports concurrent insert, find and erase, but no iteration.
There are several other libraries that provide concurrent hash maps like libcds, or my own one called xenium.
ATM xenium contains two concurrent hash map implementations:
harris_michael_hash_map - fully lock-free; supports concurrent insert, erase, find and iteration. However, the number of buckets has to be defined at construction time and cannot be adapted afterwards. Each bucket contains a linked list of items, which is not very cache friendly.
vyukov_hash_map - is a very fast hash map that uses fine grained locking for insert, erase and iteration; find operations are lock-free. However, if are using iterators you have to be careful to avoid deadlocks (i.e., a thread should not try to insert or erase a key while holding an iterator). However, there is an erase overload that takes an iteration, so you can safely remove the item the iterator points to.
I am currently working to make xenium fully windows compatible.
When inserting elements into an std::unorder_set is it worth calling std::unordered_set::find prior to std::unordered_set::insert? From my understanding, I should always just call insert as it returns an std::pair which contains a bool that tells whether the insertion succeeded.
Calling find before insert is essentially an anti-pattern, which is typically observed in poorly designed custom set implementations. Namely, it might be necessary in implementations that do not tell the caller whether the insertion actually occurred. std::set does provide you with this information, meaning that it is normally not necessary to perform this find-before-insert dance.
A typical implementation of insert will typically contain the full implementation of find, meaning that the find-before-insert approach performs the search twice for no meaningful reason.
However, some other shortcomings of std::set design do sometimes call for a find-before-insert sequence. For example, if your set elements contain some fields that need to be modified if (an only if) the actual insertion occurred. For example, you might have to allocate "permanent" memory for some pointer fields instead of "temporary" (local) memory these fields were pointing to before the insertion. Unfortunately, this is impossible to do after the insertion, since std::set only provides you with non-modifying access to its elements. One workaround is to do a find first, thus "predicting" whether an actual insertion will occur, and then setting up the new element accordingly (like allocating "permanent" memory for all fields) before doing the insert. This is ugly from the performance point of view, but it is acceptable in non-performance-critical code. That's just how things are with standard containers.
It's best to just attempt the insert, otherwise the effort of hashing and iterating over any elements that have collided in the hash bucket is unnecessarily repeated.
If your set it threadsafe and accessed concurrently then calling find first does very little, as insert would be atomic but a check-then-act would be susceptible to race condition.
So in general and especially in a multithreaded context, just insert.
This is about thread safety of std::map. Now, simultaneous reads are thread-safe but writes are not. My question is that if I add unique element to the map everytime, will that be thread-safe?
So, for an example, If I have a map like this std:map<int, std::string> myMap
and I always add new keys and never modify the existing key-value, will that be thread-safe?
More importantly, will that give me any random run-time behavior?
Is adding new keys also considered modification? If the keys are always different while adding, shouldn't it be thread-safe as it modifies an independent part of the memory?
Thanks
Shiv
1) Of course not
2) Yes, I hope you'll encounter it during testing, not later
3) Yes, it is. The new element is added in a different location, but many pointers are modified during that.
The map is implemented by some sort of tree in most if not all implementations. Inserting a new element in a tree modifies it by rearranging nodes by means of resetting pointers to point to different nodes. So it is not thread safe
no, yes, yes. You need to obtain exclusive lock when modifying container (including insertion of new keys), though while there's no modification going on you can, of course, safely read concurrently.
edit: http://www.sgi.com/tech/stl/thread_safety.html might be of interest for you.
I am trying to implement LRU Cache using C++ . I would like to know what is the best design for implementing them. I know LRU should provide find(), add an element and remove an element. The remove should remove the LRU element. what is the best ADTs to implement this
For ex: If I use a map with element as value and time counter as key I can search in O(logn) time, Inserting is O(n), deleting is O(logn).
One major issue with LRU caches is that there is little "const" operations, most will change the underlying representation (if only because they bump the element accessed).
This is of course very inconvenient, because it means it's not a traditional STL container, and therefore any idea of exhibiting iterators is quite complicated: when the iterator is dereferenced this is an access, which should modify the list we are iterating on... oh my.
And there are the performances consideration, both in term of speed and memory consumption.
It is unfortunate, but you'll need some way to organize your data in a queue (LRU) (with the possibility to remove elements from the middle) and this means your elements will have to be independant from one another. A std::list fits, of course, but it's more than you need. A singly-linked list is sufficient here, since you don't need to iterate the list backward (you just want a queue, after all).
However one major drawback of those is their poor locality of reference, if you need more speed you'll need to provide your own custom (pool ?) allocator for the nodes, so that they are kept as close together as possible. This will also alleviate heap fragmentation somewhat.
Next, you obviously need an index structure (for the cache bit). The most natural is to turn toward a hash map. std::tr1::unordered_map, std::unordered_map or boost::unordered_map are normally good quality implementation, some should be available to you. They also allocate extra nodes for hash collision handling, you might prefer other kinds of hash maps, check out Wikipedia's article on the subject and read about the characteristics of the various implementation technics.
Continuing, there is the (obvious) threading support. If you don't need thread support, then it's fine, if you do however, it's a bit more complicated:
As I said, there is little const operation on such a structure, thus you don't really need to differentiate Read/Write accesses
Internal locking is fine, but you might find that it doesn't play nice with your uses. The issue with internal locking is that it doesn't support the concept of "transaction" since it relinquish the lock between each call. If this is your case, transform your object into a mutex and provide a std::unique_ptr<Lock> lock() method (in debug, you can assert than the lock is taken at the entry point of each method)
There is (in locking strategies) the issue of reentrance, ie the ability to "relock" the mutex from within the same thread, check Boost.Thread for more information about the various locks and mutexes available
Finally, there is the issue of error reporting. Since it is expected that a cache may not be able to retrieve the data you put in, I would consider using an exception "poor taste". Consider either pointers (Value*) or Boost.Optional (boost::optional<Value&>). I would prefer Boost.Optional because its semantic is clear.
The best way to implement an LRU is to use the combination of a std::list and stdext::hash_map (want to use only std then std::map).
Store the data in the list so that
the least recently used in at the
last and use the map to point to the
list items.
For "get" use the map to get the
list addr and retrieve the data
and move the current node to the
first(since this was used now) and update the map.
For "insert" remove the last element
from the list and add the new data
to the front and update the map.
This is the fastest you can get, If you are using a hash_map you should almost have all the operations done in O(1). If using std::map it should take O(logn) in all cases.
A very good implementation is available here
This article describes a couple of C++ LRU cache implementations (one using STL, one using boost::bimap).
When you say priority, I think "heap" which naturally leads to increase-key and delete-min.
I would not make the cache visible to the outside world at all if I could avoid it. I'd just have a collection (of whatever) and handle the caching invisibly, adding and removing items as needed, but the external interface would be exactly that of the underlying collection.
As far as the implementation goes, a heap is probably the most obvious. It has complexities roughly similar to a map, but instead of building a tree from linked nodes, it arranges items in an array and the "links" are implicit based on array indices. This increases the storage density of your cache and improves locality in the "real" (physical) processor cache.
I suggest a heap and maybe a Fibonacci Heap
I'd go with a normal heap in C++.
With the std::make_heap (guaranteed by the standard to be O(n)), std::pop_heap, and std::push_heap in #include, implementing it would be absolutely cake. You only have to worry about increase-key.
I have a collection of the form:
map<key, list<object> >
I only ever insert at the back of the list and sometimes I read from the entire map (but I never write to the map, except at initialization).
As I understand it, none of the STL containers are thread safe, but I can only really have a maximum of one thread per key. Am I missing anything in assuming I'll be pretty safe with this arrangement?
If the map is never modified at all during the multi-threaded scenario, then you're fine. If each thread looks at its own list, then that's thread-private data, so you're also fine.
Take care not to try and lookup keys with [] because that will insert (modify) if the key doesn't exist in the map yet.
However, I'm curious as to why you'd need this structure - why not keep a pointer/reference or the actual list object itself on the stack of each thread, given that it's private to each thread?
(If it's not, then you need proper synchronisation on the list.)
In fact you say you "read from the entire map" - presumably meaning that any random thread may try to iterate through any of the lists. So you definitely need to synchronise operations on the lists.
TBH as long as you put a critical section around any write and read it will work fine.