Is threre a way to fix std::unordered_maps bucket count at construction time and not living in the danger of it rehashing?
My case is the following:
I have an object that can't be copied or moved. Once it's emplaced, it has to stay exactly in that memory spot until destruction, as this memory is being used continuously by another task.
I want to put this object into a container. The problem is that the key would be a string. Strings are slow for maps, so I wanted to opt for std::unordered_map.
I didn't find anything on this topic on the internet. If I use reserve() or rehash(), std::unordered_map will still automatically rehash if it so pleases. How can I lock it in place?
Related
It isn't difficult to find information on the big-O time behavior of stl container operations. However, we operate in a hard real-time environment, and I'm having a lot more trouble finding information on their heap memory usage behavior.
In particular I had a developer come to me asking about std::unordered_map. We're allowed to be non-realtime at startup, so he was hoping to perform a .reserve() at startup time. However, he's finding he gets overruns at runtime. The operations he uses are lookups, insertions, and deletions with .erase().
I'm a little worried about that .reserve() actually preventing later runtime memory allocations (I don't really understand the explanation of what it does wrt to heap usage), but .erase() in particular I don't see any guarantee whatsoever that it won't be asking the heap for a dynamic deallocation when called.
So the question is what's the specified heap interactions (if any) for std::unordered_map::erase, and if it actually does deallocations, if there's some kind of trick that can be used to avoid them?
The standard doesn't specify container allocation patterns per-se. These are effectively derived from iterator/reference invalidation rules. For example, vector::insert only invalidates all references if the number of elements inserted causes the size of the container to exceed its capacity. Which means reallocation happened.
By contrast, the only operations on unordered_map which invalidates references are those which actually remove that particular element. Even a rehash (which likely allocates memory) does not invalidate references (this is why reserve changes nothing).
This means that each element must be stored separately from the hash table itself. They are individual nodes (which is why it has a node_type extraction interface), and must be able to be allocated and deallocated individually.
So it is reasonable to assume that each insertion or erasure represents at least one allocation/deallocation.
If you're all right with nodes continuing to consume memory, even after they've been removed from the container, you could pretty easily write an Allocator class that basically made deallocation a NOP.
Quite a few real-time systems basically allocate all the memory they're going to use up-front, then once they've finished initialization they neither allocate nor release memory. This would allow you to do pretty much the same thing with an unordered_map.
That said, I'm somewhat skeptical about the benefit in this case. The main strength of unordered_map is supporting insertion and deletion that are usually fast. If you're not going to be doing insertion at runtime, chances are pretty good that it's not a particularly great choice.
If it's a collection that's mostly filled during initialization, then used mostly as-is, with a few items being "removed", but no more being inserted after you finish initialization, you're likely to be better off with a simple sorted array and an interpolating search (or, if the data is distributed extremely unpredictably, maybe a binary search--but an interpolating search is usually better). In this case, I'd handle removal by simply adding a boolean to each item saying whether that item is valid or not. Erase by setting that value to false. If you find such a value during a search, you basically just ignore it.
We know that if more than one thread operates on an object and there is a modification involved, we need some kind of locking(atomic/mutex). For my case only these operations are happening simultaneously for a std::vector:
1. Read
2. Append/Push
Will the vector need a lock in this case? and if yes, why? My program is based on CPP.
I'm new to the lock concept. Any hint in the right direction will work for me.
Yes you need locking, in general, because push_back can cause reallocation.
You can check the reference:
https://en.cppreference.com/w/cpp/container/vector/push_back says
If the new size() is greater than capacity() then all iterators and
references (including the past-the-end iterator) are invalidated.
Otherwise only the past-the-end iterator is invalidated.
https://www.cplusplus.com/reference/vector/vector/push_back/ mentions:
The container is modified. If a reallocation happens, all contained
elements are modified. Otherwise, no existing element is accessed, and
concurrently accessing or modifying them is safe.
So, you should lock if you want to be careful. Or if you care about clean maintainable code.
If you need extra performance and know what you are doing, you can get away with locking only when you know that no push_back() will bring size() above capacity(). That is very tricky and error prone: as soon as you allow one thread to start reading, you have to be sure no reallocation will occur in other thread, even later.
Edit: re-worded above. tl-dr: use synchronization :-)
You will most likely need resource locking. Take this example, if you insert an element to the vector, it might resize. Now when your resizing the vector, what if another thread tries to access data from the array. See a clash? That's why need to lock resources. Now this is if your inserting or removing data (meaning that your altering the actual allocation of the container). If the size is fixed (meaning if you have pre-allocated it), then there wont be an issue.
According to this clear() will destroy all the objects, but will not free the memory. Also while I encountered the necessity of removing all elements of a set I found out that the time complexity of set.clear() is linear in size (destructions) whereas constant for set.swap(). (same problem while removing all elements of vector)
So isn't that we should always use swap() with an empty STL container whenever possible in such cases?
Thus what is the need and benifits of clear() over swap with an empty container.
Sorry, if this seems trivial but I couldn't find anything regarding this.
Thanks!
when to use clear()
When you want to remove all elements of a container.
why use clear()
Because it is the clearest and most efficient way to achieve the above.
clear() will destroy all the objects, but will not free the memory.
This is a desirable feature. It allows clear to be fast, and allows addition of new elements into the container to be fast because memory has already been allocated.
If you want to free the memory - knowing that you will need to pay again for the allocation if you insert new elements, you can use shrink_to_fit.
I found out that the time complexity of set.clear() is linear in size (destructions) whereas constant for set.swap().
You are forgetting that the previously empty, now full set has to be eventually destroyed. Guess what the complexity of the destructor of set is.
It is linear in size of the set.
So isn't that we should always use swap() with an empty STL container whenever possible in such cases?
Swap introduces extra unnecessary operations which clear does not do, and is therefore not as efficient. Furthermore, it does not express the intent as clearly.
There is no advantage in replacing clear with a swap in general case.
Swapping with an empty container does have its uses in rare cases. For example, if you need to clear a container within a critical section, then swapping and leaving the destruction of elements later can allow reducing the size of that critical section.
Certainly calling swap() will be faster, but unless you've got an infinite supply of empty containers available, eventually you will have to do something with the container(s) you've swap()'d items into, in which case you're back to destroying items in linear time.
That doesn't mean the swap() trick isn't useful, though -- for example, if you have a program with a real-time thread and a non-real-time thread, you might swap() the container in the real-time thread, and leave it up to the non-real-time thread to do the actual clear()/items-destruction later on, so that the real-time thread doesn't ever have to spend its limited time destroying items.
The time complexity of swap() is constant, yes.
Presumably you are thinking of swapping with a local variable that soon goes out of scope. If this is the case, the destructor of the local variable is called as it goes out of scope, and the time complexity of that destructor is linear in size, the same as clear() would have been. So there is no performance benefit, and you pay the price of less readable code.
A little bit of background first (skip ahead to the boldface if you're bored by this).
I'm trying to glue two pieces of code together. One is a JSON/YML library that makes heavy use of a custom string view object, the other is a piece of code from the early 2000s.
I've been seeing weird behavior for a long time, until I have traced it down to a memory issue, namely that the string views I construct in the JSON/YML library take a const char* as a constructor, and assume that the memory location of that char array stays constant over the lifetime of the string view. However, some of the std::string objects on which I construct these views are temporary, so that's just not true and the string view ends up pointing at garbage.
Now, I thought I was being smart and constructed a cache in the form of a std::vector that would hold all the temporary strings, I would construct the string views on these and only clear the cache in the end - easy.
However, I was still seeing garbled strings every now and then, until I found the reason: sometimes, when pushing things to the vector beyond the preallocated size, the vector would be moved to a different memory location, invalidating all the string views. For now, I've settled on preallocating a cache size that is large enough to avoid any conceivable moving of the vector, but I can see this causing severe and untracable problems in the future for very large runs. So here's my question:
How can I construct a std::vector<std::string> or any other string container that either avoids being moved in memory alltogether, or at least throws an error message if that happens?
Of course, if you feel that I'm going about this whole issue in the wrong way fundamentally, please also let me know how I should deal with this issue instead.
If you're interested, the two pieces of code in question are RapidYAML and the CERN Statistics Library ROOT.
My answer from a similar question: Any way to update pointer/reference value when vector changes capability?
If you store objects in your vector as std::unique_ptr or std::shared_ptr, you can get an observing pointer to the underlying object with std::unique_ptr::get() (or a reference if you dereference the smart pointer). This way, even though the memory location of the smart pointer changes upon resizing, the observing pointer points to the same object and thus the same memory location.
[...] sometimes, when pushing things to the vector beyond the preallocated size, the vector would be moved to a different memory location, invalidating all the string views.
The reason is that std::vector is required to store its data contiguously in memory. So, if you exceed the maximum capacity of the vector when adding an element, it will allocate a new space in memory (big enough this time) and move all the data here.
What you are subject to is called iterator invalidation.
How can I construct a std::vector or any other string container that either avoids being moved in memory alltogether, or at least throws an error message if that happens?
You have at least 3 easy solutions:
If your cache size is supposed to be fixed and is known at compile-time, I would advise you to use std::array instead.
If your cache size is supposed to be fixed but not necessarily known at compile-time, I would advise you to reserve() the required capacity of your std::vector so that you will have the guarantee that it will big enough to not need to be reallocated.
If your cache size may change, I would advise you to use std::list instead. It is implemented as a (usually doubly) linked list. It will guarantee that the elements will not be relocated in memory.
But since they are not stored contiguously in memory, you'll lose the ability to have direct access to any element (i.e. you'll need to iterate over the list in order to find an element).
Of course there probably are other solutions (I do not claim this answer to be exhaustive) but these solutions will allow you to almost not change your code (only the container) and protect your string views to be invalidated.
Perhaps use an std::list. Its accessing method is slower (at least when iterating) but memory location is constant. Reason for both is that it does not use contiguous memory.
Alternatively create a wrapper that wraps a pointer to a string that has been created with "new". That address will also be constant. EDIT: Somehow I managed to miss that what I've just described is pretty much a smartpointer minus automated deletion ;)
Well sadly it is impossible to be able to grow a vector while being sure the content will stay at the same place on classical OS at least.
There is the function realloc that tries to keep the same place, but as you can read on the documentation, there is no guarantee to that, only the os will decide.
To solution your problem, you need the concept of a pool, a pool of string here, that handle the life time of your strings.
You may get away with a simple std::list of string, but it will lead to bad data aliasing and a lot of independent allocations bad to your performances. These will also be the problems with smart pointers.
So if you care about performances, how you may implement it in your case may be not far from your current implementation in my opinion. Because you cannot resize the vector, you should prefer an std::array of a fixed size that you decide at compile time. Then, whenever you need it, you can create a new one to expand your memory capacity. This may be easily implemented by a std::list<std::array> typically.
I don't know if it applies here, but you must be careful if your application can create any number of string during its execution as it may induce an ever growing memory pool, and maybe finally memory problems. To fix that you may insure that the strings you don't use anymore can be reused or freed. Sadly I cannot help you too much here, as these rules will depend on your application.
I have an STL container (std::list) that I am constantly reusing. By this I mean I
push a number of elements into the container
remove the elements during processing
clear the container
rinse and repeat a large number of times
When profiling using callgrind I am seeing a large number of calls to new (malloc) and delete (free) which can be quite expensive. I am therefore looking for some way to preferably preallocate a reasonably large number of elements. I would also like my allocation pool to continue to increase until a high water mark is reach and for the allocation pool to continue to hang onto the memory until the container itself is deleted.
Unfortunately the standard allocator continually resizes the memory pool so I am looking for some allocator that will do the above without me having to write my own.
Does such an allocator exist and where can I find such an allocator?
I am working on both Linux using GCC and Android using the STLPort.
Edit: Placement new is ok, what I want to minimize is heap walking which is expensive. I would also like all my object to be as close to eachother as possible to minimize cache misses.
It sounds like you may be just using the wrong kind of container: With a list, each element occupies a separate chunk of memory, to allow individual inserts/deletes - so every addition/deletion form the list will require a separate new()/delete().
If you can use a std::vector instead, then you can reserve the required size before adding the items.
Also for deletion, it's usually best not to remove the items individually. Just call clear() on the container to empty. it.
Edit: You've now made it clear in the comments that your 'remove the elements during processing' step is removing elements from the middle of the list and must not invalidate iterators, so switching to a vector is not suitable. I'll leave this answer for now (for the sake of the comment thread!)
The allocator boost::fast_pool_allocator is designed for use with std::list.
The documentation claims that "If you are seriously concerned about performance, use boost::fast_pool_allocator when dealing with containers such as std::list, and use boost::pool_allocator when dealing with containers such as std::vector."
Note that boost::fast_pool_allocator is a singleton and by default it never frees allocated memory. However, it is implemented using boost::singleton_pool and you can make it free memory by calling the static functions boost::singleton_pool::release_memory() and boost::singleton_pool::purge_memory().
You can try and benchmark your app with http://goog-perftools.sourceforge.net/doc/tcmalloc.html, I've seen some good improvements in some of my projects (no numbers at hand though, sorry)
EDIT: Seems the code/download has been moved there: http://code.google.com/p/gperftools/?redir=1
Comment was too short so I will post my thoughts as an answer.
IMO, new/delete can come only from two places in this context.
I believe std::list<T> is implemented with some kind of nodes as normally lists are, for various reasons. Therefore, each insertion and removal of an element will have to result in new/delete of a node. Moreover, if the object of type T has any allocations and deallocations in c'tor/d'tor, they will be called as well.
You can avoid recreation of standard stuff by reiterating over existing nodes instead of deleting them. You can use std::vector and std::vector::reserve or std::array if you want to squeeze it to c-level.
Nonetheless, for every object created there must be called a destructor. The only way I see to avoid creations and destructions is to use T::operator= when reiterating over container, or maybe some c++13 move stuff if its suitable in your case.