Custom allocators as alternatives to vector of smart pointers? - c++

This question is about owning pointers, consuming pointers, smart pointers, vectors, and allocators.
I am a little bit lost on my thoughts about code architecture. Furthermore, if this question has already an answer somewhere, 1. sorry, but I haven't found a satisfying answer so far and 2. please point me to it.
My problem is the following:
I have several "things" stored in a vector and several "consumers" of those "things". So, my first try was like follows:
std::vector<thing> i_am_the_owner_of_things;
thing* get_thing_for_consumer() {
// some thing-selection logic
return &i_am_the_owner_of_things[5]; // 5 is just an example
}
...
// somewhere else in the code:
class consumer {
consumer() {
m_thing = get_thing_for_consumer();
}
thing* m_thing;
};
In my application, this would be safe because the "things" outlive the "consumers" in any case. However, more "things" can be added during runtime and that can become a problem because if the std::vector<thing> i_am_the_owner_of_things; gets reallocated, all the thing* m_thing pointers become invalid.
A fix to this scenario would be to store unique pointers to "things" instead of "things" directly, i.e. like follows:
std::vector<std::unique_ptr<thing>> i_am_the_owner_of_things;
thing* get_thing_for_consumer() {
// some thing-selection logic
return i_am_the_owner_of_things[5].get(); // 5 is just an example
}
...
// somewhere else in the code:
class consumer {
consumer() {
m_thing = get_thing_for_consumer();
}
thing* m_thing;
};
The downside here is that memory coherency between "things" is lost. Can this memory coherency be re-established by using custom allocators somehow? I am thinking of something like an allocator which would always allocate memory for, e.g., 10 elements at a time and whenever required, adds more 10-elements-sized chunks of memory.
Example:
initially:
v = ☐☐☐☐☐☐☐☐☐☐
more elements:
v = ☐☐☐☐☐☐☐☐☐☐ 🡒 ☐☐☐☐☐☐☐☐☐☐
and again:
v = ☐☐☐☐☐☐☐☐☐☐ 🡒 ☐☐☐☐☐☐☐☐☐☐ 🡒 ☐☐☐☐☐☐☐☐☐☐
Using such an allocator, I wouldn't even have to use std::unique_ptrs of "things" because at std::vector's reallocation time, the memory addresses of the already existing elements would not change.
As alternative, I can only think of referencing the "thing" in "consumer" via a std::shared_ptr<thing> m_thing, as opposed to the current thing* m_thing but that seems like the worst approach to me, because a "thing" shall not own a "consumer" and with shared pointers I would create shared ownership.
So, is the allocator-approach a good one? And if so, how can it be done? Do I have to implement the allocator by myself or is there an existing one?

If you are able to treat thing as a value type, do so. It simplifies things, you don't need a smart pointer for circumventing the pointer/reference invalidation issue. The latter can be tackled differently:
If new thing instances are inserted via push_front and push_back during the program, use std::deque instead of std::vector. Then, no pointers or references to elements in this container are invalidated (iterators are invalidated, though - thanks to #odyss-jii for pointing that out). If you fear that you heavily rely on the performance benefit of the completely contiguous memory layout of std::vector: create a benchmark and profile.
If new thing instances are inserted in the middle of the container during the program, consider using std::list. No pointers/iterators/references are invalidated when inserting or removing container elements. Iteration over a std::list is much slower than a std::vector, but make sure this is an actual issue in your scenario before worrying too much about that.

There is no single right answer to this question, since it depends a lot on the exact access patterns and desired performance characteristics.
Having said that, here is my recommendation:
Continue storing the data contiguously as you are, but do not store aliasing pointers to that data. Instead, consider a safer alternative (this is a proven method) where you fetch the pointer based on an ID right before using it -- as a side-note, in a multi-threaded application you can lock attempts to resize the underlying store whilst such a weak reference lives.
So your consumer will store an ID, and will fetch a pointer to the data from the "store" on demand. This also gives you control over all "fetches", so that you can track them, implement safety measure, etc.
void consumer::foo() {
thing *t = m_thing_store.get(m_thing_id);
if (t) {
// do something with t
}
}
Or more advanced alternative to help with synchronization in multi-threaded scenario:
void consumer::foo() {
reference<thing> t = m_thing_store.get(m_thing_id);
if (!t.empty()) {
// do something with t
}
}
Where reference would be some thread-safe RAII "weak pointer".
There are multiple ways of implementing this. You can either use an open-addressing hash table and use the ID as a key; this will give you roughly O(1) access time if you balance it properly.
Another alternative (best-case O(1), worst-case O(N)) is to use a "reference" structure, with a 32-bit ID and a 32-bit index (so same size as 64-bit pointer) -- the index serves as a sort-of cache. When you fetch, you first try the index, if the element in the index has the expected ID you are done. Otherwise, you get a "cache miss" and you do a linear scan of the store to find the element based on ID, and then you store the last-known index value in your reference.

IMO best approach would be create new container which will behave is safe way.
Pros:
change will be done on separate level of abstraction
changes to old code will be minimal (just replace std::vector with new container).
it will be "clean code" way to do it
Cons:
it may look like there is a bit more work to do
Other answer proposes use of std::list which will do the job, but with larger number of allocation and slower random access. So IMO it is better to compose own container from couple of std::vectors.
So it may start look more or less like this (minimum example):
template<typename T>
class cluster_vector
{
public:
static const constexpr cluster_size = 16;
cluster_vector() {
clusters.reserve(1024);
add_cluster();
}
...
size_t size() const {
if (clusters.empty()) return 0;
return (clusters.size() - 1) * cluster_size + clusters.back().size();
}
T& operator[](size_t index) {
thowIfIndexToBig(index);
return clusters[index / cluster_size][index % cluster_size];
}
void push_back(T&& x) {
if_last_is_full_add_cluster();
clusters.back().push_back(std::forward<T>(x));
}
private:
void thowIfIndexToBig(size_t index) const {
if (index >= size()) {
throw std::out_of_range("cluster_vector out of range");
}
}
void add_cluster() {
clusters.push_back({});
clusters.back().reserve(cluster_size);
}
void if_last_is_full_add_cluster() {
if (clusters.back().size() == cluster_size) {
add_cluster();
}
}
private:
std::vector<std::vector<T>> clusters;
}
This way you will provide container which will not reallocate items. It doesn't meter what T does.

[A shared pointer] seems like the worst approach to me, because a "thing" shall not own a "consumer" and with shared pointers I would create shared ownership.
So what? Maybe the code is a little less self-documenting, but it will solve all your problems.
(And by the way you are muddling things by using the word "consumer", which in a traditional producer/consumer paradigm would take ownership.)
Also, returning a raw pointer in your current code is already entirely ambiguous as to ownership. In general, I'd say it's good practice to avoid raw pointers if you can (like you don't need to call delete.) I would return a reference if you go with unique_ptr
std::vector<std::unique_ptr<thing>> i_am_the_owner_of_things;
thing& get_thing_for_consumer() {
// some thing-selection logic
return *i_am_the_owner_of_things[5]; // 5 is just an example
}

Related

C++ - Best way to create a non-local variable pointer

I have recently been relearning C++ as I develop a game in the Unreal engine. Its been about 3 years since I have touched C++, and I have been mostly using Java since then.
Due to the differences between java and c++ I can already tell there are different best practices for similar concepts.
I have 2 methods like this.
void UMarchingSquares::Generate(std::map<Vector2, int> automata) {
std::map<Vector2,ControlNode*> controlNodes = getControlNodes(automata);
}
std::map<Vector2,ControlNode*> UMarchingSquares::getControlNodes(std::map<Vector2, int> automata) {
std::map<Vector2,ControlNode*> controlNodes = std::map<Vector2, ControlNode*>();
for(pair<Vector2,int> pair : automata) {
Vector2 pos = pair.first;
ControlNode node = ControlNode(pos,pair.second);
controlNodes[pos] = &node;
}
return controlNodes;
}
I probably am breaking a few different C++ best practices, but there is one that I really want clarifications on one specific area.
I am initializing the ControlNode object in the getControlNodes() method's for loop. I know now that doing it this way is bad, because I am storing a pointer to a local variable, which then goes out of scope every loop iteration. I would prefer to store pointers instead of the actual Control node (though I may be convinced otherwise, since a Control Node holds a position [2 floats], a material [1 integer], and two other objects that both have a position and material of their own.)
What is the best way to create a non-local variable pointer? I know I can just use "new ControlNode()", but from what I know, that ends up being a fairly expensive call, and requires cleanup (which may be expensive as well).
I am going to be calling this part of the code fairly frequently, so I would like it to be efficient.
Thank you!
C++ changed a lot in the last few years to make life easier for those using it.
Looking at your code, it raises a lot of questions:
Why is the value of your map a raw pointer to ControlNode instead of a ControlNode by value or a unique_ptr to it
In your for-loop, why do you write the explicit type of pair (which is different from the iterator), auto could help you here to have fewer copies
As your question is about the first, I'll ignore the second one.
Looking at this, you have 3 ways of fixing the code:
std::map<Vector2,ControlNode> getControlNodes(std::map<Vector2, int> automata) {
auto controlNodes = std::map<Vector2, ControlNode>{};
for(auto &&pair : automata) {
auto &&pos = pair.first;
auto node = ControlNode(pos,pair.second);
controlNodes[pos] = std::move(node);
}
return controlNodes;
}
In this code, you can see that the * has been removed from the map. This implies that that ownership of the ControlNode is moved instead of the map. (Also note the std::move) This would be similar to how you have the int stored in the map that comes in as an argument.
If you however require memory allocation, as you will be moving this around and the address needs to be stable, std::unique_ptr is a good solution.
std::map<Vector2,std::unique_ptr<ControlNode>> getControlNodes(std::map<Vector2, int> automata) {
auto controlNodes = std::map<Vector2, std::unique_ptr<ControlNode>>{};
for(auto &&pair : automata) {
auto &&pos = pair.first;
auto node = std::make_unique<ControlNode>(pos,pair.second);
controlNodes[pos] = std::move(node);
}
return controlNodes;
}
As you can see, this code is very similar to the previous, I've replaced the type in the map and changed the construction of the ControlNode to std::make_unique. Hence, you have a unique_ptr containing the ownership to the allocated memory (and as long as you have the unique_ptr, things are staying valid).
The third solution should only be used if you can't change the signature and is considered bad practice in C++ as it passes ownership via raw pointers. Now your caller is responsible for explicitly cleaning up the memory as C++ doesn't have garbage collection.
std::map<Vector2,ControlNode*> getControlNodes(std::map<Vector2, int> automata) {
auto controlNodes = std::map<Vector2, ControlNode*>{};
for(auto &&pair : automata) {
auto &&pos = pair.first;
auto node = new ControlNode(pos,pair.second);
controlNodes[pos] = node;
}
return controlNodes;
}
PS: I've added some auto to the code to make the changes between the snippets minimal.
Use a vector of control nodes for storage. Whenever you need a new control node, append a one to that vector. Instead of using a pointer, use an iterator to that vector. Make sure you have reserved enough slots in that vector up front, or else your iterators will get invalidated.

Is std::push_back relatively expensive to use?

I want to improve the performance of the following code. What aspect might affect the performance of the code when it's executed?
Also, considering that there is no limit to how many objects you can add to the container, what improvements could be made to “Object” or “addToContainer” to improve the performance of the program?
I was wondering if std::push_back in C++ affects performance of the code in any way? Especially if there is no limit to adding to list.
struct Object{
string name;
string description;
};
vector<Object> container;
void addToContainer(Object object) {
container.push_back(object);
}
int main() {
addToContainer({ "Fira", "+5 ATTACK" });
addToContainer({ "Potion", "+10 HP" });
}
Before you do ANYTHING profile the code and get a benchmark. After you make a change profile the code and get a benchmark. Compare the benchmarks. If you do not do this, you're rolling dice. Is it faster? Who knows.
Profile profile profile.
With push_back you have two main concerns:
Resizing the vector when it fills up, and
Copying the object into the vector.
There are a number of improvements you can make to the resizing cost cost of push_back depending on how items are being added.
Strategic use of reserve to minimize the amount of resizing, for example. If you know how many items are about to be added, you can check the capacity and size to see if it's worth your time to reserve to avoid multiple resizes. Note this requires knowledge of vector's expansion strategy and that is implementation-specific. An optimization for one vector implementation could be a terribly bad mistake on another.
You can use insert to add multiple items at a time. Of course this is close to useless if you need to add another container into the code in order to bulk-insert.
If you have no idea how many items are incoming, you might as well let vector do its job and optimize HOW the items are added.
For example
void addToContainer(Object object) // pass by value. Possible copy
{
container.push_back(object); // copy
}
Those copies can be expensive. Get rid of them.
void addToContainer(Object && object) //no copy and can still handle temporaries
{
container.push_back(std::move(object)); // moves rather than copies
}
std::string is often very cheap to move.
This variant of addToContainer can be used with
addToContainer({ "Fira", "+5 ATTACK" });
addToContainer({ "Potion", "+10 HP" });
and might just migrate a pointer and as few book-keeping variables per string. They are temporaries, so no one cares if it will rips their guts out and throws away the corpses.
As for existing Objects
Object o{"Pizza pop", "+5 food"};
addToContainer(std::move(o));
If they are expendable, they get moved as well. If they aren't expendable...
void addToContainer(const Object & object) // no copy
{
container.push_back(object); // copy
}
You have an overload that does it the hard way.
Tossing this one out there
If you already have a number of items you know are going to be in the list, rather than appending them all one at a time, use an initialization list:
vector<Object> container{
{"Vorpal Cheese Grater", "Many little pieces"},
{"Holy Hand Grenade", "OMG Damage"}
};
push_back can be extremely expensive, but as with everything, it depends on the context. Take for example this terrible code:
std::vector<float> slow_func(const float* ptr)
{
std::vector<float> v;
for(size_t i = 0; i < 256; ++i)
v.push_back(ptr[i]);
return v;
}
each call to push_back has to do the following:
Check to see if there is enough space in the vector
If not, allocate new memory, and copy the old values into the new vector
copy the new item to the end of the vector
increment end
Now there are two big problems here wrt performance. Firstly each push_back operation depends upon the previous operation (since the previous operation modified end, and possibly the entire contents of the array if it had to be resized). This pretty much destroys any vectorisation possibilities in the code. Take a look here:
https://godbolt.org/z/RU2tM0
The func that uses push_back does not make for very pretty asm. It's effectively hamstrung into being forced to copy a single float at a time. Now if you compare that to an alternative approach where you resize first, and then assign; the compiler just replaces the whole lot with a call to new, and a call to memcpy. This will be a few orders of magnitude faster than the previous method.
std::vector<float> fast_func(const float* ptr)
{
std::vector<float> v(256);
for(size_t i = 0; i < 256; ++i)
v[i] = ptr[i];
return v;
}
BUT, and it's a big but, the relative performance of push_back very much depends on whether the items in the array can be trivially copied (or moved). If you example you do something silly like:
struct Vec3 {
float x = 0;
float y = 0;
float z = 0;
};
Well now when we did this:
std::vector<Vec3> v(256);
The compiler will allocate memory, but also be forced to set all the values to zero (which is pointless if you are about to overwrite them again!). The obvious way around this is to use a different constructor:
std::vector<Vec3> v(ptr, ptr + 256);
So really, only use push_back (well, really you should prefer emplace_back in most cases) when either:
additional elements are added to your vector occasionally
or, The objects you are adding are complex to construct (in which case, use emplace_back!)
without any other requirements, unfortunately this is the most efficient:
void addToContainer(Object) { }
to answer the rest of your question. In general push_back will just add to the end of the allocated vector O(1), but will need to grow the vector on occasion, which can be amortized out but is O(N)
also, it would likely be more efficient not to use string, but to keep char * although memory management might be tricky unless it is always a literal being added

Fastest way to allocate temporary elements (knowing maximum number) in a vector?

in a function I need to store some integers in a vector. The function is called a lot of times. I know that they are less then 10 but the number is variable for each call of the function. What is the choice to have better performances?
In example I found that this:
std::vector<int> list(10)
std::vector<int>::iterator it=list.begin();
unsigned int nume_of_elements_stored;
for ( ... iterate on some structures ... ){
if (... a specific condition ...){
*it= integer from structures ;
it++;
nume_of_elements_stored++;
}
}
is slower than:
std::vector<int> list;
unsigned int num_of_elements_stored(0);
for ( ... iterate on some structures ... ){
if (... a specific condition ...){
list.push_back( integer from structures )
}
}
num_of_elements_stored=list.size();
I'm going to go down an extremely uncool route here. At the risk of being crucified, I would suggest that std::vector isn't so great here. An exception would be if you get lucky with the memory allocator and get that temporal locality through the allocator that creating and destroying a bunch of teeny vectors normally wouldn't provide.
Wait!
Before people kill me, I want to say that vector is awesome, generally speaking, as one of the most well-rounded data structures available. But when you're looking at a hotspot like this (hopefully with a profiler) as a result of creating a bunch of teeny vectors repeatedly in a tight loop, that's where this kind of straightforward usage of vector can bite you.
The trouble is that it's a heap-allocated structure (basically a dynamic array), and when we're dealing with a boatload of teeny arrays like this, we really want to use that often-cached memory at the top of the stack that's so cheap to allocate/free when we can.
One way to mitigate this is to reuse the same vector across repeated calls. Store it in the outside caller function's scope and pass it in by reference, clear it, do your push_backs, rinse and repeat. It's worth noting that clear doesn't free any memory in the vector, so it keeps that former capacity around (useful here when we want to reuse the same memory and play to temporal locality).
But here we can play to that stack. As a simplified example (using C-style code that isn't very kosher in C++ or even bothers with exception-safety, but easier to illustrate):
int stack_mem[32];
int num = 0;
int cap = 32;
int* ptr = stack_mem;
for ( ... iterate on some structures ... )
{
if (... a specific condition ...)
{
if (num == cap)
{
cap *= 2;
int* new_ptr = static_cast<int*>(malloc(cap * sizeof(int)));
memcpy(new_ptr, ptr, num * sizeof(int));
if (ptr != stack_mem)
free(ptr);
ptr = new_ptr;
}
ptr[num++] = your_int;
}
}
if (ptr != stack_mem)
free(ptr);
Of course if you use something like this, you should properly wrap it in a reusable class template that does bounds-checking, doesn't use memcpy, has exception-safety, a formal push_back method, emplace_back, copy ctor, move ctor, swap, possibly a fill ctor, range ctor, erase, range erase, insert, range insert, size, empty, iterators/begin/end, uses placement new to avoid requiring copy assignment or default ctor, etc.
The solution uses the stack when N <= 32 (can use a different number suited for your common-case needs) and then switches to heap when exceeded. This allows it to handle your common case scenarios efficiently but also not just go kablooey in those rare case scenarios when N might be huge in some pathological case. That makes it somewhat comparable to variable-length arrays in C (something I actually wish we had in C++, at least until std::dynarray is available) but without the stack overflow tendencies VLAs could have since it switches to heap in rare case scenarios.
I applied all these standard-compliant formalities with a structure based on this idea with a class template that accepts <T, FixedN>, and now use it almost as much as vector since I work with so many cases like this with teeny arrays being repeatedly created that should, in the vast majority of common cases, fit on the stack (but always with those ultra rare exceptional possibilities). It wiped off many profiler hotspots I was getting related to memory off the map.
... but applying this basic idea might give you quite a boost. You can apply that kind of effort above of wrapping it into a safe container preserving C++ object semantics if it pays off in your measurements, and I think it should quite a bit in your case.
I would probably go with sort of a middle ground:
std::vector<int> list;
list.reserve(10);
...and the rest could be pretty much like your second version. To be honest, however, it's probably open to question whether this will really make a lot of difference though.
If you use static vector it will be allocated only once.
First example works slower because it allocates and destroys vector each call.

Is this the correct way to access objects inside a list?

EDIT: TLDR? Here's a summary:
The requirement is for an essentially infinitely (or arbitrarily) long container. So list sounds like a good idea, because it will fit the objects in whatever memory space is available.
However vectors are much faster/efficient at access, but might not be able to fit in memory if we don't have a long sequential strip.
Vector of pointers was suggested to reduce memory usage, but the problem remains if there are a gigabyte of pointers and I have 4GB of ram, it might just not fit!
Solution: A list of vectors might be the way to go. Each item in the list could be a vector with 1000 pointers to items which we want to be able to access. A class could handle this functionality.
** Original Question:**
As a wise man once said: "With pointers, if it works once, that doesn't guarantee you are doing it correctly."
I have a class:
class A;
And class A is inside a std::list:
std::list<A> list_of_A;
To access items inside it I am using:
std::list<A>::iterator iter = list_of_A.begin();
std::advance(iter, <an_unsigned_int>);
return *iter;
This seems to be working, but is return *iter the correct thing to be doing? I should mention the last 3 lines are inside a function which returns a const A&.
I looked for an answer on stackoverflow, but couldn't find a duplicate of this question, which surprises me.
List > Vector because I will be swapping things in and out of the list.
Yes; you will return a reference inside the list if your function returns A& or A const& and a copy if your function returns A.
However, if you are doing this regularly, why not use a std::vector? They have random access iterators and are almost always more efficient than a std::list, unless the objects are large and you have a large number of them. std::list are very cache-inefficient.
This is good as long as you have not advanced to (or past) end().
const A& stuff(std::list<A>& list_of_A, int index)
{
assert(index <= list_of_A.size()); // Protect against UB
// of advancing past end.
std::list<A>::iterator iter = list_of_A.begin();
std::advance(iter, index);
if (iter == list_of_A.end())
{ throw std::runtime_error("Failed"); // Not allowed to de-reference end()
}
return *iter;
}

How add objects dynamically

This is the question :
How to do IT Right ?
IT = add objects dynamically (mean create class structures to support that)
class Branch
{
Leaves lv; //it should have many leaves!!
}
class Tree
{
Branch br; //it should have many branchs!!!
}
Now a Non-Working Example (neither is c++ !!, but I try to draw the idea)
class Branch
{
static lv_count;
Leaves lv; //it should have many leaves!! (and should be some pointer)
public:
add(Leave lv)
{
lv[lv_count] = lv;
lv_count ++ ;
}
}
class Tree
{
static br_count;
Branch br; //it should have many branchs!!! (and should be some pointer)
Tree
public:
add(Branch br)
{
br[br_count] = lv;
br_count ++ ;
}
}
this is for example, reaching a stupid approach:
class Branch
{
static count;
Leaves l[1000]; //mmm i don't like this
//...
}
class Tree
{
static count;
Branch b[1000]; //mmm i don't like this
//...
}
I would like to know the formal normal way for doing this, Thanks!!!!!!
std::vector is the thing, you are looking for, I guess...
class Tree
{
std::vector<Branch> branches;
};
Vectors are the generic solution. However you shiould look at memory allocation before you start using libraries code such as vectors -- e.g., C++ new, calloc, malloc, thread local memory etc... Each STL container has it's own algorithmic complexities and studying these will aid you in picking the right one
Discussion :
If you want something to grow and you don't have the space for it.... well you have to realloc() or put algorithmically. Acquire a bigger memory buffer and copy the old buffer into it at the correct index offsets. This is what vector does behind the scenes; of-course vector just does this very well by (this is implementation specific) growing using the linear function (2x). growing in this way means it has more memory than it needs, which means that data added to the vector in the future won't cause an immediate reallocation.
However, I must add that this is very inefficient, you can almost always avoid the cost of copying things around. Vector's main use is for contiguous memory regions, you can almost always improve on vector by using linked data structures, perhaps in a binary tree for searching on a key :D