push_back objects into vector memory issue C++ - c++

Compare the two ways of initializing vector of objects here.
1.
vector<Obj> someVector;
Obj new_obj;
someVector.push_back(new_obj);
2.
vector<Obj*> ptrVector;
Obj* objptr = new Obj();
ptrVector.push_back(objptr);
The first one push_back actual object instead of the pointer of the object. Is vector push_back copying the value being pushed? My problem is, I have huge object and very long vectors, so I need to find a best way to save memory.
Is the second way better?
Are there other ways to have a vector of objects/pointers that I can find each object later and use the least memory at the same time?

Of the two above options, this third not included one is the most efficient:
std::vector<Obj> someVector;
someVector.reserve(preCalculatedSize);
for (int i = 0; i < preCalculatedSize; ++i)
someVector.emplace_back();
emplace_back directly constructs the object into the memory that the vector arranges for it. If you reserve prior to use, you can avoid reallocation and moving.
However, if the objects truly are large, then the advantages of cache-coherency are less. So a vector of smart pointers makes sense. Thus the forth option:
std::vector< std::unique_ptr<Obj> > someVector;
std::unique_ptr<Obj> element( new Obj );
someVector.push_back( std::move(element) );
is probably best. Here, we represent the lifetime of the data and how it is accessed in the same structure with nearly zero overhead, preventing it from getting out of sync.
You have to explicitly std::move the std::unique_ptr around when you want to move it. If you need a raw pointer for whatever reason, .get() is how to access it. -> and * and explicit operator bool are all overridden, so you only really need to call .get() when you have an interface that expects a Obj*.
Both of these solutions require C++11. If you lack C++11, and the objects truly are large, then the "vector of pointers to data" is acceptable.
In any case, what you really should do is determine which matches your model best, check performance, and only if there is an actual performance problem do optimizations.

If your Obj class doesn't require polymorphic behavior, then it is better to simply store the Obj types directly in the vector<Obj>.
If you store objects in vector<Obj*>, then you are assuming the responsibility of manually deallocating those objects when they are no longer needed. Better, in this case, to use vector<std::unique_ptr<Obj>> if possible, but again, only if polymorphic behavior is required.
The vector will store the Obj objects on the heap (by default, unless you override the allocator in the vector template). These objects will be stored in contiguous memory, which can also give you better cache locality, depending upon your use case.
The drawback to using vector<Obj> is that frequent insertion/removal from the vector may cause reallocation and copying of your Obj objects. However, that usually will not be the bottleneck in your application, and you should profile it if you feel like it is.
With C++11 move semantics, the implications of copying can be much reduced.

Using a vector<Obj> will take less memory to store if you can reserve the size ahead of time. vector<Obj *> will necessarily use more memory than vector<Obj> if the vector doesn't have to be reallocated, since you have the overhead of the pointers and the overhead of dynamic memory allocation. This overhead may be relatively small though if you only have a few large objects.
However, if you are very close to running out of memory, using vector<Obj> may cause a problem if you can't reserve the correct size ahead of time because you'll temporarily need extra storage when reallocating the vector.
Having a large vector of large objects may also cause an issue with memory fragmentation. If you can create the vector early in the execution of your program and reserve the size, this may not be an issue, but if the vector is created later, you might run into a problem due to memory holes on the heap.

Under the circumstances, I'd consider a third possibility: use std::deque instead of std::vector.
This is kind of a halfway point between the two you've given. A vector<obj> allocates one huge block to hold all the instances of the objects in the vector. A vector<obj *> allocates one block of pointers, but each instance of the object in a block of its own. Therefore, you get N objects plus N pointers.
A deque will create a block of pointers and a number of blocks of objects -- but (at least normally) it'll put a number of objects (call it M) together into a single block, so you get a block of N/M pointers, and N/M of objects.
This avoids many of the shortcomings of either a vector of objects or a vector of pointers. Once you allocate a block of objects, you never have to reallocate or copy them. You do (or may) eventually have to reallocate the block of pointers, but it'll be smaller (by a factor of M) than the vector of pointers if you try to do it by hand.
One caveat: if you're using Microsoft's compiler/standard library, this may not work very well -- they have some strange logic (still present up through VS 2013 RC) that means if your object size is larger than 16, you'll get only one object per block -- i.e., equivalent to your vector<obj *> idea.

Related

store strings in stable memory in c++

A little bit of background first (skip ahead to the boldface if you're bored by this).
I'm trying to glue two pieces of code together. One is a JSON/YML library that makes heavy use of a custom string view object, the other is a piece of code from the early 2000s.
I've been seeing weird behavior for a long time, until I have traced it down to a memory issue, namely that the string views I construct in the JSON/YML library take a const char* as a constructor, and assume that the memory location of that char array stays constant over the lifetime of the string view. However, some of the std::string objects on which I construct these views are temporary, so that's just not true and the string view ends up pointing at garbage.
Now, I thought I was being smart and constructed a cache in the form of a std::vector that would hold all the temporary strings, I would construct the string views on these and only clear the cache in the end - easy.
However, I was still seeing garbled strings every now and then, until I found the reason: sometimes, when pushing things to the vector beyond the preallocated size, the vector would be moved to a different memory location, invalidating all the string views. For now, I've settled on preallocating a cache size that is large enough to avoid any conceivable moving of the vector, but I can see this causing severe and untracable problems in the future for very large runs. So here's my question:
How can I construct a std::vector<std::string> or any other string container that either avoids being moved in memory alltogether, or at least throws an error message if that happens?
Of course, if you feel that I'm going about this whole issue in the wrong way fundamentally, please also let me know how I should deal with this issue instead.
If you're interested, the two pieces of code in question are RapidYAML and the CERN Statistics Library ROOT.
My answer from a similar question: Any way to update pointer/reference value when vector changes capability?
If you store objects in your vector as std::unique_ptr or std::shared_ptr, you can get an observing pointer to the underlying object with std::unique_ptr::get() (or a reference if you dereference the smart pointer). This way, even though the memory location of the smart pointer changes upon resizing, the observing pointer points to the same object and thus the same memory location.
[...] sometimes, when pushing things to the vector beyond the preallocated size, the vector would be moved to a different memory location, invalidating all the string views.
The reason is that std::vector is required to store its data contiguously in memory. So, if you exceed the maximum capacity of the vector when adding an element, it will allocate a new space in memory (big enough this time) and move all the data here.
What you are subject to is called iterator invalidation.
How can I construct a std::vector or any other string container that either avoids being moved in memory alltogether, or at least throws an error message if that happens?
You have at least 3 easy solutions:
If your cache size is supposed to be fixed and is known at compile-time, I would advise you to use std::array instead.
If your cache size is supposed to be fixed but not necessarily known at compile-time, I would advise you to reserve() the required capacity of your std::vector so that you will have the guarantee that it will big enough to not need to be reallocated.
If your cache size may change, I would advise you to use std::list instead. It is implemented as a (usually doubly) linked list. It will guarantee that the elements will not be relocated in memory.
But since they are not stored contiguously in memory, you'll lose the ability to have direct access to any element (i.e. you'll need to iterate over the list in order to find an element).
Of course there probably are other solutions (I do not claim this answer to be exhaustive) but these solutions will allow you to almost not change your code (only the container) and protect your string views to be invalidated.
Perhaps use an std::list. Its accessing method is slower (at least when iterating) but memory location is constant. Reason for both is that it does not use contiguous memory.
Alternatively create a wrapper that wraps a pointer to a string that has been created with "new". That address will also be constant. EDIT: Somehow I managed to miss that what I've just described is pretty much a smartpointer minus automated deletion ;)
Well sadly it is impossible to be able to grow a vector while being sure the content will stay at the same place on classical OS at least.
There is the function realloc that tries to keep the same place, but as you can read on the documentation, there is no guarantee to that, only the os will decide.
To solution your problem, you need the concept of a pool, a pool of string here, that handle the life time of your strings.
You may get away with a simple std::list of string, but it will lead to bad data aliasing and a lot of independent allocations bad to your performances. These will also be the problems with smart pointers.
So if you care about performances, how you may implement it in your case may be not far from your current implementation in my opinion. Because you cannot resize the vector, you should prefer an std::array of a fixed size that you decide at compile time. Then, whenever you need it, you can create a new one to expand your memory capacity. This may be easily implemented by a std::list<std::array> typically.
I don't know if it applies here, but you must be careful if your application can create any number of string during its execution as it may induce an ever growing memory pool, and maybe finally memory problems. To fix that you may insure that the strings you don't use anymore can be reused or freed. Sadly I cannot help you too much here, as these rules will depend on your application.

Inserting items to vector by value vs unique_ptr

I wonder what's the difference in performance and memory consumption between these two declarations of vector.
class MyType;
std::vector<MyType> vectorA;
std::vector<std::unique_ptr<MyType>> vectorB;
I have a dilemma, since I have to create some observers into the vector and if I'm doing that while initializing, I have to use unique_ptrs because of the vector's variable size. I don't want to use shared_ptrs because they have big memory overhead.
I'm looking for the best trade-off between performance and memory consumption. I need to know if it's better to init vectorA first (after initialization it's strictly immutable) and then create observers into that vector (it won't be changing its content so I can do that). In this scenario I would lose some performance.
Or if it is better to create vectorB instead of vectorA. Then I would be able to create observers while initialization, since the location that the unique_ptr is pointing to won't be changed. I suppose that in this case I would lose at least 4B/8B per item.
What do the best practices say? I'm beginner with C++ and I'm solving such a problem for the first time.
If possible, I would go with std::vector<MyType>.
The advantages of the approach are:
A single memory block containing all elements, so looping over it will better as the prefetcher can work better
Less memory allocations as new/delete are expensive methods
Less memory overhead (however, does 1 extra pointer per element do that much? You might gain this by ordering your members to reduce padding)
Pointers to elements are stable until you realloc (so reserving in front with the correct size can give stable pointers if you never push_back more than reserved)
However, using std::vector<std::unique_ptr<MyType>> resolves a few of the negative elements:
Can be used with MyType being an abstract class (or interface)
Move/copy constructor of the class will not be called (which can be expensive or non-existent)
It will allow you to keep raw pointers to MyType

C++ how to manage iterators of contiguous dynamic arrays

I have been trying to figure out an efficient way of managing dynamic arrays which I may change occasionally but would like to randomly access and iterate over often.
I would like to be able to:
store the array in a continuous data block (reduce cache misses)
access each element individually and independently of the array handle (pointers > indices)
resize the array (dynamic)
So in order to achieve this I have been trying things out using std::vector<T>::iterator, and it worked very well, until recently, when I resized the vector (e.g. calling push_back()) that I was storing iterators of. All the iterators became invalid, because they were pointing to stale memory.
Is there any efficient (possibly STL-)way of keeping the iterator pointers up to date? Or do I have to update each Iterator manually?
Is this whole approach even worthwhile? Should I stick with indices?
EDIT:
I have used indices before and it was ok, but I have changed my approach because it still wasn´t good. I would always have to drag the entire array into scope and the indices could be easily used for any array. also there is no perfect way of defining a "NULL" index (none I know about).
What about the option to update all pointers along with a resize operation? All you would have to do is to store the original vector::begin, resize the vector and afterwards update all pointers to vector.begin() + (ptr - prevBegin) and resize operations is already something you should try to avoid.
Fully achieving all 3 of your goals is impossible. If you are fully contiguous, then you have one memory block with a finite size, and the only way to get more memory is to ask for more memory, which will not be contiguous with the memory you already have. So you have to sacrifice at least one requirement, to at least some degree:
If you are willing to partly sacrifice contiguity, you can use a std::deque. This is an array-of-arrays kind of structure. It doesn't invalidate references, for I think any operation that increases its size. It depends on the details of your data type but generally its performance is much closed to a contiguous array than a linked list. Well done but old (5 year) benchmarks: https://baptiste-wicht.com/posts/2012/12/cpp-benchmark-vector-list-deque.html. Another option is to write a chunking allocator, to use either with deque or another structure. This is quite a bit more work though.
If you can use indices, then you can just use a vector
If you don't need to resize, you can still just use a vector and never resize it.
Unless you have a good reason, I would stick with indices. If your main performance bottlenecks are iteration related over a large number of elements (as your contiguity requirement implies), then this whole indexing thing should really be a non-issue. If you do have a very good reason for avoiding indices (which you haven't stated), then I would profile the deque versus the vector on the main loop operation to see how much worse the deque really does. It might be barely worse, and if neither deque nor vector work well enough for you, the next alternatives are quite a bit more work (probably involving allocators or a custom data structure).
Depending on your needs, if you can use the following data structure:
std::vector<std::unique_ptr<Foo>>
then no matter how the vector is resized, if you access your data via a Foo*, the pointer to foo will not be invalidated.
As the number of Foos you need to store in your vector changes, the vector may need to resize it's internal contiguous block of memory, which means any iterators you have pointing inside the vector will be invalidated when the vector resizes.
(You can read more here on C++0x iterator invalidation rules)
However, since the object stored in the vector is a pointer to an object elsewhere in the heap, the pointed-to-object (Foo in this example), will not be invalidated.
Note that the vector owns Foo (as alluded to by std::unique_ptr<Foo>), whilst you can store a non-owning pointer to Foo by keeping a Foo* as the means of accessing your Foo data.
So long as the vector outlives your need to access Foo via your Foo*, then you will not have any lifetime issues.
So in terms of your requirements:
store the array in a continuous data block (reduce cache misses)
yes, std::vector achieves this
access each element individually and independently of the array handle (pointers > indices)
yes, store a Foo* as your means of accessing each element individually, and that remains independent of the array handle (vector::iterator)
resize the array (dynamic)
yes, std::vector achieves this, automatically, resizing for you when you need it to.
Bonus:
Using a smart pointer (in this example std::unique_ptr) in the vector means memory management is also handled automatically for you. (Just make sure you don't try to access a Foo* after the vector is destroyed.
Edit:
It has been pointed out in the comments that storing a std::unique_ptr<Foo> in the vector violates your requirement for the objects to be stored in contiguous memory (if indeed that is what you mean by store the array in contiguous memory, as the vector's underlying array will be contiguous, but accessing the Foo objects will incur an indirection).
However, if you use a suitable allocator (eg arena allocator) for both the vector and the Foo objects, then you will have a higher chance of suffering fewer cache misses, as your Foo objects will exist near to the memory used by your vector, thereby having a higher chance of being in cache when iterating over the vector.

Considerations when choosing between storing values vs pointers in std:: containers

When storing objects in standard collections, what considerations should one think of when deciding between storing values vs pointers? Where the owner of these collections (ObjectOwner) is allocated on the heap. For small structs/objects I've been storing values, while for large objects I've been storing pointers. My reasoning for this was that when standard containers are resized, their contents are copied (small copy ok, big copy bad). Any other things to keep in mind here?
class ObjectOwner
{
public:
SmallObject& getSmallObject(int smallObjectId);
HugeObject* getHugeObject(int hugeObjectId);
private:
std::map<int, SmallObject> mSmallObjectMap;
std::map<int, HugeObject *> mHugeObjectMap;
};
Edit:
an example of the above for more context:
Create/Delete items stored in std::map relatively infrequently ( a few times per second)
Get from std::map frequently (once per 10 milliseconds)
small object: < 32 bytes
huge object: > 1024 bytes
I would store object by value unless I need it through pointer. Possible reasons:
I need to store object hierarchy in a container (to avoid slicing)
I need shared ownership
There are possibly other reasons, but reasoning by size is only valid for some containers (std::vector for example) and even there you can make object moving cost minimal (reserve enough room in advance for example). You example for object size with std::map does not make any sense as std::map does not relocate objects when growing.
Note: return type of a method should not reflect the way you store it in a container, but rather it should be based on method semantics, ie what you would do if object is not found.
Only your profiler knows the answer for your SPECIFIC case; trying to use pointers rather than objects is a reliable way of minimising the amount you copy when a copy must be done (be it a resize of a vector or a copy of the whole container); but sometimes you WANT that copy because it's a snapshot for a thread inside a mutex, and it's faster to copy the container than to hold the mutex and deal with the data.
Some objects might not be possible to keep in any way other than pointer because they're not copyable.
Any performance gain by using container of pointer could be offset by costs of having to write more copy code, or repeated calls to new().
There's not a one answer fits all, and before you worry about the performance here you should establish where the performance problems really are. (Just repeating the point - use a profiler!)

C++ layout of an object containing containers

As someone with a lot of assembler language experience and old habits to lose, I recently did a project in C++ using a lot of the features that c++03 and c++11 have to offer (mostly the container classes, including some from Boost). It was surprisingly easy - and I tried wherever I could to favor simplicity over premature optimization. As we move into code review and performance testing I'm sure some of the old hands will have aneurisms at not seeing exactly how every byte is manipulated, so I want to have some advance ammunition.
I defined a class whose instance members contain several vectors and maps. Not "pointers to" vectors and maps. And I realized that I haven't got the slightest idea how much contiguous space my objects take up, or what the performance implications might be for frequently clearing and re-populating these containers.
What does such an object look like, once instantiated?
Formally, there aren't any constraints on the implementation
other than those specified in the standard, with regards to
interface and complexity. Practically, most, if not all
implementations derive from the same code base, and are fairly
similar.
The basic implementation of vector is three pointers. The
actual memory for the objects in the vector is dynamically
allocated. Depending on how the vector was "grown", the dynamic
area may contain extra memory; the three pointers point to the
start of the memory, the byte after the last byte currently
used, and the byte after the last byte allocated. Perhaps the
most significant aspect of the implementation is that it
separates allocation and initialization: the vector will, in
many cases, allocate more memory than is needed, without
constructing objects in it, and will only construct the objects
when needed. In addition, when you remove objects, or clear the
vector, it will not free the memory; it will only destruct the
objects, and will change the pointer to the end of the used
memory to reflect this. Later, when you insert objects, no
allocation will be needed.
When you add objects beyond the amount of allocated space,
vector will allocate a new, larger area; copy the objects into
it, then destruct the objects in the old space, and delete it.
Because of the complexity constrains, vector must grow the area
exponentially, by multiplying the size by some fixed constant
(1.5 and 2 are the most common factors), rather than by
incrementing it by some fixed amount. The result is that if you
grow the vector from empty using push_back, there will not be
too many reallocations and copies; another result is that if you
grow the vector from empty, it can end up using almost twice as
much memory as necessary. These issues can be avoided if you
preallocate using std::vector<>::reserve().
As for map, the complexity constraints and the fact that it must
be ordered mean that some sort of balanced tree must be used.
In all of the implementations I know, this is a classical
red-black tree: each entry is allocated separately, in a node
which contains two or three pointers, plus maybe a boolean, in
addition to the data.
I might add that the above applies to the optimized versions of
the containers. The usual implementations, when not optimized,
will add additional pointers to link all iterators to the
container, so that they can be marked when the container does
something which would invalidate them, and so that they can do
bounds checking.
Finally: these classes are templates, so in practice, you have
access to the sources, and can look at them. (Issues like
exception safety sometimes make the implementations less
straight forward than we might like, but the implementations
with g++ or VC++ aren't really that hard to understand.)
A map is a binary tree (of some variety, I believe it's customarily a Red-Black tree), so the map itself probably only contains a pointer and some housekeeping data (such as the number of elements).
As with any other binary tree, each node will then contain two or three pointers (two for "left & right" nodes, and perhaps one to the previous node above to avoid having to traverse the whole tree to find where the previous node(s) are).
In general, vector shouldn't be noticeably slower than a regular array, and certainly no worse than your own implementation of a variable size array using pointers.
A vector is a wrapper for an array. The vector class contains a pointer to a contiguous block of memory and knows its size somehow. When you clear a vector, it usually retains its old buffer (implementation-dependent) so that the next time you reuse it, there are less allocations. If you resize a vector above its current buffer size, it will have to allocate a new one. Reusing and clearing the same vectors to store objects is efficient. (std::string is similar). If you want to find out exactly how much a vector has allocated in its buffer, call the capacity function and multiply this by the size of the element type. You can call the reserve function to manually increase the buffer size, in expectation of the vector taking more elements shortly.
Maps are more complicated so I don't know. But if you need an associative container, you would have to use something complicated in C too, right?
Just wanted to add to the answers of others few things that I think are important.
Firstly, the default (in implementations I've seen) sizeof(std::vector<T>) is constant and made up of three pointers. Below is excerpt from GCC 4.7.2 STL header, the relevant parts:
template<typename _Tp, typename _Alloc>
struct _Vector_base
{
...
struct _Vector_impl : public _Tp_alloc_type
{
pointer _M_start;
pointer _M_finish;
pointer _M_end_of_storage;
...
};
...
_Vector_impl _M_impl;
...
};
template<typename _Tp, typename _Alloc = std::allocator<_Tp> >
class vector : protected _Vector_base<_Tp, _Alloc>
{
...
};
That's where the three pointers come from. Their names are self-explanatory, I think. But there is also a base class - the allocator. Which takes me to my second point.
Secondly, std::vector< T, Allocator = std::allocator<T>> takes second template parameter that is a class that handles memory operations. It's through functions of this class vector does memory management. There is a default STL allocator std::allocator<T>>. It has no data-members, only functions such as allocate, destroy etc. It bases its memory handling around new/delete. But you can write your own allocator and supply it to the std::vector as second template parameter. It has to conform to certain rules, such as functions it provides etc, but how the memory management is done internally - it's up to you, as long as it does not violate logic of std::vector relies on. It might introduce some data-members that will add to the sizeof(std::vector) through the inheritance above. It also gives you the "control over each bit".
Basically, a vector is just a pointer to an array, along with its capacity (total allocated memory) and size (actually used elements):
struct vector {
Item* elements;
size_t capacity;
size_t size;
};
Of course thanks to encapsulation all of this is well hidden and the users never get to handle the gory details (reallocation, calling constructors/destructors when needed, etc) directly.
As to your performance questions regarding clearing, it depends how you clear the vector:
Swapping it with a temporary empty vector (the usual idiom) will delete the old array: std::vector<int>().swap(myVector);
Using clear() or resize(0) will erase all the items and keep the allocated memory and capacity unchanged.
If you are concerned about efficiency, IMHO the main point to consider is to call reserve() in advance (if you can) in order to pre-allocate the array and avoid useless reallocations and copies (or moves with C++11). When adding a lot of items to a vector, this can make a big difference (as we all know, dynamic allocation is very costly so reducing it can give a big performance boost).
There is a lot more to say about this, but I believe I covered the essential details. Don't hesitate to ask if you need more information on a particular point.
Concerning maps, they are usually implemented using red-black trees. But the standard doesn't mandate this, it only gives functional and complexity requirements so any other data structure that fits the bill is good to go. I have to admit, I don't know how RB-trees are implemented but I guess that, again, a map contains at least a pointer and a size.
And of course, each and every container type is different (eg. unordered maps are usually hash tables).