c++ std::vector<> vs new[] performance - c++

Is there any big difference in allocation, deallocation and access time between std::vector<> and new[] when both are fixed and same length?

Depends on the types and how you call it. std::vector<int> v(1000000); has to zero a million ints, whereas new int[1000000]; doesn't, so I would expect a difference in speed. This is one place in std::vector where you might pay through the nose for something you don't use, if for some reason you don't care about the initial values of the elements.
If you compare std::vector<int> v(1000000); with new int[1000000](); then I doubt you'll see much difference. The significant question is whether one of them somehow has a more optimized loop setting the zeros, than the other one does. If so, then the implementation of the other one has missed a trick (or more specifically the optimizer has).

new is a bad thing, because it violates the idiom of single responsibility by assuming two respon­sibi­li­ties: Storage allocation and object construction. Complexity is the enemy of sanity, and you fight com­plexity by separating concerns and isolating responsibilities.
The standard library containers allow you to do just that, and only think about objects. Moreover, std::vector addionally allows you to think about storage, but separately, via the reserve/capacity interfaces.
So for the sake of keeping a clear mind about your program logic, you should always prefer a container such as std::vector:
std::vector<Foo> v;
// make some storage available
v.reserve(100);
// work with objects - no allocation is required
v.push_back(x);
v.push_back(f(1, 2));
v.emplace_back(true, 'a', 10);

Related

Containers vs Smart pointers in C++

How can I decide when choosing between std::containers (std::vector or std::array) and smart pointers pointing to arrays
I know containers are objects for memory managment. They are exception safe and there will not be any memory leak and they also provide veriuty of functions for memory managment(push.back etc) and smart pointers are pointer that also do not leak memory because they delete themsefs when they are not needed anymore(like unique_ptr when geting out of scope). Propably in containers there is an overhead every time they are created.
My question is how can i decide which method to use and why.
std::vector <unsigned char>myArray(3 * outputImageHight * outputImageWidth);
std::unique_ptr<unsigned char[]>myArray(new unsigned char[3 * outputImageHight * outputImageWidth]);
I would use the vector. Your pointer version offers basically no improvements over the vector, and you lose a lot of useful functionality. You're most likely going to need to measure the size and iterate your array at some point, with a vector you get this for free, whereas you'd need to implement it yourself for your pointer version; at which point you may as well have just used the vector to begin with.
There may be a performance cost instantiating the vector, but I doubt that it would be a bottleneck for most applications. If you're creating so many vectors that instantiating them is costing you time, you can probably be smarter about managing them (pooling your memory, custom vector allocators, etc). If in doubt, measure.
One example where you might need to use the unique_ptr<> version might be if you're working with a library written in C where you lose ownership of the array. For example:
std::unique_ptr<unsigned char[]>myArray(
new unsigned char[3 * outputImageHight * outputImageWidth]);
my_c_lib_data_t cLibData;
int result = my_c_lib_set_image(cLibData, myArray);
if (MYLIB_SUCCESS == result)
// mylib successfully took ownership of the char array, so release the pointer.
myArray.release();
If you have the choice though, prefer to use C++ style containers where you can.
std::vector, primarily because it better represents the "sequence of items in contiguous memory", it is the default representation for that, and enables a wide range of common operations.
vector has move semantics, so the benefit of std::unique_ptr is limited.
If you are lucky, your STL implementation `provides small vector optimization, skipping the memory allocation for small sizes.
-- edit: I wasn't aware SBO is apparently prohibited by the standard - sorry for getting your hopes up, thanks #KarlNicholl
If pointer semantics are required, a unique_ptr<vector<T>> or shared_ptr<vector<T>> is a valid choice with little overhead.
boost did introduce shared_array etc., that represent your second option better,
but I haven't seen them get much traction.
Always use STL containers except in situation where you have good reason to use pointers. Reasons are reliability and readability, IMO.

Why can std::vector be more performant than a native array?

In a comparison on performance between Java and C++ our professor claimed, that theres no benefit in choosing a native array (dynamic array like int * a = new int[NUM] ) over std::vector. She claimed too, that the optimized vector can be faster, but didn't tell us why. What is the magic behind the vector?
PLEASE just answers on performance, this is not a general poll!
Any super optimized code with lower level stuff like raw arrays can beat or tie the performance of an std::vector. But the benefits of having a vector far outweigh any small performance gains you get from low level code.
vector manages it's own memory, array you have to remember to manage the memory
vector allows you to make use of stdlib more easily (with dynamic arrays you have to keep track of the end ptr yourself) which is all very well written well tested code
sometimes vector can even be faster, like qsort v std::sort, read this article, keep in mind the reason this can happen is because writing higher level code can often allow you to reason better about the problem at hand.
Since the std containers are good to use for everything else, that dynamic arrays don't cover, keeping code consistent in style makes it more readable and less prone to errors.
One thing I would keep in mind is that you shouldn't compare a dynamic array to a std::vector they are two different things. It should be compared to something like std::dynarray which unfortunately probably won't make it into c++14 (boost prolly has one, and I'm sure there are reference implementations lying around). A std::dynarray implementation is unlikely to have any performance difference from a native array.
There is benefit of using vector instead of array when the number of elements needs to be changed.
Neither optimized vector can be faster than array.
The only performance optimization a std::vector can offer over a plain array is, that it can request more memory than currently needed. std::vector::reserve does just that. Adding elements up to its capacity() will not involve any more allocations.
This can be implemented with plain arrays as well, though.

Should I use manual alloc to allow move semantics?

I'm interested to learn when I should start considering using move semantics in favour over copying data depending on the size of that data and the usage of the class. For example for a Matrix4 class we have two options:
struct Matrix4{
float* data;
Matrix4(){ data = new float[16]; }
Matrix4(Matrix4&& other){
*this = std::move(other);
}
Matrix4& operator=(Matrix4&& other)
{
... removed for brevity ...
}
~Matrix4(){ delete [] data; }
... other operators and class methods ...
};
struct Matrix4{
float data[16]; // let the compiler do the magic
Matrix4(){}
Matrix4(const Matrix4& other){
std::copy(other.data, other.data+16, data);
}
Matrix4& operator=(const Matrix4& other)
{
std::copy(other.data, other.data+16, data);
}
... other operators and class methods ...
};
I believe there is some overhead having to alloc and dealloc memory "by hand", and given the chances of really hitting the move construct when using this class what is the preferred implementations of a class with such small in memory size? Is really always preferred move over copy?
In the first case, allocation and deallocation are expensive - because you are dynamically allocating memory from the heap, even if your matrix is constructed on the stack - and moves are cheap (just copying a pointer).
In the second case, allocation and deallocation are cheap, but moves are expensive - because they are actually copies.
So if you are writing an application and you just care about performance of that application, the answer to the question "Which one is better?" likely depends on how much you are creating/destroying matrices vs how much you are copying/moving them - and in any case, do your own measurements to support any conjectures.
By doing measurements you will also check whether your compiler is doing a lot of copy/move elisions in places where you expect moves to be going on - results may be against your expectations.
Also, cache locality may have an impact here: if you allocate storage for a matrix's data on the heap, having three matrices that you want to process element-by-element created on the stack will likely require quite a scattered memory access pattern - potentially result in more cache misses.
On the other hand, if you are using arrays for which memory is allocated on the stack, it is likely that the same cache line will be able to hold the data of all those matrices - thus increasing the cache hit rate. Not to mention the fact that in order to access elements on the heap you first need to read the value of the data pointer, which means accessing a different region of memory than the one holding the elements.
So once more, the moral of the story is: do your own measurements.
If you are writing a library on the other hand, and you cannot predict how many constructions/destructions vs moves/copies the client is going to perform, then you may offer two such matrix classes, and factor out the common behavior into a base class - possibly a base class template.
That will give the client flexibility and will give you a sufficiently high degree of reuse - no need to write the implementation of all common member functions twice.
This way, clients may choose the matrix class that best fits the creation/moving profile of the application in which they are using it.
UPDATE:
As DeadMG points out in the comments, one advantage of the array-based approach over the dynamic allocation approach is that the latter is doing manual resource management through raw pointers, new, and delete, which forces you to write user-defined destructor, copy constructor, move constructor, copy-assignment operator, and move-assignment operator.
You could avoid all of this if you were using std::vector, which would perform the memory management task for you and would save you from the burden of defining all those special member functions.
This said, the mere fact of suggesting to use std::vector instead of doing manual memory management - as much as it is a good advice in terms of design and programming practice - does not answer the question, while I believe the original answer does.
Like everything else in programming, specially when performance is concerned, it's a complicated trade-off.
Here, you have two designs: to keep the data inside your class (method 1) or to allocate the data on the heap and keep a pointer to it in the class (method 2).
As far as I can tell, these are the trade-offs you are making:
Construction/Destruction Speed: Naively implemented, method 2 will be slower here, because it requires dynamic memory allocation and deallocation. However, you can help the situation using custom memory allocators, specially if the size of your data is predictable and/or fixed.
Size: In your 4x4 matrix example, method 2 requires storing an additional pointer, plus memory allocation size overhead (typically can be anywhere from 4 to 32 bytes.) This might or might not be a factor, but it certainly must be considered, specially if your class instances are small.
Move Speed: Method 2 has very fast move operation, because it only requires setting two pointers. In method 1, you have no choice but to copy your data. However, while being able to rely on fast moving can make your code pretty and straightforward and readable and more efficient, compilers are quite good at copy elision, which means that you can write your pretty, straightforward and readable pass-by-value interfaces even if you implement method 1 and the compiler will not generate too many copies anyway. But you can't be sure of that, so relying on this compiler optimization, specially if your instances are larger, requires measurement and inspection of the generated code.
Member Access Speed: This is the most important differentiator for small classes, in my opinion. Each time you access an element in a matrix implemented using method 2 (or access a field in a class implemented that way, i.e., with external data) you access the memory twice: once to read the address of the external block of memory, and once to actually read the data you want. In method 1, you just directly access the field or element you want. This means that in method 2, every access could potentially generate an additional cache miss, which could affect your performance. This is specially important if your class instances are small (e.g. a 4x4 matrix) and you operate on many of them stored in arrays or vectors.
In fact, this is why you might want to actually copy bytes around when you are copying/moving an instance of your matrix, instead of just setting a pointer: to keep your data contiguous. This is why flat data structures (like arrays of values,) are much preferred in high-performance code, than pointer spaghetti data structures (like arrays of pointers, linked lists, etc.) So, while moving is cooler and faster than copying in isolation, you sometimes want to do copy your instances to make (or keep) a whole bunch of them contiguous and make iteration over and accessing them much much more efficient.
Flexibility of Length/Size: Method 2 is obviously more flexible in this regard because you can decide how much data you need at runtime, be it 16 or 16777216 bytes.
All in all, here's the algorithm I suggest you use for picking one implementation:
If you need variable amount of data, pick method 2.
If you have very large amounts of data in each instance of your class (e.g. several kilobytes,) pick method 2.
If you need to copy instances of your class around a lot (and I mean a lot!) pick method 2 (but try to measure the performance improvement and inspect the generated code, specially in hot areas.)
In all other cases, prefer method 1.
In short, method 1 should be your default, until proven otherwise. And the way to prove anything regarding performance is measurement! So don't optimize anything unless you have measured and have proof that one method is better than another, and also (as mentioned in other answers,) you might want to implement both methods if you are writing a library and let your users choose the implementation.
I would probably use a stdlib container (such as std::vector or std::array) that already implements move semantics, and then I would simply have the vectors or arrays move.
For example, you could use std::array< std::array, 4 > or std::vector< std::vector< float > > to represent your matrix type.
I don't think it will matter a lot for a 4x4 matrix, but it might for 10000x10000. So yes, a move constructor for a matrix type is definitely worth it, especially if you're planning to work with a lot of temporary matrices (which seems likely when you want to do calculations with them). It will also allow returning Matrix4 objects efficiently (whereas you'd have to use a by-ref call to get around copying otherwise).
Rather unrelated to the matter but probably worth mentioning: in case you decide to use std::array, please make a Matrix a template class (instead of embedding the size into the classname).

Overhead to using std::vector?

I know that manual dynamic memory allocation is a bad idea in general, but is it sometimes a better solution than using, say, std::vector?
To give a crude example, if I had to store an array of n integers, where n <= 16, say. I could implement it using
int* data = new int[n]; //assuming n is set beforehand
or using a vector:
std::vector<int> data;
Is it absolutely always a better idea to use a std::vector or could there be practical situations where manually allocating the dynamic memory would be a better idea, to increase efficiency?
It is always better to use std::vector/std::array, at least until you can conclusively prove (through profiling) that the T* a = new T[100]; solution is considerably faster in your specific situation. This is unlikely to happen: vector/array is an extremely thin layer around a plain old array. There is some overhead to bounds checking with vector::at, but you can circumvent that by using operator[].
I can't think of any case where dynamically allocating a C style
vector makes sense. (I've been working in C++ for over 25
years, and I've yet to use new[].) Usually, if I know the
size up front, I'll use something like:
std::vector<int> data( n );
to get an already sized vector, rather than using push_back.
Of course, if n is very small and is known at compile time,
I'll use std::array (if I have access to C++11), or even
a C style array, and just create the object on the stack, with
no dynamic allocation. (Such cases seem to be rare in the
code I work on; small fixed size arrays tend to be members of
classes. Where I do occasionally use a C style array.)
If you know the size in advance (especially at compile time), and don't need the dynamic re-sizing abilities of std::vector, then using something simpler is fine.
However, that something should preferably be std::array if you have C++11, or something like boost::scoped_array otherwise.
I doubt there'll be much efficiency gain unless it significantly reduces code size or something, but it's more expressive which is worthwhile anyway.
You should try to avoid C-style-arrays in C++ whenever possible. The STL provides containers which usually suffice for every need. Just imagine reallocation for an array or deleting elements out of its middle. The container shields you from handling this, while you would have to take care of it for yourself, and if you haven't done this a hundred times it is quite error-prone.
An exception is of course, if you are adressing low-level-issues which might not be able to cope with STL-containers.
There have already been some discussion about this topic. See here on SO.
Is it absolutely always a better idea to use a std::vector or could there be practical situations where manually allocating the dynamic memory would be a better idea, to increase efficiency?
Call me a simpleton, but 99.9999...% of the times I would just use a standard container. The default choice should be std::vector, but also std::deque<> could be a reasonable option sometimes. If the size is known at compile-time, opt for std::array<>, which is a lightweight, safe wrapper of C-style arrays which introduces zero overhead.
Standard containers expose member functions to specify the initial reserved amount of memory, so you won't have troubles with reallocations, and you won't have to remember delete[]ing your array. I honestly do not see why one should use manual memory management.
Efficiency shouldn't be an issue, since you have throwing and non-throwing member functions to access the contained elements, so you have a choice whether to favor safety or performance.
std::vector could be constructed with an size_type parameter that instantiate the vector with the specified number of elements and that does a single dynamic allocation (same as your array) and also you can use reserve to decrease the number of re-allocations over the usage time.
In n is known at compile-time, then you should choose std::array as:
std::array<int, n> data; //n is compile-time constant
and if n is not known at compile-time, OR the array might grow at runtime, then go for std::vector:
std::vector<int> data(n); //n may be known at runtime
Or in some cases, you may also prefer std::deque which is faster than std::vector in some scenario. See these:
C++ benchmark – std::vector VS std::list VS std::deque
Using Vector and Deque by Herb Sutter
Hope that helps.
From a perspective of someone who often works with low level code with C++, std vectors are really just helper methods with a safety net for a classic C style array. The only overheads you'd experience realistically are memory allocations and safety checks for boundaries. If you're writing a program which needs performance and are going to be using vectors as a regular array I'd recommend to just use C style arrays instead of vectors. You should realistically be vetting the data that comes into the application and check the boundaries yourself to avoid checks on every memory access to the array.
It's good to see that others are checking the differences of the C ways and the C++ ways. More often than not C++ standard methods have significantly worse performance and uglier syntax than their C counterparts and is generally the reason people call C++ bloated. I think C++ focuses more on safety and making the language more like JavaScript/C# even though the language fundamentally lacks the foundation to be one.

Why is 'unbounded_array' more efficient than 'vector'?

It says here that
The unbounded array is similar to a
std::vector in that in can grow in
size beyond any fixed bound. However
unbounded_array is aimed at optimal
performance. Therefore unbounded_array
does not model a Sequence like
std::vector does.
What does this mean?
As a Boost developer myself, I can tell you that it's perfectly fine to question the statements in the documentation ;-)
From reading those docs, and from reading the source code (see storage.hpp) I can say that it's somewhat correct given some assumptions about the implementation of std::vector at the time that code was written. That code dates to 2000 initially, and perhaps as late as 2002. Which means at the time many STD implementations did not do a good job of optimizing destruction and construction of objects in containers. The claim about the non-resizing is easily refuted by using an initially large capacity vector. The claim about speed, I think, comes entirely from the fact that the unbounded_array has special code for eliding dtors & ctors when the stored objects have trivial implementations of them. Hence it can avoid calling them when it has to rearrange things, or when it's copying elements. Compared to really recent STD implementations it's not going to be faster, as new STD implementation tend to take advantage of things like move semantics to do even more optimizations.
It appears to lack insert and erase methods. As these may be "slow," ie their performance depends on size() in the vector implementation, they were omitted to prevent the programmer from shooting himself in the foot.
insert and erase are required by the standard for a container to be called a Sequence, so unlike vector, unbounded_array is not a sequence.
No efficiency is gained by failing to be a sequence, per se.
However, it is more efficient in its memory allocation scheme, by avoiding a concept of vector::capacity and always having the allocated block exactly the size of the content. This makes the unbounded_array object smaller and makes the block on the heap exactly as big as it needs to be.
As I understood it from the linked documentation, it is all about allocation strategy. std::vector afaik postpones allocation until necessary and than might allocate some reasonable chunk of meory, unbounded_array seams to allocate more memory early and therefore it might allocate less often. But this is only a gues from the statement in documentation, that it allocates more memory than might be needed and that the allocation is more expensive.