Should I use std::vector instead of array [duplicate]

Should I use std::vector instead of array [duplicate] - c++

This question already has answers here:
When to use vectors and when to use arrays in C++?
(2 answers)
Closed 5 years ago.
The way I see this, they both have the same function except std::vector seems more flexible, so when would I need to use array, and could I use std::vector only?
This is not a new question, the original questions didn't have the answers I was looking for

One interesting thing to note is that while iterators will be invalidated in many functions with vectors, that is not the case with arrays. Note: std::swap with std::array the iterator will still point to the same spot.
See more:
http://en.cppreference.com/w/cpp/container/array
Good summary of advantages of arrays:
https://stackoverflow.com/a/4004027/7537900
This point seemed most interesting:
fixed-size arrays can be embedded directly into a struct or object,
which can improve memory locality and reducing the number of heap
allocations needed
Not having tested that, I'm not sure it's actually true though.
Here is a discussion in regards to 2D Vectors vs Arrays in regards to the competitive programming in Code Chef:
https://discuss.codechef.com/questions/49278/whether-to-use-arrays-or-vectors-in-c
Apparently memory is not contiguous in 2 dimensions in 2D vectors, only one dimension, however in 2D arrays it is.

As a rule of thumb, you should use:
a std::array if the size in fixed at compile time
a std::vector is the size is not fixed at compile time
a pointer on the address of their first element is you need low level access
a raw array if you are implementing a (non standard) container
Standard containers have the ability to know their size even when you pass them to other function, what raw arrays don't, and have enough goodies to never use raw arrays in C++ code without specific reasons. One could be a bottleneck that would require low level optimization, but only after profiling to identify the bottleneck. And you should benchmark in real condition whether the standard containers actually add any overload.
The only good reason I can think of is if you implement a special container. As standard containers are not meant to be derived, you have only two choices, either have you class contain a standard container and end in a container containing a container with delegations everywhere, or mimic a standard container (by copying code from a well knows implementation), and specialize it. In that case, you will find yourself managing directly raw arrays.

When using std:vector, the only performance hit would be when the capacity is reached, as the memory must be relocated to accomodate a larger number of objects in contiguous memory space on the heap
Thus here is a summary of both in regards to flexibility and performance:
std::array; Reallocation is not possible and thus no perfomance hit will occur due to relocation of memory on the heap.
std::vector; Only affects performance if capacity is exceeded and reallocation occurs. You can use reserve(size) to provide a rough estimate to the maximum amount of objects you'll need. This allows greater flexibility compared to std::array but will of course, have to reallocate memory if the reserved space is exceeded.

Related

Raw pointer, smart pointer or std::vector for "low-level" container data in C++

Let's say I am making my own Matrix class/container which itself needs to manage some sort of array of doubles and a row/column dimension.
At the risk of sounding hand-wavy, what is the considered "best practice" for how to store this data if speed is of importance? The options I can see are:
Raw pointer to single dynamic C array
Unique pointer to a C array, which is similar/leads us to...
std::vector of doubles
What are the speed implications of these different options? Obviously it does depend on circumstance, but in a general case? Also, the size of an std::vector on MSVC and GCC for me is 24 bytes, indicating 3 pointers to the begin iterator, end iterator and the end of the memory allocation. Since I need to store the size myself to be aware of the Matrix dimensions, the end iterator is somewhat useless to me, expect for use with algorithms.
What are the thoughts on best practices of this? Is using a raw pointer acceptable since the container is somewhat "low-level"?
Thanks!

I would use std::vector because it solves memory allocation, deallocation, indexing, copying, etc.. Unless you will be using "millions" of matrices at the same time, the extra member (capacity) is probably not relevant.
In any case, optimizing the library for speed is the last thing you want to do -- after you can test the actual speed of your initial implementation. Then you can decide if it is worth spending time to effectively duplicate std::vector functionality with your own implementation.

What advantages do arrays hold over vectors?

Well, after a full year of programming and only knowing of arrays, I was made aware of the existence of vectors (by some members of StackOverflow on a previous post of mine). I did a load of researching and studying them on my own and rewrote an entire application I had written with arrays and linked lists, with vectors. At this point, I'm not sure if I'll still use arrays, because vectors seem to be more flexible and efficient. With their ability to grow and shrink in size automatically, I don't know if I'll be using arrays as much. At this point, the only advantage I personally see is that arrays are much easier to write and understand. The learning curve for arrays is nothing, where there is a small learning curve for vectors. Anyway, I'm sure there's probably a good reason for using arrays in some situation and vectors in others, I was just curious what the community thinks. I'm an entirely a novice, so I assume that I'm just not well-informed enough on the strict usages of either.
And in case anyone is even remotely curious, this is the application I'm practicing using vectors with. Its really rough and needs a lot of work: https://github.com/JosephTLyons/Joseph-Lyons-Contact-Book-Application

A std::vector manages a dynamic array. If your program need an array that changes its size dynamically at run-time then you would end up writing code to do all the things a std::vector does but probably much less efficiently.
What the std::vector does is wrap all that code up in a single class so that you don't need to keep writing the same code to do the same stuff over and over.
Accessing the data in a std::vector is no less efficient than accessing the data in a dynamic array because the std::vector functions are all trivial inline functions that the compiler optimizes away.
If, however, you need a fixed size then you can get slightly more efficient than a std::vector with a raw array. However you won't loose anything using a std::array in those cases.
The places I still use raw arrays are like when I need a temporary fixed-size buffer that isn't going to be passed around to other functions:
// some code
{ // new scope for temporary buffer
char buffer[1024]; // buffer
file.read(buffer, sizeof(buffer)); // use buffer
} // buffer is destroyed here
But I find it hard to justify ever using a raw dynamic array over a std::vector.

This is not a full answer, but one thing I can think of is, that the "ability to grow and shrink" is not such a good thing if you know what you want. For example: assume you want to save memory of 1000 objects, but the memory will be filled at a rate that will cause the vector to grow each time. The overhead you'll get from growing will be costly when you can simply define a fixed array
Generally speaking: if you will use an array over a vector - you will have more power at your hands, meaning no "background" function calls you don't actually need (resizing), no extra memory saved for things you don't use (size of vector...).
Additionally, using memory on the stack (array) is faster than heap (vector*) as shown here
*as shown here it's not entirely precise to say vectors reside on the heap, but they sure hold more memory on the heap than the array (that holds none on the heap)

One reason is that if you have a lot of really small structures, small fixed length arrays can be memory efficient.
compare
struct point
{
float coords[4]
}
with
struct point
{
std::vector<float> coords;
}
Alternatives include std::array for cases like this. Also std::vector implementations will over allocate, meaning that if you want resize to 4 slots, you might have memory allocated for 16 slots.
Furthermore, the memory locations will be scattered and hard to predict, killing performance - using an exceptionally larger number of std::vectors may also need to memory fragmentation issues, where new starts failing.

I think this question is best answered flipped around:
What advantages does std::vector have over raw arrays?
I think this list is more easily enumerable (not to say this list is comprehensive):
Automatic dynamic memory allocation
Proper stack, queue, and sort implementations attached
Integration with C++ 11 related syntactical features such as iterator
If you aren't using such features there's not any particular benefit to std::vector over a "raw array" (though, similarly, in most cases the downsides are negligible).
Despite me saying this, for typical user applications (i.e. running on windows/unix desktop platforms) std::vector or std::array is (probably) typically the preferred data structure because even if you don't need all these features everywhere, if you're already using std::vector anywhere else you may as well keep your data types consistent so your code is easier to maintain.
However, since at the core std::vector simply adds functionality on top of "raw arrays" I think it's important to understand how arrays work in order to be fully take advantage of std::vector or std::array (knowing when to use std::array being one example) so you can reduce the "carbon footprint" of std::vector.
Additionally, be aware that you are going to see raw arrays when working with
Embedded code
Kernel code
Signal processing code
Cache efficient matrix implementations
Code dealing with very large data sets
Any other code where performance really matters
The lesson shouldn't be to freak out and say "must std::vector all the things!" when you encounter this in the real world.
Also: THIS!!!!
One of the powerful features of C++ is that often you can write a class (or struct) that exactly models the memory layout required by a specific protocol, then aim a class-pointer at the memory you need to work with to conveniently interpret or assign values. For better or worse, many such protocols often embed small fixed sized arrays.
There's a decades-old hack for putting an array of 1 element (or even 0 if your compiler allows it as an extension) at the end of a struct/class, aiming a pointer to the struct type at some larger data area, and accessing array elements off the end of the struct based on prior knowledge of the memory availability and content (if reading before writing) - see What's the need of array with zero elements?
embedding arrays can localise memory access requirement, improving cache hits and therefore performance

A C++ container similar to dynamic array?

In this set of slides the author highly recommends to avoid pointers in C++ programs. Specially in slide 6, using vectors are suggested instead of dynamic arrays. While I believe vectors are much safer to use, for example they avoid memory leaks when an exception happens, they have extra memory overhead. This post says that vectors can consume up to twice the size of existing elements in the vector. Unfortunately, this feature causes my program to abort in my system with limited amount of available memory.
Is there a C++ container similar to dynamic arrays, getting the fixed (or rarely changing) number of elements in runtime, and provide the same safety as containers? The closest thing I could find (here) was array which required compile-time specified size provided in template.

You can use the array specialization of std::unique_ptr:
std::unique_ptr<int[]> arr(new int[5]);
This will safely manage the memory for you.

If you know the fixed size, just calling reserve(n) on the vector should do the trick. While it isn't guaranteed not to use more space, I don't know of any implementations that don't just allocate space for exactly that number of elements (assuming it is greater than the current capacity(), of course).

What is the difference between std::array and std::vector? When do you use one over other? [duplicate]

This question already has answers here:
std::vector versus std::array in C++
(6 answers)
Closed 6 years ago.
What is the difference between std::array and std::vector? When do you use one over other?
I have always used and considered std:vector as an C++ way of using C arrays, so what is the difference?

std::array is just a class version of the classic C array. That means its size is fixed at compile time and it will be allocated as a single chunk (e.g. taking space on the stack). The advantage it has is slightly better performance because there is no indirection between the object and the arrayed data.
std::vector is a small class containing pointers into the heap. (So when you allocate a std::vector, it always calls new.) They are slightly slower to access because those pointers have to be chased to get to the arrayed data... But in exchange for that, they can be resized and they only take a trivial amount of stack space no matter how large they are.
[edit]
As for when to use one over the other, honestly std::vector is almost always what you want. Creating large objects on the stack is generally frowned upon, and the extra level of indirection is usually irrelevant. (For example, if you iterate through all of the elements, the extra memory access only happens once at the start of the loop.)
The vector's elements are guaranteed to be contiguous, so you can pass &vec[0] to any function expecting a pointer to an array; e.g., C library routines. (As an aside, std::vector<char> buf(8192); is a great way to allocate a local buffer for calls to read/write or similar without directly invoking new.)
That said, the lack of that extra level of indirection, plus the compile-time constant size, can make std::array significantly faster for a very small array that gets created/destroyed/accessed a lot.
So my advice would be: Use std::vector unless (a) your profiler tells you that you have a problem and (b) the array is tiny.

I'm going to assume that you know that std::array is compile-time fixed in size, while std::vector is variable size. Also, I'll assume you know that std::array doesn't do dynamic allocation. So instead, I'll answer why you would use std::array instead of std::vector.
Have you ever found yourself doing this:
std::vector<SomeType> vecName(10);
And then you never actually increase the size of the std::vector? If so, then std::array is a good alternative.
But really, std::array (coupled with initializer lists) exists to make C-style arrays almost entirely worthless. They don't generally compete with std::vectors; they compete more with C-style arrays.
Think of it as the C++ committee doing their best to kill off almost all legitimate use of C-style arrays.

std::array
is an aggregate
is fixed-size
requires that its
elements be default constructible (vs
copy (C++03) or move (C++0x)
constructible)
is linearly
swappable (vs constant time)
is linearly movable (vs constant time)
potentially pays one less indirection than std::vector
A good use case is when doing things 'close-to-the-metal', while keeping the niceties of C++ and keeping all the bad things of raw arrays out of the way.

Same reasoning when using a C-style static array rather than a std::vector. And for that, I kindly refer you to here.

std::array has a fixed (compile time) size, while std::vector can grow.
As such, std::array is like using a C array, while std::vector is like dynamically allocating memory.

I use my own personal hand coded Array<> template class, which has a simpler API compared with std::array or std::vector. For example:
To use a dynamic Array:
Array<> myDynamicArray; // Note array size is not given at compile time
myDynamicArray.resize(N); // N is a run time value
...
To use a static Array, fixed size at compile time:
Array<100> myFixedArry;
I believe it has a better syntax than std::array, or std::vector. Also extremely efficient.

Why would I prefer using vector to deque

Since
they are both contiguous memory containers;
feature wise, deque has almost everything vector has but more, since it is more efficient to insert in the front.
Why whould anyone prefer std::vector to std::deque?

Elements in a deque are not contiguous in memory; vector elements are guaranteed to be. So if you need to interact with a plain C library that needs contiguous arrays, or if you care (a lot) about spatial locality, then you might prefer vector. In addition, since there is some extra bookkeeping, other ops are probably (slightly) more expensive than their equivalent vector operations. On the other hand, using many/large instances of vector may lead to unnecessary heap fragmentation (slowing down calls to new).
Also, as pointed out elsewhere on StackOverflow, there is more good discussion here: http://www.gotw.ca/gotw/054.htm .

To know the difference one should know how deque is generally implemented. Memory is allocated in blocks of equal sizes, and they are chained together (as an array or possibly a vector).
So to find the nth element, you find the appropriate block then access the element within it. This is constant time, because it is always exactly 2 lookups, but that is still more than the vector.
vector also works well with APIs that want a contiguous buffer because they are either C APIs or are more versatile in being able to take a pointer and a length. (Thus you can have a vector underneath or a regular array and call the API from your memory block).
Where deque has its biggest advantages are:
When growing or shrinking the collection from either end
When you are dealing with very large collection sizes.
When dealing with bools and you really want bools rather than a bitset.
The second of these is lesser known, but for very large collection sizes:
The cost of reallocation is large
The overhead of having to find a contiguous memory block is restrictive, so you can run out of memory faster.
When I was dealing with large collections in the past and moved from a contiguous model to a block model, we were able to store about 5 times as large a collection before we ran out of memory in a 32-bit system. This is partly because, when re-allocating, it actually needed to store the old block as well as the new one before it copied the elements over.
Having said all this, you can get into trouble with std::deque on systems that use "optimistic" memory allocation. Whilst its attempts to request a large buffer size for a reallocation of a vector will probably get rejected at some point with a bad_alloc, the optimistic nature of the allocator is likely to always grant the request for the smaller buffer requested by a deque and that is likely to cause the operating system to kill a process to try to acquire some memory. Whichever one it picks might not be too pleasant.
The workarounds in such a case are either setting system-level flags to override optimistic allocation (not always feasible) or managing the memory somewhat more manually, e.g. using your own allocator that checks for memory usage or similar. Obviously not ideal. (Which may answer your question as to prefer vector...)

I've implemented both vector and deque multiple times. deque is hugely more complicated from an implementation point of view. This complication translates to more code and more complex code. So you'll typically see a code size hit when you choose deque over vector. You may also experience a small speed hit if your code uses only the things the vector excels at (i.e. push_back).
If you need a double ended queue, deque is the clear winner. But if you're doing most of your inserts and erases at the back, vector is going to be the clear winner. When you're unsure, declare your container with a typedef (so it is easy to switch back and forth), and measure.

std::deque doesn't have guaranteed continuous memory - and it's often somewhat slower for indexed access. A deque is typically implemented as a "list of vector".

According to http://www.cplusplus.com/reference/stl/deque/, "unlike vectors, deques are not guaranteed to have all its elements in contiguous storage locations, eliminating thus the possibility of safe access through pointer arithmetics."
Deques are a bit more complicated, in part because they don't necessarily have a contiguous memory layout. If you need that feature, you should not use a deque.
(Previously, my answer brought up a lack of standardization (from the same source as above, "deques may be implemented by specific libraries in different ways"), but that actually applies to just about any standard library data type.)

A deque is a sequence container which allows random access to it's elements but it is not guaranteed to have contiguous storage.

I think that good idea to make perfomance test of each case. And make decision relying on this tests.
I'd prefer std::deque than std::vector in most cases.

You woudn't prefer vector to deque acording to these test results (with source).
Of course, you should test in your app/environment, but in summary:
push_back is basically the same for all
insert, erase in deque are much faster than list and marginally faster than vector
Some more musings, and a note to consider circular_buffer.

On the one hand, vector is quite frequently just plain faster than deque. If you don't actually need all of the features of deque, use a vector.
On the other hand, sometimes you do need features which vector does not give you, in which case you must use a deque. For example, I challenge anyone to attempt to rewrite this code, without using a deque, and without enormously altering the algorithm.

Note that vector memory is re-allocated as the array grows. If you have pointers to vector elements, they will become invalid.
Also, if you erase an element, iterators become invalid (but not "for(auto...)").
Edit: changed 'deque' to 'vector'

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js