Vector of vectors, reserve - c++

Suppose I want to represent a two-dimensional matrix of int as a vector of vectors:
std::vector<std::vector<int> > myVec;
The inner dimension is constant, say 5, and the outer dimension is less than or equal to N. To minimize reallocations I would like to reserve space:
myVec.reserve(N);
What size is assumed for the inner vector? Is this purely implementation dependent? How does this effect the spatial locality of the data? Since the inner dimension is a constant is there a way to tell the compiler to use this constant size? How do these answers change if the inner vector's size changes?

Since your inner dimension is constant, I think you want
std::vector< std::array<int, 5> > vecs;
vecs.reserve(N);
This will give you preallocated contiguous storage, which is optimal for performance.

The size of the inner vectors is completely irrelevant for resizing the outer vector. There are no locality guarantees for your bundle of vectors, the locality guarantee (i.e. contiguous block in memory) exists for a single vector only.
Remember that a vector object itself has constant sizeof-size, its actual data is usually dynamically allocated. The outer vector, to first approximation, is a contiguous block of N 'pointers' to the inner vectors. Your reserve call does not reserve memory for possible elements of inner vectors, but only for the inner vector objects themselves (i.e. their bookkeeping data and their pointers to their dynamically allocated data block).

The inner vectors are initialized with the default constructor. So if you write:
vector<vector<int> > vecs;
vecs.reserve(10);
This is equivalent to calling the constuctor of vector<int> or vector<int>() for each element. Which means you'll have a zero-sized vectors. But remember, you can't use them unless you resize (not reserve) your vectors.
Remember, too, that it could be sometimes more efficient to resize with the initial size that you would need. So it's useful to do something like
vector<vector<int> > vecs(3,vector<int>(5));
This will create a vector with size 3, and each element will contain a vector of size 5.
Remember also, that it could be more efficient to use deque rather than vector if you're going to resize your vectors often. They're easy to use (as vectors) and you don't need to reserve, since the elements are not contiguous in memory.

Related

What is the difference between vector<vector<int> > v and vector <int>* v in memory allocation?

I use these two types of two-dimensional vector to load a graph.
In the first case I use vector<vector<int> > v to load the graph adjacency matrix, and in the second case I use vector <int>* vto do so, when I initialize it with vector <int>* v = new vector <int>[n].(n is the number of vetices).
It appeared on the official judge platform that vector<vector<int> > v takes memory of over 5000kb, while vector <int>* vtakes only about 3600kb. What's the difference between these two types of two-dimensional vector?
I'd appreciate it if you could help me with this. Thank you.
vector<vector<int> > v(n) is a single vector internally holding a pointer to an array of n vector<int> elements.
vector <int>* v = new vector <int>[n] is a pointer to an array of n vector<int> elements.
In that regard, they are virtually identical, just that the 1st one manages the array for you and will free it automatically when appropriate, whereas you have to manage the 2nd one yourself.
However, in terms of memory usage, the technical difference between the two approaches is that a vector has to internally keep track of metadata related to its size() and capacity() methods, so there is slightly more memory overhead between using vector v vs vector* v, but that overhead is only a few mere bytes. However, the fact that a vector has separate size() and capacity() means that the vector may allocate room for more elements than are actually valid in the array. So, even though you are adding n vector<int> elements to vector v's internal array, v may allocate room for m elements, where m > n.
Whenever you push elements into a vector, if its new size() would exceed its current capacity(), the internal array has to be reallocated to grow larger, and most implementations will grow that array by 1.5x-2x for efficiency, to avoid having to reallocate the array on every push.
That could easily account for the extra memory usage you are seeing, depending on how you are populating vector v.

Swapping vector to free it

I've read that the best way to free the memory of a vector is:
vector<int>().swap(my_vector);
And i don't really understand what is happening.The swap function takes 2 vectors and swaps their elements, so for instance:
vector<int>v1{1,2,3};
vector<int>v2{4,5,6};
v1.swap(v2);
v1 becomes {4,5,6} and v2 becomes {1,2,3}.This looks normal. But how does my first example free the memory? What happens inside the memory? If my_vector swaps elements with vector () (an empty vector), then doesn't the empty vector get my_vector elements, and my_vector becomes empty?
You're not only swapping with an empty vector, you're swapping with a temporary empty vector. So the memory of the two vectors is swapped, and then the destructor of the temporary vector frees the memory which originally was my_vectors.
Note that the standard effectively guarantees that in this case swap is swapping the allocated memory. Otherwise, this method could raise an exception, which it is forbidden to do in this case. Also, note that the default constructor of vector is also exception-free, so it effectively cannot allocate memory. As Aziuth correctly notes in the comment, it could theoretically have a small capacity non-dynamic initial buffer, and this would be transferred in the swap. In practice, this is probably negligible.

std::vector(n, std::vector(n)) or std::vector(n*n)?

What way to store n.m matrix is more efficient? To create a vector of n vectors of size m or to a create big vector of size n.m?
How is a vector of vectors stored in the memory? Does it contains only references/pointers or whole vectors?
It's more efficient to use a single vector of size N x M. This way all the memory is contiguous, and there is only a single pointer to all of it, rather than N pointers.
A vector of vectors is stored as a pointer to an array of pointers, each of which points to an array of values.

Vector of vectors, heap versus stack (C++)

I wanted to initialize a vector of vectors that contain pointers to Courses. I declared this:
std::vector<std::vector<Course*> > *CSPlan =
new std::vector<std::vector<Course*> >(smsNum);
What I wanted to do by this is to have a vector of vectors, each inside vector is a vector that contains pointers to Courses, and I wanted the MAIN vector to be of size int smsNum. Furthermore, I wanted it on the heap.
My questions are:
Are both the main vector AND the inside vectors allocated on the heap? or is it only the MAIN vector is on the heap and its' indexes are pointers to other smaller vectors on the stack?
I declared it to be of size int smsNum so the Main vector is of size 10, but what about the smaller vectors? are they also of that size or are they still dynamic?
My goal in the end is to have a vector of vectors, both the Main vector and the child vectors on the heap, and ONLY the Main vector is of size smsNum, while the rest are dynamic.
Any structure that can grow as large as the user wants it to is going to be allocated on the heap. The memory stack, on the other hand, is used to allocate statically allocated variables, that the program has control of the size statically, during the compilation process.
Since you can have a loop like this:
for (i = 0; i < your_value; i++) {
vector.insert(...);
}
And considering your_value as an integer read from the standard input, the compiler have no control on how large will be your vector, i.e., it does not know what is the maximum amount of inserts you may perform.
To solve this, the structure must be allocated on the heap, where it may grow as large as the OS allows it to -- considering primary memory, and swap. As a complement, if you use the pointer to the vector, you'll be simply dynamically allocating a variable to reference the vector. This changes NOT the fact that the contents of the vector is, necessarily, being allocated on the heap.
You'll have, in your stack:
a variable "x" that stores the address of a variable "y";
And in your heap:
the value of the variable "y", that is a reference to your vector of vectors;
the contents of your vector of vectors (accessed by "y", that is accessed by "x").

Is the following std::vector code valid?

std::vector<Foo> vec;
Foo foo(...);
assert(vec.size() == 0);
vec.reserve(100); // I've reserved 100 elems
vec[50] = foo; // but I haven't initialized any of them
// so am I assigning into uninitialized memory?
Is the above code safe?
It's not valid. The vector has no elements, so you cannot access any element of them. You just reserved space for 100 elements (which means that it's guaranteed that no reallocation happens until over 100 elements have been inserted).
The fact is that you cannot resize the vector without also initializing the elements (even if just default initializing).
You should use vec.resize(100) if you want to index-in right away.
vec[50] is only safe if 50 < vec.size(). reserve() doesn't change the size of a vector, but resize() does and constructs the contained type.
It won't work. While the container has 100 elements reserved, it still has 0 elements.
You need to insert elements in order to access that portion of memory. Like Jon-Eric said, resize() is the way to go.
std::vector::reserve(100) will claim 100*sizeof(Foo) of free memory, so that further insertion to a vector will not do a memory allocation until 100*sizeof(foo) is full, but accessing the element of that vector will give indeterministic content of that element, since its only claim the memory not allocate it.
Before you can use operator[] to access 50th element you should either call resize, push_back() something 50 times or use std::fill_n algorithm.