3d stl vector memory size issue - c++

std::vector< std::vector< std::vector<int> > > sp(1, std::vector< std::vector<int> >(1,std::vector<int>(1)));
What should be the memory allocated for this 3d vector?
Massif shows 84 bytes, but shouldn't it be close to the size of int(4 bytes) ?

When you use STL you have to consider that your data structures are not only composed by the data itself but the meta-data. They are objects not memory regions.
For each vector object, you have several attributes. Look at:
http://en.cppreference.com/w/cpp/container/vector

Normally a single std::vector is implemented using 3 pointers
a pointer to the begin of the allocated area
a pointer to the end of valid data
a pointer to the end of the allocated area (reserved space)
thus on a 64-bit platform it's at least 3x8 = 24 bytes in addition to actual content, of course.
A 3d vector with one integer therefore would occupy at least 24x3 + sizeof(int) = 76 bytes supposing integers are 4 bytes. With 8-bytes integer would be 80 bytes, not counting any extra alignment needed for example by the heap allocator.

By hand calculus it seems that each vector holds 7 elements at start. Thus 7*sizeof(int)*3 = 84

Why should it be close to 4 bytes? std::vector is a class with more than just the element attribute! If you are short on memory you should probably not use std::vector and simply use an array or your own implementation of an ArrayList that is closer to the standard array size!

Related

Why is std::stack using 18 times more memory than std::vector when used as an element of a vector?

The following code occupies 7000 kb:
vector<vector<int>> v(300005);
While this occupies more than 131 000 kb:
vector<stack<int>> st(300005)
I also tested for deque and queue, which took more than 131 000 kb as well.
The memory usage measurements come from submitting my programs to an online competitive programming judge, since this is where I ran into this issue. Is it a known fact that stack takes more memory than vector or is this a weird thing related to the online judge? I also checked for the memory usage when declaring individual stacks and vectors, and in that case it is the same.
Simple comparison of sizes of default-constructed containers: std::vector, std::deque, std::stack:
int main()
{
std::vector<int> v1;
std::deque<int> v2;
std::stack<int> v3;
std::cout << sizeof(v1) << std::endl;
std::cout << sizeof(v2) << std::endl;
std::cout << sizeof(v3) << std::endl;
}
yields 24, 80, 80 with gcc 11. If you peek into STL vector implementation it is composed of 3 pointers, in 64-bit architecture normally 8 bytes each - hence 24 bytes the size of vector. Then the implementation of deque (and stack is just deque wrapped in additional features) is slightly different. The class consists of: map pointer, size_t, and 2 iterators. Each iterator is composed of 4 pointers, so in the end std::deque itself takes 80 bytes.
Considering that, your vector of stacks should in theory take a bit more than 3 times more memory than vector of vectors. I am not very familiar with memory layouts, but I guess that on some machines it might take significantly more due to fragmentation or single memory page size limits (vector must use a contiguous memory). Maybe that's why your online judge shows 18 times more.

How to reserve a multi-dimensional Vector without increasing the vector size?

I have data which is N by 4 which I push back data as follows.
vector<vector<int>> a;
for(some loop){
...
a.push_back(vector<int>(4){val1,val2,val3,val4});
}
N would be less than 13000. In order to prevent unnecessary reallocation, I would like to reserve 13000 by 4 spaces in advance.
After reading multiple related posts on this topic (eg How to reserve a multi-dimensional Vector?), I know the following will do the work. But I would like to do it with reserve() or any similar function if there are any, to be able to use push_back().
vector<vector<int>> a(13000,vector<int>(4);
or
vector<vector<int>> a;
a.resize(13000,vector<int>(4));
How can I just reserve memory without increasing the vector size?
If your data is guaranteed to be N x 4, you do not want to use a std::vector<std::vector<int>>, but rather something like std::vector<std::array<int, 4>>.
Why?
It's the more semantically-accurate type - std::array is designed for fixed-width contiguous sequences of data. (It also opens up the potential for more performance optimizations by the compiler, although that depends on exactly what it is that you're writing.)
Your data will be laid out contiguously in memory, rather than every one of the different vectors allocating potentially disparate heap locations.
Having said that - #pasbi's answer is correct: You can use std::vector::reserve() to allocate space for your outer vector before inserting any actual elements (both for vectors-of-vectors and for vectors-of-arrays). Also, later on, you can use the std::vector::shrink_to_fit() method if you ended up inserting a lot less than you had planned.
Finally, one other option is to use a gsl::multispan and pre-allocate memory for it (GSL is the C++ Core Guidelines Support Library).
You've already answered your own question.
There is a function vector::reserve which does exactly what you want.
vector<vector<int>> a;
a.reserve(N);
for(some loop){
...
a.push_back(vector<int>(4){val1,val2,val3,val4});
}
This will reserve memory to fit N times vector<int>. Note that the actual size of the inner vector<int> is irrelevant at this point since the data of a vector is allocated somewhere else, only a pointer and some bookkeeping is stored in the actual std::vector-class.
Note: this answer is only here for completeness in case you ever come to have a similar problem with an unknown size; keeping a std::vector<std::array<int, 4>> in your case will do perfectly fine.
To pick up on einpoklum's answer, and in case you didn't find this earlier, it is almost always a bad idea to have nested std::vectors, because of the memory layout he spoke of. Each inner vector will allocate its own chunk of data, which won't (necessarily) be contiguous with the others, which will produce cache misses.
Preferably, either:
Like already said, use an std::array if you have a fixed and known amount of elements per vector;
Or flatten your data structure by having a single std::vector<T> of size N x M.
// Assuming N = 13000, M = 4
std::vector<int> vec;
vec.reserve(13000 * 4);
Then you can access it like so:
// Before:
int& element = vec[nIndex][mIndex];
// After:
int& element = vec[mIndex * 13000 + nIndex]; // Still assuming N = 13000

size of 2 dimensional vector

I have been trying to figure out the size of 2 dimensional vector and not able to figure out entirely.
The test program that I have written is as below.
#include <iostream>
#include <vector>
using namespace std;
int main()
{
vector<int> one(1);
vector < vector<int> > two(1, vector <int>(1));
return 0;
}
Memory allocation when I check with the help of valgrind is confusing me. After executing the first statement in the main block, I get the below output.
==19882== still reachable: 4 (+4) bytes in 1 (+1) blocks
So far so good. But after running the next statement I get the below log.
==19882== still reachable: 32 (+28) bytes in 3 (+2) blocks
Now this is confusing, I don't know how to justify the 28 bytes allocated.
If I change the second line as below
vector < vector<int> > two(1, vector <int>(0));
I get the below log
==19882== still reachable: 32 (+24) bytes in 3 (+1) blocks
Kindly help in understanding how the memory is allocated.
tl;dr
The first case just shows the allocation for the (int) storage managed by the vector. The second shows both the inner vector's int storage, and the storage for the inner vector object itself.
So it's telling you this
vector<int> one(1);
allocates one block of 4 bytes.
It doesn't tell you about the automatic storage for the vector object itself, only the dynamic storage for the single integer: assuming sizeof(int)==4, this seems pretty reasonable.
Next it tells you this:
vector < vector<int> > two(1, vector <int>(1));
allocates two more blocks of 28 bytes in total.
Now, one of those blocks will contain the dynamic storage for the vector<int> - remember the previous instance was an automatic local - and the other block will contain the dynamic storage for the nested vector's single integer.
We can assume the second (single integer) allocation is a single block of 4 bytes, as it was last time. So, the dynamically-allocated vector<int> itself is taking 24 bytes in another single block.
Is 24 bytes a reasonable size for a std::vector instance? That could easily be
template <typename T> class vector {
T* begin;
size_t used;
size_t allocated;
};
on a platform with 64-bit pointers and size_t. This assumes a stateless allocator, which is probably right.

Which data structure is better for an array of std string

I need a structure as follow:
The structure must hold fixed size std::strings so that the number of its elements is finit (100 - 10000000).
I would like to be able to access each element randomly as follow:
std::string Temp = MyStrcuture[i];
or
MyStrcuture[i] = std::string Temp;
I have to use the fastest structure with no (possibly) memory leak.
Which one is better for me?
std::string* MyStrcuture = new std::string[Nu_of_Elements];
std::queue< std:string> MyStrcuture(Nu_of_Elements);
std::vector< std:string> MyStrcuture(Nu_of_Elements);
boost::circular_buffer< std::string> MyStrcuture(Nu_of_Elements);
Your suggestion?
std::vector< std:string> MyStrcuture(Nu_of_Elements);
Vector is the best fit for your requirements. It supports index-based element access as the elements are stored in continuous memory addresses, and has flexibility with size.
std:string* MyStrcuture = new std::string[Nu_of_Elements]; No
C++ STL vector vs array in the real world
std::queue< std:string> MyStrcuture(Nu_of_Elements); No
How do I get the nth item in a queue in java?
Index-based element access is not supported.
std::vector< std:string> MyStrcuture(Nu_of_Elements); Yes
Clean-up : The vector's destructor automatically invokes the destructor of each element in the vector.
Boost::circular_buffer< std::string> MyStrcuture(Nu_of_Elements); No
Same reason as second one. Know more
Well, since your string have fixed size, if you don't have dedicated requirement when processing string and have enough free memory for contiguous allocation. You can use std::array< char, 400 > or std::unique_ptr< char* > instead of std::string.
You have to manage memory in C way. consider smart pointer
std::queue doesn't have random access, Access c++ queue elements like an array
std::vector is suitable if the number of string will be changed. However, the clear() function just call the destructor of elements, not free vector allocated memory (you can check the capacity after clear).
After reading boost documentation. The random access circular buffer is suitable if your number of string have an upper limit (that you said 10 millions). But its a waste of memory if actually you have so few strings. So I suggest to use with smart pointer.
If your number of string are fixed and unchanged from the beginning. You can have a look at C++11 array container
If number of elements and length is fixed and memory is critical, you may consider using plain char array, which provides minimal memory overhead and fast accessibility. Your code will look like this:
char* MyStructure = new char[n * 401];
memset(MyStructure, 0, n * 401);
std::string Temp = MyStructure[i * 401]; // Get value
strcpy(MyStructure[i * 401], Temp.c_str()); // Put value
401 here is for 400 bytes of your string and 1 trailing zero.

basic question on std::vector in C++

C++ textbooks, and threads, like these say that vector elements are physically contiguous in memory.
But when we do operations like v.push_back(3.14) I would assume the STL is using the new operator to get more memory to store the new element 3.14 just introduced into the vector.
Now say the vector of size 4 is stored in computer memory cells labelled 0x7, 0x8, 0x9, 0xA. If cell 0xB contains some other unrelated data, how will 3.14 go into this cell? Does that mean cell 0xB will be copied somewhere else, erased to make room for 3.14?
The short answer is that the entire array holding the vector's data is moved around to a location where it has space to grow. The vector class reserves a larger array than is technically required to hold the number of elements in the vector. For example:
vector< int > vec;
for( int i = 0; i < 100; i++ )
vec.push_back( i );
cout << vec.size(); // prints "100"
cout << vec.capacity(); // prints some value greater than or equal to 100
The capacity() method returns the size of the array that the vector has reserved, while the size() method returns the number of elements in the array which are actually in use. capacity() will always return a number larger than or equal to size(). You can change the size of the backing array by using the reserve() method:
vec.reserve( 400 );
cout << vec.capacity(); // returns "400"
Note that size(), capacity(), reserve(), and all related methods refer to individual instances of the type that the vector is holding. For example, if vec's type parameter T is a struct that takes 10 bytes of memory, then vec.capacity() returning 400 means that the vector actually has 4000 bytes of memory reserved (400 x 10 = 4000).
So what happens if more elements are added to the vector than it has capacity for? In that case, the vector allocates a new backing array (generally twice the size of the old array), copies the old array to the new array, and then frees the old array. In pseudo-code:
if(capacity() < size() + items_added)
{
size_t sz = capacity();
while(sz < size() + items_added)
sz*=2;
T* new_data = new T[sz];
for( int i = 0; i < size(); i++ )
new_data[ i ] = old_data[ i ];
delete[] old_data;
old_data = new_data;
}
So the entire data store is moved to a new location in memory that has enough space to store the current data plus a number of new elements. Some vectors may also dynamically decrease the size of their backing array if they have far more space allocated than is actually required.
std::vector first allocates a bigger buffer, then copies existing elements from the "old" buffer to the "new" buffer, then it deletes the "old buffer", and finally adds the new element into the "new" buffer.
Generally, std::vector implementation grow their internal buffer by doubling the capacity each time it's necessary to allocate a bigger buffer.
As Chris mentioned, every time the buffer grows, all existing iterators are invalidated.
When std::vector allocates memory for the values, it allocates more than it needs; you can find out how much by calling capacity. When that capacity is used up, it allocates a bigger chunk, again larger than it needs, and copies everything from the old memory to the new; then it releases the old memory.
If there is not enough space to add the new element, more space will be allocated (as you correctly indicated), and the old data will be copied to the new location. So cell 0xB will still contain the old value (as it might have pointers to it in other places, it is impossible to move it without causing havoc), but the whole vector in question will be moved to the new location.
A vector is an array of memory. Typical implementation is that it grabs more memory than is required. It that footprint needs to expand over any other memory - the whole lot is copied. The old stuff is freed. The vector memory is on the stack - and that should be noted. It is also a good idea to say the maximum size is required.
In C++, which comes from C, memory is not 'managed' the way you describe - Cell 0x0B's contents will not be moved around. If you did that, any existing pointers would be made invalid! (The only way this could be possible is if the language had no pointers and used only references for similar functionality.)
std::vector allocates a new, larger buffer and stores the value of 3.14 to the "end" of the buffer.
Usually, though, for optimized this->push_back()s, a std::vector allocates memory about twice its this->size(). This ensures that a reasonable amount of memory is exchanged for performance. So, it is not guaranteed 3.14 will cause a this->resize(), and may simply be put into this->buffer[this->size()++] if and only if this->size() < this->capacity().