I know vectors are guaranteed to be contiguous memory, and so are arrays. So what happens when I do something like this:
std::vector<uint8_t> my_array[10];
my_array[2].push_back(11);
my_array[2].push_back(7);
What would the memory look like? If both need to be contiguous, would every element of the array after my_array[2] be pushed forward a byte every time I do a push_back() on my_array[2]?
Would this be the same situation as when I have an array of structs, where the structs have a member that has a variable size, such as a string or another vector?
Memory footprint of std::vector consists of two parts:
The memory for the std::vector object itself (very small, and independent of the size), and
The memory for the data of the vector (depends on the number of elements in the vector).
The first kind of data will be contiguous in an array; the second kind of data is allocated dynamically, so it would not be contiguous in an array.
This would not be the same as with a C struct that has a flexible data member, because the data portion of std::vector is not always allocated in the same kind of memory, let alone being adjacent to it. The vector itself may be allocated in static, dynamic, or automatic memory areas, while its data is always in the dynamic area. Moreover, when vector is resized, the memory for its data may be moved to a different region.
Each time you call push_back, std::vector checks if it has enough dynamic memory to accommodate the next data element. If there is not enough memory, then the vector allocates a bigger chunk of memory, and moves its current content there before pushing the new item.
The vector memory structure is contiguous in memory; however std::vector's all contain a pointer pointing to dynamically allocated memory for the actual storage (which is very very likely not contiguous).
Knowing this, std::vector::push_back will only check to see if the (external) dynamically allocated array has enough capacity to hold the new item, if not it will reallocate space. A push_back on the first vector that overflows will not cause the second vector in the array to reallocate memory, that isn't how it works.
Also, there is no such thing as a struct having a variable size, the size of structures and classes have to be known at compile time.
std::string also has a fixed size, although you may think it is variable, because it also (like vector) has a pointer to the char* it contains.
Related
From this answer, I am led to believe that the members of a class are guaranteed to be contiguous in memory, in the order that they are declared. However, from the docs, "vectors use contiguous storage locations for their elements". So then how is it possible to have multiple vectors in a class, or a vector that is not the last member of a class, given that a vector may resize and spill over into memory that is already allocated?
std::vector is usually implemented as three pointers. One pointer to a dynamically allocated array that stores the vector's contents, one pointer to the end of the used memory and one pointer to the end of the allocated array. No matter how the vector is allocated, the vector's data is stored elsewhere in dynamic storage.
The memory used by each vector is dynamically allocated.
vector container holds object in continuous memory. it is easy to understand for cases like vector. but what if it is a vector of vectors, like vector>, each vector in this vector of vectors can have different length. how does it manage the memory? Does it allocate a fixed length vector every time we push in a new vector? if so, what will happen if the first vector grows out of size during push_back. would it trigger a full vector of vector reallocate and copy/move?
A vector is a pointer to a dynamic array. If you push_back and find you're out of space in the array you have, you allocate a new, bigger array, copy over everything from the old array, and then stick the new value in.
If you have a vector of vectors, the same holds true for each of the inner vectors.
What you need to understand here is that a vector of vectors (unlike a 2D array), is not contiguous in memory. Each of the inner vectors' arrays can be stored anywhere in memory. Or in other words, "each vector in a vector of vectors is a completely different vector. Each with their own, completely separate and separately managed buffer.1"
1. Thanks to user4581301 for this!
A vector contains a pointer to a contiguous memory block. When it runs out of memory, it allocates a new memory block. A vector of vectors is just a vector of pointers to memory blocks. Although each memory block is a contiguous block, they are not necessarily contiguous to each other, that is, not necessarily when one vector ends, the next one starts, there is almost always a gap.
Why the not necessarily and almost always semantics? Because it depends on the memory allocator you're using and on the operating system internals. Ultimately, it's (one of) the job(s) of the OS to allocate and serve memory blocks to user-space programs.
Let's say I have declared a variable
vector<int>* interList = new vector<int>();
interList->push_back(1);
interList->push_back(2);
interList->push_back(3);
interList->push_back(4);
First question is when I push_back an int, a memory space will be consumed?
Second question if I (delete interList), will the memory consume by 1,2,3,4 be released automatically?
EDIT: free --> delete
Yes, the vector class will probably automatically allocate a larger space than needed in case you want to store more data in it later, so it probably won't allocate new space each time you push_back().
Yes, but you should use delete interList; instead of free().
std::vector allocates continuos block of memory at once for some amount of elements. So every time you insert new element it is inserted into reserved block, and the memory space remains the same, no new allocation happens.
If you insert element beyond the allocated block (capacity of the vector) then it allocates a bigger block (resize), copies all the previous elements into it and destroys the old block. So vector manages memory by itself, not each inserted element cause reallocation of the internal buffer.
Second question - yes, vector will clean up all the memory if you delete vector itself.
delete interList;
push_back copies the elements to the heap where the vector will allocate array to store the elements. The capacity of vector can be greater than required or greater than how many elements the vector has. Every time a push back happens the vector checks if there is enough space and if there isn't then it moves all the elements to bigger space and then push elements to the array. The vector always puts elements to contiguous memory blocks and hence if the memory block is not large enough to hold all elements together then it moves all the elements to larger block and appends new elements. In order to avoid this frequent moving the vector would usually allocated bigger memory block.
delete interList would destroy the vector and the integers hold by the vector. Here the vector would be on heap as well as the integers also would be on heap. Actually it is better to create the vector on stack or as a member of other object like vector<int> interList; The vector though on stack stores the elements of int on heap as a array. And as ints are stored as value types then once the vector goes out of scope then the memory of ints would be reclaimed.
Because the vector has value types. They are copied to heap by the vector and stored and managed as arrays and their lifetime is attached with vector's lifetime. If you have a vector of pointers then you have to worry. Like vector<T*> list; list.push_back(new T()); The list stores pointers to objects of type T. When you destroy such vector the T objects would not be deleted. This is same like a class with a pointer to a T*. You have to loop through all the element and call delete on pointers or use vector of shared pointers. Vector of shared pointers or unique pointers is recommended.
You are better off not directly allocating the vector if you can help it. So your code would look like this:
vector<int> interList;
interList.push_back(1);
interList.push_back(2);
interList.push_back(3);
interList.push_back(4);
Now when interList goes out of scope all memory is freed. In fact this is basis of all resource management of C++, somewhat prosaically called RAII (resource acquisition is initialization).
Now if you felt that you absolutely had to allocate your vector you should use one of the resource management smart pointers. In this case I'm using shared_ptr
auto interList = std::make_shared<vector<int>>();
interList->push_back(1);
interList->push_back(2);
interList->push_back(3);
interList->push_back(4);
This will now also free all memory and you never need to call delete. What's more you can pass you interList around and it will reference count it for you. When the last reference is lost the vector will be freed.
The standard STL vector container has a "reserve" function to reserve uninitialized memory that can be used later to prevent reallocations.
How come that the other deque container hasn't it?
Increasing the size of a std::vector can be costly. When a vector outgrows its reserved space, the entire contents of the vector must be copied (or moved) to a larger reserve.
It is specifically because std::vector resizing can be costly that vector::reserve() exists. reserve() can prepare a std::vector to anticipate reaching a certain size without exceeding its capacity.
Conversely, a deque can always add more memory without needing to relocate the existing elements. If a std::deque could reserve() memory, there would be little to no noticeable benefit.
For vector and string, reserved space prevents later insertions at the end (up to the capacity) from invalidating iterators and references to earlier elements, by ensuring that elements don't need to be copied/moved. This relocation may also be costly.
With deque and list, earlier references are never invalidated by insertions at the end, and elements aren't moved, so the need to reserve capacity does not arise.
You might think that with vector and string, reserving space also guarantees that later insertions will not throw an exception (unless a constructor throws), since there's no need to allocate memory. You might think that the same guarantee would be useful for other sequences, and hence deque::reserve would have a possible use. There is in fact no such guarantee for vector and string, although in most (all?) implementations it's true. So this is not the intended purpose of reserve.
Quoting from C++ Reference
As opposed to std::vector, the elements of a deque are not stored contiguously: typical implementations use a sequence of individually allocated fixed-size arrays.
The storage of a deque is automatically expanded and contracted as needed. Expansion of a deque is cheaper than the expansion of a std::vector because it does not involve copying of the existing elements to a new memory location.
Deque can allocate new memory anywhere it wants and just point to it, unlike vectors which require a continuous block of memory to hold all their elements.
Only vector have. There is no need of reserve function for deque, since elements not stored continuougusly and there is no need to reallocate and move elements when add, or remove elements.
reserve implies allocation of large blocks of contiguous data (like a vector). There is nothing in the dequeue that implies contiguous storage - it's generally implemented more like a list (which you will notice also doesn't have a 'reserve' function).
Thus, a 'reserve' function would make no sense.
There are 2 main types of memory: memories that allocate a single chunk like array and vectors, and distributed memories whose members grabs any empty location to fill in. queue and linkest list structures belong to the second type and they have some special practical advantages such that deleting a particular element does not cause a mass memory movement as opposed to arrays and vectors. Therefore they do not need to reserve any space beforehand, if they need it they just take it by connecting to tip
If you aim for having aligned memory containers you could think about implementing something like this:
std::deque<std::vector> dv; //deque with dynamic size memory aligned vectors.
typedef size_t[N] Mem;
std::deque<Mem> dvf //deque with fixed size memory aligned vectors. Here you can store the raw bytes adding a header to loop through and cast using header information and typeid...
//templates and polymorphism can help storing raw bytes, checking the type where a pointer points for example, and creating an interface to access the partial aligned memory.
Alternatively you can use a map to access the vectors instead of a deque...
If I have an array of vectors, will the vector be limited in its resizing ability due to the contiguous storage nature of arrays?
Yes, but not in the way you're thinking.
Vectors have to find contiguous address space for their content. Memory fragmentation may cause the largest contiguous block to be smaller than total free memory. And having many vectors makes fragmentation more likely.
No; internally, vectors hold pointers to the memory blocks, not the block themselves.
resize won't affect the array's memory at all. The vectors have a pointer to the actual storage, so resizing affects some other memory that has nothing to do with the array's. All that will be in the array is basically just pointers pointing to possibly different lengths of memory blocks.
Furthermore, if you have something like this:
std::vector<int> arr [5];
The array's memory will be on the stack, and the vectors' memory will be on the heap. Totally different!