passing vectors by reference and dynamic allocation - c++

If I pass a vector by reference and and I push new elements to the vector and it has to dynamically resize the array while I push elements in the function, will the vector now be at a different location? How does the program ensure that it points to the new dynamically allocated location? Is it using a pointer to the pointer that points to the current array memory location which can be modified after resizing to point to a new contiguous memory location under the hood?
void func(vector<int> &vect)
{
vect.push_back(30);
}
int main()
{
vector<int> vect;
func(vect);
}

The answer is yes. You can confirm this to yourself by trying to figure out what's happening in your code sample. The key insight is that anything allocated non-dynamically must have a fixed size at allocation time.
You don't have a pointer here, so the core data structure of the vector is getting allocated on the stack, where it can't just grow into other things that are potentially on top of it in the stack. For instance, while you're inside func(), the function call to func is on the stack on top your vector variable. It's impossible for it grow there without overwriting the stack frame for your function call. If it did that, your program would crash when trying to return from func.
So, the only way it could possibly grow is if, internally, it dynamically allocated memory in a different location as needed. So, this must be what's happening under the hood.

Related

Why does object's memory profile need to be known at compile-time for stack placement?

It seems to be a well-accepted fact that if you instantiate an object in a language like C++, it is only eligible for stack placement (as opposed to heap allocation) if the size of the object is known at compile time. For example an array of fixed size can go on the stack but the elements of a vector will be placed on the heap since the final size is unknown during compilation.
Why is this the case? Why can't stack space be dynamically allocated during runtime by moving the stack pointer?
EDIT: For clarification, here's an example: in the middle of a function, you compute a non-const integer. If you want to create an array of this size, the allocation must be done on the heap because the size of the array is not known at compile-time. But why can't you just put the array at the top of the stack when it is created?
Consider this code:
void extend(std::vector<int>& v, int val) {
int w = abs(val);
v.push(w)
}
std::vector<int> v(10);
extend(v, 3);
If the content of v is on stack, it will not be continuous since there's other stuff between v[9] and the newly pushed v[10]. Worse, if the operation is forced, then v[10] will almost surely overwrite something.
You can do it with the function alloca() for gcc, and the functions _alloca() and _malloca() for Microsoft compilers.
This looks as follows:
void allocatesArray(int i)
{
char *ray = reinterpret_cast<char*>(alloca(i));
// 'i' bytes of memory allocated, returns void *
ray[0] = 'a';
// space freed, pointer invalidated
}
Mind that it is not considered as a very good practice because of undefined behavior in the case of a stack overflow.
What you can't do unless you're at the very top of the stack is resizing, since you can't move the entire stack around. If you have some data structure that may change size in any way, you cannot use it in any function call from your current scope with a non-const context.
If you're just thinking about this as a possible way to optimize things, clang already does so called "heap elision" to do these kinds of optimization when applicable.

Pointer to Vector Memory Loss

I have come against a problem with my pointer vectors...
And i got an idea what the problem might be:
When I create a pointer to a vector, the pointer reserves the size of the vector on the heap. So that basicly means, that the pointer now points to the memory of the vector without anything inside... When i now resize or pushback the vector, will the pointer now still point to the whole memory of the vector or just the memory which has been allocated at the beginning?
I also want to know, if there are some tricks you can do to fix that (if what i think is true). Is the "vector.reserve(n)" a method of accomplishing this? Or is there something i could do to overwrite the pointers memory adress, to a vector after it has been initialised?
"When I create a pointer to a vector, the pointer reserves the size of the vector on the heap.
No, it doesn't! When you create a pointer to a vector, you have a pointer. Nothing more. It doesn't point to anything yet, and it certainly hasn't reserved any "heap memory" for you.
You still need to actually create the vector that will be pointed-to.
std::vector<int>* ptr1; // just a pointer;
ptr1 = new std::vector<int>(); // there we go;
delete ptr1; // don't forget this;
auto ptr2 = std::make_shared<std::vector<int>>(); // alternatively...
Y'know, it's very rare that you need to dynamically-allocate a container. Usually you just want to construct it in the usual fashion:
std::vector<int> v;
That's it. No need for pointers.
When i now resize or pushback the vector, will the pointer now still point to the whole memory of the vector or just the memory which has been allocated at the beginning?
Regardless of how you constructed/allocated it, the vector itself never moves spontaneously (only the dynamic memory that it is managing for you internally), so you do not need to worry about this.
I also want to know, if there are some tricks you can do to fix that
No need, as there's nothing to fix.
Is the "vector.reserve(n)" a method of accomplishing this?
Theoretically, if this were a problem (which it isn't) then, yes, this could possibly form the basis of a solution.
Vector is class that has internal pointer to vector's elements using continuous block of memory in the heap. Long enough for all reserved vector's elements (capacity() method)
So, if you create vector (in local scope's stack OR in the heap - doesn't matter) it creates this layout
[ vector [ ptr-to-data ] ] --> HEAP: [ 1 ][ 2 ][ 3 ]
vector<int> v1(3); // whole vector instance in the stack
vector<int> *pv2 = new vector<int>(3); // pointer to vector in the heap
each of these 2 vector instances has its pointer to its elements in the heap as well
Vector manages its pointer to the data internally.
When you push_back() more elements than its current .capacty() it will re-allocate new continuous block of memory and copy-construct or move all old elements to this new block.

Pointers assigned inside an invoked function valid?

I want to understand whether in this piece of code, pointers stored in the vector by func() are valid when accessed from main_func(), and if so why?
void func(vector<double*>& avector) {
double a=0,b=0;
for (int i=0;i<10;i++) {
double *a = new double[2];
avector.push_back(a);
avector[avector.size()-1][0] = a;
avector[avector.size()-1][1] = b;
}
}
void main_func(){
vector<double*> v;
func(v);
for (int i=0;i<v.size();i++)
{
// access a
// References stored in vector valid???
}
}
Why don't you use a vector of pair vector<pair<double,double>>
http://www.cplusplus.com/reference/utility/pair/
Let's get to basics. A pointer is simply four-byte integer value denoting address in memory in which some data exists. Dynamic memory allocation is application-global, so if you allocate memory anywhere in your program by new, new[] or malloc (or something like this), you can use this memory anywhere you like and OS guarantees, that it won't be moved anywhere (in terms of address, that you got from the memory allocation routine).
So let's get back to your program. You allocate some memory inside the function and store the addresses of pointers to that memory in the vector. When you exit that function, these pointers are still completely valid and you can use them as you wish.
Don't forget to free them, when you don't need them anymore!
They are valid because you created them with new and that means they have dynamic storage duration. You do however lack the corresponding delete and therefore have a memory leak.
Your vector does not store references, but pointers to objects on the heap. That object is an array of two doubles, and you just copy the values from your local variable. So everything is alright (except that you don't release the memory).
If you somewhere (store and) return references or pointers to function locals, those won't have any meaning outside the function.

Can a vector allocated on the "stack" be passed from function to function?

I know that variables allocated on that stack of a function become inaccessible when the function finishes execution. However, vector types allocate their elements on the heap no matter how they are allocated. So for instance,
vector<int> A;
will allocate space for its elements on the heap instead of the stack.
My question is, assume I have the following code:
int main(int argc, char *argv[]) {
// initialize a vector
vector<int> A = initVector(100000);
// do something with the vector...
return 0;
}
// initialize the vector
vector<int> initVector(int size) {
vector<int> A (size); // initialize the vector "on the stack"
// fill the vector with a sequence of numbers...
int counter = 0;
for (vector<int>::iterator i = A.begin(); i != A.end(); i++) {
(*i) = counter++;
}
return A;
}
Will I have memory access problems when using the vector A in the main function? I tried this several times and they all worked normally, but I'm scared that this might just be luck.
The way I see it is, the vector A allocates its elements on the heap, but it has some "overhead" parameters (maybe the size of the vector) allocated on the stack itself. Therefore, using the vector in the main function might result in a memory access problem if these parameters are overwritten by another allocation. Any ideas?
When you do "return A;" you return by value, so you get a copy of the vector -- C++ creates a new instance and calls copy constructor or operator= on it. So in this case it doesn't matter where the memory was allocated as you have to copy it anyway and destroy the old copy (some possible optimizations notwithstanding).
The data in the vector (and all other STL containers) is moved around by value as well, so you store the copy of your integers, not pointers to them. That means your objects can be copied around several times in any container operation and they must implement a copy constructor and/or assignment operator correctly. C++ generates those for you by default (just calling copy ctor on all your member variables), but they don't always do the right thing.
If you want to store pointers in an STL container consider using shared pointer wrappers (std::shared_ptr or boost::shared_ptr). They will ensure the memory is handled correctly.
Yes, it will work normally because the memory for the elements are allocated and that is what will be used to build the vector<int> A = variable. However, performance wise, it is not the best idea.
I would suggest changing your function to be the following though
void initVector(vector<int>& a, int size)
For additional references on usage, please see Returning a STL vector from a function… and [C++] Returning Vector from Function.
For an additional reference on performance (using C++11), please see Proper way (move semantics) to return a std::vector from function calling in C++0x
C++ vector actually has two pieces of memory that are linked with a single pointer. The first one is in stack, and the 2nd one in heap. So you have features of both stack and heap in a single object.
std::vector<int> vec;
vec.push_back(10);
vec.push_back(20);
vec.push_back(30);
std::cout << sizeof(vec) << std::endl;
Once you run that code, you'll notice that the stack area does not contain the elements, but it still exists. So when you pass the vector from function to another, you'll need to manipulate the stack areas, and the vector will get copied like any other stack-based object.

What's the difference between these two classes?

Below, I'm not declaring my_ints as a pointer. I don't know where the memory will be allocated. Please educate me here!
#include <iostream>
#include <vector>
class FieldStorage
{
private:
std::vector<int> my_ints;
public:
FieldStorage()
{
my_ints.push_back(1);
my_ints.push_back(2);
}
void displayAll()
{
for (int i = 0; i < my_ints.size(); i++)
{
std::cout << my_ints[i] << std::endl;
}
}
};
And in here, I'm declaring the field my_ints as a pointer:
#include <iostream>
#include <vector>
class FieldStorage
{
private:
std::vector<int> *my_ints;
public:
FieldStorage()
{
my_ints = new std::vector<int>();
my_ints->push_back(1);
my_ints->push_back(2);
}
void displayAll()
{
for (int i = 0; i < my_ints->size(); i++)
{
std::cout << (*my_ints)[i] << std::endl;
}
}
~FieldStorage()
{
delete my_ints;
}
};
main() function to test:
int main()
{
FieldStorage obj;
obj.displayAll();
return 0;
}
Both of them produces the same result. What's the difference?
In terms of memory management, these two classes are virtually identical. Several other responders have suggested that there is a difference between the two in that one is allocating storage on the stack and other on the heap, but that's not necessarily true, and even in the cases where it is true, it's terribly misleading. In reality, all that's different is where the metadata for the vector is allocated; the actual underlying storage in the vector is allocated from the heap regardless.
It's a little bit tricky to see this because you're using std::vector, so the specific implementation details are hidden. But basically, std::vector is implemented like this:
template <class T>
class vector {
public:
vector() : mCapacity(0), mSize(0), mData(0) { }
~vector() { if (mData) delete[] mData; }
...
protected:
int mCapacity;
int mSize;
T *mData;
};
As you can see, the vector class itself only has a few members -- capacity, size and a pointer to a dynamically allocated block of memory that will store the actual contents of the vector.
In your example, the only difference is where the storage for those few fields comes from. In the first example, the storage is allocated from whatever storage you use for your containing class -- if it is heap allocated, so too will be those few bits of the vector. If your container is stack allocated, so too will be those few bits of the vector.
In the second example, those bits of the vector are always heap allocated.
In both examples, the actual meat of the vector -- the contents of it -- are allocated from the heap, and you cannot change that.
Everybody else has pointed out already that you have a memory leak in your second example, and that is also true. Make sure to delete the vector in the destructor of your container class.
You have to release ( to prevent memory leak ) memory allocated for vector in the second case in the FieldStorage destructor.
FieldStorage::~FieldStorage()
{
delete my_ints;
}
As Mykola Golubyev pointed out, you need to delete the vector in the second case.
The first will possibly build faster code, as the optimiser knows the full size of FieldStorage including the vector, and could allocate enough memory in one allocation for both.
Your second implementation requires two separate allocations to construct the object.
I think you are really looking for the difference between the Stack and the Heap.
The first one is allocated on the stack while the second one is allocated on the heap.
In the first example, the object is allocated on the stack.
The the second example, the object is allocated in the heap, and a pointer to that memory is stored on the stack.
the difference is that the second allocates the vector dynamically. there are few differences:
you have to release memory occupied by the vector object (the vector object itself, not the object kept in the vector because it is handled correctly by the vector). you should use some smart pointer to keep the vector or make (for example in the destructor):
delete my_ints;
the first one is probably more efficient because the object is allocated on the stack.
access to the vector's method have different syntax :)
The first version of FieldStorage contains a vector. The size of the FieldStorage class includes enough bytes to hold a vector. When you construct a FieldStorage, the vector is constructed right before the body of FieldStorage's constructor is executed. When a FieldStorage is destructed, so is the vector.
This does not necessarily allocate the vector on the stack; if you heap-allocate a FieldStorage object, then the space for the vector comes from that heap allocation, not the stack. If you define a global FieldStorage object, then the space for the vector comes from neither the stack nor the heap, but rather from space designated for global objects (e.g. the .data or .bss section on some platforms).
Note that the vector performs heap allocations to hold the actual data, so it is likely to only contain a few pointers, or a pointer and couple of lengths, but it may contain whatever your compiler's STL implementation needs it to.
The second version of FieldStorage contains a pointer to a vector. The size of the FieldStorage class includes room for a pointer to a vector, not an actual vector. You are allocating storage for the vector using new in the body of FieldStorage's constructor, and you leaking that storage when FieldStorage is destructed, because you didn't define a destructor that deletes the vector.
First way is the prefereable one to use, you don't need pointer on vector and have forgot to delete it.
In the first case, the std::vector is being put directly into your class (and it is handling any memory allocation and deallocation it needs to grow and shrink the vector and free the memory when your object is destroyed; it is abstracting the memory management from you so that you don't have to deal with it). In the second case, you are explicitly allocating the storage for the std::vector in the heap somewhere, and forgetting to ever delete it, so your program will leak memory.
The size of the object is going to be different. In the second case the Vector<>* only takes up the size of a pointer (4bytes on 32bit machines). In the first case, your object will be larger.
One practical difference is that in your second example, you never release either the memory for the vector, or its contents (because you don't delete it, and therefore call its destructor).
But the first example will automatically destroy the vector (and free its contents) upon destruction of your FieldStorage object.