Hello I am comming from c to c++ and I've been wondering why can std::vector be passed by value.
I assume passing dynamicaly allocated array by value is not possible as that would only copy the pointer.
How is it then possible for a vector to be coppied, if inside of a vector class is same pointer. It has to somehow know how to reconstruct it into another object.
std::vector knows how many elements are stored in the dynamic memory. It is a simple matter to allocate a new buffer of that size and copy the contents into that new memory. All of this happens in the copy constructor.
Related
I was wondering how vector works in C++. When we add a new element and the vector runs out of the space, it allocates a new memory and copies all the previous elements to the new location.
Now, how this behavior is defined?
A* a = new A(prev_a);
will copy construct at new location allocated by new. But for vector, we have to allocate multiple objects. But we cannot do so because array new cannot have initialization arguments.
So I wonder, how does vector implement this? I assume that the vector allocate a memory first and calls a copy constructor at the specific location. How is this done? Thanks
I assume that the vector allocate a memory first and calls a copy constructor at the specific location.
That is right, these are two separate steps:
Memory allocation using allocator::allocate.
Initialization. It copy/move-constructs the elements using allocator::construct, which normally uses placement new.
I have a problem about pointer and standard library use.
Let's create a new class
class Graph
{
std::vector<Edge> *edge_list;
//another way is
//std::vector<Edge> edge_list;
}
I already thought two reasons why I use pointer:
It's easy to manipulate the memory using new and delete
It can be passed by parameters easily.
However, we can pass by reference if we use vector.Then Reason 2 doesn't count.
So, Is it true if I am not strict with memory allocation, I don't need to use pointer to vector and other std container?
The implementation of std::vector contains 2 pointers:
The beginning of the allocated array
1 element after the end of the allocated array
Essentially, when you declare a vector it has no space allocated in the heap, but as you add elements this changes.
Note that std::vector manages the memory it uses, so there is no need for you to worry about new and delete (unnecessary complexity). As soon as it goes out of scope, it deallocates its memory (stack and heap).
As you said, a vector can be passed very easily by reference, which works the same way as a pointer for machine code, and it's more clear.
Let's say I have a struct like this:
struct typeA
{
long first;
string second
double third;
};
If I declare
typeA myArray[100];
Then myArray is stored in the stack, consuming sizeof(typeA)*100 bytes of garbage data (until I store some actual data, at least).
Whenever I pass this array as a parameter, I'll always be passing a pointer to the first of the first element in the stack. So the pointer goes from stack to stack.
But if I declare
vector<int> myVector (4, 100);
Then the myVector object is actually stored in the stack, and it contains a pointer to the first element of an array of 4*sizeof(int) bytes stored in the heap, where the actual data is stored. So pointer goes from stack to heap.
Whenever I pass this vector as a parameter, if I add it to the parameter list like this:
vector<int> parameterVector
the function gets a copy of the myVector object and stores it in the stack.
But if I do it like this:
vector<int> ¶meterVector
the function gets a reference to myVector stored in the stack, so I now have a variable stored in the stack, referencing a myVector object also stored in the stack, that contains a pointer to an array of actual elements stored in the heap.
Is this correct?
I have a few doubts here:
Do the actual elements get stored in a static array (the ones inherited from C, indicated with square brackets) in the heap?
Does the myVector object have just one pointer to the first element, or it has multiple pointers to each one of the elements?
So passing a vector by value doesn't pose much of a problem, since the only thing that gets copied is the vector object, but not the actual elements. Is that so?
If I got the whole thing wrong and the actual elements are copied as well when passing a vector parameter by value, then why does C++ allow this, considering it discourages it with static arrays? (as far as I know, static arrays always get passed as a reference to the first element).
Thanks!
Do the actual elements get stored in a static array (the ones inherited from C, indicated with square brackets) in the heap?
Typically the elements of the vector are stored in the free store using a dynamic array like
some_type* some_name = new some_type[some_size]
Does the myVector object have just one pointer to the first element, or it has multiple pointers to each one of the elements?
Typically a vector will have a pointer to the first element, a size variable and a capacity. It could have more but these are implementation details and are not defined by the standard.
So passing a vector by value doesn't pose much of a problem, since the only thing that gets copied is the vector object, but not the actual elements. Is that so?
No. copying the vector is an O(N) operation as it has to copy each element of the vector. If it did not then you would have two vectors using the same underlying array and if one gets destroyed then it would delete the array out from under the other one.
Do the actual elements get stored in a static array (the ones inherited from C, indicated with square brackets) in the heap?
std::vector<> will allocate memory on the heap for all your elements, given, that you use the standard allocator. It will manage that memory and reallocate, when necessary. So no, there is no static array. It is more as you would handle a dynamic array in C, but without all the traps.
If you are looking for a modern replacement for C-Arrays, have a look at std::array<>. Be aware, that a std::array<> will copy all the elements as well. Pass by reference, if that is what you mean.
Does the myVector object have just one pointer to the first element, or it has multiple pointers to each one of the elements?
std::vector usually is a pointer to the first element, a size and a few more bits for internal usage. But the details are actually implementation specific.
So passing a vector by value doesn't pose much of a problem, since the only thing that gets copied is the vector object, but not the actual elements. Is that so?
No. Whenever the vector object gets copied to another vector object, all the elements will be copied.
If I got the whole thing wrong and the actual elements are copied as well when passing a vector parameter by value, then why does C++ allow this, considering it discourages it with static arrays? (as far as I know, static arrays always get passed as a reference to the first element).
The "static arrays" are a C-Legacy. You should simply not use them any more in new code. In case you want to pass a vector by reference, do so and nothing will be copied. In case you want the vector to be moved, move it, instead of copying it. Whenever you tell the compiler, you want to copy an object, it will.
OK, why is it that way?
The C-behavior is somehow inconsistent with the rest of the language. When you pass an int, it will be copied, when you pass a struct, it will be copied, when you pass a pointer, it will be copied, but when you pass an array, the array will not be copied, but a pointer to its first element.
So the C++ way is more consistent. Pass by value copies everything, pass by reference doesn't. With C++11 move constructors, objects can be passed by moving them. That means, that the original vector will be left empty, while the new one has taken over the responsibility for the original memory block.
First we have a vector of pointers as:
vehicleGroup<vehicle*> VG;
In c++, is there a difference between:
VG.push_back(new vehicle(1));
VG.push_back(new vehicle(2));
and
//tmp_vehicle is a public class member
tmp_vehicle = new vehicle(1);
VG.push_back(tmp_vehicle);
tmp_vehicle = new vehicle(2);
VG.push_back(tmp_vehicle);
Does the vecotr VG contains the address of pointer itself OR the address the pointer pointed to?
What about map?
VG contains exactly what you ask it for - pointers to vehicle objects.
When you call push_back(), it takes object you provided (in your case "object" is vector*), makes copy of it and puts it to vector. Vector uses internal memory chunk where it stores objects, so that's why it needs to make copies.
The two versions do the same thing.
In your second version, tmp_vehicle first points to whatever new vehicule(1) returned. This pointer is then pushed into the vector, so the vector's first element now also points to that location.
Seen another way, you're not storing tmp_vehicule itself in the vector. You're storing a copy of that pointer.
Then you make tmp_vehicule point to something else. This doesn't change the fact that you stored a pointer to the first location in the vector. It changes what your variable points to, but doesn't change the vector in any way.
(And if you hadn't stored that pointer in the vector, you'd have a memory leak after the second assignment to tmp_vector, since you'd have lost all pointers to the first vehicule - so no way to delete it.)
I am writing an application using openFrameworks, but my question is not specific to just oF; rather, it is a general question about C++ vectors in general.
I wanted to create a class that contains multiple instances of another class, but also provides an intuitive interface for interacting with those objects. Internally, my class used a vector of the class, but when I tried to manipulate an object using vector.at(), the program would compile but not work properly (in my case, it would not display a video).
// instantiate object dynamically, do something, then append to vector
vector<ofVideoPlayer> videos;
ofVideoPlayer *video = new ofVideoPlayer;
video->loadMovie(filename);
videos.push_back(*video);
// access object in vector and do something; compiles but does not work properly
// without going into specific openFrameworks details, the problem was that the video would
// not draw to screen
videos.at(0)->draw();
Somewhere, it was suggested that I make a vector of pointers to objects of that class instead of a vector of those objects themselves. I implemented this and indeed it worked like a charm.
vector<ofVideoPlayer*> videos;
ofVideoPlayer * video = new ofVideoPlayer;
video->loadMovie(filename);
videos.push_back(video);
// now dereference pointer to object and call draw
videos.at(0)->draw();
I was allocating memory for the objects dynamically, i.e. ofVideoPlayer = new ofVideoPlayer;
My question is simple: why did using a vector of pointers work, and when would you create a vector of objects versus a vector of pointers to those objects?
What you have to know about vectors in c++ is that they have to use the copy operator of the class of your objects to be able to enter them into the vector. If you had memory allocation in these objects that was automatically deallocated when the destructor was called, that could explain your problems: your object was copied into the vector then destroyed.
If you have, in your object class, a pointer that points towards a buffer allocated, a copy of this object will point towards the same buffer (if you use the default copy operator). If the destructor deallocates the buffer, when the copy destructor will be called, the original buffer will be deallocated, therefore your data won't be available anymore.
This problem doesn't happen if you use pointers, because you control the life of your elements via new/destroy, and the vector functions only copy pointer towards your elements.
My question is simple: why did using a
vector of pointers work, and when
would you create a vector of objects
versus a vector of pointers to those
objects?
std::vector is like a raw array allocated with new and reallocated when you try to push in more elements than its current size.
So, if it contains A pointers, it's like if you were manipulating an array of A*.
When it needs to resize (you push_back() an element while it's already filled to its current capacity), it will create another A* array and copy in the array of A* from the previous vector.
If it contains A objects, then it's like you were manipulating an array of A, so A should be default-constructible if there are automatic reallocations occuring. In this case, the whole A objects get copied too in another array.
See the difference? The A objects in std::vector<A> can change address if you do some manipulations that requires the resizing of the internal array. That's where most problems with containing objects in std::vector comes from.
A way to use std::vector without having such problems is to allocate a large enough array from the start. The keyword here is "capacity". The std::vector capacity is the real size of the memory buffer in which it will put the objects. So, to setup the capacity, you have two choices:
1) size your std::vector on construction to build all the object from the start , with maximum number of objects - that will call constructors of each objects.
2) once the std::vector is constructed (but has nothing in it), use its reserve() function : the vector will then allocate a large enough buffer (you provide the maximum size of the vector). The vector will set the capacity. If you push_back() objects in this vector or resize() under the limit of the size you've provided in the reserve() call, it will never reallocate the internal buffer and your objects will not change location in memory, making pointers to those objects always valid (some assertions to check that change of capacity never occurs is an excellent practice).
If you are allocating memory for the objects using new, you are allocating it on the heap. In this case, you should use pointers. However, in C++, the convention is generally to create all objects on the stack and pass copies of those objects around instead of passing pointers to objects on the heap.
Why is this better? It is because C++ does not have garbage collection, so memory for objects on the heap will not be reclaimed unless you specifically delete the object. However, objects on the stack are always destroyed when they leave scope. If you create objects on the stack instead of the heap, you minimize your risk of memory leaks.
If you do use the stack instead of the heap, you will need to write good copy constructors and destructors. Badly written copy constructors or destructors can lead to either memory leaks or double frees.
If your objects are too large to be efficiently copied, then it is acceptable to use pointers. However, you should use reference-counting smart pointers (either the C++0x auto_ptr or one the Boost library pointers) to avoid memory leaks.
vector addition and internal housekeeping use copies of the original object - if taking a copy is very expensive or impossible, then using a pointer is preferable.
If you make the vector member a pointer, use a smart pointer to simplify your code and minimize the risk of leaks.
Maybe your class does not do proper (ie. deep) copy construction/assignment? If so, pointers would work but not object instances as the vector member.
Usually I don't store classes directly in std::vector. The reason is simple: you would not know if the class is derived or not.
E.g.:
In headers:
class base
{
public:
virtual base * clone() { new base(*this); };
virtual ~base(){};
};
class derived : public base
{
public:
virtual base * clone() { new derived(*this); };
};
void some_code(void);
void work_on_some_class( base &_arg );
In source:
void some_code(void)
{
...
derived instance;
work_on_some_class(derived instance);
...
}
void work_on_some_class( base &_arg )
{
vector<base> store;
...
store.push_back(*_arg.clone());
// Issue!
// get derived * from clone -> the size of the object would greater than size of base
}
So I prefer to use shared_ptr:
void work_on_some_class( base &_arg )
{
vector<shared_ptr<base> > store;
...
store.push_back(_arg.clone());
// no issue :)
}
The main idea of using vector is to store objects in a continue space, when using pointer or smart pointer that won't happen
Here also need to keep in mind the performance of memory usage by CPU.
std::vector vector guarantees(not sure) that the mem block is
continuous.
std::vectorstd::unique_ptr<Object> will keep smart-pointers in continuous memory, but real memory blocks for objects can be placed in different positions in RAM.
So I can guess that std::vector will be faster for cases when the size of the vector is reserved and known. However, std::vectorstd::unique_ptr<Object> will be faster if we don't know the planned size or we have plans to change the order of objects.