This question already has answers here:
How is vector implemented in C++
(9 answers)
Closed 8 years ago.
Not so far I discovered, that I completely have no clue about nature of std::vector.
Let me explain:
Vector is growable, right? That means, inside it must somehow allocate/reallocate memory dynamically. Something like this:
class vector {
private:
int *data;
};
Okay. But such definition implies the fact that if we pass std::vector to another function by reference or by value -- there will be no difference between this two types of parameter passing, and both function will be able to modify data (unless vector is passed as const).
BUT! I tried the following and my idea failed:
void try_to_modify(vector<int> v) {
v[2] = 53;
}
int main() {
vector<int> v(3);
v[2] = 142;
try_to_modify(v);
cout << v[2] << '\n'; // output is: 142
return 0;
}
So where's the truth? What std::vector really is?
Thank you.
std::vector is a container, which internally manages its memory and provides a custom copy constructor. In this copy constructor, new memory is allocated and the existing data is copied over, which makes it a expensive operation. If you want to pass a vector without copying the contained data, you can pass by const reference, for example, const std::vector<int>&.
Let's take a look how to implement a basic container like std::vector.
template <typename T>
class MyVector
{
public:
MyVector (int size)
: data_ (new T[size])
, size_ (size)
{}
~MyVector ()
{
delete [] data_;
}
private:
T* data_ = nullptr;
int size_ = 0;
};
If we copy such an object, we'll have two problems. First, as you noticed, the memory will point to the same location. Second, we will have two destructors that will destruct the same memory, resulting in a double-free. So, let's add a copy constructor, which will be invoked whenever a copy is made.
MyVector (const MyVector& other)
: size_ (other.size_)
{
data_ = new T[size_];
std::copy (other.data_, other.data_ + size_, data_);
}
MyVector& operator= (const MyVector& other)
{
// allocate and copy here to allow for self-assignment
auto newData = new T[other.size_];
std::copy (other.data_, other.data_ + size_, newData);
delete [] data_;
size_ = other.size_;
data_ = newData;
return *this;
}
And that's how std::vector works internally.
This isn't so much a problem with vector as it is with pass-by-value and pass-by-reference arguments.
void try_to_modify (vector<int> vec) will send a copy of the original vector to the function and the function operates on a copy. The vector will copy the data when a copy is made. i.e. new pointer to new data.
However, if you define your function as: void try_to_modify(vector<int> & vec) then it will send the exact vector to the function and your function will operate on it.
Passing by reference is much faster for objects and is usually preferable, unless you have a specific need for a copy.
Related
I have a simple Box container with a naive implementation that takes Car
#include <iostream>
#include <vector>
struct Car {
Car() { puts("def"); }
Car(Car const& other) { puts("copy"); }
Car& operator=(Car const& other) {
puts("assign");
return *this;
}
};
struct Box {
size_t size;
Car* ptr;
Box(size_t size)
: size(size)
, ptr{new Car[size]}
{}
Box(Box const& other)
: size{other.size}
{
ptr = new Car[size];
for (int i = 0; i < size; i++)
ptr[i] = other.ptr[i]; // hits operator=
}
};
int main() {
Box b(2);
Box b3 = b;
std::cout << std::endl;
std::vector<Car> v(2);
std::vector<Car> v2 = v;
}
o/p
def
def
def
def
assign
assign
def
def
copy
copy
std::vector copies and calls the copy constructor and Box doesn't. How std::vector's copy constructor is implemented? and what I am doing wrong?
How std::vector's memory allocation is handled and why default constructor is called only two times in std::vector where as 4 times in Box? Any explanation would suffice
new[] combines allocating memory with starting the lifetime of the elements in the array. This can be problematic, as you've seen, because it calls the default constructor of each element.
What std::vector does is use std::allocator (or whatever allocator you provided as the second template argument) to allocate memory then uses placement new to start the lifetime of the array's elements one-by-one. Placement new is a new expression where the developer provides a pointer to where the object should be created, instead of asking new to allocate new storage.
Using this approach, here is a simplified example of your copy constructor :
Box::Box(const Box & other) : size{other.size}
{
// Create storage for an array of `size` instances of `Car`
ptr = std::allocator<Car>{}.allocate(size);
for(std::size_t i = 0; i < size; ++i)
{
// Create a `Car` at the address `ptr + i`
// using the constructor argument `other.ptr[i]`
new (ptr + i) Car (other.ptr[i]);
}
}
With this approach, you can't use delete[] or delete to clean up your Car elements. You need to explicitly perform the previous process in reverse. First, explicitly destroy all the Car objects by calling each of their destructors, then deallocate the storage using the allocator. A simplified destructor would look like :
Box::~Box()
{
for(std::size_t i = 0; i < size; ++i)
{
// Explicitly call the destructor of the `Car`
ptr[i].~Car();
}
// Free the storage that is now unused
std::allocator<Car>().deallocate(ptr, size);
}
The copy assignment operator will involve both of these processes, first to release the clean up the previous elements, then to copy the new elements.
Here is a very rudimentary implementation for Box : https://godbolt.org/z/9P3sshEKa
It is still missing move semantics, and any kind of exception guarantee. Consider what happens if new (ptr + i) Car (other.ptr[i]); throws an exception. You're on the line for cleaning up all the previously created instances, as well as the storage. For example, if it throws at i == 5 you need to call the destructors of the Car objects 0 through 4, then deallocate the storage.
Overall, std::vector does a lot of heavy lifting for you. It is hard to replicate its functionalities correctly.
std::vector uses an allocator instead of using the new operator. The crucial difference is that the new operator constructs every single element in the array. But vector allocates raw memory and only constructs elements on demand. You could achieve the same by using operator new (instead of new operator), malloc, some allocator, or by other means. You then use placement-new to call the constructors. In destructor, you have to call destructors of all elements individually and only then free the memory. See:
Difference between 'new operator' and 'operator new'?
malloc
std::allocator
operator new
new expression
Additionally, your Box class needs an operator= since the default one does something wrong. And a destructor. It's leaking memory.
Perhaps used the reference counting mechanism in the implementation of the lower layers of the allocation.
I am trying to create my own vector and here is a minimal example to introduce the problem I have:
class DemoVector {
public:
DemoVector() : capacity_(1), size_(0) {
data_ = new int[1];
}
DemoVector(DemoVector&& rhs) {
data_ = std::move(rhs.data_);
size_ = rhs.size_;
capacity_ = rhs.capacity_;
}
~DemoVector() {
delete[] data_;
}
void PushBack(const int &v) {
// doesn't matter
}
private:
int *data_;
size_t capacity_;
size_t size_;
};
Test:
TEST_CASE("Test") {
DemoVector b;
b.PushBack(1);
DemoVector c(std::move(b));
}
I have a problem here and I understand why. I have two objects which points on the same memory. Second destructor tries to free memory, which have already been freed by first destructor.
But I don't know how to fix it.
Thank you for your help.
std::move(rhs.data_) doesn't actually move anything. std::move is nothing more than a named cast. It produces an rvalue reference that allows move semantics to occur. But for primitive types, it's just a copy operation. The pointer is being copied, and so you end up with two pointers that contain the same address. Since you don't want the source object to still be pointing at the the same buffer, simply modify it. That's why move-semantics is build around non-const references.
Move constructors are commonplace now, so there's a standard utility (C++14) to help write them in a way that makes code behave more as you'd expect. It's std::exchange. You can simply write
DemoVector(DemoVector&& rhs)
: data_(std::exchange(rhs.data_, nullptr))
, size_(std::exchange(rhs.size_ , 0))
, capacity_(std::exchange(rhs.capacity_ , 0))
{}
And all the values get adjusted properly. std::exchange modifies its first argument to hold the value of the second argument. And finally, it return the old value of the first argument. Very handy to shift values around in one-liner initializations.
Because std::move is basically just a cast that doesn't actually move anything! You need to update the values in the other object yourself:
DemoVector(DemoVector&& rhs) {
data_ = rhs.data_;
size_ = rhs.size_;
capacity_ = rhs.capacity_;
rhs.data_ = nullptr;
rhs.size_ = 0;
rhs.capacity = 0;
}
Or alternatively to make use of the existing constructor:
DemoVector(DemoVector&& rhs): DemoVector() {
// Or write your own swap function to reuse this elsewhere
std::swap(data_, rhs.data_);
std::swap(size_, rhs.size_);
std::swap(capacity_, rhs.capacity_);
}
It's up to you how you want users of your class to handle moved-from objects. In the second case, and possibly also the first depending on how the rest of your class works, rhs (b in your test case) will be an empty vector.
Recently I got a task to do in C++, implement a Set class with union, intersection etc. as overloaded operators. I've got a problem with overloading an operator+(). I decide to use vectors and get the advantage of some algorithm's library functions. The problem is I HAD TO pass to constructor an array pointer and array size. This complicated this task a bit... I can compile it but during the "z=a+b" operation I encounter somekind of memory leak. Can anyone explain me what am I doing wrong?
class Set {
int number; // array size (can't be changed)
int *elems; // array pointer (same)
public:
Set();
Set(int, int*); // (can't be changed)
~Set();
friend Set operator+(const Set& X,const Set& Y){
std::vector<int> v(X.number+Y.number);
std::vector<int>::iterator it;
it=std::set_union (X.elems, X.elems+X.number, Y.elems, Y.elems+Y.number, v.begin());
v.resize(it-v.begin());
Set Z;
Z.number=v.size();
Z.elems=&v[0];
return Z;
}
};
Set::Set(){};
Set::Set(int n, int* array){
number=n;
elems = array = new int[number];
for(int i=0; i<number; i++) // creating Set
std::cin >> elems[i];
std::sort(elems, elems + number);
}
Set::~Set(){
delete[] elems;
}
int main(){
int* pointer;
Set z;
Set a = Set(5, pointer);
Set b = Set(2, pointer);
z=a+b;
}
I added copy constructor and copy assingment, changed the operator+() as NathanOliver advised and now I am passing to constructor static array. Still have memory leak and strange thing is that I got this memory leak even when in main there is only class variable initialization, doesn't matter if with parameters or not... Any suggestions? I think cunstructor is valid.
Set::Set(int n, int* array){
number = n;
elems = array;
std::sort(elems, elems + number);
}
Set::Set(const Set& s){
number=s.number;
elems=s.elems;
}
Set& operator=(const Set& X){
if(this==&X)
return *this;
delete [] elems;
elems=X.elems;
number=X.number;
return *this;
I use gcc (tdm64-2) 4.8.1 compiler.
In
friend Set operator+(const Set& X,const Set& Y){
std::vector<int> v(X.number+Y.number);
std::vector<int>::iterator it;
it=std::set_union (X.elems, X.elems+X.number, Y.elems, Y.elems+Y.number, v.begin());
v.resize(it-v.begin());
Set Z;
Z.number=v.size();
Z.elems=&v[0];
return Z;
}
You create a vector, modify it and then set the elems to point to what the vector contains. The issue with that is that when the vector is destroyed at the end of the function the memory that the vector held is released. So you now have a pointer pointing to memory you no longer own. Trying to do anything with it is undefined behavior. What you could do is create a new array, copy the elements of the vector into the array and then assign the new array to `elems
Set Z;
Z.number= v.size();
Z.elems= new int[z.number];
for (int i = 0; i < Z.number; i++)
Z.elems[i] = v[i];
return Z;
Secondly you need to define a copy constructor and assignment operator for you class. To do that reference: What is The Rule of Three?
The best solution (but I can't tell if you're allowed to do this specifically) is to use vector internally in your Set and assign it from the passed in pointer and length using the two-iterator constructor.
Now if that's not possible you need to properly manage the memory of your class:
You need a copy constructor and copy assignment operator.
In your operator+ you can't create a local vector and then take the address of its memory, that memory will disappear as soon as the operator returns.
Possibly other things I didn't catch.
When you have z=a+b, the assignment operator for the Set class is used. You didn't define a custom version of this operator, so the default compiler-generated one is used. This compiler-generated assignment operator=() simply does a member-wise copy.
Since you have raw owning pointers in your Set class, this doesn't work correctly: the default compiler-generated operator=() shallow-copies pointers, instead you should deep-copy the data.
An option to fix this is to define your own version of operator=(), paying attention to do proper deep-copy of source data.
Note that in this case you also should define a copy constructor.
But a better option would be to get rid of the owning raw pointer data members, and instead use a RAII building block class, like std::vector.
So, for example, instead of these data members:
int number; // array size (can't be changed)
int *elems; // array pointer (same)
you could just have a single:
std::vector<int> elems;
If you do that, the default compiler-generated operator=() will work just fine, since it will copy the std::vector data member (not the raw owning pointers), and std::vector knows how to properly copy its content without leaking resources.
This question already has answers here:
What is The Rule of Three?
(8 answers)
Closed 8 years ago.
I have created a custom vector class that uses a dynamic array to store data. The overloaded constructor takes a pointer to existing array and size of the array as the arguments.
int a[3] = { 1, 2, 3 };
Vector<int> v(a, 3);
However, when I try to change this vector using the following code, it crashes because the pointer of vector object "v" points to 0xcccccccc insted of the address of the dynamic array
v = Vector<int>(a, 3);
Why does this happen and how could I improve the assignment above?
EDIT: here is the calss code:
template <class T> class Vector
{
private:
T* mArray;
int Length;
public:
Vector(){
mArray = 0;
Length = 0;
};
Vector(const Vector& rVectorData){
Length = rVectorData.Length;
T* pArray = new T[Length];
for (int i = 0; i < Length; i++)
pArray[i] = rVectorData.mArray[i];
delete[] Array;
mArray = pArray;
};
Vector(const T* aArray, int size){
Length = size;
T* pArray = new T[Length];
for (int i = 0; i < Length; i++)
pArray[i] = aArray[i];
delete[] mArray;
mArray = pArray;
};
~Vector(){
delete[] mArray;
mArray = 0;
Length = 0;
};
}
delete[] mArray;
mArray = pArray;
Since this is occurring in your constructor, and you have not initialized mArray to anything (e.g. nullptr), you are attempting to delete some random area of memory (that you did not allocate), which will likely cause your program to crash (it is UB).
You can fix that by removing the delete[] mArray line as the constructor will only be called during construction.
"v" points to 0xcccccccc insted of the address of the array "a"
Since you are allocating memory in the constructor, v will not point to the address of a, as you are copying the values from a into v, which has allocated its own memory.
Additionally, since you did not define a copy-constructor, anytime you attempt to copy your vector, it will do a shallow copy. This will result in a memory corruption problem as whichever variable goes out of scope first will free the memory, leaving a dangling pointer in the other. When the latter finally goes out of scope, it will also result in UB (and likely crash your program).
As Karoly noted, when you implement your own destructor, you should follow the rule of 3.
It seems to me that if you're going to write your own vector class (or your own semi-duplicate of almost anything that's already widely available) you should try to not only make it correct, but add something new to the mix so your code isn't just a mediocre imitation of what's already easily available (along with preferably avoiding it's being a huge step backwards in any direction either).
For example, if I were going to support initializing a Vector from an array, I'd add a template member function that deduced the size of the array automatically:
template <size_t N>
Vector(T (&array)[N]) : data(new T[N]), size(N) {
std::copy_n(array, N, data);
}
This allows something like:
int a[]={1, 2, 3};
Vector<int> x(a);
...so you don't have to specify the size.
You've already heard about the rule of three. To avoid a step back from std::vector, you almost certainly want to update that to the rule of 5 (or else use a smarter pointer class that lets you follow the rule of zero).
The straightforward way to do that is to implement a move ctor and move assignment operator:
Vector &operator=(Vector &&src) {
delete[] data;
data=src.data;
size=src.size;
src.data=nullptr;
src.size = 0;
return *this;
}
Vector(Vector &&src): data(src.data), size(src.size) {
src.data=nullptr;
src.size=0;
}
For convenience, you also almost certainly also want to include a ctor that takes an initializer list:
Vector(std::initializer_list<T> const &i) : data(new T[i.size()]), size(i.size())
{
std::copy(i.begin(), i.end(), data);
}
Finally, you just about need (or at least really want) to support an iterator interface to the contained data:
class iterator {
T *pos;
friend class Vector;
iterator(T *init): pos(init) {}
public:
iterator &operator++() { ++pos; return *this; }
iterator &operator--() { --pos; return *this; }
iterator &operator++(int) { iterator tmp(*this); ++pos; return tmp; }
iterator &operator--(int) { iterator tmp(*this); --pos; return tmp; }
T &operator*() { return *pos; }
bool operator!=(iterator const &other) const { return pos!=other.pos; }
};
iterator begin() { return iterator(data); }
iterator end() { return iterator(data+size); }
...then you want to add const_iterator, reverse_iterator and const_reverse_iterator classes, and with them cbegin/cend, rbegin/rend and crbegin/crend to support constant and/or reversed iteration of the data.
Note, however, that most of this is just duplicating what std::vector already provides. The only new thing we've added here is the ctor that takes an array and deduces its size automatically. At the same time, that is enough to provide a fixed-size array wrapper that (other than dynamic sizing) has approximate parity with std::vector.
How the push_back of stl::vector is implemented so it can make copy of any datatype .. may be pointer, double pointer and so on ...
I'm implementing a template class having a function push_back almost similar to vector. Within this method a copy of argument should be inserted in internal allocated memory.
In case the argument is a pointer or a chain of pointers (an object pointer); the copy should be made of actual data pointed. [updated as per comment]
Can you pls tell how to create copy from pointer. so that if i delete the pointer in caller still the copy exists in my template class?
Code base is as follows:
template<typename T>
class Vector
{
public:
void push_back(const T& val_in)
{
T a (val_in); // It copies pointer, NOT data.
m_pData[SIZE++] = a;
}
}
Caller:
// Initialize my custom Vector class.
Vector<MyClass*> v(3);
MyClass* a = new MyClass();
a->a = 0;
a->b = .5;
// push MyClass object pointer
// now push_back method should create a copy of data
// pointed by 'a' and insert it to internal allocated memory.
// 'a' can be a chain of pointers also.
// how to achieve this functionality?
v.push_back(a);
delete a;
I can simply use STL vector to accomplish the tasks but for experiment purposes i'm writing a template class which does exactly the same.
Thanks.
if you have polymorphic object ( the pointed object may be more specialized than the variable ), I suggest you creating a virtual method called clone() that allocate a new pointer with a copy of your object:
Base* A::clone() {
A* toReturn = new A();
//copy stuff
return toReturn;
}
If you can't modify your Base class, you can use RTTI, but I will not approach this solution in this answer. ( If you want more details in this solution, please make a question regarding polymorphic cloning with RTTI).
If you have not a polymorphic object, you may allocate a new object by calling the copy constructor.
void YourVector::push_back(Base* obj) {
Base* copy = new Base(obj);
}
But it smells that what you are really needing is shared_ptr, avaliable in <tr1/memory> ( or <memory> if you use C++0x ).
Update based on comments
You may also have a two template parameters list:
template <typename T>
struct CopyConstructorCloner {
T* operator()(const T& t) {
return new T(t);
}
}
template <typename T, typename CLONER=CopyConstructorCloner<T> >
class MyList {
CLONER cloneObj;
public:
// ...
void push_back(const T& t) {
T* newElement = cloneObj(t);
// save newElemenet somewhere, dont forget to delete it later
}
}
With this approach it is possible to define new cloning politics for things like pointers.
Still, I recommend you to use shared_ptrs.
I think for this kind of problems it is better to use smart pointers ex: boost::shared_ptr or any other equivalent implementation.
There is no need to call new for the given datatype T. The push_back implementation should (must) call the copy-constructor or the assignment operator. The memory should have been allocated to hold those elemnets that are being pushed. The intial memory allocation should not call CTOR of type T. Something like:
T* pArray;
pArray = (T*) new BYTE[sizeof(T) * INITIAL_SIZE);
And then just put new object into pArray, calling the assignment operator.
One solution is to make a copy construction:
MyClass *p = new MyClass();
MyVector<MyClass*> v;
v.push_back(new MyClass(*p));
Update: From you updated question, you can definitely override push_back
template<typename T>
class MyVector {
public:
void push_back (T obj); // general push_back
template<typename TYPE> // T can already be a pointer, so declare TYPE again
void push_back (TYPE *pFrom)
{
TYPE *pNew = new TYPE(*pFrom);
// use pNew in your logic...
}
};
Something like this:
template<typename T>
class MyVector
{
T* data; // Pointer to internal memory
size_t count; // Number of items of T stored in data
size_t allocated; // Total space that is available in data
// (available space is => allocated - count)
void push_back(std::auto_ptr<T> item) // Use auto pointer to indicate transfer of ownership
/*void push_back(T* item) The dangerous version of the interface */
{
if ((allocated - count) == 0)
{ reallocateSomeMemory();
}
T* dest = &data[count]; // location to store item
new (dest) T(*item); // Use placement new and copy constructor.
++count;
}
// All the other stuff you will need.
};
Edit based on comments:
To call it you need to do this:
MyVector<Plop> data;
std::auto_ptr<Plop> item(new Plop()); // ALWAYS put dynamically allocated objects
// into a smart pointer. Not doing this is bad
// practice.
data.push_back(item);
I use auto_ptr because RAW pointers are bad (ie in real C++ code (unlike C) you rarely see pointers, they are hidden inside smart pointers).