Pointers arguments in copy constructor - c++

I have a problem where I want to clone of a object pointer when doing a deep copy.
like I have T* t1 and I want to create a new object pointer T* t2 in a way that *t1.x= *t2.x.
Is it a good Idea to write a copy constructor which will work like:
T(const T* cpy)
{
m_var = (*cpy).m_var;
}
T* t1 = new T;
T* t2(t1);
what things should I take care of if using the above approach?
Thanks
Ruchi

To do this you should write a normal copy-constructor and use it like this:
T(const T& cpy)
: m_var(cpy.m_var) // prefer initialization-list, thanks to #Loki Astari
{}
T* t1 = new T;
T* t2 = new T(*t1);
In the code you show, T* t2(t1); would never call the constructor you have declared (which, by the way, is not a copy-constructor), because it simply initializes the pointer t2 to the value of the pointer t1, making both point to the same object.
As #Nawaz notes, this copy-constructor is equivalent to the one generated by the compiler, so you don't actually need to write it. In fact, unless you have any manually managed resources (which, usually, you shouldn't) you will always be fine with the compiler generated copy-constructor.

The definition of a copy constructor requires a reference and is thus:
T(T const& copy) // This defines a copy constructor.
: m_var(copy.m_var) // Prefer to use the initializer list.
{}
So you need to pass a reference.
If you want to copy a pointer the usage is then:
T* t2 = new T(*t1);

This does not do what you think:
T* t2(t1);
since you are only declaring a pointer, not an object. The pointer is initialise to the value of the other pointer. It should be:
T* t2 = new T (t1);
to create a new object.
As for the copy, you're currently doing a shallow copy as you are only copying the pointer value, not the data the pointer points at. Doing a shallow copy causes problems when the original or the copy is destroyed - if the m_var is deleted, the other object then has a pointer to deleted memory, invoking Undefined BehaviourTM if it is dereferenced. A deep copy fixes this:
T(const T* cpy)
{
m_var = new VarType (cpy->m_var); // VarType being whatever m_var is
}
This now requires a copy constructor for the type of m_var, which must also be deep to prevent the deletion problem above.
The downside to deep copying the data is that it increases the memory requires and takes significant time to allocate memory and copy the data. This can be solved using reference counted objects. These come in a few flavours, smart pointer being the most common. Here, the same underlying object is reference by all copies of the parent object. When the parent is deleted, the object's smart pointer's destructor only destroys the underlying object when all references to it are deleted.
The downside to smart pointers is that changing the data from one owning object modifies the data that all owning objects will see. To get the best of both worlds you'd want to have a 'copy on modified' system. This will only increase memory use when the underlying data is modified by the owning object.

Related

Why does the impl of shared_ptr ref count hold a pointer to the actual pointed type?

This was motivated by an interview question:
shared_ptr<void> p(new Foo());
Will the destructor of Foo get called once p goes out of scope?
It turns out it does, I had to look at the implementation of shared_ptr in GCC 1, and find out that apparently the control block holds a pointer to the actual type (Foo) and a pointer to the destructor that gets invoked when the ref count reaches 0.
1: Sorry I am on my phone I cannot copy the link to the impl.
But I am still wondering: why? Why is it needed? Is there anything I am missing from the standard?
On the other hand, the line above doesn't compile with unique_ptr because obviously there's no ref count in that case.
A std::shared_ptr<T> instance itself must keep track of the pointer to return when .get() is called. This is always of type T*, except when T is an array, in which case it is of type std::remove_extent_t<T>* (for example, std::shared_ptr<int[]>::get() returns int*).
Also, when a std::shared_ptr<T> is destroyed, it has to check whether it is the last std::shared_ptr instance referring to its control block. If so, it must execute the deleter. In order for this to work, the control block must keep track of the pointer to pass to the deleter. It is not necessarily of the type T* or std::remove_extent_t<T>*.
The reason why these are not the same is that, for example, code like the following should work:
struct S {
int member;
int other_member;
~S();
};
void foo(std::shared_ptr<int>);
int main() {
std::shared_ptr<S> sp = std::make_shared<S>();
std::shared_ptr<int> ip(sp, &sp->member);
foo(std::move(ip));
}
Here, sp owns an object of type S and also points to the same object. The function foo takes a std::shared_ptr<int> because it is part of some API that needs an int object that will remain alive for as long as the API isn't done with it (but the caller can also keep it alive for longer, if they want). The foo API doesn't care whether the int that you give it is part of some larger object; it just cares that the int will not be destroyed while it is holding on to it. So, we can create a std::shared_ptr<int> named ip, which points to sp->member, and pass that to foo. Now, this int object can only survive as long as the enclosing S object is alive. It follows that ip must, as long as it is alive, keep the entire S object alive. We could now call sp.reset() but the S object must remain alive, since there is still a shared_ptr referring to it. Finally, when ip is destroyed, it must destroy the entire S object, not just the int that it, itself, points to. Thus, it is not enough for the std::shared_ptr<int> instance ip to store a int* (which will be returned when .get() is called); the control block that it points to also has to store the S* to pass to the deleter.
For the same reason, your code will call the Foo destructor even though it is a std::shared_ptr<void> that is carrying out the destruction.
You asked: "Is there anything I am missing from the standard?" By this I assume you are asking whether the standard requires this behaviour and if so, where in the standard is it specified? The answer is yes. The standard specifies that a std::shared_ptr<T> stores a pointer and may also own a pointer; these two pointers need not be the same. In particular, [util.smartptr.shared.const]/14 describes constructors that "[construct] a shared_ptr instance that stores p and shares ownership with the initial value of r" (emphasis mine). The shared_ptr instance thus created may own a pointer that is different from the one it stores. However, when it is destroyed, [util.smartptr.shared.dest]/1 applies: if this is the last instance, the owned pointer is deleted (not the stored one).
I assume that for this code the answer is trivial:
shared_ptr<Foo> p(new Foo());
Every call to new must be balanced by a call to delete. Every constructed object must be destructed too. So if
shared_ptr<void> p(new Foo());
would not call ~Foo() that would be rather surprising and would cause resource leaks, dangling pointers or any number of UB caused by the destructor not being called.
For me the bigger question is: Why does that compile at all? The shared_ptr has the wrong type so it shouldn't be able to call the right destructor and that should not compile (like unique_ptr fails).
The reason for that is this little bit of genius I believe:
template< class Y > shared_ptr( const shared_ptr<Y>& r, element_type* ptr ) noexcept;
template< class Y > shared_ptr( shared_ptr<Y>&& r, element_type* ptr ) noexcept;
You can create a shared pointer pointing at a member of a larger object, which will keep the larger object alive as long as the pointer to the member exists.
For this feature to work the shared_ptr and the control block of the shared_ptr both have a pointer and they can have different types. The control block always points to the object while the shared_ptr points to the member. When you normaly create a shared_ptr they happen to be the same type and point to the same address. But apparently that isn't always the case.
This also allows making a shared_ptr<void> with the control block pointing at a Foo. Here both point to the same address but have different type. The control block know the type of the original object and what destructor to call in the end.
How does that work? The shared_ptr and control block can have different types of pointers and that allows for these copy constructors:
template< class Y > shared_ptr( const shared_ptr<Y>& r ) noexcept;
template< class Y > shared_ptr( shared_ptr<Y>&& r ) noexcept;
As long as Y* is convertible / compatible with T* you can change the type of the shared_ptr during copy construction. The given code actually turns into this:
shared_ptr<void> p(shared_ptr<Foo>(new Foo()));
It creates a temporary shared_ptr<Foo> with the control block having a Foo* and then p reuses the same control block.
std::shared_ptr::shared_ptr - cppreference.com
constexpr shared_ptr() noexcept;
(1)
constexpr shared_ptr( std::nullptr_t ) noexcept;
(2)
template< class Y > explicit shared_ptr( Y* ptr );
(3)
.....
...
....
3-7) Constructs a shared_ptr with ptr as the pointer to the managed object.
For (3-4,6), Y* must be convertible to T*.
(until C++17)
If T is an array type U[N], (3-4,6) do not participate in overload resolution if Y(*)[N] is not convertible to T*. If T is an array type U[], (3-4,6) do not participate in overload resolution if Y(*)[] is not convertible to T*. Otherwise, (3-4,6) do not participate in overload resolution if Y* is not convertible to T*.
(since C++17)
Additionally:
Uses the delete-expression delete ptr if T is not an array type; delete[] ptr if T is an array type (since C++17) as the deleter. Y must be a complete type. The delete expression must be well-formed, have well-defined behavior and not throw any exceptions. This constructor additionally does not participate in overload resolution if the delete expression is not well-formed. (since C++17)
So basically third form is used.
Also data holding reference counters (strong and weak) holds also information about destructor of the object. This (3) form of constructor fetches this information.
Note that std::unique_ptr by default do not hold such information and so it will fail in this scenario (fails to compile).

"moving" from the stack to the heap?

I need to write a C wrapper around a C++ lib and I need objects allocated on the heap.
Functions and methods of the C++ lib use and return objects allocated on the stack.
I know, I can "transfer" an object from the stack to the heap via copy i.e. auto heapObj = new Foo(stackObj); but would like to avoid copy and try to move instead if I can.
This seems to "work" (to my surprise). Is there a copy happening behind the scenes ? If not, is this pattern safe to use ?
main.h
class Foo {
public:
std::vector<int> big;
explicit Foo(size_t len);
Foo(Foo&& other) noexcept;
// remove copy constructor
Foo(const Foo &) = delete;
// delete assignment operator
Foo &operator=(const Foo &) = delete;
size_t size();
};
main.cpp
#include <iostream>
#include "main.h"
Foo::Foo(size_t len) : big(len) {}
Foo::Foo(Foo&& other) noexcept : big(std::move(other.big)) {}
size_t Foo::size() { return this->big.size(); }
int main() {
Foo ms(1000); // on the stack
ms.big[0] = 42;
auto mh = new Foo(std::move(ms)); // on the heap (no copy?)
std::cout << mh->size() << ", " << mh->big[0] << std::endl;
delete mh;
}
First of all, moving an int or a pointer is equivalent to a copy. That is, if you had a
struct X {
int a, b;
int* data;
};
then moving it is not going to be cheaper than copying it (ignoring ownership of data for now). Coincidentally, the above is basically what std::vector looks like from far away: A size and capacity member plus some pointer to a chunk of memory.
The important thing about moving vs copying is what happens in regards to ownership of resources. std::vector has ownership of some heap memory (data). If you copy a std::vector, that heap memory must be copied, so that both the original and the copy can have ownership of their own data. But if you move it, then only the moved-to vector needs to retain ownership, so the data pointer can be handed from one to the other (instead of all the data), because the ownership can be "stolen" from the moved-from object.
This is why there is no conflict in "moving" your object from the stack to the heap: The object itself is still basically copied from one place to the other, but the resources it (or its subobjects, like big) owns are not copied but moved ("stolen").
Any time a move "actually happens", it's because there is some indirect resource within the moved thing. Some resource that's referred to by a handle that can be cheaply swapped (a pointer copied over, for example; the pointee remains where it was). This is generally accomplished via pointers to dynamically-allocated things, such as the data stored within a vector.
The stuff you're trying to move is a vector. As such, it is already dynamically-allocated and the move is easy. It doesn't really matter where the actual std::vector object lives, nor where the Foo lives — if there's an indirect resource, a move is probably possible.
In other cases, a move constructor or move assignment will actually just trigger a copy of whatever data is inside. When everything (recursively) in the "thing" has automatic storage duration, you can pretty much guarantee that a copy will be required. But that's not the case here.

Do constructor object assignments leak memory

Say I have a simple class like this:
class MyObj
{
char* myPtr;
public:
MyObj()
{
myPtr = malloc(30);
}
~MyObj()
{
free(myPtr);
}
}
class TestObject
{
MyObj _myObj;
public:
TestObject(MyObj myObj)
{
_myObj = myObj;
}
};
Does this leak memory? My reasoning is that there is already an instance of MyObj contained in the TestObject by the time the constructor runs, so doesn't that blow away the myPtr before the memory can be freed? Does assigning to a local object call the destructor of the object being replaced? Does the compiler optimize away the assignment of an object instance variable if it is directly assigned in the constructor? I'm coming from C# where an object doesn't get automatically initialized just by declaring a reference type variable, so this is kind of confusing me.
Thanks!
Does this leak memory?
Yes. The assignment of myObj will invoke the default copy-assignment operator, as no override was provided by you. As a result, a member-by-member copy is performed, and the myPtr instance of the assignment target is overwritten with the myPtr instance from the assignment source. There introduces two problems, frequently encountered when violating one or more parts of the Rule of Three/Five/Zero:
You lose the original myPtr content from the target of the assignment. Thus, the original memory uniquely referred to by that pointer is leaked.
You now share the same pointer value in two myPtr members: both the source and the target of the assignment operation.
The latter is especially troubling, as myObj is leaving scope immediately after the assignment is complete within the TestObject constructor. In doing so, myObj will be destroyed, and with that, it's myPtr freed. Further, myObj was passed in to that constructor by value, not reference, so an implicit copy is already likely to have happened (short of elided copy due to rvalue move semantics). Therefore, three MyObj objects may well be hoisting a myPtr that all reference the same memory, and as soon as one releases it, the rest are unknowingly hoisting dangling pointers. Any dereference or free-ing of those pointers will invoke undefined behavior.
Does assigning to a local object call the destructor of the object being replaced?
Destructors are only invoked to live to their namesake. I.e., they're only invoked when an object is being destroyed (manual invoke of destructors for placement-new semantics notwithstanding). Copy-assignment doesn't do that unless temporaries are introduced, and that isn't the case in your code.
Does the compiler optimize away the assignment of an object instance variable if it is directly assigned in the constructor?
No, but a member initialization list can assist in that regard.
Modern C++ programming techniques frequently use RAII to accomplish what you seem to be trying in a number of ways, depending on the goal you're really trying to achieve.
Unique Data Per Instance
If the goal is unique dynamic data per instance, you can accomplish this easily with either std::vector<char>, or simply std::string, depending on the underlying need. Both are RAII data types and are usually ample for dynamic memory management needs.
class MyObj
{
std::vector<char> myData;
public:
MyObj() : myData(30)
{
}
}
class TestObject
{
MyObj _myObj;
public:
TestObject(MyObj myObj)
: _myObj(std::move(myObj))
{
}
};
This eliminates the need to a destructor in MyObj, and utilizes move semantics as well as the aforementioned member initialization list in the TestObject constructor. All instances of MyObj will hoist a distinct vector of char. All assignment operations for MyObj and TestObject work with default implementations.
Assignments Share Memory
Unlikely you desire this, but it is none-the-less feasible:
class MyObj
{
std::shared_ptr<char> myPtr;
public:
MyObj() : myPtr(new char[30])
{
}
};
class TestObject
{
MyObj _myObj;
public:
TestObject(MyObj myObj)
: _myObj(std::move(myObj))
{
}
};
Similar code, but different member type. Now myPtr is a shared pointer to an array of char. Any assignment to a different myPtr joins the share list. In short, assignment means both object reference the same data, and reference-counting ensures the last-man-standing sweeps up the mess.
Note: There is the possibility of a memory leak using shared pointers like this, as the new may succeed, but the shared data block of the shared-pointer may throw an exception. This is addressed in C++17,
where std::make_shared supports array-allocation
These are just a few ways of doing what you may be trying to accomplish. I encourage you to read about the Rule of Three/Five/Zero and about RAII concepts both at the links provided and on this site. There are plenty of examples that will likely answer further questions you may have.

Calling an object's destructor in its assignment operator method

In my assignment operator method I first destroy any resources that the object manages, and then assign, so:
struct Animal
{
int aNumber;
int * buffer;
Animal() { buffer = new int[128]; }
Animal& operator= (Animal& other)
{
if (this != &other){
delete [] buffer;
//this->~Animal(); // I'm wondering if I can call this instead of deleting buffer here.
aNumber = other.aNumber;
}
~Animal() { delete[] buffer;}
};
The reason I'm asking this is so that instead of rewriting the deleting code, I can just have it in one place. Also, I don't think that calling the destructor deallocates the memory, so when I assign aNumber after calling the destructor, I think it's OK. When I say the memory isn't deallocated I mean for example if I had a vector<Animal>, and the vector called the copy assignment operator for vector[0], vector[0] Animal would call its own destructor and then assign aNumber, but the memory is managed by vector (it's not deallocated). Am I right that the memory isn't deallocated?
After a destructor call the region of memory that held the object is just raw memory.
You can't use the result of just assigning to apparent members there.
It needs a constructor call to re-establish an object there.
But don't do this.
It's fraught with dangers, absolutely hostile deadly territory, and besides it's smelly and unclean.
Instead of
int* buffer;
Animal() { buffer = new int[128]; }
do
vector<int> buffer;
and expand that buffer as necessary as you add items to it, e.g. via push_back.
A std::vector automates the memory management for you, and does it guaranteed correctly. No bugs. Much easier.
In other news, the signature
Animal& operator= (Animal& other)
only lets you assign from non-const Animal objects specified with lvalue expressions (i.e. not temporaries), because only those can be bound the formal argument's reference to non-const.
One way to fix that is to add a const:
Animal& operator= (Animal const& other)
which communicates the intention to not modify the actual argument.
I'm wondering if I can call [the destructor] instead of deleting buffer here.
You may not.
so when I assign aNumber after calling the destructor, I think it's OK
It is not OK. An explicit destructor call ends the lifetime of the object. You may not access members of an object after its lifetime has ended. The behaviour is undefined.
Am I right that the memory isn't deallocated?
You are right, but that doesn't matter.
The reason I'm asking this is so that instead of rewriting the deleting code, I can just have it in one place.
That can be achieved by writing a function that frees the resources, and call that function from both the assignment operator and the destructor.

How can this compile? (delete a member of const object)

I would expect an error inside the copy constructor, but this compiles just fine with MSVC10.
class Test
{
public:
Test()
{
p = new int(0);
}
Test(const Test& t)
{
delete t.p; // I would expect an error here
}
~Test()
{
delete p;
}
private:
int* p;
};
This is a common issue with pointers. There is no way of actually disabling code from calling delete on a pointer (other than controlling access to the destructors). The first thing that you can hear is that the delete does not modify the pointer, but rather the pointed object. This can easily be checked by printing the pointer (std::cout << static_cast<void*>(p);) before and after a delete, so even if the pointer is constant the operation is not modifying it.
A little less intuitive is the fact that you can delete a pointer to a constant element --and the delete surely modifies the pointed element. But the language needed to be able to destruct constant objects when they fell out of scope (think { const mytype var(args); }) so const-ness cannot really affect the ability to destroy an object, and if that is allowed for auto variables, it does not make much sense to change the behavior for dynamically allocated objects. So at the end this is also allowed.
The issue that you are running into here is that you are not changing p per se (thus pstays immutable as you're not changing its value), but you're changing what p points to and thus are working at one additional level of indirection. This is possible because deleteing the memory associated with a pointer doesn't change the pointer itself.
In a strict sense the const-ness of the object is preserved, even though its logical constness has been violated as you pulled the rug from underneath whatever p was pointing to.
As JonH mentioned in the comment, if you were not able to delete the object pointed to by a pointer held in a const object, you would end up with memory leaks because you wouldn't be able to clean up properly after the object.
Constants are immutable, but that doesn't guarantee that they cannot be deleted. How would you ever delete an object if delete wasn't allowed.
If you try to modify t.p that should throw an error as t is const. But deleting t is quite normal even if it is constant.
Having:
int* const p;
... does not disallow operator delete from being called on p. Having const int* also does not disallow operator delete from being called on it.
Typical operator delete implementations take void* and any pointer will be implicitly cast to it (actually this might be the standard behavior to take void* or the only reasonable way to implement one global operator delete which can delete anything). Also as an interesting tidbit, one can implement their own overloaded operator delete (either globally or per-class) which takes void* and only has to free the memory allocated by new. The destructor call is implicitly added before any call to operator delete by the compiler; operator delete does not call the dtor in its implementation.
It is also worth noting that having const Test& in this case basically modifies the member, int* p so that it's analogous to int* const p, not int const* p or const int* p.
Thus:
Test::Test(const Test& other)
{
*other.p = 123; // this is valid
other.p = NULL; // this is not valid
}
In other words, the pointer address is immutable, but the pointee is not. I've often encountered a lot of confusion here with respect to member function constness and the effect it has on data members which are pointers. Understanding this will give a little insight as to one of the reasons why we need the separation between iterator and const_iterator.