I need to write a C wrapper around a C++ lib and I need objects allocated on the heap.
Functions and methods of the C++ lib use and return objects allocated on the stack.
I know, I can "transfer" an object from the stack to the heap via copy i.e. auto heapObj = new Foo(stackObj); but would like to avoid copy and try to move instead if I can.
This seems to "work" (to my surprise). Is there a copy happening behind the scenes ? If not, is this pattern safe to use ?
main.h
class Foo {
public:
std::vector<int> big;
explicit Foo(size_t len);
Foo(Foo&& other) noexcept;
// remove copy constructor
Foo(const Foo &) = delete;
// delete assignment operator
Foo &operator=(const Foo &) = delete;
size_t size();
};
main.cpp
#include <iostream>
#include "main.h"
Foo::Foo(size_t len) : big(len) {}
Foo::Foo(Foo&& other) noexcept : big(std::move(other.big)) {}
size_t Foo::size() { return this->big.size(); }
int main() {
Foo ms(1000); // on the stack
ms.big[0] = 42;
auto mh = new Foo(std::move(ms)); // on the heap (no copy?)
std::cout << mh->size() << ", " << mh->big[0] << std::endl;
delete mh;
}
First of all, moving an int or a pointer is equivalent to a copy. That is, if you had a
struct X {
int a, b;
int* data;
};
then moving it is not going to be cheaper than copying it (ignoring ownership of data for now). Coincidentally, the above is basically what std::vector looks like from far away: A size and capacity member plus some pointer to a chunk of memory.
The important thing about moving vs copying is what happens in regards to ownership of resources. std::vector has ownership of some heap memory (data). If you copy a std::vector, that heap memory must be copied, so that both the original and the copy can have ownership of their own data. But if you move it, then only the moved-to vector needs to retain ownership, so the data pointer can be handed from one to the other (instead of all the data), because the ownership can be "stolen" from the moved-from object.
This is why there is no conflict in "moving" your object from the stack to the heap: The object itself is still basically copied from one place to the other, but the resources it (or its subobjects, like big) owns are not copied but moved ("stolen").
Any time a move "actually happens", it's because there is some indirect resource within the moved thing. Some resource that's referred to by a handle that can be cheaply swapped (a pointer copied over, for example; the pointee remains where it was). This is generally accomplished via pointers to dynamically-allocated things, such as the data stored within a vector.
The stuff you're trying to move is a vector. As such, it is already dynamically-allocated and the move is easy. It doesn't really matter where the actual std::vector object lives, nor where the Foo lives — if there's an indirect resource, a move is probably possible.
In other cases, a move constructor or move assignment will actually just trigger a copy of whatever data is inside. When everything (recursively) in the "thing" has automatic storage duration, you can pretty much guarantee that a copy will be required. But that's not the case here.
Related
(Before anyone asks: no, i didn't forget the delete[] statement)
I was fiddling around with dinamically allocated memory and i run into this issue. I think the best way to explain it is to show you these two pieces of code I wrote. They are very similar, but in one of them my class destructor doesn't get called.
// memleak.cpp
#include <vector>
using namespace std;
class Leak {
vector<int*> list;
public:
void init() {
for (int i = 0; i < 10; i++) {
list.push_back(new int[2] {i, i});
}
}
Leak() = default;
~Leak() {
for (auto &i : list) {
delete[] i;
}
}
};
int main() {
Leak leak;
while (true) {
// I tried explicitly calling the destructor as well,
// but this somehow causes the same memory to be deleted twice
// and segfaults
// leak.~Leak();
leak = Leak();
leak.init();
}
}
// noleak.cpp
#include <vector>
using namespace std;
class Leak {
vector<int*> list;
public:
Leak() {
for (int i = 0; i < 10; i++) {
list.push_back(new int[2] {i, i});
}
};
~Leak() {
for (auto &i : list) {
delete[] i;
}
}
};
int main() {
Leak leak;
while (true) {
leak = Leak();
}
}
I compiled them both with g++ filename.cpp --std=c++14 && ./a.out and used top to check memory usage.
As you can see, the only difference is that, in memleak.cpp, the class constructor doesn't do anything and there is an init() function that does its job. However, if you try this out, you will see that this somehow interferes with the destructor being called and causes a memory leak.
Am i missing something obvious? Thanks in advance.
Also, before anyone suggests not using dinamically allocated memory: I knwow that it isn't always a good practice and so on, but the main thing I'm interested in right now is understanding why my code doesn't work as expected.
This is constructing a temporary object and then assigning it. Since you didn't write an assignment operator, you get a default one. The default one just copies the vector list.
In the first code you have:
Create a temporary Leak object. It has no pointers in its vector.
Assign the temporary object to the leak object. This copies the vector (overwriting the old one)
Delete the temporary object, this deletes 0 pointers since its vector is empty.
Allocate a bunch of memory and store the pointers in the vector.
Repeat.
In the second code you have:
Create a temporary Leak object. Allocate some memory and store the pointers in its vector.
Assign the temporary object to the leak object. This copies the vector (overwriting the old one)
Delete the temporary object, this deletes the 10 pointers in the temporary object's vector.
Repeat.
Note that after leak = Leak(); the same pointers that were in the temporary object's vector are also in leak's vector. Even if they were deleted.
To fix this, you should write an operator = for your class. The rule of 3 is a way to remember that if you have a destructor, you usually need to also write a copy constructor and copy assignment operator. (Since C++11 you can optionally also write a move constructor and move assignment operator, making it the rule of 5)
Your assignment operator would delete the pointers in the vector, clear the vector, allocate new memory to hold the int values from the object being assigned, put those pointers in the vector, and copy the int values. So that the old pointers are cleaned up, and the object being assigned to becomes a copy of the object being assigned from, without sharing the same pointers.
Your class doesn't respect the rule of 3/5/0. The default-generated move-assignment copy-assignment operator in leak = Leak(); makes leak reference the contents of the temporary Leak object, which it deletes promptly at the end of its lifetime, leaving leak with dangling pointers which it will later try to delete again.
Note: this could have gone unnoticed if your implementation of std::vector systematically emptied the original vector upon moving, but that is not guaranteed.
Note 2: the striked out parts above I wrote without realizing that, as StoryTeller pointed out to me, your class does not generate a move-assignment operator because it has a user-declared destructor. A copy-assignment operator is generated and used instead.
Use smart pointers and containers to model your classes (std::vector<std::array<int, 2>>, std::vector<std::vector<int>> or std::vector<std::unique_ptr<int[]>>). Do not use new, delete and raw owning pointers. In the exceedingly rare case where you may need them, be sure to encapsulate them tightly and to carefully apply the aforementioned rule of 3/5/0 (including exception handling).
My following question is on memory management. I have for example an int variable not allocated dynamically in a class, let's say invar1. And I'm passing the memory address of this int to another classes constructor. That class does this:
class ex1{
ex1(int* p_intvar1)
{
ptoint = p_intvar1;
}
int* ptoint;
};
Should I delete ptoint? Because it has the address of an undynamically allocated int, I thought I don't need to delete it.
And again I declare an object to a class with new operator:
objtoclass = new ex1();
And I pass this to another class:
class ex2{
ex2(ex1* p_obj)
{
obj = p_obj;
}
ex1* obj;
};
Should I delete obj when I'm already deleting objtoclass?
Thanks!
Because it has the address of an undynamically allocated int I thought I don't need to delete it.
Correct.
Should I delete obj when I'm already deleting objtoclass?
No.
Recall that you're not actually deleting pointers; you're using pointers to delete the thing they point to. As such, if you wrote both delete obj and delete objtoclass, because both pointers point to the same object, you'd be deleting that object twice.
I would caution you that this is a very easy mistake to make with your ex2 class, in which the ownership semantics of that pointed-to object are not entirely clear. You might consider using a smart pointer implementation to remove risk.
just an appendix to the other answers
You can get rid of raw pointers and forget about memory management with the help of smart pointers (shared_ptr, unique_ptr).
The smart pointer is responsible for releasing the memory when it goes out of scope.
Here is an example:
#include <iostream>
#include <memory>
class ex1{
public:
ex1(std::shared_ptr<int> p_intvar1)
{
ptoint = p_intvar1;
std::cout << __func__ << std::endl;
}
~ex1()
{
std::cout << __func__ << std::endl;
}
private:
std::shared_ptr<int> ptoint;
};
int main()
{
std::shared_ptr<int> pi(new int(42));
std::shared_ptr<ex1> objtoclass(new ex1(pi));
/*
* when the main function returns, these smart pointers will go
* go out of scope and delete the dynamically allocated memory
*/
return 0;
}
Output:
ex1
~ex1
Should I delete obj when I'm already deleting objtoclass?
Well you could but mind that deleting the same object twice is undefined behaviour and should be avoided. This can happen for example if you have two pointers for example pointing at same object, and you delete the original object using one pointer - then you should not delete that memory using another pointer also. In your situation you might as well end up with two pointers pointing to the same object.
In general, to build a class which manages memory internally (like you do seemingly), isn't trivial and you have to account for things like rule of three, etc.
Regarding that one should delete dynamically allocated memory you are right. You should not delete memory if it was not allocated dynamically.
PS. In order to avoid complications like above you can use smart pointers.
You don't currently delete this int, or show where it's allocated. If neither object is supposed to own its parameter, I'd write
struct ex1 {
ex1(int &i_) : i(i_) {}
int &i; // reference implies no ownership
};
struct ex2 {
ex2(ex1 &e_) : e(e_) {}
ex1 &e; // reference implies no ownership
};
int i = 42;
ex1 a(i);
ex2 b(a);
If either argument is supposed to be owned by the new object, pass it as a unique_ptr. If either argument is supposed to be shared, use shared_ptr. I'd generally prefer any of these (reference or smart pointer) to raw pointers, because they give more information about your intentions.
In general, to make these decisions,
Should I delete ptoint?
is the wrong question. First consider things at a slightly higher level:
what does this int represent in your program?
who, if anyone, owns it?
how long is it supposed to live, compared to these classes that use it?
and then see how the answer falls out naturally for these examples:
this int is an I/O mapped control register.
In this case it wasn't created with new (it exists outside your whole program), and therefore you certainly shouldn't delete it. It should probably also be marked volatile, but that doesn't affect lifetime.
Maybe something outside your class mapped the address and should also unmap it, which is loosely analogous to (de)allocating it, or maybe it's simply a well-known address.
this int is a global logging level.
In this case it presumably has either static lifetime, in which case no-one owns it, it was not explicitly allocated and therefore should not be explicitly de-allocated
or, it's owned by a logger object/singleton/mock/whatever, and that object is responsible for deallocating it if necessary
this int is being explicitly given to your object to own
In this case, it's good practice to make that obvious, eg.
ex1::ex1(std::unique_ptr<int> &&p) : m_p(std::move(p)) {}
Note that making your local data member a unique_ptr or similar, also takes care of the lifetime automatically with no effort on your part.
this int is being given to your object to use, but other objects may also be using it, and it isn't obvious which order they will finish in.
Use a shared_ptr<int> instead of unique_ptr to describe this relationship. Again, the smart pointer will manage the lifetime for you.
In general, if you can encode the ownership and lifetime information in the type, you don't need to remember where to manually allocate and deallocate things. This is much clearer and safer.
If you can't encode that information in the type, you can at least be clear about your intentions: the fact that you ask about deallocation without mentioning lifetime or ownership, suggests you're working at the wrong level of abstraction.
Because it has the address of an undynamically allocated int, I
thought I don't need to delete it.
That is correct. Simply do not delete it.
The second part of your question was about dynamically allocated memory. Here you have to think a little more and make some decisions.
Lets say that your class called ex1 receives a raw pointer in its constructor for a memory that was dynamically allocated outside the class.
You, as the designer of the class, have to decide if this constructor "takes the ownership" of this pointer or not. If it does, then ex1 is responsible for deleting its memory and you should do it probably on the class destructor:
class ex1 {
public:
/**
* Warning: This constructor takes the ownership of p_intvar1,
* which means you must not delete it somewhere else.
*/
ex1(int* p_intvar1)
{
ptoint = p_intvar1;
}
~ex1()
{
delete ptoint;
}
int* ptoint;
};
However, this is generally a bad design decision. You have to root for the user of this class read the commentary on the constructor and remember to not delete the memory allocated somewhere outside class ex1.
A method (or a constructor) that receives a pointer and takes its ownership is called "sink".
Someone would use this class like:
int* myInteger = new int(1);
ex1 obj(myInteger); // sink: obj takes the ownership of myInteger
// never delete myInteger outside ex1
Another approach is to say your class ex1 does not take the ownership, and whoever allocates memory for that pointer is the responsible for deleting it. Class ex1 must not delete anything on its destructor, and it should be used like this:
int* myInteger = new int(1);
ex1 obj(myInteger);
// use obj here
delete myInteger; // remeber to delete myInteger
Again, the user of your class must read some documentation in order to know that he is the responsible for deleting the stuff.
You have to choose between these two design decisions if you do not use modern C++.
In modern C++ (C++ 11 and 14) you can make things explicit in the code (i.e., do not have to rely only on code documentation).
First, in modern C++ you avoid using raw pointers. You have to choose between two kinds of "smart pointers": unique_ptr or shared_ptr. The difference between them is about ownership.
As their names say, an unique pointer is owned by only one guy, while a shared pointer can be owned by one or more (the ownership is shared).
An unique pointer (std::unique_ptr) cannot be copied, only "moved" from one place to another. If a class has an unique pointer as attribute, it is explicit that this class has the ownership of that pointer. If a method receives an unique pointer as copy, it is explicit that it is a "sink" method (takes the ownership of the pointer).
Your class ex1 could be written like this:
class ex1 {
public:
ex1(std::unique_ptr<int> p_intvar1)
{
ptoint = std::move(p_intvar1);
}
std::unique_ptr<int> ptoint;
};
The user of this class should use it like:
auto myInteger = std::make_unique<int>(1);
ex1 obj(std::move(myInteger)); // sink
// here, myInteger is nullptr (it was moved to ex1 constructor)
If you forget to do "std::move" in the code above, the compiler will generate an error telling you that unique_ptr is not copyable.
Also note that you never have to delete memory explicitly. Smart pointers handle that for you.
Given the following example code:
#include <memory>
class Foo {
public:
Foo(std::shared_ptr<int> p);
private:
std::shared_ptr<int> ptr;
};
Foo::Foo(std::shared_ptr<int> p) : ptr(std::move(p)) {
}
class Bar {
public:
Bar(int &p);
private:
std::shared_ptr<int> ptr;
};
Bar::Bar(int &p) : ptr(std::make_shared<int>(p)) {
}
int main() {
Foo foo(std::make_shared<int>(int(256)));
Bar bar(*new int(512));
return 0;
}
Both Foo and Bar work correctly. However, are there any differences between creating the shared_ptr when calling the constructor and then transferring ownership with std::move and just passing a reference to the object and delegating the creation of the shared_ptr to the class constructor?
I assume the second way is better since I don't have to move the pointer. However, I've mostly seen the first way being used in the code I'm reading.
Which one should I use and why?
Foo is correct.
Bar is an abomination. It involves a memory leak, unsafe exception behaviour and an un-necessary copy.
EDIT: explanation of memory leak.
Deconstructing this line:
Bar bar(*new int(512));
results in these operations:
call new int(512) which results in a call to operator new, returning a pointer to an int on the heap (memory allocation).
dereference the pointer in order to provide a const reference for the constructor of Bar
Bar then constructs its shared_ptr with one returned by make_shared (this part is efficient). This shared_ptr's int is initialised with a copy of the int passed by reference.
then the function returns, but since no variable has recorded the pointer returned from new, nothing can free the memory. Every new must be mirrored by a delete in order to destroy the object and deallocate its memory.
therefore memory leak
It depends on what you want to achieve.
If you need a shared_ptr internally, because you want to share the object with some other objects your create later, the second way may be better (except for that horrible constructor call obviously).
If you want shared ownership of an existing object (this is the more common case, really), you don't have a choice and you need to use the first way.
If neither of these applies, you probably don't need a shared_ptr in the first place.
The second way is incorrect; it leaks memory. With
Bar::Bar(int &p) : ptr(std::make_shared<int>(p)) {
}
....
Bar bar(*new int(512));
the shared_ptr made by std::make_shared takes the value of p (which is 512) to build a new shared pointer that's responsible for a new piece of memory; it does not accept responsibility for the memory address where p lies. This piece -- the one you allocated in main -- is then lost. This particular piece of code would work with
// +---------------------- building a shared_ptr directly
// v v----- from p's address
Bar::Bar(int &p) : ptr(std::shared_ptr<int>(&p)) {
...but look at that. That is pure evil. Nobody expects that a reference parameter to a constructor requires that it references a heap object of which the new object will take responsibility.
You could more sanely write
Bar::Bar(int *p) : ptr(p) {
}
....
Bar bar(new int(512));
In fact you could give Foo a second constructor that does that, if you wanted. It's a bit of an argument how clear the function signature makes it that the pointer has to be to a heap-allocated object, but std::shared_ptr offers the same, so there's precedent. It depends on what your class does whether this is a good idea.
Assuming that you used int just as an example, and there is a real resource instead, then it depends on what you want to achieve.
The first is a typical dependency injection, where the object is injected through the constructor. It's good side is that it can simplify unit testing.
The second case just creates the object in the constructor, and initializes it using values passed through the constructor.
By the way, take care how you initialize. This :
Bar bar(*new int(512));
is causing a memory leak. A memory is allocated, but never deallocated.
If your intention is to take sole ownership of a heap-allocated object I suggest you accept a std::unique_ptr. It documents the intention clearly and you can create a std::shared_ptr from the std::unique_ptr internally if you need to:
#include <memory>
class Foo {
public:
Foo(std::unique_ptr<int> p);
private:
std::shared_ptr<int> ptr;
};
Foo::Foo(std::unique_ptr<int> p) : ptr(std::move(p)) {
}
int main() {
Foo foo(std::make_unique<int>(512));
}
Don't do Bar, it is error prone and fails to describe your intent (if I understand your intent correctly).
Both can work correctly, but the way you use them in main is not consistent.
When I see a constructor taking a reference (like Bar(int& p)) I expect Bar to hold a reference. When I see Bar(const int& p) I expect it to hold a copy.
When I see a rvalue ref (not universal, like Bar(int&& p) I expect p not "surviving its content" after passed. (well... for an it that's not that meaningful...).
In any case p holds an int, not a pointer, and what make_shared expects are the parameter to forwarded to the int constructor (that is ... an int, not int*).
Your main so have to be
Foo foo(std::make_shared<int>(int(256)));
Bar bar(512);
This will make bar to hold a dynamically allocated sharable copy of the value 512.
If you do Bar bar(*new int(512)) you make bar to hold a copy of your "new int", whose pointer gets discarded, and hence the int itself leaked.
In general, expressions like *new something should sound as "no no no no no ..."
But you Bar constructor has a problem: to be able to take a constant or an expression returned value, it must take a const int&, not an int&
I have a problem where I want to clone of a object pointer when doing a deep copy.
like I have T* t1 and I want to create a new object pointer T* t2 in a way that *t1.x= *t2.x.
Is it a good Idea to write a copy constructor which will work like:
T(const T* cpy)
{
m_var = (*cpy).m_var;
}
T* t1 = new T;
T* t2(t1);
what things should I take care of if using the above approach?
Thanks
Ruchi
To do this you should write a normal copy-constructor and use it like this:
T(const T& cpy)
: m_var(cpy.m_var) // prefer initialization-list, thanks to #Loki Astari
{}
T* t1 = new T;
T* t2 = new T(*t1);
In the code you show, T* t2(t1); would never call the constructor you have declared (which, by the way, is not a copy-constructor), because it simply initializes the pointer t2 to the value of the pointer t1, making both point to the same object.
As #Nawaz notes, this copy-constructor is equivalent to the one generated by the compiler, so you don't actually need to write it. In fact, unless you have any manually managed resources (which, usually, you shouldn't) you will always be fine with the compiler generated copy-constructor.
The definition of a copy constructor requires a reference and is thus:
T(T const& copy) // This defines a copy constructor.
: m_var(copy.m_var) // Prefer to use the initializer list.
{}
So you need to pass a reference.
If you want to copy a pointer the usage is then:
T* t2 = new T(*t1);
This does not do what you think:
T* t2(t1);
since you are only declaring a pointer, not an object. The pointer is initialise to the value of the other pointer. It should be:
T* t2 = new T (t1);
to create a new object.
As for the copy, you're currently doing a shallow copy as you are only copying the pointer value, not the data the pointer points at. Doing a shallow copy causes problems when the original or the copy is destroyed - if the m_var is deleted, the other object then has a pointer to deleted memory, invoking Undefined BehaviourTM if it is dereferenced. A deep copy fixes this:
T(const T* cpy)
{
m_var = new VarType (cpy->m_var); // VarType being whatever m_var is
}
This now requires a copy constructor for the type of m_var, which must also be deep to prevent the deletion problem above.
The downside to deep copying the data is that it increases the memory requires and takes significant time to allocate memory and copy the data. This can be solved using reference counted objects. These come in a few flavours, smart pointer being the most common. Here, the same underlying object is reference by all copies of the parent object. When the parent is deleted, the object's smart pointer's destructor only destroys the underlying object when all references to it are deleted.
The downside to smart pointers is that changing the data from one owning object modifies the data that all owning objects will see. To get the best of both worlds you'd want to have a 'copy on modified' system. This will only increase memory use when the underlying data is modified by the owning object.
I'm trying to follow a tutorial here: regarding overloading operators, and I've found something that's really confused me.
There was a previous question on this very website here where this tutorial was discussed, namely regarding how the variables in the class were preserved because the whole class was passed back by value.
Whilst experimenting with the class definition I toyed with making the integer variables pointers (perhaps not sensible - but just to experiment!) as follows:
class CVector {
int* x;
int* y;
public:
CVector () {};
CVector (int,int);
CVector operator + (CVector);
~CVector ();
};
In the class constructor I allocate memory for the two integers, and in the class deconstructor I delete the allocated memory.
I also tweak the overloaded operator function as follows:
CVector CVector::operator+ (CVector param) {
CVector temp;
*temp.x = *x + *param.x;
*temp.y = *y + *param.y;
return (temp);
}
For the original code, where the class has simple integer variables the return by value of the entire class completes successfully.
However, after I change the variables to int pointers, the return by value of the class does not complete successfully as the integer variables are no longer intact.
I assume the deconstructor is being called when the temporary CVector goes out of scope and deletes these member integer pointers, but the class itself is still returned by value.
I'd like to be able to return by value the CVector with the memory allocated to its member variables intact, whilst ensuring the temporary CVector is correctly deleted when it goes out of scope.
Is there any way this can be done?
Many thanks!
The problem is that you are not following the rule of the three, which basically boils down to: *if you manage resources, then you should provide copy constructor, assignment operator and destructor for your class*.
Assuming that on construction you are allocating memory for the pointers, the problem is that the implicit copy constructor is shallow, and will copy the pointers, but you probably want a deep copy. In the few cases where you do not want a deep copy, control of the manage shared resource becomes more complicated, and I would use a shared_ptr rather than trying to do it manually.
You need to provide a copy constructor for CVector to make copies of the allocated memory. Otherwise, when you return by value, the pointer values will simply be copied and then the temp object is destructed, deallocating the ints. The returned copy now points to invalid memory.
CVector( const CVector& other )
: x ( new int(other.x) )
, y ( new int(other.y) )
{}
Note that it is bad idea to be using raw pointers in your class, especially more than one. If the allocation of y fails above and new throws you've got a memory leak (because x is left dangling). You could've allocated within the constructor itself, instead of the initializer list, either within a try-catch, or using the std::nothrow version of new, then check for nullptr. But it makes the code very verbose and error prone.
The best solution is to use some smart pointer class such as std::unique_ptr to hold pointers. If you were to use std::shared_ptr to hold those pointers, you can even share the ints between copies of the class.
The return-by-value causes the returned, temp object to be copied to another object, a temporary "return object". After temp is copied, it is destructed, deallocating your ints. The easiest way to handle this is to use a reference-counted pointer such as tr1::shared_ptr<>. It will keep the memory allocated until the last reference to it is dropped, then it will deallocate.
There are few problems in the given code.
(1) You should allocate proper memory to *x and *y inside the constructor; otherwise accessing them is undefined behavior.
CVector () : x(new int), y(new int) {}
Also make sure that to have copy constructor and operator = where you delete x and delete y before reallocating them; otherwise it will lead to hazards.
(2) delete them in destructor
~CVector () { delete x; delete y; }
(3) Pass argument to operator + by const reference to avoid unnecessary copying.
CVector CVector::operator+ (const CVector ¶m)
{
// code
}
Since you are learning with playing around with pointers, I will not comment on the design perspective of your class, like if they should be pointer or variables or containers and so on.