What are the reasons to allocate a pointer on the heap?

What are the reasons to allocate a pointer on the heap? - c++

Probably this question was already asked but I couldn't find it. Please redirect me if you you saw something.
Question :
what is the benefit of using :
myClass* pointer;
over
myClass* pointer = new(myClass);
From reading on other topics, I understand that the first option allocates a space on the stack and makes the pointer point to it while the second allocates a space on the heap and make a pointer point to it.
But I read also that the second option is tedious because you have to deallocate the space with delete.
So why would one ever use the second option.
I am kind of a noob so please explain in details.
edit
#include <iostream>
using namespace std;
class Dog
{
public:
void bark()
{
cout << "wouf!!!" << endl;
}
};
int main()
{
Dog* myDog = new(Dog);
myDog->bark();
delete myDog;
return 0;
}
and
#include <iostream>
using namespace std;
class Dog
{
public:
void bark()
{
cout << "wouf!!!" << endl;
}
};
int main()
{
Dog* myDog;
myDog->bark();
return 0;
}
both compile and give me "wouf!!!". So why should I use the "new" keyword?

I understand that the first option allocates a space on the stack and
makes the pointer point to it while the second allocates a space on
the heap and make a pointer point to it.
The above is incorrect -- the first option allocates space for the pointer itself on the stack, but doesn't allocate space for any object for the pointer to point to. That is, the pointer isn't pointing to anything in particular, and thus isn't useful to use (unless/until you set the pointer to point to something)
In particular, it's only pure blind luck that this code appears to "work" at all:
Dog* myDog;
myDog->bark(); // ERROR, calls a method on an invalid pointer!
... the above code is invoking undefined behavior, and in an ideal world it would simply crash, since you are calling a method on an invalid pointer. But C++ compilers typically prefer maximizing efficiency over handling programmer errors gracefully, so they typically don't put in a check for invalid-pointers, and since your bark() method doesn't actually use any data from the Dog object, it is able to execute without any obvious crashing. Try making your bark() method virtual, OTOH, and you will probably see a crash from the above code.
the second allocates a space on the heap and make a pointer point to
it.
That is correct.
But I read also that the second option is tedious because you have to
deallocate the space with delete.
Not only tedious, but error-prone -- it's very easy (in a non-trivial program) to end up with a code path where you forgot to call delete, and then you have a memory leak. Or, alternatively, you could end up calling delete twice on the same pointer, and then you have undefined behavior and likely crashing or data corruption. Neither mistake is much fun to debug.
So why would one ever use the second option.
Traditionally you'd use dynamic allocation when you need the object to remain valid for longer than the scope of the calling code -- for example, if you needed the object to stick around even after the function you created the object in has returned. Contrast that with a stack allocation:
myClass someStackObject;
... in which someStackObject is guaranteed to be destroyed when the calling function returns, which is usually a good thing -- but not if you need someStackObject to remain in existence even after your function has returned.
These days, most people would avoid using raw/C-style pointers entirely, since they are so dangerously error-prone. The modern C++ way to allocate an object on the heap would look like this:
std::shared_ptr<myClass> pointer = std::make_shared<myClass>();
... and this is preferred because it gives you a heap-allocated myClass object whose pointed-to-object will continue to live for as long as there is at least one std::shared_ptr pointing to it (good), but also will automagically be deleted the moment there are no std::shared_ptr's pointing to it (even better, since that means no memory leak and no need to explicitly call delete, which means no potential double-deletes)

Related

Deconstructing a *Thing

I'm sure this is answered somewhere, but I'm lacking the vocabulary to formulate a search.
#include <iostream>
class Thing
{
public:
int value;
Thing();
virtual ~Thing() { std::cout << "Destroyed a thing with value " << value << std::endl; }
};
Thing::Thing(int newval)
{
value = newval;
}
int main()
{
Thing *myThing1 = new Thing(5);
std::cout << "Value 1: " << myThing1->value << std::endl;
Thing myThing2 = Thing(6);
std::cout << "Value 2: " << myThing2.value << std::endl;
return 0;
}
Output indicates myThing2 was destroyed, my myThing1 was not.
So... do I need to deconstruct it manually somehow? Is this a memory leak? Should I avoid using the * in this situation, and if so, when would it be appropriate?

The golden rule is, wherever you use a new you must use a delete. You are creating dynamic memory for myThing1, but you never release it, hence the destructor for myThing1 is never called.
The difference between this and myThing2 is that myThing2 is a scoped object. The operation:
Thing myThing2 = Thing(6);
is not similar at all to:
Thing *myThing1 = new Thing(5);
Read more about dynamic allocation here. But as some final advice, you should be using the new keyword sparingly, read more about that here:
Why should C++ programmers minimize use of 'new'?

myThing1 is a Thing* not a Thing. When a pointer goes out of scope nothing happens except that you leak the memory it was holding as there is no way to get it back. In order for the destructor to be called you need to delete myThing1; before it goes out of scope. delete frees the memory that was allocated and calls the destructor for class types.
The rule of thumb is for every new/new[] there should be a corresponding delete/delete[]

You need to explicitly delete myThing1 or use shared_ptr / unique_ptr.
delete myThing1;

The problem is not related to using pointer Thing *. A pointer can point to an object with automatic storage duration.
The problem is that in this statement
Thing *myThing1 = new Thing(5);
there is created an object new Thing(5) using the new operator. This object can be deleted by using the delete operator.
delete myThing1;
Otherwise it will preserve the memory until the program will not finish.

Thing myThing2 = Thing(6);
This line creates a Thing in main's stack with automatic storage duration. When main() ends it will get cleaned up.
Thing *myThing1 = new Thing(5);
This, on the other hand, creates a pointer to a Thing. The pointer resides on the stack, but the actual object is in the heap. When the pointer goes out of scope nothing happens to the pointed-to thing, the only thing reclaimed is the couple of bytes used by the pointer itself.
In order to fix this you have two options, one good, one less good.
Less good:
Put a delete myThing1; towards the end of your function. This will free to allocated object. As noted in other answers, every allocation of memory must have a matching deallocation, else you will leak memory.
However, in modern C++, unless you have good reason not to, you should really be using shared_ptr / unique_ptr to manage your memory. If you had instead declared myThing1 thusly:
shared_ptr<Thing> myThing1(new Thing(5));
Then the code you have now would work the way you expect. Smart pointers are powerful and useful in that they greatly reduce the amount of work you have to do to manage memory (although they do have some gotchas, circular references take extra work, for example).

When do we need deep copy in c

In this code:
struct test
{
char *p;
};
struct test glob;
void someFunc(struct test);
int main()
{
struct test X;
X.p=malloc(10);
someFunc(X);
//free(X.p);
}
void someFunc(struct test y)
{
glob.p=y.p;
}
In C++ I think this is tricky because when someFunc ends, and y goes
out of scope - if one has proper destructor and this is a class
object - memory pointed by y.p will get freed after someFunc ends, and thus glob.p
will point to garbage right?
Is this also the case in C considering the same code above? Or will glob.p point to usable memory after someFunc ends?

In C, after someFunc() ends, y is destroyed, but not what y.p points to. You can keep using the memory you allocated wherever you want until you call free() - that's one of the advantages of dynamic allocation (and one of its drawbacks too - with great power comes great responsibility.)
In C++, the same thing will happen, unless there's a destructor that deletes the pointer. In that case, yes, both glob.p and X.p would be invalid pointers.

In C++ this same code will work the same as in C. Both should be fine as you pass the pointer around. You just need to be aware who owns that and thus deletes it.
If you write a destructor as you said, you should usually also provide proper copy and assignment operators. http://en.wikipedia.org/wiki/Rule_of_three_(C%2B%2B_programming) to prevent things like invalid pointers.
If you don't then, yes, such code will be dangerous and you need to be very much aware of what you are doing (and especially why this way).

After someFunc ends, glob.p will point to the memory allocated with your call to malloc in main. This is true whether this is compiled as a C or C++ program (with struct removed from the someFunc declaration in the c++ case). If test has a different definition as a c++ class, as you seem to be asking at the end, then what happens at the end of someFunc depends entirely on the particulars of this definition.

double free without any dynamic memory allocation

In general, what could cause a double free in a program that is does not contain any dynamic memory allocation?
To be more precise, none of my code uses dynamic allocation. I'm using STL, but it's much more likely to be something I did wrong than for it to be a broken implmentation of G++/glibc/STL.
I've searched around trying to find an answer to this, but I wasn't able to find any example of this error being generated without any dynamic memory allocations.
I'd love to share the code that was generating this error, but I'm not permitted to release it and I don't know how to reduce the problem to something small enough to be given here. I'll do my best to describe the gist of what my code was doing.
The error was being thrown when leaving a function, and the stack trace showed that it was coming from the destructor of a std::vector<std::set<std::string>>. Some number of elements in the vector were being initialized by emplace_back(). In a last ditch attempt, I changed it to push_back({{}}) and the problem went away. The problem could also be avoided by setting the environment variable MALLOC_CHECK_=2. By my understanding, that environment variable should have caused glibc to abort with more information rather than cause the error to go away.
This question is only being asked to serve my curiosity, so I'll settle for a shot in the dark answer. The best I have been able to come up with is that it was a compiler bug, but it's always my fault.

In general, what could cause a double free in a program that is does not contain any dynamic memory allocation?
Normally when you make a copy of a type which dynamically allocates memory but doesn't follow rule of three
struct Type
{
Type() : ptr = new int(3) { }
~Type() { delete ptr; }
// no copy constructor is defined
// no copy assign operator is defined
private:
int * ptr;
};
void func()
{
{
std::vector<Type> objs;
Type t; // allocates ptr
objs.push_back(t); // make a copy of t, now t->ptr and objs[0]->ptr point to same memory location
// when this scope finishes, t will be destroyed, its destructor will be called and it will try to delete ptr;
// objs go out of scope, elements in objs will be destroyed, their destructors are called, and delete ptr; will be executed again. That's double free on same pointer.
}
}

I extracted a presentable example showcasing the fault I made that led to the "double free or corruption" runtime error. Note that the struct doesn't explicitly use any dynamic memory allocations, however internally std::vector does (as its content can grow to accommodate more items). Therefor, this issue was a bit hard to diagnose as it doesn't violate 'the rule of 3' principle.
#include <vector>
#include <string.h>
typedef struct message {
std::vector<int> options;
void push(int o) { this->options.push_back(o); }
} message;
int main( int argc, const char* argv[] )
{
message m;
m.push(1);
m.push(2);
message m_copy;
memcpy(&m_copy, &m, sizeof(m));
//m_copy = m; // This is the correct method for copying object instances, it calls the default assignment operator generated 'behind the scenes' by the compiler
}
When main() returns m_copy is destroyed, which calls the std::vector destructor. This tries to delete memory that was already freed when the m object was destroyed.
Ironically, I was actually using memcpy to try and achieve a 'deep copy'. This is where the fault lies in my case. My guess is that by using the assignment operator, all the members of message.options are actually copied to "newly allocated memory" whereas memcpy would only copy those members that were allocated at compile time (e.g. a uint32_t size member). See Will memcpy or memmove cause problems copying classes?. Obviously this also applies to structs with non-fundamental typed members (as is the case here).
Maybe you also copied a std::vector incorrectly and saw the same behavior, maybe you didn't. In the end, it was my entirely my fault :).

How to destroy a vector of pointers in c++?

I have the following code in one of my methods:
vector<Base*> units;
Base *a = new A();
Base *b = new B();
units.push_back(a);
units.push_back(b);
Should I destroy the a and b pointers before I exit the method?
Or should I somehow just destroy the units vector of pointers?
Edit 1:
This is another interesting case:
vector<Base*> units;
A a;
B b;
units.push_back(&a);
units.push_back(&b);
What about this case? Now I don't have to use delete nor smart pointers.
Thanks

If you exit the method, units will be destroyed automatically. But not a and b. Those you need to destroy explicitly.
Alternatively, you could use std::shared_ptr to do it for you, if you have C++11.
std::vector<std::shared_ptr<Base>> units;
And you just use the vector almost as you did before, but without worrying about memory leaks when the function exists. I say almost, because you'll need to use std::make_shared to assign into the vector.

A rather old-fashioned solution, that works with all compilers:
for ( vector<Base*>::iterator i = units.begin(); i != units.end(); ++i )
delete *i;
In C++11 this becomes as simple as:
for ( auto p : units )
delete p;
Your second example doesn't require pointer deallocation; actually it would be a bad error to do it. However it does require care in ensuring that a and b remain valid at least as long as units does. For this reason I would advise against that approach.

You need to iterate over the vector and delete each pointer it contains. Deleting the vector will result in memory leaks, as the objects pointed to by its elements are not deleted.
TL;DR: The objects remain, the pointers are lost == memory leak.

Yes you should destroy those pointers (assuming you aren't returning the vector elsewhere).
You could easily do it with a std::for_each as follows:
std::for_each( units.begin(), units.end(), []( Base* p ) { delete p; } );

You should not delete if this two situation match.
Created vector return to out side of the function.
Vector created outside of the function and and suppose to access from other functions.
In other situations you should delete memory pointed by pointers in vector. otherwise after you delete the pointers, no way to refer this memory locations and it calls memory leak.
vector<Base*>::iterator it;
for ( it = units.begin(); it != units.end(); ){
delete * it;
}

I would suggest that you use SmartPointers in the vector. Using smart pointers is a better practice than using raw pointers. You should use the std::unique_ptr, std::shared_ptr or std::weak_ptr smart pointers or the boost equivalents if you don't have C++11. Here is the boost library documentation for these smart pointers.
In the context of this question, yes you have to delete the pointers that are added to the vector. Else it would cause a memory leak.

You have to delete them unless you will have memory leak , in the following code if I comment the two delete lines the destructors never called, also you have to declare the destuctor of the Base class as virtual. As others mentioned is better to use smart pointers.
#include <iostream>
#include <vector>
class Base
{
public:
virtual ~Base(){std::cout << "Base destructor" << std::endl;};
};
class Derived : public Base
{
~Derived(){std::cout << "Derived destructor" << std::endl;};
};
int main()
{
std::vector<Base*> v;
Base *p=new Base();
Base *p2=new Derived();
v.push_back(p);
v.push_back(p2);
delete v.at(0);
delete v.at(1);
};
Output:
Base destructor
Derived destructor
Base destructor
Output with non-virtual base destructor (memory leak):
Base destructor
Base destructor

Yes and no. You don't need to delete them inside the function, but for other reasons than you might think.
You are essentially giving ownership of the objects to the vector, but the vector is not aware of that and therfore wont call delete on the pointers automatically. So if you store owning raw pointers in a vector, you have to manually call delete on them some time. But:
If you give the vector out of your function, you should not destroy the objects inside the function, or the vector full of pointers to freed memory would be pretty useless, so no. But in that case, you should make sure the objects are destroyed after the vector has been used outside the function.
If you don't give the vector out of the function, you should destroy the objects inside the function, but there would be no need to allocate them on the free store, so don't use pointers and new. You just push/emplace the objects themselves into the vector, it takes care of the destruction then, and therfore you don't need delete.
And besides that: Don't use plain new. Use smart pointers. Regardless what you do with them, the smart pointers will take care of a proper destruction of the objects contained. No need to use new, no need to use delete. Ever. (Except when you are writing your own low level data structures, e.g. smart pointers). So if you want to have a vector full of owning pointers, these should be smart pointers. That way you won't have to worry about wether, when and how to destroy the objects and free the memory.

The best way to store pointers in a vector will be to use smart_ptr instead of raw pointers. As soon as the vector DTOR is called and control exits the DTOR all smart_ptrs will be refernced counted. And you should never bothered about the memory leak with smart_ptrs.

In the first example, you will eventually have to delete a and b, but not necessarily when units goes out of scope. Usually you will do that just before units goes out of scope, but that is not the only possible case. It depends on what is intended.
You might (later in the same function) alias a or b, or both, because you want them to outlive units or the function scope. You might put them into two unit objects at the same time. Or, many other possible things.
What's important is that destroying the vector (automatic at scope end in this case) destroys the elements held by the vector, nothing more. The elements are pointers, and destroying a pointer does nothing. If you also want to destroy what the pointer points to (as to not leak memory), you must do that manually (for_each with a lambda would do).
If you don't want to do this work explicitly, a smart pointer can automatize that for you.
The second example (under Edit1) does not require you to delete anything (in fact that's not even possible, you would likely see a crash attempting to do that) but the approach is possibly harmful.
That code will work perfectly well as long as you never reference anything in units any more after a and b left scope. Woe if you do.
Technically, such a thing might even happen invisibly, since units is destroyed after a, but luckily, ~vector does not dereference pointer elements. It merely destroys them, which for a pointer doesn't do anything (trivial destructor).
But imagine someone was so "smart" as to extend the vector class, or maybe you apply this pattern some day in the future (because it "works fine") to another object which does just that. Bang, you're dead. And you don't even know where it came from.
What I really don't like about the code, even though it is strictly "legal" is the fact that it may lead to a condition which will crash or exhibit broken, unreproducable behaviour. However, it does not crash immediately. Code that is "broken" should crash immediately, so you see that something is wrong, and you are forced to fix it. Unluckily that's not the case here.
This will appear to work, possibly for years, until one day it doesn't. Eventually you'll have forgotten that a and b live on the current stack frame and reference the non-existing objects in the vector from some other location. Maybe you dynamically allocate the vector in a future revision of your code, since you pass it to another function. And maybe it will continue to appear working.
And then, you'll spend hours of your time (and likely the time of others) trying to find why a section of code that cannot possibly fail produces wrong results or crashes.

Warning against your second example.
This simple extension leads to undefined behavior:
class A {
public:
int m;
A(int _m): m(_m) {}
};
int main(){
std::vector<A*> units;
for (int i = 0; i < 3; ++i) {
A a(i);
units.push_back(&a);
}
for (auto i : units) std::cout << i->m << " "; // output: 2 2 2 !!!!
return 0;
}
In each loop, the pointer to each a is saved in units, but the objects that they point to go out of scope. In the case of my compiler, the memory address of each a was re-used each time, resulting in units holding three identical memory addresses -- all pointing to the final a object.

In c++, is passing an on-the-fly dynamically allocated object to a function (always) a bad idea?

I know the title of the question looks a bit mind-damaging, but I really don't know how to ask this in one phrase. I'll just show you what I mean:
void f(T *obj)
{
// bla bla
}
void main()
{
f(new T());
}
As far as I know, (almost) every new requires a delete, which requires a pointer (returned by new). In this case, the pointer returned by new isn't stored anywhere. So would this be a memory leak?
Does C++ work some kind of magic (invisible to the programmer) that deletes the object after the function ends or is this practice simply always a bad idea?

There is no particular magic and delete will not be called automatically.
It is definitely not "always a bad idea" - if function takes ownership of an object in some form it is perfectly valid way for calling such function:
container.AddAndTakeOwnership(new MyItem(42));

Yes, it's always a bad idea; functions and constructors should be explicit about their ownership semantics and if they expect to take or share ownership of a passed pointer they should receive std::unique_ptr or std::shared_ptr respectively.
There are lots of legacy APIs which take raw pointers with ownership semantics, even in the standard library (e.g. the locale constructor taking a Facet * with ownership), but any new code should avoid this.
Even when constructing a unique_ptr or shared_ptr you can avoid using the new keyword, in the latter case using make_shared and in the former case writing a make_unique function template which should fairly soon be added to the language.

In this case, the pointer returned by new isn't stored anywhere. So
would this be a memory leak?
No, it is not necessarily a memory leak. The pointer is stored as an argument to f:
void f(T *obj)
// ^^^ here pointer is "stored"
{
// bla bla
delete obj; // no memory leak if delete will be called on obj
}
void main()
{
f(new T());
// ^^^^^^^ this "value" will be stored as an argument to f
}
Does C++ work some kind of magic (invisible to the programmer) that
deletes the object after the function ends or is this practice simply
always a bad idea?
No magic in your example. As I showed - delete must be called explicitly.
Better is to use smart pointer, then C++ "magic" works, and delete is not needed.
void f(std::unique_ptr<T> obj)
{
// bla bla
}
void main()
{
f(new T());
}

The code shown will result in a memory leak. C++ does not have garbage collection unless you explicitly use a specialized framework to provide it.
The reason for this has to do with the way memory is managed in C/C++. For a local variable, like your example, memory for the object is requested directly from the operating system (malloc) and then the pointer to the object exists on the stack. Because C/C++ can do arbitrarily complex pointer arithmetic, the compiler has no way of knowing whether there exists some other pointer somewhere to the object, so it cannot reclaim the memory when function f() ends.
In order to prevent the leak automatically, the memory would have to be allocated out of a managed heap, and every reference into this heap would have to be carefully tracked to determine when a given object no longer was being used. You would have to give up C's ability to do pointer arithmatic in order to get this capability.
For example, let's say the compiler could magically figure out that all normal references to obj were defunct and deleted the object (released the memory). What if you had some insanely complicated RUNTIME DEPENDENT expression like void* ptr = (&&&&(&&&*obj)/2++ - currenttime() - 567 + 3^2 % 52) etc; How would the compiler know whether this ptr pointed to obj or not? There is no way to know. This is why there is no garbage collection. You can either have garbage collection OR complex runtime pointer arithmetic, not both.

This is often used when your function f() is storing the object somewhere, like an array (or any other data container) or a simple class member; in which case, deletion will (and must) take place somewhere else.
Otherwise is not a good idea cause you will have to delete it manually at the end of the function anyway. In that case, you can just declare an automatic (on the stack) object and pass it by pointer.

No magic. In your case after f is called main returns back to CRT's main and eventually the OS will clean up the "leak". Its not necessarily a bad idea, it could be giving f the ownership and it is up to f to do stuff and eventually delete. Some make call it bad practice but its probably out there in the wild.
Edit:
Although I see that code no more dangerous than:
void f(T *obj)
{
// bla bla
}
void main()
{
T* test = new T ();
f(test);
}
Principally it is the same. Its saying to f, here is a pointer to some memory, its yours you look after it now.

In C++ passing a pointer is discouraged as there are no associated owner semantics (ie you can not know who owns the pointer and thus who is responsible for deleting the pointer). Thus when you do have a function (that takes a pointer) you need to document very clearly if the function is responsible for cleaning up the pointer or not. Of course documentation like this is very error prone as the user has to read it.
It is more normal in C++ program to pass objects that describe the ownership semantics of the thing.
Pass by reference
The function is not taking ownership of the object. It will just use the object.
Pass a std::auto_ptr (or std::unique_ptr)
The function is being passed the pointer and ownership of the pointer.
Pass a std::shared_ptr
The function is being passed shared ownership of the pointer.
By using these techniques you not only document the ownership semantics but the objects used will also automatically control the lifespan of the object (thus relieving your function from calling delete).
As a result it is actually very rare to see a manual call to delete in modern C++ code.
So I would have written it like this:
void f(std::unique_ptr<T> obj)
{
// bla bla
}
int main()
{
f(std::unique_ptr<T>(new T()));
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js