I get a bad feeling about this code
widget* GetNewWidget()
{
widget* theWidget = (widget *) malloc(sizeof(widget));
return theWidget;
}
Firstly, one should never cast the result of malloc() (nor, I suspect, use it in C++ (?)).
Secondly, won't theWidget be allocated on the stack?
If so, won't the caller trying to access after this function returns be undefined behaviour?
Can someone point to an authoritative URL explaining this?
[Update] I am thinking of this question Can a local variable's memory be accessed outside its scope?
In summary: this code is perfectly fine
Returning a pointer is like returning an int: the very act of returning creates a bitwise copy.
Step, by step, the code works as follows:
malloc(sizeof(widget));
Allocates a block of memory on the heap[1], starting at some address (let's call it a), and sizeof(widget) bytes long.
widget* theWidget = (widget *) malloc(sizeof(widget));
Stores the address a on the stack[2] in the variable theWidget. If malloc allocated a block at address0x00001248, then theWidget now contains the value 0x00001248, as if it were an integer.
return theWidget;
Now causes the value of a to be returned, i.e., the value 0x00001248 gets written to wherever the return value is expected.
At no point is the address of theWidget used. Hence, there is no risk of accessing a dangling pointer to theWidget. Note that if your code would return &theWidget;, there would have been an issue.
[1] Or it might fail, and return NULL
[2] Or it might keep it in a register
On the stack you just allocated a pointer, it's not related to the object itself. :)
I never use malloc (it's a C thing, you shouldn't use it in C++), thus i am not sure, but i hardly believe it's undefined behaviour.
If you would write this: widget* theWidget = new widget(); it should work correctly.
Even better if you use smart pointers if you have C++11
std::unique_ptr<widget> GetNewWidget()
{
std::unique_ptr<widget> theWidget(std::make_unique<widget>());
return theWidget;
}
Or in this case you can write even smaller code, like this:
std::unique_ptr<widget> GetNewWidget()
{
return std::make_unique<widget>();
}
The above version will clean out the memory as soon as unique pointer go out of scope. (unless you move it to another unique_ptr) It's worth some time to read about memory management in C++11.
Related
Is it always wise to use NULL after a delete in legacy code without any smartpointers to prevent dangling pointers? (bad design architecture of the legacy code excluded)
int* var = new int(100);
delete var;
var = NULL;
Does it also make sense in destructors?
In a getter, does it make sense to test for NULL in second step?
Or is it undefinied behavier anyway?
Foo* getPointer() {
if (m_var!=NULL) { // <-is this wise
return m_var;
}
else {
return nullptr;
}
}
What about this formalism as an alternative? In which cases will it crash?
Foo* getPointer() {
if (m_var) { // <-
return m_var;
}
else {
return nullptr;
}
}
(Edit) Will the code crash in example 3./4. if A. NULL is used after delete or B. NULL is not used after delete.
Is it always wise to use NULL after a delete in legacy code without any smartpointers to prevent dangling pointers? (bad design architecture of the legacy code excluded)
int* var = new int(100);
// ...
delete var;
var = NULL;
Only useful if you test var afterward.
if scope ends, or if you set other value, setting to null is unneeded.
Does it also make sense in destructors?
nullify members in destructor is useless as you cannot access them without UB afterward anyway. (but that might help with debugger)
In a getter, does it make sense to test for NULL in second step? Or is it undefinied behavier anyway?
[..]
[..]
if (m_var != NULL) and if (m_var) are equivalent.
It is unneeded, as, if pointer is nullptr, you return nullptr,
if pointer is not nullptr, you return that pointer, so your getter can simply be
return m_var;
Avoid writing code like this
int* var = new int(100);
// ... do work ...
delete var;
This is prone to memory leaks if "do work" throws, returns or otherwise breaks out of current scope (it may not be the case right now but later when "do work" needs to be extended/changed). Always wrap heap-allocated objects in RAII such that the destructor always runs on scope exit, freeing the memory.
If you do have code like this, then setting var to NULL or even better a bad value like -1 in a Debug build can be helpful in catching use-after-free and double-delete errors.
In case of a destructor:
Setting the pointer to NULL in a destructor is not needed.
In production code it's a waste of CPU time (writing a value that will never be read again).
In debug code it makes catching double-deletes harder. Some compilers fill deleted objects with a marker like 0xDDDDDDDD such that a second delete or any other dereference of the pointer will cause a memory access exception. If the pointer is set to NULL, delete will silently ignore it, hiding the error.
This question is really opinion-based, so I'll offer some opinions ... but also a justification for those opinions, which will hopefully be more useful for learning than the opinions themselves.
Is it always wise to use NULL after a delete in legacy code without any smartpointers to prevent dangling pointers? (bad design architecture of the legacy code excluded)
Short answer: no.
It is generally recommended to avoid raw pointers whenever possible. Regardless of which C++ standard your code claims compliance with.
Even if you somehow find yourself needing to use a raw pointer, it is safer to ensure the pointer ceases to exist when no longer needed, rather than setting it to NULL. That can be achieved with scope (e.g. the pointer is local to a scope, and that scope ends immediately after delete pointer - which absolutely prevents subsequent use of the pointer at all). If a pointer cannot be used when no longer needed, it cannot be accidentally used - and does not need to be set to NULL. This also works for a pointer that is a member of a class, since the pointer ceases to exist when the containing object does i.e. after the destructor completes.
The idiom of "set a pointer to NULL when no longer needed, and check for NULL before using it" doesn't prevent stupid mistakes. As a rough rule, any idiom that requires a programmer to remember to do something - such as setting a pointer to NULL, or comparing a pointer to NULL - is vulnerable to programmer mistakes (forgetting to do what they are required to do).
Does it also make sense in destructors?
Generally speaking, no. Once the destructor completes, the pointer (assuming it is a member of the class) will cease to exist as well. Setting it to NULL immediately before it ceases to exist achieves nothing.
If you have a class with a destructor that, for some reason, shares the pointer with other objects (i.e. the value of the pointer remains valid, and presumably the object it points at, still exist after the destructor completes) then the answer may be different. But that is an exceedingly rare use case - and one which is usually probably better avoided, since it becomes more difficult to manage lifetime of the pointer or the object it points at - and therefore easier to introduce obscure bugs. Setting a pointer to NULL when done is generally not a solution to such bugs.
In a getter, does it make sense to test for NULL in second step? Or is it undefinied behavier anyway?
Obviously that depends on how the pointer was initialised. If the pointer is uninitialised, even comparing it with NULL gives undefined behaviour.
In general terms, I would not do it. There will presumably be some code that initialised the pointer. If that code cannot appropriately initialise a pointer, then that code should deal with the problem in a way that prevents your function being called. Examples may include throwing an exception, terminating program execution. That allows your function to safely ASSUME the pointer points at a valid object.
What about this formalism as an alternative? In which cases will it crash?
The "formalism" is identical to the previous one - practically the difference is stylistic. In both cases, if m_var is uninitialised, accessing its value gives undefined behaviour. Otherwise the behaviour of the function is well-defined.
A crash is not guaranteed in any circumstances. Undefined behaviour is not required to result in a crash.
If the caller exhibits undefined behaviour (e.g. if your function returns NULL the caller dereferences it anyway) there is nothing your function can do to prevent that.
The case you describe remains relatively simple, because the variable is described in a local scope.
But look for example at this scenario:
struct MyObject
{
public :
MyObject (int i){ m_piVal = new int(i); };
~MyObject (){
delete m_piVal;
};
public:
static int *m_piVal;
};
int* MyObject::m_piVal = NULL;
You may have a double free problem by writing this:
MyObject *pObj1 = new MyObject(1);
MyObject *pObj2 = new MyObject(2);
//...........
delete pObj1;
delete pObj2; // You will have double Free on static pointer (m_piVal)
Or here:
struct MyObject2
{
public :
MyObject2 (int i){ m_piVal = new int(i); };
~MyObject2 (){
delete m_piVal;
};
public:
int *m_piVal;
};
when you write this:
MyObject2 Obj3 (3);
MyObject2 Obj4 = Obj3;
At destruction, you will have double Free here because Obj3.m_piVal = Obj4.m_piVal
So there are some cases that need special attention (Implement : smart pointer, copy constructor, ...) to manage the pointer
Case-1:
int* func1() {
return new int;
}
Case-2:
int& func2() {
return *(new int);
}
Yes, the return value of one needs to be stored in a pointer variable and the other in an int variable and they both are equally likely to create a memory leak if we forget to deallocate the memory in heap.
So, for all practial purposes, they are equivalent, right?
No, the two are not equivalent from the point of view of readability: when you return a pointer, readers of your code think about the ownership of that pointer; when you return a reference, they know for sure that you own it.
It is natural for users of your API to delete a pointer that you return. Your API documentation needs to say whether the pointer needs to be deleted with delete[] or delete, - or not at all, in cases when your library retains the ownership of the object that you return by pointer.
It is entirely unnatural for users of your API to think that they may need to delete anything returned to them by reference. It does not mean that you cannot do it - you certainly can, but it would be completely unexpected, so one should avoid doing it.
If the compiler is smart enough, the two cases may compile to the same machine code (my compiler does, at least), for example:
int* func1() {
return new int;
/*
push $04
call $00402afe
pop ecx
*/
}
int& func2() {
return *(new int);
/*
push $04
call $00402afe
pop ecx
*/
}
Other than the possibility of maybe having slightly different machine code being generated by the compiler if the second case's de-referencing/re-referencing is not optimized away, they are essentially equivalent in that the caller ends up receiving a memory address to the allocated int value in both cases. From a technical standpoint, pointers and references are treated identical in machine code, the main difference is that the compiler validates that a reference is never unassigned but a pointer can be. But from a coding standpoint, pointers and references are different beasts with different coding semantics, rules of use, etc.
I know the title of the question looks a bit mind-damaging, but I really don't know how to ask this in one phrase. I'll just show you what I mean:
void f(T *obj)
{
// bla bla
}
void main()
{
f(new T());
}
As far as I know, (almost) every new requires a delete, which requires a pointer (returned by new). In this case, the pointer returned by new isn't stored anywhere. So would this be a memory leak?
Does C++ work some kind of magic (invisible to the programmer) that deletes the object after the function ends or is this practice simply always a bad idea?
There is no particular magic and delete will not be called automatically.
It is definitely not "always a bad idea" - if function takes ownership of an object in some form it is perfectly valid way for calling such function:
container.AddAndTakeOwnership(new MyItem(42));
Yes, it's always a bad idea; functions and constructors should be explicit about their ownership semantics and if they expect to take or share ownership of a passed pointer they should receive std::unique_ptr or std::shared_ptr respectively.
There are lots of legacy APIs which take raw pointers with ownership semantics, even in the standard library (e.g. the locale constructor taking a Facet * with ownership), but any new code should avoid this.
Even when constructing a unique_ptr or shared_ptr you can avoid using the new keyword, in the latter case using make_shared and in the former case writing a make_unique function template which should fairly soon be added to the language.
In this case, the pointer returned by new isn't stored anywhere. So
would this be a memory leak?
No, it is not necessarily a memory leak. The pointer is stored as an argument to f:
void f(T *obj)
// ^^^ here pointer is "stored"
{
// bla bla
delete obj; // no memory leak if delete will be called on obj
}
void main()
{
f(new T());
// ^^^^^^^ this "value" will be stored as an argument to f
}
Does C++ work some kind of magic (invisible to the programmer) that
deletes the object after the function ends or is this practice simply
always a bad idea?
No magic in your example. As I showed - delete must be called explicitly.
Better is to use smart pointer, then C++ "magic" works, and delete is not needed.
void f(std::unique_ptr<T> obj)
{
// bla bla
}
void main()
{
f(new T());
}
The code shown will result in a memory leak. C++ does not have garbage collection unless you explicitly use a specialized framework to provide it.
The reason for this has to do with the way memory is managed in C/C++. For a local variable, like your example, memory for the object is requested directly from the operating system (malloc) and then the pointer to the object exists on the stack. Because C/C++ can do arbitrarily complex pointer arithmetic, the compiler has no way of knowing whether there exists some other pointer somewhere to the object, so it cannot reclaim the memory when function f() ends.
In order to prevent the leak automatically, the memory would have to be allocated out of a managed heap, and every reference into this heap would have to be carefully tracked to determine when a given object no longer was being used. You would have to give up C's ability to do pointer arithmatic in order to get this capability.
For example, let's say the compiler could magically figure out that all normal references to obj were defunct and deleted the object (released the memory). What if you had some insanely complicated RUNTIME DEPENDENT expression like void* ptr = (&&&&(&&&*obj)/2++ - currenttime() - 567 + 3^2 % 52) etc; How would the compiler know whether this ptr pointed to obj or not? There is no way to know. This is why there is no garbage collection. You can either have garbage collection OR complex runtime pointer arithmetic, not both.
This is often used when your function f() is storing the object somewhere, like an array (or any other data container) or a simple class member; in which case, deletion will (and must) take place somewhere else.
Otherwise is not a good idea cause you will have to delete it manually at the end of the function anyway. In that case, you can just declare an automatic (on the stack) object and pass it by pointer.
No magic. In your case after f is called main returns back to CRT's main and eventually the OS will clean up the "leak". Its not necessarily a bad idea, it could be giving f the ownership and it is up to f to do stuff and eventually delete. Some make call it bad practice but its probably out there in the wild.
Edit:
Although I see that code no more dangerous than:
void f(T *obj)
{
// bla bla
}
void main()
{
T* test = new T ();
f(test);
}
Principally it is the same. Its saying to f, here is a pointer to some memory, its yours you look after it now.
In C++ passing a pointer is discouraged as there are no associated owner semantics (ie you can not know who owns the pointer and thus who is responsible for deleting the pointer). Thus when you do have a function (that takes a pointer) you need to document very clearly if the function is responsible for cleaning up the pointer or not. Of course documentation like this is very error prone as the user has to read it.
It is more normal in C++ program to pass objects that describe the ownership semantics of the thing.
Pass by reference
The function is not taking ownership of the object. It will just use the object.
Pass a std::auto_ptr (or std::unique_ptr)
The function is being passed the pointer and ownership of the pointer.
Pass a std::shared_ptr
The function is being passed shared ownership of the pointer.
By using these techniques you not only document the ownership semantics but the objects used will also automatically control the lifespan of the object (thus relieving your function from calling delete).
As a result it is actually very rare to see a manual call to delete in modern C++ code.
So I would have written it like this:
void f(std::unique_ptr<T> obj)
{
// bla bla
}
int main()
{
f(std::unique_ptr<T>(new T()));
}
I've got a couple of questions regarding pointers. First:
ObjectType *p;
p->writeSomething();
Why is it possible to call a method on an object when the pointer hasn't been initialized? If I run that code I get the output from "writeSomething()" in my console window. Second:
ObjectType *p;
if(p==NULL)
cout<<"Null pointer";//This is printed out
p = new ObjectType;
delete p;
if(p==NULL)
cout<<"Null pointer";
else
cout<<"Pointer is not null";//This is printed out
Why isn't the pointer null in the second if statement and how do I check if a pointer isn't pointing to any memory address? I'm also wondering if there is any way to check if some memory hasn't been released when a program is done executing. For example, if you forget to write 1 delete statement in the code.
The first code is undefined behavior, anything can happen, even appearing to work. It's probably working because the call is resolved statically, and you're not accessing any members of the class.
For the second snippet delete doesn't set the pointer to NULL, it just releases the memory. The pointer is now dangling, as it points to memory you no longer own.
Your code does of course exhibit undefined behaviour, but here's an example of why it may appear possible to call a member function even if there is no object: If the member function doesn't refer to any member objects of the class, then it will never need to access any part of the memory which you haven't initialized. That means, your member function is essentially static.
As you know, member functions can be considered as normal, free functions which have an implicit instance object reference argument. For example, a simple class Foo defined like this,
struct Foo
{
void bar() { std::cout << "Hello\n"; }
};
could be implemented as a single, free function:
void __Foo_bar(Foo * this)
{
std::cout << "Hello\n";
}
Now when you say Foo * p; p->bar();, this amounts to a free function call __Foo_bar(p);. You end up passing an invalid pointer to the function, but since the function never makes use of the pointer, no harm is done.
On the other hand, if your class had a member object, like int Foo::n;, and if the member function was trying to access it, your implementation would try and access this->n, which would very likely cause an immediate problem since you'd actually be dereferencing an invalid pointer.
delete p;
deallocates memory, but it does not change the value of the address stored in p.
There is no method in standard C++ to detect that a pointer is referring to invalid memory. It is your responsibility not to de-reference an invalid pointer.
Your first example is undefined behaviour. One of the possible outcomes of undefined behaviour is that the program works the way you intended it to. Again, it is your responsibility not to write programs with undefined behaviour.
In your code, writeSomething() is probably a non-virtual member function that does not de-reference this which is why it happens to work for you, on your compiler. Most likely if you tried to refer to some member data fields then you would encounter a runtime error.
delete would call upon the destructor of ObjectType followed by de-allocation of memory but it doesn't explicitly makes your pointer NULL
That is something you have to do as a good programming practice.
#include <cstdio>
class baseclass
{
};
class derclass : public baseclass
{
public:
derclass(char* str)
{
mystr = str;
}
char* mystr;
};
baseclass* basec;
static void dostuff()
{
basec = (baseclass*)&derclass("wtf");
}
int main()
{
dostuff();
__asm // Added this after the answer found, it makes it fail
{
push 1
push 1
push 1
push 1
push 1
push 1
push 1
push 1
push 1
push 1
}
printf("%s", ((derclass*)basec)->mystr);
}
Ugh. This is one of those "don't ever do this" examples. In dostuff, you create a temporary of type derclass, take its address, and manage to pass it outside of dostuff (by assigning it to basec). Once the line creating the temporary is finished, accessing it via that pointer yields undefined behavior. That it works (i.e. your program prints "wtf") is certainly platform dependent.
Why does it work in this specific instance? To explain this requires delving deeper than just C++. You create a temporary of type derclass. Where is it stored? Probably it's stored as a very short lived temporary variable on the stack. You take it's address (an address on your stack), and store that.
Later, when you go to access it, you still have a pointer to that portion of your stack. Since nobody has since come along and reused that portion of the stack, the object's remnants are still there. Since the object's destructor doesn't do anything to wipe out the contents (which is, after all, just a pointer to "wtf" stored somewhere in your static data), you can still read it.
Try interjecting something which uses up a lot of stack between the dostuff and printf calls. Like, say, a call to a function which calculates factorial(10) recursively. I'll bet that the printf no longer works.
basec = (baseclass*)&derclass("wtf");
Here a temporary object of derclass is created and destructed immediately when ; is encountered in dostuff() function. Hence, your basec pointer points to invalid object.
As aJ notes, the temporary object you create is immediately destroyed. This doesn't exactly 'work': you're into undefined behaviour which may legally cause your monitor to catch on fire the next time you run it!
Hint: undefined behaviour - just say no.
Note that basec = (baseclass*)&derclass("wtf"); causes undefined behavior to be invoked. The problem is that derclass("wtf") creates a temporary object (of type derclass) the & in front of it will take the temporary object's address, which will then be assigned to basec. Then, at the end of the full expression, the temporary object will be destroyed, leaving basec with a pointer to a no longer existing object. When you later access this piece of memory (in (derclass*)basec)->mystr) you are invoking undefined behavior.
Since it's the nature of undefined behavior to allow the program to do anything it pleases, your program might even work as if the object still existed. But it might as well crash, format your hard drive, or invoke nasty nasal demons on you.
What you would have to do is assign the address of an object to basec which isn't destroyed as long as you use it. One way to do this would be to dynamically create an object: basec = new derclass("wtf").
It creates the temporary variable on the stack because it's a local variable to the dostuff() function. Once the dostuff function exits, the stack rolls back possibly leaving the object on the memory stack exactly as it should be. Now your pointer is pointing to a spot on the stack that hopefully won't get clobbered by the call to printf when it passes in a pointer to stack memory that is no longer being used.
Usually stack that isn't being used isn't overwritten if you don't call other functions.
You could actually do some damage by calling a few functions, and then changing the value of mystr. The characters of text would then become part of the executable code. Hacker's dream.
Try something like this:
void breakStuff()
{
char dummy[3];
strcpy( dummy, "blahblahblahblahblah" );
int i = 7;
i = i + 8;
i = i + 22;
printf( "**%d**", i );
}
The strcpy will write PAST the local variable and overwrite the code. It'll die horribly. Good times, noodle salad.
The instance pointed to by basec is a derclass, the casts just tell the compiler what to think of the pointer at any given moment.
Edit: strange that you can access the temporary later on. Does this still work if you allocate some other data on the stack?
Do you get a compiler warning from the (baseclass*) cast?