Interesting issue when attempting to reset pointer - c++

Being that my C++ isn't that great, this may be a really simple/obvious answer, but it sure has me stumped. Keep in mind its kinda late here and I'm a little tired. I got this code here:
void TestFunc(int *pVar)
{
cout << endl << *pVar << endl;
delete pVar;
pVar = nullptr;
}
int main(int argc, char *argv[])
{
int *z(new int);
*z = 5;
TestFunc(z);
if (z == nullptr)
cout << "z Successfully Deleted!" << endl;
else cout << "z NOT deleted!" << endl;
return 0;
}
The program compiles just fine with no errors or warning. When I run it, it displays 5, just as I'd expect. However, it says z NOT deleted!. I am curious as to why pVar is not getting set to nullptr even though I explicity set it in my TestFunc() function. Any help would be appreciated. If it matters, this is Visual Studio 2010 and just a regular unmanaged C++ application.

Because it's being passed by value (i.e. as a copy).
If you want the variable itself to be passed (rather than just its value, which is copied), use
void TestFunc(int *&pVar)
instead.
Note that delete only cares about the pointee, not the pointer. So "deleting" a copy of a pointer deletes the same thing as the original pointer, because in either case you're deleting their targets, which are the same.

TestFunc accepts the pointer by value, so setting it to null inside the function actually only affects the copy in the function and is not visible to the caller. So pVar is set to null, but z in main() is not because those are different variables.
To make the change visible to the caller pass the pointer by reference or via a double pointer.

It is late (sorry!!!).
z is passed by value. z is not pVar. You assigned the value of nullptr to pVar and not to z.

The "idiom" you present is generally advocated as Safe Delete.
It was already known in C (though it was a free back then): zero-out the pointer you just freed / deleted to avoid doing so twice.
The trouble is, you zero out the current pointer, but any copy of it still point to the same memory area, which now contains garbage.
Memory handling is a difficult topic, the fundamental concept is ownership. At any point in time, the owners of a particular memory zone should be well identified, and they should have the responsibility of returning it to the system when appropriate.
The first step in this direction is the use of smart pointers, for example std::unique_ptr or boost::scoped_ptr. For shared ownership (experts only), std::shared_ptr might come handy, but you're not there yet.
If you write a delete in your code, it means you are exposing yourself to leaks. It is not bad in itself, but it calls for careful review, and makes the code brittle (ie, likely to break on change). In your case:
int main() {
boost::scoped_ptr<int> i(new 5);
foo(*i);
} // memory returned to system by ~scoped_ptr()

Related

Unique pointer still holds the object after moving

I'm going through some tutorials on how smart pointers work in C++, but I'm stuck on the first one I tried: the unique pointer. I'm following guidelines from wikipedia, cppreference and cplusplus. I've also looked at this answer already. A unique pointer is supposed to be the only pointer that has ownership over a certain memory cell/block if I understood this correctly. This means that only the unique pointer (should) point to that cell and no other pointer. From wikipedia they use the following code as an example:
std::unique_ptr<int> p1(new int(5));
std::unique_ptr<int> p2 = p1; //Compile error.
std::unique_ptr<int> p3 = std::move(p1); //Transfers ownership. p3 now owns the memory and p1 is rendered invalid.
p3.reset(); //Deletes the memory.
p1.reset(); //Does nothing.
Until the second line, that worked fine for me when I test it. However, after moving the first unique pointer to a second unique pointer, I find that both pointers have access to the same object. I thought the whole idea was for the first pointer to be rendered useless so to speak? I expected a null pointer or some undetermined result. The code I ran:
class Figure {
public:
Figure() {}
void three() {
cout << "three" << endl;
}
};
class SubFig : public Figure {
public:
void printA() {
cout << "printed a" << endl;
}
};
int main()
{
unique_ptr<SubFig> testing (new SubFig());
testing->three();
unique_ptr<SubFig> testing2 = move(testing);
cout << "ok" << endl;
int t;
cin >> t; // used to halt execution so I can verify everything works up til here
testing->three(); // why is this not throwing a runtime error?
}
Here, testing has been moved to testing2, so I'm surprised to find I can still call the method three() on testing.
Also, calling reset() doesn't seem to delete the memory like it said it would. When I modify the main method to become:
int main()
{
unique_ptr<SubFig> testing (new SubFig());
testing->three();
unique_ptr<SubFig> testing2 = move(testing);
cout << "ok" << endl;
int t;
cin >> t;
testing.reset(); // normally this should have no effect since the pointer should be invalid, but I added it anyway
testing2.reset();
testing2->three();
}
Here I expect three() not to work for testing2 since the example from wikipedia mentioned the memory should be deleted by resetting. I'm still printing out printed a as if everything is fine. That seems weird to me.
So can anyone explain to me why:
moving from one unique pointer to another unique pointer doesn't make the first one invalid?
resetting does not actually remove the memory? What's actually happening when reset() is called?
Essentially you invoke a member function through a null pointer:
int main()
{
SubFig* testing = nullptr;
testing->three();
}
... which is undefined behavior.
From 20.8.1 Class template unique_ptr (N4296)
4 Additionally, u can, upon request, transfer ownership to another
unique pointer u2. Upon completion of such a transfer, the following
postconditions hold:
u2.p is equal to the pre-transfer u.p,
u.p is equal to nullptr, and
if the pre-transfer u.d maintained state, such state has been transferred to u2.d.
(emphasis mine)
After the std::move() the original pointer testing is set to nullptr.
The likely reason std::unique_ptr doesn't check for null access to throw a runtime error is that it would slow down every time you used the std::unique_ptr. By not having a runtime check the compiler is able to optimize the std::unique_ptr call away entirely, making it just as efficient as using a raw pointer.
The reason you didn't get a crash when calling the nullptr is likely because the function you called doesn't access the (non-existent) object's memory. But it is undefined behavior so anything could happen.
On calling std::unique_ptr<int> p3 = std::move(p1); your original pointer p1 is in undefined state, as such using it will result in undefined behavior. Simply stated, never ever do it.

Return newly allocated pointer or update the object through parameters?

I'm actually working on pointers to user-defined objects but for simplicity, I'll demonstrate the situation with integers.
int* f(){
return new int(5);
}
void f2(int* i){
*i = 10;
}
int main(){
int* a;
int* b = new int();
a = f();
f2(b);
std::cout << *a << std::endl; // 5
std::cout << *b << std::endl; // 10
delete b;
delete a;
return 0;
}
Consider that in functions f() and f2() there are some more complex calculations that determine the value of the pointer to be returned(f()) or updated through paramteres(f2()).
Since both of them work, I wonder if there is a reason to choose one over the other?
From looking at the toy code, my first thought is just to put the f/f2 code into the actual object's constructor and do away with the free functions entirely. But assuming that isn't an option, it's mostly a matter of style/preference. Here are a few considerations:
The first is easier to read (in my opinion) because it's obvious that the pointer is an output value. When you pass a (nonconst) pointer as a parameter, it's hard to tell at a glance whether it's input, output or both.
Another reason to prefer the first is if you subscribe to the school of thought that says objects should be made immutable whenever possible in order to simplify reasoning about the code and to preclude thread safety problems. If you're attempting that, then the only real choice is for f() to create the object, configure it, and return const Foo* (again, assuming you can't just move the code to the constructor).
A reason to prefer the second is that it allows you to configure objects that were created elsewhere, and the objects can be either dynamic or automatic. Though this can actually be a point against this approach depending on context--sometimes it's best to know that objects of a certain type will always be created and initialized in one spot.
If the allocation function f() does the same thing as new, then just call new. You can do whatever initialisation in the object's construction.
As a general rule, try to avoid passing around raw pointers, if possible. However that may not be possible if the object must outlive the function that creates it.
For a simple case, like you have shown, you might do something like this.
void DoStuff(MyObj &obj)
{
// whatever
}
int Func()
{
MyObj o(someVal);
DoStuff(o);
// etc
}
f2 is better only because the ownership of the int is crystal clear, and because you can allocate it however you want. That's reason enough to pick it.

Nullptr and checking if a pointer points to a valid object

In a couple of my older code projects when I had never heard of smart pointers, whenever I needed to check whether the pointer still pointed to a valid object, I would always do something like this...
object * meh = new object;
if(meh)
meh->member;
Or when I needed to delete the object safely, something like this
if(meh)
{
delete meh;
meh = 0;
}
Well, now I have learned about the problems that can arise from using objects and pointers in boolean expressions both with literal numbers, the hard way :. And now I've also learned of the not so new but pretty cool feature of C++, the nullptr keyword. But now I'm curious.
I've already gone through and revised most of my code so that, for example, when deleting objects I now write
if(meh)
{
delete meh;
meh = nullptr;
}
Now I'm wondering about the boolean. When you pass just say an int into an if statement like this,
int meh;
if(meh)
Then it implicitly checks for zero without you needing to write it.
if(meh == 0) // does the exact same check
Now, will C++ do the same for pointers? If pass in a char * like this to an if statement?
char * meh;
if(meh)
Then will it implicitly compare it with nullptr? Because of how long I have been writing these ifs like this, it is second nature at this point to check if the pointers valid before using by typing if (object *) and then calling its members. If this is not the functionality why not? Too difficult to implement? Would solve some problems by removing yet another tiny way you could mess up your code.
In C, anything that's not 0 is true. So, you certainly can use:
if (ptrToObject)
ptrToObject->doSomething();
to safely dereference pointers.
C++11 changes the game a bit, nullptr_t is a type of which nullptr is an instance; the representation of nullptr_t is implementation specific. So a compiler may define nullptr_t however it wants. It need only make sure it can enforce proper restriction on the casting of a nullptr_t to different types--of which boolean is allowed--and make sure it can distinguish between a nullptr_t and 0.
So nullptr will be properly and implicitly cast to the boolean false so long as the compiler follows the C++11 language specification. And the above snippet still works.
If you delete a referenced object, nothing changes.
delete ptrToObject;
assert(ptrToObject);
ptrToObject = nullptr;
assert(!ptrToObject);
Because of how long I have been writing these ifs like this, it is second nature at this point to check if the pointers valid before using by typing if (object *) and then calling it's members.
No. Please maintain a proper graph of objects (preferably using unique/smart pointers). As pointed out, there's no way to determine if a pointer that is not nullptr points to a valid object or not. The onus is on you to maintain the lifecycle anyway.. this is why the pointer wrappers exist in the first place.
In fact, because the life-cycle of shared and weak pointers are well defined, they have syntactic sugar that lets you use them the way you want to use bare pointers, where valid pointers have a value and all others are nullptr:
Shared
#include <iostream>
#include <memory>
void report(std::shared_ptr<int> ptr)
{
if (ptr) {
std::cout << "*ptr=" << *ptr << "\n";
} else {
std::cout << "ptr is not a valid pointer.\n";
}
}
int main()
{
std::shared_ptr<int> ptr;
report(ptr);
ptr = std::make_shared<int>(7);
report(ptr);
}
Weak
#include <iostream>
#include <memory>
void observe(std::weak_ptr<int> weak)
{
if (auto observe = weak.lock()) {
std::cout << "\tobserve() able to lock weak_ptr<>, value=" << *observe << "\n";
} else {
std::cout << "\tobserve() unable to lock weak_ptr<>\n";
}
}
int main()
{
std::weak_ptr<int> weak;
std::cout << "weak_ptr<> not yet initialized\n";
observe(weak);
{
auto shared = std::make_shared<int>(42);
weak = shared;
std::cout << "weak_ptr<> initialized with shared_ptr.\n";
observe(weak);
}
std::cout << "shared_ptr<> has been destructed due to scope exit.\n";
observe(weak);
}
Now, will C++ do the same for pointers? If pass in a char * like this to an if statement?
So to answer the question: with bare pointers, no. With wrapped pointers, yes.
Wrap your pointers, folks.
It's not possible to test whether a pointer points to a valid object or not. If the pointer is not null but does not point to a valid object, then using the pointer causes undefined behaviour. To avoid this sort of error, the onus is on you to be careful with the lifetime of objects being pointed to; and the smart pointer classes help with this task.
If meh is a raw pointer then there is no difference whatsoever between if (meh) and if (meh != 0) and if (meh != nullptr). They all proceed iff the pointer is not null.
There is an implicit conversion from the literal 0 to nullptr .
It is always set a pointer to zero after invalidating it so that you know a pointer that's non-zero is valid" is an anti-pattern. What happens if you have two pointers to the same object? Setting one to zero won't be better and it does not affect the other.

Return by reference

Please see the following code snippets. In the second function i am returning a reference. I am declaring a local variable in the function and is returning the address. As the variable is local I believe its life ends as it exits the function. My question is why is it possible to access the value from the caller without any exceptions even though the original variable is deleted?
int& b=funcMulRef(20,3);
int* a= funcMul(20,3);
int* funcMul(int x,int y)
{
int* MulRes = new int;
*MulRes = (x*y);
return MulRes;
}
int& funcMulRef(int x,int y)
{
int MulRes ;
MulRes = (x*y);
return MulRes;
}
Regards,
JOhn
The behaviour of the second function is simply undefined; anything can happen, and in many circumstances, it will appear to work, simply because nothing has overwritten where the result used to be stored on the stack.
You are accessing data that is no longer in scope.
The memory probably still has the data in it though so it appears to work properly but is likely to be reused at any time and the value will be overwritten.
The next time you call any function or allocate a local stack variable it's very likely to reuse that memory for the new data and overwrite what you had there before. It's underfined behavour.
The original value isn't deleted. Just because the action of deleting it will cause some unseen computations.
The value is still there, but the memory space is no longer yours, and is actually undefined.
You are pointing to a space in memory that can be overrun by the program.
No, you shouldn't do this. The result of accessing residual data on the stack is undefined. Beside that, if your return value is of class type, its destructor will have already been called.
Are you trying to avoid temporary objects? If so, you might be interested in this:
http://en.wikipedia.org/wiki/Return_value_optimization
It most likely won't work in these cases :
funcMulRef(10,3) + funcMulRef(100,500)
alternatively, in a more nasty way :
std::cout << "10*3=" << funcMulRef(10,3) << " 100*500=" << funcMulRef(100,500) << std::endl;
gcc will warn for this kind of errors if you use -Wall

C++ Objects: When should I use pointer or reference

I can use an object as pointer to it, or its reference. I understand that the difference is that pointers have to be deleted manually, and references remain until they are out of scope.
When should I use each of them? What is the practical difference?
Neither of these questions answered my doubts:
Pointer vs. Reference
C++ difference between reference, objects and pointers
A reference is basically a pointer with restrictions (has to be bound on creation, can't be rebound/null). If it makes sense for your code to use these restrictions, then using a reference instead of a pointer allows the compiler to warn you about accidentally violating them.
It's a lot like the const qualifier: the language could exist without it, it's just there as a bonus feature of sorts that makes it easier to develop safe code.
"pointers I have to delete and reference they remain until their scope finish."
No, that's completely wrong.
Objects which are allocated with new must be deleted[*]. Objects which are not allocated with new must not be deleted. It is possible to have a pointer to an object that was not allocated with new, and it is possible to have a reference to an object that was allocated with new.
A pointer or a reference is a way of accessing an object, but is not the object itself, and has no bearing on how the object was created. The conceptual difference is that a reference is a name for an object, and a pointer is an object containing the address of another object. The practical differences, how you choose which one to use, include the syntax of each, and the fact that references can't be null and can't be reseated.
[*] with delete. An array allocated with new[] must be deleted with delete[]. There are tools available that can help keep track of allocated resources and make these calls for you, called smart pointers, so it should be quite rare to explicitly make the call yourself, as opposed to just arranging for it to be done, but nevertheless it must be done.
suszterpatt already gave a good explanation. If you want a rule of thumb that is easy to remember, I would suggest the following:
If possible use references, use pointers only if you can not avoid them.
Even shorter: Prefer references over pointers.
Here's another answer (perhaps I should've edited the first one, but since it has a different focus, I thought it would be OK to have them separate).
When you create a pointer with new, the memory for it is reserved and it persists until you call delete on it - but the identifier's life span is still limited to the code block's end. If you create objects in a function and append them to an external list, the objects may remain safely in the memory after the function returns and you can still reference them without the identifier.
Here's a (simplified) example from Umbra, a C++ framework I'm developing. There's a list of modules (pointers to objects) stored in the engine. The engine can append an object to that list:
void UmbraEngine::addModule (UmbraModule * module) {
modules.push(module);
module->id = modules.size() - 1;
}
Retrieve one:
UmbraModule * UmbraEngine::getModule (int id) {
for (UmbraModule **it=modules.begin(); it != modules.end(); it++) {
if ((*it)->id == id) return *it;
}
}
Now, I can add and get modules without ever knowing their identifiers:
int main() {
UmbraEngine e;
for (int i = 0; i < 10; i++) {
e.addModule(new UmbraModule());
}
UmbraModule * m = e.getModule(5); //OK
cout << m << endl; //"0x127f10" or whatever
for (int j = 0; k < 10; j++) {
UmbraModule mm; //not a pointer
e.addModule(&mm);
}
m = e.getModule(15);
cout << m << endl; //{null}
}
The modules list persists throughout the entire duration of the program, I don't need to care about the modules' life span if they're instantiated with new :). So that's basically it - with pointers, you can have long-lived objects that don't ever need an identifier (or a name, if you will) in order to reference them :).
Another nice, but very simple example is this:
void getVal (int * a) {
*a = 10;
}
int main() {
int b;
getVal(&b);
return b;
}
You have many situations wherein a parameter does not exist or is invalid and this can depend on runtime semantics of the code. In such situations you can use a pointer and set it to NULL (0) to signal this state. Apart from this,
A pointer can be re-assigned to a new
state. A reference cannot. This is
desirable in some situations.
A pointer helps transfer owner-ship semantics. This is especially useful
in multi-threaded environment if the parameter-state is used to execute in
a separate thread and you do not usually poll till the thread has exited. Now the thread can delete it.
Erm... not exactly. It's the IDENTIFIER that has a scope. When you create an object using new, but its identifier's scope ends, you may end up with a memory leak (or not - depends on what you want to achieve) - the object is in the memory, but you have no means of referencing it anymore.
The difference is that a pointer is an address in memory, so if you have, say, this code:
int * a = new int;
a is a pointer. You can print it - and you'll get something like "0x0023F1" - it's just that: an address. It has no value (although some value is stored in the memory at that address).
int b = 10;
b is a variable with a value of 10. If you print it, you'll get 10.
Now, if you want a to point to b's address, you can do:
a = &b; //a points to b's address
or if you want the address pointed by a to have b's value:
*a = b; //value of b is assigned to the address pointed by a
Please compile this sample and comment/uncomment lines 13 and 14 to see the difference (note WHERE the identifiers point and to WHAT VALUE). I hope the output will be self-explanatory.
#include <iostream>
using namespace std;
int main()
{
int * a = new int;
int b = 10;
cout << "address of a: " << a << endl;
cout << "address of b: " << &b << endl;
cout << "value of a: " << *a << endl;
cout << "value of b: " << b << endl;
a = &b; //comment/uncomment
//*a = b; //comment/uncomment
cout << "address of a: " << a << endl;
cout << "address of b: " << &b << endl;
cout << "value of a: " << *a << endl;
cout << "value of b: " << b << endl;
}
Let's answer the last question first. Then the first question will make more sense.
Q: "What is the practical difference[ between a pointer and a reference]?"
A: A reference is just a local pseudonym for another variable. If you pass a parameter by reference, then that parameter is exactly the same variable as the one that was listed in the calling statement. However, internally there usually is no difference between a pointer and a reference. References provide "syntax sugar" by allowing you to reduce the amount of typing you have to do when all you really wanted was access to a single instance of a given variable.
Q: "When should I use each of em?"
A: That's going to be a matter of personal preference. Here's the basic rule I follow. If I'm going to need to manipulate a variable in another scope, and that variable is either an intrinsic type, a class that should be used like an intrinsic type (i.e. std::string, etc...), or a const class instance, then I pass by reference. Otherwise, I'll pass by pointer.
The thing is you cannot rebind a reference to another object. References are bound compile time and cannot be null or rebound. So pointers aren't redundant if your doubt was that :)
As my c++ teacher used to put it, pointers point to the memory location while references are aliases . Hence the main advantage is that they can be used in the same way as the object's name they refer to, but in a scope where the object is not available by passing it there.
While pointers can be redirected to some other location, references being like constant pointers, can't be redirected. So references cant be used for traversing arrays in a functions etc.
However a pointer being a separate entity takes up some memory, but the reference being the same as the referred object doesn't take any additional space. This is one of its advantages.
I have also read that the processing time for references are less,
as
int & i = b ;
i++ ; takes lesser time than
int * j = b ;
(*j) ++ ;
but I am yet to confirm this. If anyone can throw light on this claim it would be great.
Comments are welcome :)