Checking for a null reference? - c++

Lets say you have something like this:
int& refint;
int* foo =0;
refint = *foo;
How could you verify if the reference is NULL to avoid a crash?

You can't late-initialize a reference like that. It has to be initialized when it's declared.
On Visual C++ I get
error C2530: 'refint' : references
must be initialized
with your code.
If you 'fix' the code, the crash (strictly, undefined behaviour) happens at reference usage time in VC++ v10.
int* foo = 0;
int& refint(*foo);
int i(refint); // access violation here
The way to make this safe is to check the pointer at reference initialization or assignment time.
int* foo =0;
if (foo)
{
int& refint(*foo);
int i(refint);
}
though that still does not guarantee foo points to usable memory, nor that it remains so while the reference is in scope.

You don't, by the time you have a "null" reference you already have undefined behaviour. You should always check whether a pointer is null before trying to form a reference by dereferencing the pointer.
(Your code is illegal; you can't create an uninitialized reference and try and bind it by assigning it; you can only bind it during initialization.)

In general, you can't.
Whoever "creates a null reference" (or tries to, I should say) has already invoked undefined behavior, so the code might (or might not) crash before you get a chance to check anything.
Whoever created the reference should have done:
int *foo = 0;
if (foo) {
int &refint = *foo;
... use refint for something ...
}
Normally it's considered the caller's problem if they've written *foo when foo is null, and it's not one function's responsibility to check for that kind of error in the code of other functions. But you could litter things like assert(&refint); through your code. They might help catch errors made by your callers, since after all for any function you write there's a reasonable chance the caller is yourself.

All the answers above are correct, but if for some reason you want to do this I thought at least one person should provide an answer. I am currently trying to track down a bad reference in some source code and it would be useful to see if someone has deleted this reference and set it to null at some point. Hopefully this wont generate to many down votes.
#include <iostream>
int main()
{
int* foo = nullptr;
int& refint = *foo;
if(&refint == nullptr)
std::cout << "Null" << std::endl;
else
std::cout << "Value " << refint << std::endl;
}
Output:
Null

To make the above code compile, you will have to switch the order:
int* foo =0;
int& refint = *foo; // on actual PCs, this code will crash here
(There may be older processor or runtime architectures where this worked.)

....saying all of the above, if you do want to have a null reference, use boost::optional<>, works like a charm..

You don't need to, references cannot be null.
Read the manual.

Related

How can a C++ reference be changed (by any means necessary)

The C++ language doesn't let you change a reference after it is assigned. However, I had a debugging need/desire to change the reference to help debug something. Is there a hacky way to basically overwrite the reference implementation with a new pointer? Once you get an address to the object you want to change, you can cast it to whatever you want and overwrite it. I could not figure out how to get a memory address of the underlying reference instance; using & to dereference the reference doesn't give you the address of the reference, but the address of the object pointed to by the reference.
I realize this is obviously going to invoke undefined behavior, and this is just an experiment. A third party library has a bug with global reference that was not getting constructed before the code is exercised, and I want to see if I can fix it by setting the reference myself. At this point, it became a challenge to see if it is even possible. I know you can do this in assembly language, if you can reference the symbol table directly.
I imagine something like this. These are globally scoped variables.
Apple a;
Apple& ref = a;
Later I want ref to refer to a new object instance b and leave a alone.
Apple b;
ref = b; // that doesn't work. that justs sets a=b.
&ref = &b; // that doesn't work. the compiler complains.
uint64_t addr = find_symbol_by_any_means_necessary(ref);
*(Apple**)addr = &b; // this should work if I could get addr
Please don't remind me this is a bad idea. I know it is a bad idea. Think of it as a challenge. This is for debug only, to test a hypotheses quickly. I want to learn something about the internals of C++ binary code. (Please tell me if it is impossible because of system page protection... I suppose you could get a seg fault if the references are placed in a holy place).
(The system is CentOS 7, compiler is Intel although I could use gcc for this experiment).
I don't think there is a way to re-direct the object that a standalone reference variable references.
If a reference is contained in a struct as a member variable, you can easily change the object the reference variable references. It's most likely UB but it works with my current version of g++, g++ 4.8.4.
Here's an example program that demonstrates a method.
#include <iostream>
#include <cstring>
struct Foo
{
int& ref;
};
int main()
{
int a = 10;
int b = 20;
Foo foo = {a}; // foo.ref is a reference to a
std::cout << foo.ref << std::endl;
// Use memcpy to change what foo.ref references
int* bPtr = &b;
std::memcpy(&foo, &bPtr, sizeof(bPtr));
// Now, foo.ref is a reference to b
std::cout << foo.ref << std::endl;
// Changing foo.ref changes b
foo.ref = 30;
std::cout << b << std::endl;
}
Output:
10
20
30
it's generally not possible, because the fact that a reference is not reseatable allows a lot of optimizations.
For instance, a reference is often implemented as a pointer, but the compiler may also notice that you often use an fixed offset. So besides storing the pointer, the compiler may decide to store the pointer plus offset. It may even decide to only store the pointer plus offset.
Another optimization is to store the reference as an address in a CPU register. Since it can't change, the compiler doesn't need to reload it.
So your statement that you can change it in assembly is rather misleading. You have no idea what the representation of the reference is after optimization, and this optimization will be situationally dependent.

Strange results with object creation and binding

Mistakenly I wrote something daft, which to my surprise worked.
class A
{ public:
void print()
{
std::cout << "You can't/won't see me !!!" << std::endl;
}
A* get_me_the_current_object()
{
return this;
}
};
int main()
{
A* abc = dynamic_cast<A*>(abc);
abc->print();
}
Here, A* abc = dynamic_cast<A*>(abc), I am doing dynamic_cast on a pointer, which isn't declared. But, it works, so I assumed that the above statement is broken as:
A* abc;
abc = dynamic_cast<A*>(abc);
and therefore, it works. However, on trying some more weird scenarios such as:
A* abc;
abc->print();
and, further
A* abc = abc->get_me_the_current_object();
abc->print();
I was flabbergasted, looking at how these examples worked and the mapping was done.
Can someone please elaborate on how these are working? Thanks in advance.
You've made the common mistake of thinking that undefined behaviour and C++ bugs mean you should expect to see a dramatic crash or your computer catching fire. Sometimes nothing happens. That doesn't mean the code "works", because it's still got a bug, it's just that the symptoms haven't shown up ... yet.
But, it works, so I assumed that the above statement is broken as:
Yes, all you're doing is converting an uninitialized pointer to the same type, i.e. no conversion needed, so the compiler does nothing. Your pointer is still the same type and is still uninitialized.
This is similar to this:
int i = i;
This is valid according to the grammar of C++, because i is in scope at that point, but is undefined because it copies an uninitialized object. It's unlikely to actually set your computer on fire though, it appears to "work".
Can someone please elaborate on how these are working?
Technically you're dereferencing an invalid pointer, which is undefined behaviour, but since your member functions don't actually use any members of the object, the invalid this pointer is not dereferenced, so the code "works" (or at least appears to.)
This is similar to:
void function(A* that)
{
std::cout << "Hello, world!\n";
}
A* p;
function(p);
Because the that pointer is not used (like the this pointer is not used in your member functions) this doesn't necessarily crash, although it might do on implementations where even copying an uninitialized pointer could cause a hardware fault. In your example it seems that your compiler doesn't need to dereference abc to call a non-static member function, and passing it as the hidden this parameter does not cause a hardware fault, but the behaviour is still undefined even though it doesn't fail in an obvious way such as a segfault..
abc is uninitialized and points to an undefined location in memory, but your methods don't read anything from *this so they won't crash.
The fact that they won't crash is almost certainly implementation defined behavior though.

how to detect references to members of temporary objects

My colleague recently compiled our program in Windows, and discovered a bug of the sort:
std::string a = "hello ";
std::string b = "world";
const char *p = (a+b).c_str();
printf("%s\n", p);
which for some reason did not crash in our Linux executables.
None of our compilers give any kind of warning, so we are now worried that this error might exist in the code.
Although we can grep for c_str() occurrences and do a visual inspection, there is a possibility that one might have also done the following:
struct I {
int num;
I() { num=0; }
};
struct X {
I *m;
X() { m = new I; }
~X() { delete m; }
I get() { return *m; } // version 1, or
I& get() { return *m; } // version 2
};
and accessed it like:
I& a = X().get(); // will get a reference to a temporary, or a valid copy?
cout << a.num;
instead of :
cout << X().get().num;
which is safe (isn't it?)
Question: Is there a way I can catch such errors (perhaps using the compiler, or even an assertion) ?
I need to be sure that if author of struct X changes get() between version 1 and 2 that the program will warn for the error
Simple answer: In general you cannot catch those errors, and the reason is that there are similar constructs that might be perfectly fine, so the compiler would have to know the semantics of all the functions to be able to warn you.
In simpler cases, like obtaining the address of a temporary, many compilers already warn you, but in the general case, it is quite difficult if not impossible for the compiler to know.
For some similar example to the .c_str() consider:
std::vector< const char * > v;
v.push_back( "Hi" );
const char* p = *v.begin();
The call to begin returns a temporary, similar to the expression (a+b), and you are calling a member function of that temporary (operator*) that returns a const char*, which is quite similar to your original case (from the point of view of the types involved). The problem is that in this case the pointee is still valid after the call, while in yours (.c_str()) it isn't, but it is part of the semantics of the operation, not the syntax that the compiler can check for you. The same goes for the .get() example, the compiler does not know if the returned reference is to an object that will be valid after the expression or not.
All these fall under the category of Undefined Behavior.
Check out this question's solution, I think it does something similar to what you're looking for:
C++ catching dangling reference
There are runtime based solutions which instrument the code to check
invalid pointer accesses. I've only used mudflap so far (which is
integrated in GCC since version 4.0). mudflap tries to track each
pointer (and reference) in the code and checks each access if the
pointer/reference actually points to an alive object of its base type.
Here is an example: {...}

C++: References as return values

I noticed I don't get any compiler errors when I accidentally forget to return from a function that is supposed to return a reference. I wrote some small tests to see what actually happens and I got more confused than anything.
struct Foo
{
int x;
Foo() {
x = 3;
}
};
Foo* foo = new Foo;
Foo& test(bool flag) {
if (flag)
return *foo;
}
If test() doesn't (explicitly) return a value, I will still get something returned. However the Foo object that is returned is not initialized using the default constructor — that's because x is different from 3 in the non-explicitly returned value.
What is actually happening when you don't return a reference? If this is a feature, is it safe to use it as a means to return dummy objects in case errors occur, as opposed to returning a null pointer. (See example below.)
class FooFactory
{
// Return reference...
Foo& createFooRef() {
Foo* foo = new Foo;
bool success = foo->load();
if (success)
return *foo;
// Implicit (and safe?) return value on failure?
}
// ... as opposed to returning a pointer.
Foo* createFooPtr() {
Foo* foo = new foo;
bool success = foo->load();
if (success)
return foo;
else
return 0;
}
// Yes, I am aware of the memory leaks,
// but that's not the point of the example.
Most compilers will give you a warning about this, but you may have to crank up the warning level of the compiler to see it.
No, this is not safe. It is bad. It may lead to stack corruption by just returning whatever happens to be on the stack at the time. As you've already seen, it does not use a constructor for you. If you want a default constructed object, you have to do that yourself (but be careful about returning a reference to a temporary object. That's also bad).
The usual way to lower references in compilers is to pointers. For a reference-returning function, it will mean you get an arbitrary address represented, whatever was in the register or stack slot used for the return value.
Formally in the language, the effects are undefined.
This is undefined behaviour, and infinite bad things may happen, or indeed, may not happen, or might happen sometimes, or might simultaneously happen and not happen if it doesn't like you, or send engineers from Microsoft to your house to beat you over the head with a baseball bat.
The described behaviour is not limited to functions that returns references. The following code will also compile:
int func1( int i )
{
if( i )
return 3; // C4715 warning, nothing returned if i == 0
}
I'm not sure why they generate just a warning, not an error (there might be an option in settings to turn it into error), but you will get undefined behaviour if you call such a function
References are typically just syntactic sugar for pointers, so the return is going to grab a pointer's worth of bytes from the stack for the return value. If you aren't giving it that it will just grab garbage.
I had to use the function and then add -Wall to get g++ to complain:
g++ -Wall foo.cc
foo.cc: In member function 'Foo& FooFactory::createFooRef()':
foo.cc:19: warning: control reaches end of non-void function
Have you tried compiling with /O1 optimisations or greater on and treat warnings as errors? That might fail. I remember something along those lines happening in GCC 4.1. You could forget to return the reference in debug mode, but the reference would return; as soon as you put any optimisations on it would still compile, but not return the reference. When coding in a text editor (as I was in those days) it was a total pain and a huge surprise to me.

Why does this code only print 42?

Could somebody please explain to me why does this code only print "42" instead of "created\n42"?
#include <iostream>
#include <string>
#include <memory>
using namespace std;
class MyClass
{
public:
MyClass() {cout<<"created"<<endl;};
int solution() {return 42;}
virtual ~MyClass() {};
};
int main(int argc, char *argv[])
{
auto_ptr<MyClass> ptr;
cout<<ptr->solution()<<endl;
return 0;
}
BTW I tried this code with different values in solution and I always get the "right" value, so it doesn't seem to be a random lucky value.
Because it exhibits undefined behaviour - you dereference a null pointer.
When you say:
auto_ptr<MyClass> ptr;
you create an autopointer which doesn't point to anything. This is equivalent to saying:
MyClass * ptr = NULL;
Then when you say:
cout<<ptr->solution()<<endl;
you dereference this null pointer. Doing that is undefined in C++ - for your implementation, it appears to work.
std::auto_ptr will not automatically create an object for you. That is, ptr in main as it stands is initialized to null. Dereferencing this is undefined behavior, and you just happen to be getting lucky and getting 42 as a result.
If you actually create the the object:
int main(int argc, char *argv[])
{
auto_ptr<MyClass> ptr(new MyClass);
cout << ptr->solution() << endl;
return 0;
}
You will get the output you expect.
First, keep in mind that the -> operator of auto_ptr is essentially forwarded on to the contained pointer. So for this discussion, your code in main becomes equivalent to:
MyClass* ptr = NULL;
cout << ptr->solution() << endl;
Then note that compilers tend to implement member functions in ways that act very much as if they were non-member functions with the this pointer passed as another function argument. So from your current compiler's point of view, your code in main acts as if it was:
MyClass* ptr = NULL;
cout << solution(ptr) << endl;
with solution written as:
int solution(MyClass* this) { return 42; }
In which case it becomes obvious why there wasn't a crash.
However as others have already mentioned, these are internal details of how compilers implement C++, which are not specified by the language standard. So in theory this code could work as described here on one compiler but crash or do something else entirely on another compiler.
But in practice, even if the standard doesn't guarantee this behavior, any particular compiler could guarantee it if they want to. For instance: since MFC relies on this behavior, it is very unlikely that Visual Studio will ever stop supporting it. Of course, you would have to research each particular compiler where your code might be used to make sure that they actually guarantee this behavior.
Because you don't know the question to the answer xD
It seems you're not calling the constructor, right?
You a re not creating an instance of the object.
You are only creating a smart pointer.
When you call the method you are de-referencing a NULL pointer, so as Neil mentioned you are now in undefined behavior. But since your code does not try and access any member variables it luckily does not crash.
Try this:
auto_ptr<MyClass> ptr(new MyClass);
Because ptr is uninitialized and you're lucky. You should first call new for it:
auto_ptr<MyClass> ptr( new MyClass );
You're not getting a crash because the "solution" method doesn't need to actually use the class members. If you were returning a member or something, you'd probably get a crash.