As, the title says:
Why is calling non virtual member function on deleted pointer an undefined behavior?
Note the Question does not ask if it is an Undefined Behavior, it asks Why it is undefined behavior.
Consider the following program:
#include<iostream>
class Myclass
{
//int i
public:
void doSomething()
{
std::cout<<"Inside doSomething";
//i = 10;
}
};
int main()
{
Myclass *ptr = new Myclass;
delete ptr;
ptr->doSomething();
return 0;
}
In the above code, the compiler does not actually dereference this while calling member function doSomething(). Note that the function is not an virtual function & the compilers convert the member function call to a usual function call by passing this as the first parameter to the function(As I understand this is implementation defined). They can do so because the compiler can exactly determine which function to call at compile time itself. So practically, calling the member function through deleted pointer does not dereference the this. The this is dereferenced only if any member is accessed inside the function body.(i.e: Uncommenting code in above example that accesses i)
If an member is not accessed within the function there is no purpose that the above code should actually invoke undefined behavior.
So why does the standard mandate that calling the non virtual member function through deleted pointer is an undefined behavior, when in fact it can reliably say that dereferencing the this should be the statement which should cause undefined behavior? Is it merely for sake of simplicity for users of the language that standard simply generalizes it or is there some deeper semantic involved in this mandate?
My feeling is that perhaps since it is implementation defined how compilers can invoke the member function may be that is the reason standard cannot enforce the actual point where UB occurs.
Can someone confirm?
Because the number of cases in which it might be reliable are so slim, and doing it is still an ineffably stupid idea. There's no benefit to defining the behaviour.
So why does the standard mandate that calling the non virtual member function through deleted pointer is an undefined behavior, when in fact it can reliably say that dereferencing the this should be the statement which should cause undefined behavior?
[expr.ref] paragraph 2 says that a member function call such as ptr->doSomething() is equivalent to (*ptr).doSomething() so calling a non-static member function is a dereference. If the pointer is invalid that's undefined behaviour.
Whether the generated code actually needs to dereference the pointer for specific cases is not relevant, the abstract machine that the compiler models does do a dereference in principle.
Complicating the language to define exactly which cases would be allowed as long as they don't access any members would have almost zero benefit. In the case where you can't see the function definition you have no idea if calling it would be safe, because you can't know if the function uses this or not.
Just don't do it, there's no good reason to, and it's a Good Thing that the language forbids it.
In C++ language (according to C++03) the very attempt to use the value of an invalid pointer is causing undefined behavior already. There's no need to dereference it for the UB to happen. Just reading the pointer value is enough. The concept of "invalid value" that causes UB when you merely attempt to read that value actually extends to almost all scalar types, not just to pointers.
After delete the pointer is generally invalid in that specific sense, i.e. reading a pointer that supposedly points to something that has just been "deleted" leads to undefined behavior.
int *p = new int();
delete p;
int *p1 = p; // <- undefined behavior
Calling a member function through an invalid pointer is just a specific case of the above. The pointer is used as an argument for the implicit parameter this. Passing a pointer is an non-reference argument is an act of reading it, which is why the behavior is undefined in your example.
So, your question really boils down to why reading invalid pointer values causes undefined behavior.
Well, there could be many platform-specific reasons for that. For example, on some platforms the act of reading a pointer might lead to the pointer value being loaded into some dedicated address-specific register. If the pointer is invalid, the hardware/OS might detect it immediately and trigger a program fault. In fact, this is how our popular x86 platform works with regard to segment registers. The only reason we don't hear much about it is that the popular OSes stick to flat memory model that simply does not actively use segment registers.
C++11 actually states that dereferencing invalid pointer values causes undefined behavior, while all other uses of invalid pointer value cause implementation-defined behavior. It also notes that implementation-defined behavior in case of "copying an invalid pointer" might lead to "a system-generated runtime fault". So it might actually be possible to carefully maneuver one's way through the labyrinth of C++11 specification and successfully arrive at the conclusion that calling a non-virtual method through an invalid pointer should result in implementation-defined behavior mentioned above. By in any case the possibility of "a system-generated runtime fault" will always be there.
Dereferencing of this in this case is effectively an implementation detail. I'm not saying that the this pointer is not defined by the standard, because it is, but from a semantically abstracted standpoint what is the purpose of allowing the use of objects that have been destroyed, just because there is a corner case in which in practice it will be "safe"? None. So it's not. No object exists, so you may not call a function on it.
Related
Is there a way, in purely standard C++, to somehow call a function generated within a buffer?
Let's assume:
We've used implementation-specific knowledge (and obv. architecture-specific) to generate the byte sequence implementing some function in an array of char.
This implementation correctly matches the implementation's calling convention, and other relevant parts of the ABI, so that it will actually function correctly if called as C++ function
The buffer is correctly marked as executable
I.e., all the architecture / ABI hurdles have been overcome, all we need to do is actually call this block of code.
The question is: is there a standards-compliant way to create a function pointer to this and call said function pointer without hitting undefined behavior?
As I understand it, it is implementation-defined if we can cast a pointer to an object type to a function pointer type. I also believe that it is implementation-defined if said pointer will point to the same logical address as the original pointer. The obstacle is the call: from the perspective of the implementation, all I've done is try and cast a pointer to an array of chars into a function pointer and then call it. If said ptr was one to an object type, I know this would be violating strict aliasing rules, and would be undefined behavior. But is it undefined behavior or merely implementation-defined what will occur if I attempt to call this function pointer?
As I understand your question, you ask about code such as:
char buffer[200] = "some valid operations";
void (*func)() = reinterpret_cast<void(*)()>(buffer);
func();
I see in the standard in C++ expr.call 7.6.1.3 Function call:
A function call is a postfix expression followed by parentheses containing a possibly empty, comma-separated list of initializer-clauses which constitute the arguments to the function.
[...]
The postfix expression shall have function type or function pointer type.
For a call to a non-member function or to a static member function, the postfix expression shall either be an lvalue that refers to a function [...], or have function pointer type.
The func in the above code is an lvalue that refers to a char array and not to a function. The following points in the standard describe other cases of function call expression, such as virtual function call or call to a destructor of some object. These points also do not apply here. To summarize, the standard does not define what will happen when the function call expression is applied to a lvalue that refers to a object that is a char array. Because it's not defined, hence it's undefined behavior.
is there a standards-compliant way to create a function pointer to this
Just cast it, via reinterpret_cast. The resulting value is implementation-defined.
and call said function pointer without hitting undefined behavior?
No.
Is calling a function pointer to generated code undefined behavior?
Yes.
Is there a way, in purely standard C++, to somehow call a function generated within a buffer?
No.
But is it undefined behavior or merely implementation-defined what will occur if I attempt to call this function pointer?
Undefined behavior.
It is implementation-defined what the conversion means, so the implementation can choose to interpret it as a pointer to some function “defined by the implementation” (as if its standard library contained all possible functions). I’m not sure that implementations actually document support for this technique (as required for implementation-defined behavior), but it’s only reasonable to interpret that as a bug in their documentation.
The following code (or its equivalent which uses explicit casts of null literal to get rid of temporary variable) is often used to calculate the offset of a specific member variable within a class or struct:
class Class {
public:
int first;
int second;
};
Class* ptr = 0;
size_t offset = reinterpret_cast<char*>(&ptr->second) -
reinterpret_cast<char*>(ptr);
&ptr->second looks like it is equivalent to the following:
&(ptr->second)
which in turn is equivalent to
&((*ptr).second)
which dereferences an object instance pointer and yields undefined behavior for null pointers.
So is the original fine or does it yield UB?
Despite the fact that it does nothing, char* foo = 0; *foo; is could be undefined behavior.
Dereferencing a null pointer is could be undefined behavior. And yes , ptr->foo is equivalent to (*ptr).foo, and *ptr dereferences a null pointer.
There is currently an open issue in the working groups about if *(char*)0 is undefined behavior if you don't read or write to it. Parts of the standard imply it is, other parts imply it is not. The current notes there seem to lean towards making it defined.
Now, this is in theory. How about in practice?
Under most compilers, this works because no checks are done at dereferencing time: memory around where null pointer point to is guarded against access, and the above expression simply takes an address of something around null, it does not read or write the value there.
This is why cpp reference offsetof lists pretty much that trick as a possible implementation. The fact that some (many? most? every one I've checked?) compilers implement offsetof in a similar or equivalent manner does not mean that the behavior is well defined under the C++ standard.
However, given the ambiguity, compilers are free to add checks at every instruction that dereferences a pointer, and execute arbitrary code (fail fast error reporting, for example) if null is indeed dereferenced. Such instrumentation might even be useful to find bugs where they occur, instead of where the symptom occurs. And on systems where there is writable memory near 0 such instrumentation could be key (pre-OSX MacOS had some writable memory that controlled system functions near 0).
Such compilers could still write offsetof that way, and introduce pragmas or the like to block the instrumentation in the generated code. Or they could switch to an intrinsic.
Going a step further, C++ leaves lots of latitude on how non-standard-layout data is arranged. In theory, classes could be implemented as rather complex data structures and not the nearly standard-layout structures we have grown to expect, and the code would still be valid C++. Accessing member variables to non-standard-layout types and taking their address could be problematic: I do not know if there is any guarantee that the offset of a member variable in a non-standard layout type does not change between instances!
Finally, some compilers have aggressive optimization settings that find code that executes undefined behavior (at least under certain branches or conditions), and uses that to mark that branch as unreachable. If it is decided that null dereference is undefined behavior, this could be a problem. A classic example is gcc's aggressive signed integer overflow branch eliminator. If the standard dictates something is undefined behavior, the compiler is free to consider that branch unreachable. If the null dereference is not behind a branch in a function, the compiler is free to declare all code that calls that function to be unreachable, and recurse.
And it would be free to do this in not the current, but the next version of your compiler.
Writing code that is standards-valid is not just about writing code that compiles today cleanly. While the degree to which dereferencing and not using a null pointer is defined is currently ambiguous, relying on something that is only ambiguously defined is risky.
What is the boolean value that is returned when a fucntion call is made on a null object in C++?
ClassDummy* dummy = NULL;
if (!dummy->dummy_function(1,2,3)) {
// Do Something
}
Shouldn't this return an error according to C++11 standards?
Unless dummy has been declared at namespace scope, it is uninitialized and its value is unspecified, i.e. it may or may not be null. Invoking a member function on a nullptr, or on a pointer that is pointing to invalid memory, is undefined behavior.
You might get away with the correct result if the member function you invoke doesn't access any other data members of the class; in other words, if it doesn't dereference the this pointer. However, regardless of whether you obtain the expected result or not, it is still undefined behavior.
Compilers are not required to detect such invalid function calls. If you do something obvious, like initialize an object pointer to nullptr and then invoke a member function on it, you may see a diagnostic from the compiler, but this is not guaranteed.
In the original question dummy had an unspecified value, maybe null. The effect after the edit isn't changed, though: The behavior is undefined. Its not an error any tool is required to detect: it is the mandate of the programmer to call member functions only on valid objects.
Calling a method through a null pointer has undefined behavior. The code is invalid, but the compiler is not required to tell you that it is.
The C++ standard defines many cases where, even though the code is invalid, the compiler is not required to give a "diagnostic". In many cases, the reason is just because it is very difficult for the compiler to determine whether or not the code is valid. In your particular code, this is pretty easy, and some compilers may in fact give a warning about it if you use the proper warning level. However, it wouldn't be too difficult to construct a more complicated example where the compiler wouldn't easily be able to tell whether the pointer was null or not. Standardizing exactly how complex the code has to be before the compiler doesn't need to give a diagnostic would be difficult, and probably pointless, so it is just left up to each implementation to make that decision.
Trying to dereference a null pointer like this will result in undefined behavior.
(Typically, a core dump or memory access exception on most systems.)
Calling non-static member functions and accessing non-static member variables should produce an error.
Some compilers and run time environments allow use of static member variables and member functions with a NULL pointer but that is still undefined behavior according to the standard.
See C++ static const access through a NULL pointer and the accepted answer for more details.
As, the title says:
Why is calling non virtual member function on deleted pointer an undefined behavior?
Note the Question does not ask if it is an Undefined Behavior, it asks Why it is undefined behavior.
Consider the following program:
#include<iostream>
class Myclass
{
//int i
public:
void doSomething()
{
std::cout<<"Inside doSomething";
//i = 10;
}
};
int main()
{
Myclass *ptr = new Myclass;
delete ptr;
ptr->doSomething();
return 0;
}
In the above code, the compiler does not actually dereference this while calling member function doSomething(). Note that the function is not an virtual function & the compilers convert the member function call to a usual function call by passing this as the first parameter to the function(As I understand this is implementation defined). They can do so because the compiler can exactly determine which function to call at compile time itself. So practically, calling the member function through deleted pointer does not dereference the this. The this is dereferenced only if any member is accessed inside the function body.(i.e: Uncommenting code in above example that accesses i)
If an member is not accessed within the function there is no purpose that the above code should actually invoke undefined behavior.
So why does the standard mandate that calling the non virtual member function through deleted pointer is an undefined behavior, when in fact it can reliably say that dereferencing the this should be the statement which should cause undefined behavior? Is it merely for sake of simplicity for users of the language that standard simply generalizes it or is there some deeper semantic involved in this mandate?
My feeling is that perhaps since it is implementation defined how compilers can invoke the member function may be that is the reason standard cannot enforce the actual point where UB occurs.
Can someone confirm?
Because the number of cases in which it might be reliable are so slim, and doing it is still an ineffably stupid idea. There's no benefit to defining the behaviour.
So why does the standard mandate that calling the non virtual member function through deleted pointer is an undefined behavior, when in fact it can reliably say that dereferencing the this should be the statement which should cause undefined behavior?
[expr.ref] paragraph 2 says that a member function call such as ptr->doSomething() is equivalent to (*ptr).doSomething() so calling a non-static member function is a dereference. If the pointer is invalid that's undefined behaviour.
Whether the generated code actually needs to dereference the pointer for specific cases is not relevant, the abstract machine that the compiler models does do a dereference in principle.
Complicating the language to define exactly which cases would be allowed as long as they don't access any members would have almost zero benefit. In the case where you can't see the function definition you have no idea if calling it would be safe, because you can't know if the function uses this or not.
Just don't do it, there's no good reason to, and it's a Good Thing that the language forbids it.
In C++ language (according to C++03) the very attempt to use the value of an invalid pointer is causing undefined behavior already. There's no need to dereference it for the UB to happen. Just reading the pointer value is enough. The concept of "invalid value" that causes UB when you merely attempt to read that value actually extends to almost all scalar types, not just to pointers.
After delete the pointer is generally invalid in that specific sense, i.e. reading a pointer that supposedly points to something that has just been "deleted" leads to undefined behavior.
int *p = new int();
delete p;
int *p1 = p; // <- undefined behavior
Calling a member function through an invalid pointer is just a specific case of the above. The pointer is used as an argument for the implicit parameter this. Passing a pointer is an non-reference argument is an act of reading it, which is why the behavior is undefined in your example.
So, your question really boils down to why reading invalid pointer values causes undefined behavior.
Well, there could be many platform-specific reasons for that. For example, on some platforms the act of reading a pointer might lead to the pointer value being loaded into some dedicated address-specific register. If the pointer is invalid, the hardware/OS might detect it immediately and trigger a program fault. In fact, this is how our popular x86 platform works with regard to segment registers. The only reason we don't hear much about it is that the popular OSes stick to flat memory model that simply does not actively use segment registers.
C++11 actually states that dereferencing invalid pointer values causes undefined behavior, while all other uses of invalid pointer value cause implementation-defined behavior. It also notes that implementation-defined behavior in case of "copying an invalid pointer" might lead to "a system-generated runtime fault". So it might actually be possible to carefully maneuver one's way through the labyrinth of C++11 specification and successfully arrive at the conclusion that calling a non-virtual method through an invalid pointer should result in implementation-defined behavior mentioned above. By in any case the possibility of "a system-generated runtime fault" will always be there.
Dereferencing of this in this case is effectively an implementation detail. I'm not saying that the this pointer is not defined by the standard, because it is, but from a semantically abstracted standpoint what is the purpose of allowing the use of objects that have been destroyed, just because there is a corner case in which in practice it will be "safe"? None. So it's not. No object exists, so you may not call a function on it.
int& fun()
{
int * temp = NULL;
return *temp;
}
In the above method, I am trying to do the dereferencing of a NULL pointer. When I call this function it does not give exception. I found when return type is by reference it does not give exception if it is by value then it does. Even when dereferencing of NULL pointer is assinged to reference (like the below line) then also it does not give.
int* temp = NULL:
int& temp1 = *temp;
Here my question is that does not compiler do the dereferencing in case of reference?
Dereferencing a null pointer is Undefined Behavior.
An Undefined Behavior means anything can happen, So it is not possible to define a behavior for this.
Admittedly, I am going to add this C++ standard quote for the nth time, but seems it needs to be.
Regarding Undefined Behavior,
C++ Standard section 1.3.24 states:
Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
NOTE:
Also, just to bring it to your notice:
Using a returned reference or pointer to a local variable inside a function is also an Undefined Behavior. You should be allocating the pointer on freestore(heap) using new and then returning a reference/pointer to it.
EDIT:
As #James McNellis, appropriately points out in the comments,
If the returned pointer or reference is not used, the behavior is well defined.
When you dereference a null pointer, you don't necessarily get an exception; all that is guaranteed is that the behavior is undefined (which really means that there is no guarantee at all as to what the behavior is).
Once the *temp expression is evaluated, it is impossible to reason about the behavior of the program.
You are not allowed to dereference a null pointer, so the compiler can generate code assuming that you don't do that. If you do it anyway, the compiler might be nice and tell you, but it doesn't have to. It's your part of the contract that says you must not do it.
In this case, I bet the compiler will be nice and tell you the problem already at compile time, if you just set the warning level properly.
Don't * a null pointer, it's UB. (undefined behavior, you can never assume it'll do anything short of lighting your dog on fire and forcing you to take shrooms which will lead to FABULOUS anecdotes)
Some history and information of null pointers in the Algol/C family: http://en.wikipedia.org/wiki/Pointer_(computing)#Null_pointer
Examples and implications of undefined behavior: http://en.wikipedia.org/wiki/Undefined_behavior#Examples_in_C
I don't sure I understand what you're trying todo. Dereferencing of ** NULL** pointer is not defined.
In case you want to indicate that you method not always returns value you can declare it as:
bool fun(int &val);
or stl way (similar to std::map insert):
std::pair<int, bool> fun();
or boost way:
boost::optional<int> fun();