Why this code compiles? - c++

I have a code.
class A
{
public:
int foo(int i)
{
return i;
}
};
int foo(int i)
{
return i;
}
int (A::*ptrFoo)(int) = NULL;
int (*_foo)(int) = NULL;
int main()
{
ptrFoo = &A::foo;
_foo = foo;
(*_foo)++++++++++++++(10); //This dont compile...
A a;
(a.*ptrFoo)+++++++++++++++++(10); //This compiles ????
}
please tell me what it is ?? a undefined behavior or what ??? I compiled it in VS2008.Strangely the last line of code compiles successfully.

Neither expression should compile: in C++, you cannot perform arithmetic on a pointer to a function or member function, or on a function type or member function. The two expressions in your program attempt to perform arithmetic on a function and on a member function, respectively.
If your compiler accepts the second expression, it is due to a bug in the compiler.

First note that pointer to functions are different with pointer to member functions.
Your first example is a pointer to an ordinary function. It contains the real memory address of the function. When you dereference it ((*_foo)) you get the function itself, and arithmetic operations including ++ on a function (function pointer) are meaningless.
The second one is another story, pointers to member functions of classes do not carry the address of the function in memory. Actually how compiler manages member functions is implementation-specific. A pointer to a member function may contain some address or maybe some compiler-specific information. Arithmetic on this type is also meaningless.
Therefore we don't know what the value of (a.*ptrFoo) ever is, but in your case MSVC2008 managed to compiler it, either because of a bug or by design.
By the way, GCC does not compile any of the two statements and threw errors on both.
The above is true whether you put even number of +'s or odd numbers; we are doing arithmetic anyway. (If there are an odd number of +'s then there is no function call, as in your second example you are incrementing the function 8 times then the last remaining + adds 10 to the result. Again, this doesn't matter: we are trying to change a function/member function pointer.)

Related

null pointer dereference when used as an lvalue

Background
I have a class containing different members (custom run time constructed structs). And I have a compile time tuple containing pairs of pointer-to-member elements and strings. Compile time I need to check if every pointer-to-member and name is used only once in the list, and the custom structs check if they have an entry in the tuple (they know their own pointer-to-member). Having a tuple for this purpose increases the compile time dramatically, it would be great to identify the members in compile time with a void* array and not with a heterogeneous data struct.
Attempt to solve problem
As I read in this thread, dereferencing a nullptr is not always undefined behavior.
I read CWG-issue #315 also, that states:
We agreed the example should be allowed. p->f() is rewritten as (*p).f() according to 5.2.5 [expr.ref]. *p is not an error when p is null unless the lvalue is converted to an rvalue (4.1 [conv.lval]), which it isn't here.
I wanted to leverage this to get a normal pointer from a pointer-to-member (I don't want to dereference them, I just want to compare pointers-to-members from the same class but with different types).
So I created the following code:
#include <iostream>
class Test
{
int a;
public:
static constexpr inline int Test::*memPtr = &Test::a;
static constexpr inline int* intPtr = &(static_cast<Test*>(nullptr)->*Test::memPtr);
};
int main () {
std::cout << Test::intPtr << std::endl;
}
In my opinion the &(static_cast<Test*>(nullptr)->*Test::memPtr); expression uses the same approach as the code that was discussed in CWG-issue #315.
The code above compiles with MSVC but not with clang or gcc.
I checked if similar code that was mentioned in #315 compiles or not:
struct Test {
static constexpr int testFun () { return 10; }
};
int main ()
{
static constexpr int res{static_cast<Test*>(nullptr)->testFun()};
static_assert(res == 10, "error");
}
And yes, it does. test code
Should the construct I used in the first example be available in constexpr expressions (as undefined behavior is not allowed there)?
Fun fact: If I modify my original code and add a virtual destructor to the class then both MSVC and clang are happy with it, and gcc crashes. I mean literally, it segfaults.
Fun fact 2: If I remove the virtual destructor and make the class templated gcc and MSVC compiles it, but now clang complains.
From the standard on ->*'s behavior:
The expression E1->*E2 is converted into the equivalent form (*(E1)).*E2.
And for .*:
Abbreviating pm-expression.*cast-expression as E1.*E2, E1 is called the object expression. If the dynamic type of E1 does not contain the member to which E2 refers, the behavior is undefined.
The dynamic type of E1 (which dereferences a nullptr) does not exist, because it's a reference to no object. Therefore, the behavior of this expression is undefined.

Difference between f and &f [duplicate]

It's interesting that using the function name as a function pointer is equivalent to applying the address-of operator to the function name!
Here's the example.
typedef bool (*FunType)(int);
bool f(int);
int main() {
FunType a = f;
FunType b = &a; // Sure, here's an error.
FunType c = &f; // This is not an error, though.
// It's equivalent to the statement without "&".
// So we have c equals a.
return 0;
}
Using the name is something we already know in array. But you can't write something like
int a[2];
int * b = &a; // Error!
It seems not consistent with other parts of the language. What's the rationale of this design?
This question explains the semantics of such behavior and why it works. But I'm interested in why the language was designed this way.
What's more interesting is the function type can be implicitly converted to pointer to itself when using as a parameter, but will not be converted to a pointer to itself when using as a return type!
Example:
typedef bool FunctionType(int);
void g(FunctionType); // Implicitly converted to void g(FunctionType *).
FunctionType h(); // Error!
FunctionType * j(); // Return a function pointer to a function
// that has the type of bool(int).
Since you specifically ask for the rationale of this behavior, here's the closest thing I can find (from the ANSI C90 Rationale document - http://www.lysator.liu.se/c/rat/c3.html#3-3-2-2):
3.3.2.2 Function calls
Pointers to functions may be used either as (*pf)() or as pf().
The latter construct, not sanctioned in the Base Document, appears in
some present versions of C, is unambiguous, invalidates no old code,
and can be an important shorthand. The shorthand is useful for
packages that present only one external name, which designates a
structure full of pointers to object s and functions : member
functions can be called as graphics.open(file) instead of
(*graphics.open)(file). The treatment of function designators can
lead to some curious , but valid , syntactic forms . Given the
declarations :
int f ( ) , ( *pf ) ( ) ;
then all of the following expressions are valid function calls :
( &f)(); f(); (*f)(); (**f)(); (***f)();
pf(); (*pf)(); (**pf)(); (***pf)();
The first expression on each line was discussed in the previous
paragraph . The second is conventional usage . All subsequent
expressions take advantage of the implicit conversion of a function
designator to a pointer value , in nearly all expression contexts .
The Committee saw no real harm in allowing these forms ; outlawing
forms like (*f)(), while still permitting *a (for int a[]),
simply seemed more trouble than it was worth .
Basically, the equivalence between function designators and function pointers was added to make using function pointers a little more convenient.
It's a feature inherited from C.
In C, it's allowed primarily because there's not much else the name of a function, by itself, could mean. All you can do with an actual function is call it. If you're not calling it, the only thing you can do is take the address. Since there's no ambiguity, any time a function name isn't followed by a ( to signify a call to the function, the name evaluates to the address of the function.
That actually is somewhat similar to one other part of the language -- the name of an array evaluates to the address of the first element of the array except in some fairly limited circumstances (being used as the operand of & or sizeof).
Since C allowed it, C++ does as well, mostly because the same remains true: the only things you can do with a function are call it or take its address, so if the name isn't followed by a ( to signify a function call, then the name evaluates to the address with no ambiguity.
For arrays, there is no pointer decay when the address-of operator is used:
int a[2];
int * p1 = a; // No address-of operator, so type is int*
int (*p2)[2] = &a; // Address-of operator used, so type is int (*)[2]
This makes sense because arrays and pointers are different types, and it is possible for example to return references to arrays or pass references to arrays in functions.
However, with functions, what other type could be possible?
void foo(){}
&foo; // #1
foo; // #2
Let's imagine that only #2 gives the type void(*)(), what would the type of &foo be? There is no other possibility.

GCC pure/const functions that accept a pointer argument

Can someone please clarify whether (and why) a function can be attributed pure or const if it has a pointer parameter.
According to the GCC documentation:
Some of common examples of pure functions are strlen or memcmp.
The whole point of a pure function is that it need only be called once for the same parameters, i.e. the result can be cached if the compiler thinks it fit to do so, however how does this work for memcmp?
for example:
char *x = calloc(1, 8);
char *y = calloc(1, 8);
if (memcmp(x, y, 8) > 0)
printf("x > y\n");
x[1] = 'a';
if (memcmp(x, y, 8) > 0)
printf("x > y\n");
The parameters to the second call to memcmp are identical to the first (the pointers point to the same address), how does the compiler know not to use the result from the first call, if memcmp is pure?
In my case I want to pass an array to a pure function, and calculate the result based on the array alone. Someone reassure me that this is okay, and that when values in the array change but the address does not, my function will be called correctly.
If I understood the documentation correctly, a pure function can depend on values of the memory, where the compiler knows whenever the memory changes. Moreover, a pure function can not change the state of the program, such as a global variable, it only produces a return value.
In your example code, memcmp can be a pure function. The compiler sees that the memory is changed between the calls to memcmp, and can not reuse the result of the first call for the second call.
On the other hand, memcmp can not be declared as a const function, since it depends on data in memory.
If it was const, the compiler could apply more aggressive optimizations.
For this reason, it seems safe to declare the function that you want to implement as pure (but not const).
With respect to pure we can see from the article Implications of pure and constant functions that pure means that the function does not have side effects and only depends on the parameters.
So if the compiler can determine that the arguments are the same, and memory has not changed between subsequent calls it can eliminate the subsequent calls to the pure function since it knows the pure function does not have side effects.
Which means the compiler has to do analysis to be able to determine if the arguments to the pure function could have been modified before it can decide to eliminate subsequent calls to a pure function for the same arguments.
An example from the article is as follows:
int someimpurefunction(int a);
int somepurefunction(int a)
__attribute__((pure));
int testfunction(int a, int b, int c, int d) {
int res1 = someimpurefunction(a) ? someimpurefunction(a) : b;
int res2 = somepurefunction(a) ? somepurefunction(a) : c;
int res3 = a+b ? a+b : d;
return res1+res2+res3;
}
and it shows the optimized assembly generated which shows that somepurefunction was only called once and then says:
As you can see, the pure function is called just once, because the two references inside the ternary operator are equivalent, while the other one is called twice. This is because there was no change to global memory known to the compiler between the two calls of the pure function (the function itself couldn't change it – note that the compiler will never take multi-threading into account, even when asking for it explicitly through the -pthread flag), while the non-pure function is allowed to change global memory or use I/O operations.
This logic also applies to a pointer, so if the compiler can prove the memory pointed to the pointer has not been modified then it can eliminate the call to the pure function so in your case when the compiler sees:
x[1] = 'a';
it can not eliminate the second call to memcmp because memory pointed to by x has changed.

Function scope regarding pointers in C++ (or C)

I am attempting to write portable code that allows the function to access a variable like an array, even if it is just a single value. The idea behind it is that the code will not make an array of size 1, but I need to be able to loop over all the values in the array if it is an array. Since I can't use sizeof(foo) to determine whether the memory is larger than a single instance sizeof(foo)/sizeof(int) might work, but it is too cumbersome to include in the main code. Macros wouldn't help because if I used a ternary operator like I'd expect #define ARRAY_OR_NOT(foo, type) (sizeof(foo)/sizeof(type) > 1) ? (foo) : (&foo) to return a pointer, in order to access with indexing. This problem is the compiler doesn't like the mixing of types between pointers and non-pointers.
So my second attempt was function overloading.
int * convert(int value)
{return &value;}
int * convert(int * value)
{return value;}
I know that this wouldn't work, because the first function would return the address of the temporary variable copy in that function scope. So my third attempt was
int * convert(int * value)
{return value;}
int * convert(int ** value)
{return *value;}
Every time I would call convert, pass the address of the value: convert(&foo).
This should work, and (I think) it avoids returning a temporary function scope
address. The result of convert would be accessible with indexing. In a controlled for loop, the code would run smoothly. The program would know how many elements are in value, but it would be faster to run everything inside a for loop than not.
So why does my second block of code produce the "Warning returning temporary scope blahblahblah" warning?
UPDATE: Major XY problem here.
Basically I'm trying to wrap all my code in a loop and access each value in a variable, one value per loop iteration. The system would know how many values are in that variable, but the code base is so large that wrapping everything in an if/else would be slow. So the way to access some value in the for loop with an index would be int foo = convert(&maybeArray)[counter]; Then I would use foo several times in the for loop.
For some reason Visual Studio was throwing an error while with the second block of code. Added this to OP.
Another solution would be to make 2 functions with overloaded operators that would basically execute the entire code, without converting each variable, but the code base is very large, and this needs to be as portable as possible. Referencing convert would be more future proof I would believe.
You've tagged this as C++, so I'm assuming you are using a C++ compiler.
There's a lot going on in your question, so I'm going to simplify. You want a C++ function convert(x) that will:
if x is an array, return the address of the first element
if x is not an array, return &x.
(Generally, maybe you need to redesign this whole thing, convert seems like a pretty strange function to want).
template<typename T, size_t N>
auto convert( T (&t) [N] ) -> T* {
return t; // just let pointer decay work for us here
}
template<typename T>
auto convert( T &t) -> T* {
return &t;
}
And, in C++, I would never use sizeof with things that I think are arrays. This template technique is a safer way to count the number of elements in an array.
Also, do you expect to have arrays of pointers, and to want to treat a single pointer as a single-element-array of pointers? If so, then tread carefully. Something that looks like an array, might actually be a pointer, e.g. arrays in parameter lists foo(int is_really_a_pointer[5]) { ...}. See the comment by #MSalters for more. Might be good to use his assert to catch any surprises. If you're just using int, then don't use the typename T in my templates, just force it to be int for clarity.
Finally, maybe instead of turning arrays into pointers, you should ask for a function that turns a non-array into a reference to a single-element array?
Update Here is a more complete example showing how to use convert and convert_end to find the beginning and end of an array to iterate over all the elements in an array; where, of course, a non-array is treated as an array of one element.
In C, there exist only pass by value. When you pass a pointer to a function then its address is copied to the function parameter. This simply means that if p is a pointer in calling function then a function call
int x = 5;
int *p = &x;
int a = foo(p);
for function definition
int foo(int *p1)
{
return *p1*2;
}
is implies that:
copy the address p points to parameter p1, i.e make p and p1 points to the same location.
any changes to the location pointed by p1 in function foo is reflected to *p because p and p1 is pointing to same location. But, if at any point p1 points to another location then this does not imply that p will point to that location too. p and p1 are two different pointers.
When you you pass a pointer to pointer, as in your last snippet of second block,
int * convert(int ** value)
{return *value;}
if *value changes to points to different location after argument is passed to it, then that pointer whose address is passed will also be updated with this location. In this case no need to return *value, but returning do no harm.

Why is using the function name as a function pointer equivalent to applying the address-of operator to the function name?

It's interesting that using the function name as a function pointer is equivalent to applying the address-of operator to the function name!
Here's the example.
typedef bool (*FunType)(int);
bool f(int);
int main() {
FunType a = f;
FunType b = &a; // Sure, here's an error.
FunType c = &f; // This is not an error, though.
// It's equivalent to the statement without "&".
// So we have c equals a.
return 0;
}
Using the name is something we already know in array. But you can't write something like
int a[2];
int * b = &a; // Error!
It seems not consistent with other parts of the language. What's the rationale of this design?
This question explains the semantics of such behavior and why it works. But I'm interested in why the language was designed this way.
What's more interesting is the function type can be implicitly converted to pointer to itself when using as a parameter, but will not be converted to a pointer to itself when using as a return type!
Example:
typedef bool FunctionType(int);
void g(FunctionType); // Implicitly converted to void g(FunctionType *).
FunctionType h(); // Error!
FunctionType * j(); // Return a function pointer to a function
// that has the type of bool(int).
Since you specifically ask for the rationale of this behavior, here's the closest thing I can find (from the ANSI C90 Rationale document - http://www.lysator.liu.se/c/rat/c3.html#3-3-2-2):
3.3.2.2 Function calls
Pointers to functions may be used either as (*pf)() or as pf().
The latter construct, not sanctioned in the Base Document, appears in
some present versions of C, is unambiguous, invalidates no old code,
and can be an important shorthand. The shorthand is useful for
packages that present only one external name, which designates a
structure full of pointers to object s and functions : member
functions can be called as graphics.open(file) instead of
(*graphics.open)(file). The treatment of function designators can
lead to some curious , but valid , syntactic forms . Given the
declarations :
int f ( ) , ( *pf ) ( ) ;
then all of the following expressions are valid function calls :
( &f)(); f(); (*f)(); (**f)(); (***f)();
pf(); (*pf)(); (**pf)(); (***pf)();
The first expression on each line was discussed in the previous
paragraph . The second is conventional usage . All subsequent
expressions take advantage of the implicit conversion of a function
designator to a pointer value , in nearly all expression contexts .
The Committee saw no real harm in allowing these forms ; outlawing
forms like (*f)(), while still permitting *a (for int a[]),
simply seemed more trouble than it was worth .
Basically, the equivalence between function designators and function pointers was added to make using function pointers a little more convenient.
It's a feature inherited from C.
In C, it's allowed primarily because there's not much else the name of a function, by itself, could mean. All you can do with an actual function is call it. If you're not calling it, the only thing you can do is take the address. Since there's no ambiguity, any time a function name isn't followed by a ( to signify a call to the function, the name evaluates to the address of the function.
That actually is somewhat similar to one other part of the language -- the name of an array evaluates to the address of the first element of the array except in some fairly limited circumstances (being used as the operand of & or sizeof).
Since C allowed it, C++ does as well, mostly because the same remains true: the only things you can do with a function are call it or take its address, so if the name isn't followed by a ( to signify a function call, then the name evaluates to the address with no ambiguity.
For arrays, there is no pointer decay when the address-of operator is used:
int a[2];
int * p1 = a; // No address-of operator, so type is int*
int (*p2)[2] = &a; // Address-of operator used, so type is int (*)[2]
This makes sense because arrays and pointers are different types, and it is possible for example to return references to arrays or pass references to arrays in functions.
However, with functions, what other type could be possible?
void foo(){}
&foo; // #1
foo; // #2
Let's imagine that only #2 gives the type void(*)(), what would the type of &foo be? There is no other possibility.