[expr.call]/6:
Calling a function through an expression whose function type is different from the function type of the called function's definition results in undefined behavior.
void f() noexcept {}; // function type is "noexcept function"
void (*pf)() = f; // variable type is "pointer to function"; initialized by result of [conv.fctptr]([conv.func](f))
int main()
{
(*pf)(); // `*pf`: lvalue expression's function type is "function" (without noexcept!)
}
Does the above call result in undefined behavior per the cited standardese?
C++14 had a weaker requirement, from [expr.call]/6:
[...] Calling a function through an expression whose function type has a language linkage that is different from the language linkage of the function type of the called function's definition is undefined ([dcl.link]). [...]
However [expr.reinterpret.cast]/6 contained a similar, but stronger requirement:
A function pointer can be explicitly converted to a function pointer of a different type. The effect of calling a function through a pointer to a function type ([dcl.fct]) that is not the same as the type used in the definition of the function is undefined.
P0012R1 made exception specifications to be part of the type system, and was implemented for C++17
The exception specification of a function is now part of the function’s type: void f() noexcept(true); and void f() noexcept(false); are functions of two distinct types. Function pointers are convertible in the sensible direction. (But the two functions f may not form an overload set.) This change strengthens the type system, e.g. by allowing APIs to require non-throwing callbacks.
and moreover added [conv.fctptr]:
Add a new section after section 4.11 [conv.mem]:
4.12 [conv.fctptr] Function pointer conversions
A prvalue of type "pointer to noexcept function" can be converted to a prvalue of type
"pointer to function". [...]
but included no changes to [expr.reinterpret.cast]/6; arguably an unintentional omission.
CWG 2215 highlighted the duplicated information in [expr.call] compared to [expr.reinterpret.cast]/6, flagging the weaker requirement in the former as redundant. The following cplusplus / draft commit implemented CWG 2215, and removed the weaker (redundant) requirement, made [expr.reinterpret.cast]/6 into a non-normative note and moved its (stronger) normative requirement to [expr.call]; eventually this stronger requirement was broken out into its own paragraph.
This confusion arguably lead to the unintentional (seemingly conflicting) rules that:
a prvalue of type “pointer to noexcept function” can be converted to a prvalue of type “pointer to function” ([conv.fctptr]/1), and
calling a function through an expression whose function type is different only by its exception specification is undefined behaviour.
Afaict, there are no defect reports covering this issue, and a new one should arguably be submitted.
Related
Microsoft lvalue definition:
An lvalue refers to an object that persists beyond a single
expression.
A second definition:
The expression E belongs to the lvalue category if and only if E
refers to an entity that ALREADY has had an identity (address, name or
alias) that makes it accessible outside of E.
I wrote the following code:
class A{};
const A& f1()
{
return A();
}
const int& f2()
{
return 1;
}
int main() {
cout<<&f1()<<endl; // this prints everytime "0".
cout<<&f2()<<endl; // this prints everytime "0".
return 0;
}
Why f1() and f2() are lvalue expressions?
Why an address of lvalue reference to rvalue is zero?
Why are both definitions equivalent?
Why f1() and f2() are lvalue expressions?
Because each are a function call to a function that returns an lvalue reference.
Standard draft: [expr.call]
11 A function call is an lvalue if the result type is an lvalue reference type or ...
Why the & character after the type name makes it an lvalue reference?
Standard draft: [dcl.ref]
1 In a declaration T D where D has either of the forms
& attribute-specifier-seqopt D1
&& attribute-specifier-seqopt D1
and the type of the identifier in the declaration T D1 is “derived-declarator-type-list T”, then the type of the identifier of D is “derived-declarator-type-list reference to T” ...
2 A reference type that is declared using & is called an lvalue reference ...
Why an addres of lvalue reference to rvalue is zero?
The behaviour is undefined.
Standard draft: [expr.unary.op]
3 The result of the unary & operator is ... the result has type “pointer to T” and is a prvalue that is the address of the designated object
There is no designated object, and the standard doesn't define the behaviour of the addressof operator in that case.
Standard draft: [defns.undefined]
behavior for which this document imposes no requirements
[ Note: Undefined behavior may be expected when this document omits any explicit definition of behavior ...
Why are both definitions equivalent?
They aren't necessarily equivalent. One or both of them may be incorrect. Both appear to be descriptions of lvalue expressions, rather than definitions.
The normative definition is in the C++ standard document.
What is the definition of lvalue by the standard?
Standard draft: [basic.lval]
(1.1) A glvalue is an expression whose evaluation determines the identity of an object, bit-field, or function.
...
(1.3) An xvalue is a glvalue that denotes an object or bit-field whose resources can be reused (usually because it is near the end of its lifetime).
[ Example: Certain kinds of expressions involving rvalue references ([dcl.ref]) yield xvalues, such as a call to a function whose return type is an rvalue reference or a cast to an rvalue reference type.
— end example
]
(1.4) An lvalue is a glvalue that is not an xvalue.
The [expr] section defines each possible expression in the language, and if the expression is an lvalue, then that is stated. "is an lvalue" occurs 37 times, but this simple search is not necessarily exhaustive.
Declaring a function with lvalue reference return type means that that function call is an lvalue expression (nothing more and nothing less).
The pages you linked to are both wrong in equating lvalue expressions with objects that "already exist" or "persist" or whatever. In your code is an example of an lvalue expression that refers to an object that only existed during the function call.
Using the result of the function call causes undefined behaviour because the behaviour of lvalue expressions is only defined for when they actually refer to an object. (Plus a few cases for referring to a potential object under construction or destruction, but that doesn't apply here, since the object's associated storage is already released by the time the calling code uses the result of the expression).
Undefined behaviour means anything can happen, including (but not limited to) outputting a zero.
You end up returning a hanging reference because the returned reference has nothing to reference because A() is destructed at the end of the method. What you have is undefined behavior. The 0 is a placeholder for the fact that it is not referencing any memory location. The f2() function returns another temporary variable as reference. To be absolutely clear the memory location they return is 0 because the memory location they reference does not exit any longer.
Hope this helps.
I am trying to determine whether the following code invokes undefined behavior:
#include <iostream>
class A;
void f(A& f)
{
char* x = reinterpret_cast<char*>(&f);
for (int i = 0; i < 5; ++i)
std::cout << x[i];
}
int main(int argc, char** argue)
{
A* a = reinterpret_cast<A*>(new char[5])
f(*a);
}
My understanding is that reinterpret_casts to and from char* are compliant because the standard permits aliasing with char and unsigned char pointers (emphasis mine):
If a program attempts to access the stored value of an object through an lvalue of other than one of the following types the behavior is undefined:
the dynamic type of the object,
a cv-qualified version of the dynamic type of the object,
a type that is the signed or unsigned type corresponding to the dynamic type of the object,
a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union),
a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
a char or unsigned char type.
However, I am not sure whether f(*a) invokes undefined behavior by creating a A& reference to the invalid pointer. The deciding factor seems to be what "attempts to access" verbiage means in the context of the C++ standard.
My intuition is that this does not constitute an access, since an access would require A to be defined (it is declared, but not defined in this example). Unfortunately, I cannot find a concrete definition of "access" in the C++ standard:
Does f(*a) invoke undefined behavior? What constitutes "access" in the C++ standard?
I understand that, regardless of the answer, it is likely a bad idea to rely on this behavior in production code. I am asking this question primarily out of a desire to improve my understanding of the language.
[Edit] #SergeyA cited this section of the standard. I've included it here for easy reference (emphasis mine):
5.3.1/1 [expr.unary.op]
The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object type, or a pointer to a function type and the result is an lvalue referring to the object or function to which the expression points. If the type of the expression is “pointer to T,” the type of the result is “T.” [Note: indirection through a pointer to an incomplete type (other than cv void) is valid. The lvalue thus obtained can be used in limited ways (to initialize a reference, for example); this lvalue must not be converted to a prvalue, see 4.1. — end note ]
Tracing the reference to 4.1, we find:
4.1/1 [conv.lval]
A glvalue (3.10) of a non-function, non-array type T can be converted to a prvalue. If T is an incomplete type, a program that necessitates this conversion is ill-formed. If T is a non-class type, the type of the prvalue is the cv-unqualified version of T. Otherwise, the type of the prvalue is T.
When an lvalue-to-rvalue conversion is applied to an expression e, and either:
e is not potentially evaluated, or
the evaluation of e results in the evaluation of a member ex of the set of potential results of e, and ex names a variable x that is not odr-used by ex (3.2)
the value contained in the referenced object is not accessed.
I think our answer lies in whether *a satisfies the second bullet point. I am having trouble parsing that condition, so I am not sure.
char* x = reinterpret_cast<char*>(&f); is valid. Or, more specifically, access through x is allowed - the cast itself is always valid.
A* a = reinterpret_cast<A*>(new char[5]) is not valid - or, to be precise, access through a will trigger undefined behaviour.
The reason for this is that while it's OK to access object through a char*, it's not OK to access array of chars through a random object. Standard allows first, but not the second.
Or, in layman terms, you can alias a type* through char*, but you can't alias char* through type*.
EDIT
I just noticed I didn't answer direct question ("What constitutes "access" in the C++ standard"). Apparently, Standard does not define access (at least, I was not able to find the formal definition), but dereferencing the pointer is commonly understood to qualify for access.
I've been experimenting with function types in C++. Note that I don't mean pointer-to-function types like:
typedef void (*voidFuncPtr)();
but the more exotic:
typedef void (voidFunc)();
I didn't expect the following code to compile, but surprisingly it did:
template<voidFunc func>
class funcClass
{
public:
void call() { func(); };
};
void func()
{ }
void Test()
{
funcClass<func> foobar;
foobar.call();
}
however, if I try adding the following to funcClass:
voidFuncPtr get() { return &func; }
I get the error Address expression must be an lvalue or a function designator
My first question here is: what kind of black magic is the compiler using to pretend that a func type is something it can actually pass around an instance of? Is it just treating it like a reference? Second question is: if it can even be called, why can't the address of it be taken? Also, what are these non-pointer-to function types called? I only discovered them because of boost::function, and have never been able to find any documentation about them.
§14.1.4 of the Standard says:
A non-type template-parameter shall have one of the following (optionally cv-qualified) types:
— integral or enumeration type,
— pointer to object or pointer to function, [this is what yours is]
— lvalue reference to object or lvalue reference to function,
— pointer to member,
— std::nullptr_t.
And §14.1.6 says
A non-type non-reference template-parameter is a prvalue. It shall not
be assigned to or in any other way have its value changed. A non-type
non-reference template-parameter cannot have its address taken. When a
non-type non-reference template-parameter is used as an initializer
for a reference, a temporary is always used.
So that explains the two behaviours you are seeing.
Note that func is the same as &func (§14.3.2.1):
[A non-type template parameter can be] a constant expression (5.19) that designates the address of an object with static storage duration and external or internal linkage or a
function with external or internal linkage, including function
templates and function template-ids but excluding non-static class
members, expressed (ignoring parentheses) as & id-expression, except
that the & may be omitted if the name refers to a function or array
and shall be omitted if the corresponding template-parameter is a
reference; or...
So it's just a function pointer.
Given that the code compiles without the address-of operator and pointers (including to functions and member functions) are valid template arguments, it seems the compiler considers voidFunc to be a function pointer type, i.e., the decayed version of the type. The rules for this didn't change between C++ 2003 and C++ 2011.
Consider the following code:
class Foo;
Foo& CreateFoo();
void Bar()
{
CreateFoo();
}
In Visual Studio this will result in an error C2027 that Foo is an undefined type. In most other compilers it compiles fine. It is only an issue if the return value of CreateFoo is not assigned. If I change the line to:
Foo& foo = CreateFoo();
it compiles fine in Visual Studio. Also if Foo is defined rather than just forward-declared, then it will compile fine with no assignment.
Which should be the correct behavior? Is there anything in the C++ standard that addresses this, or is this something that is left to the implementation? I looked and didn't see anything that talks about this.
Update:
A bug report has been filed.
This looks like the relevant part of the Standard (section 5.2.2):
A function call is an lvalue if the result type is an lvalue reference type or an rvalue reference to function
type, an xvalue if the result type is an rvalue reference to object type, and a prvalue otherwise.
If a function call is a prvalue of object type:
if the function call is either
the operand of a decltype-specifier or
the right operand of a comma operator that is the operand of a decltype-specifier,
a temporary object is not introduced for the prvalue. The type of the prvalue may be incomplete.
[ Note: as a result, storage is not allocated for the prvalue and it is not destroyed; thus, a class type is
not instantiated as a result of being the type of a function call in this context. This is true regardless of
whether the expression uses function call notation or operator notation (13.3.1.2). — end note ] [ Note:
unlike the rule for a decltype-specifier that considers whether an id-expression is parenthesized (7.1.6.2),
parentheses have no special meaning in this context. — end note ]
otherwise, the type of the prvalue shall be complete.
Since this function result type is an lvalue reference type, the function call evaluates to an lvalue, and the completeness requirement does not apply.
The code is legal, at least in C++11, which no released version of Visual C++ implements fully.
You can always use incomplete types in function declarations (since that only declares a signature of the function, not any real code), but not when you use it.
Calling CreateFoo(); is equals to (void) CreateFoo();, and my guess is that Visual Studio needs to inspect the code of Foo to do ANY conversion (I'm not sure if you can actually write a void conversion), because, for conversions you need a complete type.
As for Foo & foo = CreateFoo();, this does not do any conversions, so you can get away with having an incomplete type.
(I am aware of the fact that returning address/reference to a variable local to the function should be avoided and a program should never do this.)
Does returning a reference to a local variable/reference result in Undefined Behavior? Or does the Undefined Behavior only occur later, when the returned reference is used (or "dereferenced")?
i.e. at what exact statement (#1 or #2 or #3) does code sample below invoke Undefined Behavior? (I've written my theory alongside each one)
#include <iostream>
struct A
{
int m_i;
A():m_i(10)
{
}
};
A& foo()
{
A a;
a.m_i = 20;
return a;
}
int main()
{
foo(); // #1 - Not UB; return value was never used
A const &ref = foo(); // #2 - Not UB; return value still not yet used
std::cout<<ref.m_i; // #3 - UB: returned value is used
}
I am interested to know what the C++ standard specifies in this regard.
I would like a citation from the C++ standard which will basically tell me which exact statement makes this code ill-formed.
Discussions about how specific implementations handle this are welcome but as I said an ideal answer would cite an reference from the C++ Standard that clarifies this beyond doubt.
Of course, when the reference is first initialised it is done so validly, satisfying the following:
[C++11: 8.3.2/5]: There shall be no references to references, no arrays of references, and no pointers to references. The declaration of a reference shall contain an initializer (8.5.3) except when the declaration contains an explicit extern specifier (7.1.1), is a class member (9.2) declaration within a class definition, or is the declaration of a parameter or a return type (8.3.5); see 3.1. A reference shall be initialized to refer to a valid object or function. [ Note: in particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the “object” obtained by dereferencing a null pointer, which causes undefined behavior. As described in 9.6, a reference cannot be bound directly to a bit-field. —end note ]
The reference being returned from the function is an xvalue:
[C++11: 3.10/1]: [..] An xvalue (an “eXpiring” value) also refers to an object, usually near the end of its lifetime (so that its resources may be moved, for example). An xvalue is the result of certain kinds of expressions involving rvalue references (8.3.2). [ Example: The result of calling a function whose return type is an rvalue reference is an xvalue. —end example ] [..]
That means the following does not apply:
[C++11: 12.2/1]: Temporaries of class type are created in various contexts: binding a reference to a prvalue (8.5.3), returning a prvalue (6.6.3), a conversion that creates a prvalue (4.1, 5.2.9, 5.2.11, 5.4), throwing an exception (15.1), entering a handler (15.3), and in some initializations (8.5).
[C++11: 6.6.3/2]: A return statement with neither an expression nor a braced-init-list can be used only in functions that do not return a value, that is, a function with the return type void, a constructor (12.1), or a destructor (12.4).
A return statement with an expression of non-void type can be used only in functions returning a value; the value of the expression is returned to the caller of the function. The value of the expression is implicitly converted to the return type of the function in which it appears. A return statement can involve the construction and copy or move of a temporary object (12.2). [ Note: A copy or move operation associated with a return statement may be elided or considered as an rvalue for the purpose of overload resolution in selecting a constructor (12.8). —end note ] A return statement with a braced-init-list initializes the object or reference to be returned from the function by copy-list-initialization (8.5.4) from the specified initializer list. [ Example:
std::pair<std::string,int> f(const char* p, int x) {
return {p,x};
}
—end example ]
Additionally, even if we interpret the following to mean that an initialisation of a new reference "object" is performed, the referee is probably still alive at the time:
[C++11: 8.5.3/2]: A reference cannot be changed to refer to another object after initialization. Note that initialization of a reference is treated very differently from assignment to it. Argument passing (5.2.2) and function value return (6.6.3) are initializations.
This makes #1 valid.
However, your initialisation of a new reference ref inside main quite clearly violates [C++11: 8.3.2/5]. I can't find wording for it, but it stands to reason that the function scope has been exited when the initialisation is performed.
This would make #2 (and consequently #3) invalid.
At the very least, there does not appear to be anything further stated about the matter in the standard, so if the above reasoning is not sufficient then we have to conclude that the standard is ambiguous in the matter. Fortunately, it's of little consequence in practice, at least in the mainstream.
Here's my incomplete and possible insufficient view on the matter:
The only thing special about references is that at initialization time they must refer to a valid object. If the object later stops existing, using the reference is UB, and so is initializing another reference to the now-defunct reference.
The following much simpler example provides exactly the same dilemma as your question, I think:
std::reference_wrapper<T> r;
{
T t;
r = std::ref(t);
}
// #1
At #1, the reference inside r is no longer valid, but the program is fine. Just don't read r.
In your example, line #1 is fine, and line #2 isn't -- that is because the original line #2 calls A::A(A const &) with argument foo(), and as discussed, this fails to initialize the function argument variable with a valid reference, and so would your edited version A const & a = foo();.
I would say #3. Alone, #2 doesn't actually do anything even though the referenced object is already out of scope. This isn't really a standards-related issue because it is the result of two mistakes made in succession:
Returning a reference to an out-of-scope object followed by
Use of a reference.
Either in isolation has defined behavior. Whether the standard has anything to say regarding use of references to objects beyond the end of their lifetime is another matter.