Can I take the address of a function defined in standard library? - c++

Consider the following code:
#include <cctype>
#include <functional>
#include <iostream>
int main()
{
std::invoke(std::boolalpha, std::cout); // #1
using ctype_func = int(*)(int);
char c = std::invoke(static_cast<ctype_func>(std::tolower), 'A'); // #2
std::cout << c << "\n";
}
Here, the two calls to std::invoke are labeled for future reference.
The expected output is:
a
Is the expected output guaranteed in C++20?
(Note: there are two functions called tolower — one in <cctype> and the other in <locale>. The explicit cast is introduced to select the desired overload.)

Short answer
No.
Explanation
[namespace.std] says:
Let F denote a standard library function ([global.functions]), a standard library static member function, or an instantiation of a standard library function template.
Unless F is designated an addressable function, the behavior of a C++ program is unspecified (possibly ill-formed) if it explicitly or implicitly attempts to form a pointer to F.
[Note: Possible means of forming such pointers include application of the unary & operator ([expr.unary.op]), addressof ([specialized.addressof]), or a function-to-pointer standard conversion ([conv.func]).
— end note ]
Moreover, the behavior of a C++ program is unspecified (possibly ill-formed) if it attempts to form a reference to F or if it attempts to form a pointer-to-member designating either a standard library non-static member function ([member.functions]) or an instantiation of a standard library member function template.
With this in mind, let's check the two calls to std::invoke.
The first call
std::invoke(std::boolalpha, std::cout);
Here, we are attempting to form a pointer to std::boolalpha. Fortunately, [fmtflags.manip] saves the day:
Each function specified in this subclause is a designated addressable function ([namespace.std]).
And boolalpha is a function specified in this subclause.
Thus, this line is well-formed, and is equivalent to:
std::cout.setf(std::ios_base::boolalpha);
But why is that? Well, it is necessary for the following code:
std::cout << std::boolalpha;
The second call
std::cout << std::invoke(static_cast<ctype_func>(std::tolower), 'A') << "\n";
Unfortunately, [cctype.syn] says:
The contents and meaning of the header <cctype> are the same as the C standard library header <ctype.h>.
Nowhere is tolower explicitly designated an addressable function.
Therefore, the behavior of this C++ program is unspecified (possibly ill-formed), because it attempts to form a pointer to tolower, which is not designated an addressable function.
Conclusion
The expected output is not guaranteed.
In fact, the code is not even guaranteed to compile.
This also applies to member functions.
[namespace.std] doesn’t explicitly mention this, but it can be seen from [member.functions] that the behavior of a C++ program is unspecified (possibly ill-formed) if it attempts to take the address of a member function declared in the C++ standard library. Per [member.functions]/2:
For a non-virtual member function described in the C++ standard library, an implementation may declare a different set of member function signatures, provided that any call to the member function that would select an overload from the set of declarations described in this document behaves as if that overload were selected. [ Note: For instance, an implementation may add parameters with default values, or replace a member function with default arguments with two or more member functions with equivalent behavior, or add additional signatures for a member function name. — end note ]
And [expr.unary.op]/6:
The address of an overloaded function can be taken only in a context that uniquely determines which version of the overloaded function is referred to (see [over.over]). [ Note: Since the context might determine whether the operand is a static or non-static member function, the context can also affect whether the expression has type “pointer to function” or “pointer to member function”. — end note ]
Therefore, the behavior of a program is unspecified (possibly ill-formed) if it explicitly or implicitly attempts to form a pointer to a member function in the C++ library.
(Thanks for the comment for pointing this out!)

Related

Is there no such thing as "implicit this parameter" in the Standard?

Recently, I asked this question where one of the answers says:
There's no such thing as "implicit this parameter" in the standard. The standard calls it an "implicit object parameter".
Then someone commented that:
There's no such thing as "implicit this parameter" in the standard." seems wrong. From expr.call#4: "If the function is a non-static member function, the this parameter of the function shall be initialized with a pointer to the object of the call, converted as if by an explicit type conversion."
Seeing the above comment i think that the answer is technically incorrect because the answer said that "There's no such thing as "implicit this parameter" in the standard." while the standard clearly talks about the this parameter.
So how to interpret this further (pun intended)? I mean, it seems that the standard makes a distinction between the non-static member function and a constructor in the context of this parameter. For example, the standard says that for a non-static member function, the this parameter of the function shall be initialized with a pointer to the object of the call converted as if by an explicit type conversion. But the standard doesn't say the same for constructors. So why does the standard makes this distinction? I mean why doesn't the standard says that constructors also have an this parameter that get initialized by the passed argument just like for non-static member functions. This again leads to the deeper question that if there is no this parameter in the constructor unlike non-static member function, then how are we able to use this inside the constructor. For example, we know that we can write this->p = 0 inside the constructor as well as inside a non-static member function, where p is a data member. But in case of non-static member function, this is a parameter of that particular member function so this->p makes sense. But in case of constructor this is not a parameter, so how are we able to use this->p inside the constructor.
Originally, by reading the answers here, I thought that the implicit this parameter is an implementation detail. But after reading expr.call#4 it seems that it is not an implementation detail.
If you think this is some sort of implicit parameter, type in this code:
#include <iostream>
struct SimpleThing {
int xyzzy;
SimpleThing(): xyzzy(42) {}
void print(int plugh, const int twisty) {
std::cout << xyzzy << '\n';
std::cout << plugh << '\n';
std::cout << twisty << '\n';
xyzzy = 0;
plugh = 0;
twisty = 0;
this = 0;
}
};
int main() {
SimpleThing thing;
thing.print(7, 99);
}
Then examine the errors you get:
prog.cpp: In member function ‘void SimpleThing::print(int, int)’:
prog.cpp:12:16: error: assignment of read-only parameter ‘twisty’
12 | twisty = 0;
| ~~~~~~~^~~
prog.cpp:13:16: error: lvalue required as left operand of assignment
13 | this = 0;
| ^
Note that the first two assignments work because they are modifiable variables. The third fails because it is, of course, (non-modifiable) const.
The attempted assignment to this doesn't look like any sort of "can't write to some sort of variable" diagnostic because it actually isn't.
The this keyword is a special marker inside non-static member functions (and constructors/destructors) that is translated into the address of the object being worked upon. While it may be passed as a hidden parameter, that is very much an implementation detail with which the standard does not concern itself.
The controlling section in the C++20 standard is in [class.this]:
In the body of a non-static member function, the keyword this is a prvalue whose value is the address of the object for which the function is called.
Nowhere in there (the entire section) does it mention that this is some sort of hidden parameter to the call.
And, regarding your question on why there is a distinction between non-static member functions and constructors, I don't believe this distinction involves the existence of this in either case, it instead has to do with the qualification of the type of this. It's existence in a constructor is undeniable as [class.ctor] states:
During the construction of an object, if the value of the object or any of its subobjects is accessed through a glvalue that is not obtained, directly or indirectly, from the constructor’s this pointer, the value of the object or subobject thus obtained is unspecified.
In other words, I see your quote:
If the function is a non-static member function, the this parameter of the function is initialized with a pointer to the object of the call, converted as if by an explicit type conversion.
as specifying only the qualification of this, something that the constructor doesn't need.
There is no discussion of cv-qualified conversion for constructors as there is for other member functions because you can't actually create a cv-qualified constructor. It would be rather useless if your constructor were not allowed to set any member variables, for example :-)
While constructors can be used to create cv-qualified objects, the constructor itself is not cv-qualified. This is covered at the end of [class.this]:
Constructors and destructors shall not be declared const, volatile or const volatile. [Note: However, these functions can be invoked to create and destroy objects with cv-qualified types - end note]
And further in [class.ctor]:
A constructor can be invoked for a const, volatile or const volatile object. Const and volatile semantics are not applied on an object under construction. They come into effect when the
constructor for the most derived object ends.
To be honest, I think WG21 would be better off going through the next iteration and replacing things like "the this parameter of the function" with a phrase that does not mention parameters at all (such as "the this property".
Here's a quotation from this Draft C++17 Standard (bolding for emphasis, and to answer the question, is mine):
10.3.3 The using declaration      [namespace.udecl]
…
16    
For the purpose of forming a set of candidates during overload
resolution, the functions that are introduced by a using-declaration
into a derived class are treated as though they were members of the
derived class. In particular, the implicit this parameter shall
be treated as if it were a pointer to the derived class rather than to
the base class. This has no effect on the type of the function, and in
all other respects the function remains a member of the base class.
Likewise, constructors that are introduced by a using-declaration
are treated as though they were constructors of the derived class when
looking up the constructors of the derived class …
However I should add that the cited paragraph doesn't seem to be present in this later Draft Standard. In fact, that (later) Standard seems to use the phrase, "implicit object parameter," in similar clauses.
So, maybe you should add a specific version tag to your question: c++17 or c++20, as there appears to be a divergence in the use (or not) of the term.
Note that the above citation is the only occurrence of the phrase, "implicit this parameter" in that Draft Standard.
Also, note that both documents I have linked are only Draft versions of the respective Standards, and both come with this cautionary escape-clause:
Note: this is an early draft. It’s known to be incomplet and
incorrekt, and it has lots of bad formatting.

Use specific std::get overload in a function [duplicate]

Consider the following code:
#include <cctype>
#include <functional>
#include <iostream>
int main()
{
std::invoke(std::boolalpha, std::cout); // #1
using ctype_func = int(*)(int);
char c = std::invoke(static_cast<ctype_func>(std::tolower), 'A'); // #2
std::cout << c << "\n";
}
Here, the two calls to std::invoke are labeled for future reference.
The expected output is:
a
Is the expected output guaranteed in C++20?
(Note: there are two functions called tolower — one in <cctype> and the other in <locale>. The explicit cast is introduced to select the desired overload.)
Short answer
No.
Explanation
[namespace.std] says:
Let F denote a standard library function ([global.functions]), a standard library static member function, or an instantiation of a standard library function template.
Unless F is designated an addressable function, the behavior of a C++ program is unspecified (possibly ill-formed) if it explicitly or implicitly attempts to form a pointer to F.
[Note: Possible means of forming such pointers include application of the unary & operator ([expr.unary.op]), addressof ([specialized.addressof]), or a function-to-pointer standard conversion ([conv.func]).
— end note ]
Moreover, the behavior of a C++ program is unspecified (possibly ill-formed) if it attempts to form a reference to F or if it attempts to form a pointer-to-member designating either a standard library non-static member function ([member.functions]) or an instantiation of a standard library member function template.
With this in mind, let's check the two calls to std::invoke.
The first call
std::invoke(std::boolalpha, std::cout);
Here, we are attempting to form a pointer to std::boolalpha. Fortunately, [fmtflags.manip] saves the day:
Each function specified in this subclause is a designated addressable function ([namespace.std]).
And boolalpha is a function specified in this subclause.
Thus, this line is well-formed, and is equivalent to:
std::cout.setf(std::ios_base::boolalpha);
But why is that? Well, it is necessary for the following code:
std::cout << std::boolalpha;
The second call
std::cout << std::invoke(static_cast<ctype_func>(std::tolower), 'A') << "\n";
Unfortunately, [cctype.syn] says:
The contents and meaning of the header <cctype> are the same as the C standard library header <ctype.h>.
Nowhere is tolower explicitly designated an addressable function.
Therefore, the behavior of this C++ program is unspecified (possibly ill-formed), because it attempts to form a pointer to tolower, which is not designated an addressable function.
Conclusion
The expected output is not guaranteed.
In fact, the code is not even guaranteed to compile.
This also applies to member functions.
[namespace.std] doesn’t explicitly mention this, but it can be seen from [member.functions] that the behavior of a C++ program is unspecified (possibly ill-formed) if it attempts to take the address of a member function declared in the C++ standard library. Per [member.functions]/2:
For a non-virtual member function described in the C++ standard library, an implementation may declare a different set of member function signatures, provided that any call to the member function that would select an overload from the set of declarations described in this document behaves as if that overload were selected. [ Note: For instance, an implementation may add parameters with default values, or replace a member function with default arguments with two or more member functions with equivalent behavior, or add additional signatures for a member function name. — end note ]
And [expr.unary.op]/6:
The address of an overloaded function can be taken only in a context that uniquely determines which version of the overloaded function is referred to (see [over.over]). [ Note: Since the context might determine whether the operand is a static or non-static member function, the context can also affect whether the expression has type “pointer to function” or “pointer to member function”. — end note ]
Therefore, the behavior of a program is unspecified (possibly ill-formed) if it explicitly or implicitly attempts to form a pointer to a member function in the C++ library.
(Thanks for the comment for pointing this out!)

C++ type deduction of overloaded function [duplicate]

Consider the following code:
#include <cctype>
#include <functional>
#include <iostream>
int main()
{
std::invoke(std::boolalpha, std::cout); // #1
using ctype_func = int(*)(int);
char c = std::invoke(static_cast<ctype_func>(std::tolower), 'A'); // #2
std::cout << c << "\n";
}
Here, the two calls to std::invoke are labeled for future reference.
The expected output is:
a
Is the expected output guaranteed in C++20?
(Note: there are two functions called tolower — one in <cctype> and the other in <locale>. The explicit cast is introduced to select the desired overload.)
Short answer
No.
Explanation
[namespace.std] says:
Let F denote a standard library function ([global.functions]), a standard library static member function, or an instantiation of a standard library function template.
Unless F is designated an addressable function, the behavior of a C++ program is unspecified (possibly ill-formed) if it explicitly or implicitly attempts to form a pointer to F.
[Note: Possible means of forming such pointers include application of the unary & operator ([expr.unary.op]), addressof ([specialized.addressof]), or a function-to-pointer standard conversion ([conv.func]).
— end note ]
Moreover, the behavior of a C++ program is unspecified (possibly ill-formed) if it attempts to form a reference to F or if it attempts to form a pointer-to-member designating either a standard library non-static member function ([member.functions]) or an instantiation of a standard library member function template.
With this in mind, let's check the two calls to std::invoke.
The first call
std::invoke(std::boolalpha, std::cout);
Here, we are attempting to form a pointer to std::boolalpha. Fortunately, [fmtflags.manip] saves the day:
Each function specified in this subclause is a designated addressable function ([namespace.std]).
And boolalpha is a function specified in this subclause.
Thus, this line is well-formed, and is equivalent to:
std::cout.setf(std::ios_base::boolalpha);
But why is that? Well, it is necessary for the following code:
std::cout << std::boolalpha;
The second call
std::cout << std::invoke(static_cast<ctype_func>(std::tolower), 'A') << "\n";
Unfortunately, [cctype.syn] says:
The contents and meaning of the header <cctype> are the same as the C standard library header <ctype.h>.
Nowhere is tolower explicitly designated an addressable function.
Therefore, the behavior of this C++ program is unspecified (possibly ill-formed), because it attempts to form a pointer to tolower, which is not designated an addressable function.
Conclusion
The expected output is not guaranteed.
In fact, the code is not even guaranteed to compile.
This also applies to member functions.
[namespace.std] doesn’t explicitly mention this, but it can be seen from [member.functions] that the behavior of a C++ program is unspecified (possibly ill-formed) if it attempts to take the address of a member function declared in the C++ standard library. Per [member.functions]/2:
For a non-virtual member function described in the C++ standard library, an implementation may declare a different set of member function signatures, provided that any call to the member function that would select an overload from the set of declarations described in this document behaves as if that overload were selected. [ Note: For instance, an implementation may add parameters with default values, or replace a member function with default arguments with two or more member functions with equivalent behavior, or add additional signatures for a member function name. — end note ]
And [expr.unary.op]/6:
The address of an overloaded function can be taken only in a context that uniquely determines which version of the overloaded function is referred to (see [over.over]). [ Note: Since the context might determine whether the operand is a static or non-static member function, the context can also affect whether the expression has type “pointer to function” or “pointer to member function”. — end note ]
Therefore, the behavior of a program is unspecified (possibly ill-formed) if it explicitly or implicitly attempts to form a pointer to a member function in the C++ library.
(Thanks for the comment for pointing this out!)

Do GCC and clang show the same result as Visual Studio on this case, about language linkage?

[dcl.link]/4:
Linkage specifications nest. When linkage specifications nest, the
innermost one determines the language linkage. A linkage specification
does not establish a scope. A linkage-specification shall occur only
in namespace scope. In a linkage-specification, the specified
language linkage applies to the function types of all function
declarators, function names with external linkage, and variable
names with external linkage declared within the linkage-specification.
[ Example:
extern "C" // the name f1 and its function type have C language linkage;
void f1(void(*pf)(int)); // pf is a pointer to a C function
...
— end example ]
Observe that the pointer &foo passed to the function c_f() below is not a pointer to a C function. This code compiles and links normally in VS2017. But it shouldn't, according to [dcl.link]/4.
File main.cpp:
#include <stdio.h>
extern "C" // the name c_f and its function type have C language linkage;
void c_f(void(*pf)(int)); // pf is a pointer to a C function
void foo(int i) {
printf("%d\n", i);
}
extern "C" void c_foo(int);
int main() {
c_foo(1); // Calls c_foo(int) defined in other.c
c_f(&foo); // Calls c_f(void(*)(int)) defined in other.c, but &foo is not a pointer to a C function !!
}
File other.c:
#include <stdio.h>
void c_f(void(*pf)(int)){
pf(2);
}
void c_foo(int i) {
printf("%d\n", i);
}
I'm curious to know whether clang and GCC are compliant with the Standard, but I can't verify this in a web compiler.
Edit
It dawned on me that I really don't need two files to verify whether clang and GCC are compliant to the Standard, on the issue mentioned above. If the Standard requires the address of a C function, as an argument for the function c_f() and the code in main.cpp supplies the address of a C++ function, the C++ compiler has to complain1 when compiling this file. But that doesn't happen neither in clang nor in GCC. Then, I might as well say that both clang and GCC are also buggy on this regard.
1) If we assume that a diagnostic is required
Your code shows undefined behavior according to [dcl.link]/1 and [expr.call]/1 (emphases are mine):
[dcl.link]/1:
All function types, function names with external linkage, and variable names with external linkage have a language linkage. [ Note: Some of the properties associated with an entity with language linkage are specific to each implementation and are not described here. For example, a particular language linkage may be associated with a particular form of representing names of objects and functions with external linkage, or with a particular calling convention, etc. — end note ] The default language linkage of all function types, function names, and variable names is C++ language linkage. Two function types with different language linkages are distinct types even if they are otherwise identical.
[expr.call]/1:
A function call is a postfix expression followed by parentheses containing a possibly empty, comma-separated list of initializer-clauses which constitute the arguments to the function. The postfix expression shall have function type or function pointer type. For a call to a non-member function or to a static member function, the postfix expression shall be either an lvalue that refers to a function (in which case the function-to-pointer standard conversion is suppressed on the postfix expression), or it shall have function pointer type. Calling a function through an expression whose function type is different from the function type of the called function's definition results in undefined behavior ([dcl.link]). For a call to a non-static member function, the postfix expression shall be an implicit ([class.mfct.non-static], [class.static]) or explicit class member access whose id-expression is a function member name, or a pointer-to-member expression selecting a function member; the call is as a member of the class object referred to by the object expression. In the case of an implicit class member access, the implied object is the one pointed to by this. [ Note: A member function call of the form f() is interpreted as (*this).f() (see [class.mfct.non-static]). — end note ] If a function or member function name is used, the name can be overloaded, in which case the appropriate function shall be selected according to the rules in [over.match]. If the selected function is non-virtual, or if the id-expression in the class member access expression is a qualified-id, that function is called. Otherwise, its final overrider in the dynamic type of the object expression is called; such a call is referred to as a virtual function call. [ Note: The dynamic type is the type of the object referred to by the current value of the object expression. [class.cdtor] describes the behavior of virtual function calls when the object expression refers to an object under construction or destruction. — end note ]

C++11 initializer with ambiguous function id-expression?

In the following C++11 code:
void f(int) {}
void f(double) {}
void (*p)(int) = f;
There are two functions.
The third f identifier is an id-expression and the initializer of p.
In 5.1.1p8 [expr.prim.general]/8 it says:
The type of the [id-expression] is the type of the identifier.
The result is the entity denoted by the identifier. The result is an lvalue if the entity is a function, variable, or data member and a prvalue otherwise.
Given that f could be referring to two different entities with two different types, there is no "the entity" or "the type".
Is there some other text in the standard that addresses this situation?
Do implementations just disambiguate this as an extension or is it required somewhere? (Without some other text one could argue that an implementation could reject the f id-expression as ambiguous.)
The standard (at § 13.4) defines that:
A use of an overloaded function name without arguments is resolved in
certain contexts to a function, a pointer to function or a pointer to
member function for a specific function from the overload set. A
function template name is considered to name a set of overloaded
functions in such contexts. The function selected is the one whose
type is identical to the function type of the target type required in
the context.
Emphasis mine.
After the quote, there is an example (at § 13.4/5) that resembles yours:
int f(double);
int f(int);
int (*pfd)(double) = &f; // selects f(double)
int (*pfi)(int) = &f; // selects f(int)
As far as the unary & is concerned, the standard specifies that (at § 5.3.1/6 and thanks to jogojapan):
The address of an overloaded function can be taken only in a context
that uniquely determines which version of the overloaded function is
referred to.
but can also be omitted (at § 13.4/1):
The overloaded function name can be preceded by the & operator.
(again, emphasis mine) just like you did, in your example.