Operator Functions Calling Mechanism - c++

Is the calling of an operator function similar to a normal function call?
When a function call is encountered, its local variables, parameters, and the return address is loaded on to the call stack. Does this happen when we use an operator? If it happens, then the operator function should be removed from the stack after the execution is finished, right?
Well, some part of me says that it doesn't happen that way because we're returning a reference to a local object, which will be destroyed after the execution is finished.
I just want to know the details of it.
#include <stdio.h>
class OUT
{};
OUT & operator<<(OUT & out, int x)
{
printf("%d",x);
return out;
}
int main()
{
OUT print;
print<<3<<4;
}

Yes, a use of an overloaded operator function is semantically a function call.
[over.match.oper]/2 in the C++ Standard says, emphasis mine:
If either operand [in an operator expression] has a type that is a class or an enumeration, a user-defined operator function might be declared that implements this operator or a user-defined conversion can be necessary to convert the operand to a type that is appropriate for a built-in operator. In this case, overload resolution is used to determine which operator function or built-in operator is to be invoked to implement the operator. Therefore, the operator notation is first transformed to the equivalent function-call notation as summarized in Table 12 ....
So the Standard rules about object lifetimes apply in exactly the same ways. There's also no reason a compiler's manipulation of behind-the-scenes things like a call stack would need to be different.
Your example is fine not because there's something special about operator functions, but because it doesn't return a reference to a local object. In return out;, out names the function parameter with reference type, so it refers to some other object from outside of the function scope. In this case, out refers to the variable print in main, and the lifetime of print goes to the end of main.

Related

Copying c++ lambda to Function Pointer reference

I am not sure if I have defined behaviour in the following situation:
My Function pointer type:
typedef void (*DoAfter_cb_type)(void);
The Function which should assign callbacks:
void DoSomething(DoAfter_cb_type & DoAfter_cb)
{
//...
DoAfter_cb = [](){
//...
};
}
Caller:
DoAfter_cb_type DoAfter_cb = nullptr;
DoSomething(DoAfter_cb);
// Here is something that has to be done after DoSomething but before DoAfter_cb.
if( DoAfter_cb != nullptr){
DoAfter_cb();
}
As I learned here lambdas can be implicitly converted to function pointers.
However thoose are still pointers and I fear that something important for calling the lambda is stored on stack and would be out of scope if I just return the function pointer
I have to use function pointers because i do not have access to std::function in my environment.
With std::function I would expect the lambda object to be stored in the reference variable and I would not have any problems.
Is the behaviour the same as If I would just define an ordinary function or do I have any side effects here?
Is the behaviour the same as If I would just define an ordinary function or do I have any side effects here?
Yes, it's the same. A captureless lambda is convertible to a regular function pointer because, to quote the C++ standard ([expr.prim.lambda.closure]/6, emphasis mine):
The closure type for a non-generic lambda-expression with no
lambda-capture has a conversion function to pointer to function with
C++ language linkage having the same parameter and return types as the
closure type's function call operator. The conversion is to “pointer
to noexcept function” if the function call operator has a non-throwing
exception specification. The value returned by this conversion
function is the address of a function F that, when invoked, has the
same effect as invoking the closure type's function call operator.
So while the lambda goes out of scope, that pointer is backed by a proper function, just as if you had written it yourself at file scope. Functions "live" throughout the entire execution of the program, so the pointer will be valid, always.

About lambdas, conversions to function pointers and visibility of private data members

Consider the following example:
#include <cassert>
struct S {
auto func() { return +[](S &s) { s.x++; }; }
int get() { return x; }
private:
int x{0};
};
int main() {
S s;
s.func()(s);
assert(s.get() == 1);
}
It compiles both with G++ and clang, so I'm tempted to expect that is allowed by the standard.
However, the lambda has no capture list and it cannot have it because of the + that forces the conversion to a function pointer. Therefore, I expected it was not allowed to access private data members of S.
Instead, it behaves more or less how if it was defined as a static member function.
So far, so good. If I knew it before, I would have used this trick often to avoid writing redundant code.
What I'd like to know now is where in the standard (the working draft is fine) this is defined, for I've not been able to find the section, the bullet or whatever that rules about it.
Is there any limitation for the lambda or it works exactly as if it was defined as a static member function?
For lambda expressions inside the member function, according to §8.4.5.1/2 Closure types [expr.prim.lambda.closure]:
The closure type is declared in the smallest block scope, class scope, or namespace scope that contains the corresponding lambda-expression.
That means the lambda closure type will be declared inside the member function, i.e. a local class. And according to §14/2 Member access control [class.access]:
(emphasis mine)
A member of a class can also access all the names to which the class has access. A local class of a member
function may access the same names that the member function itself may access.
That means for the lambda expression itself, it could access the private members of S, same as the member function func.
And §8.4.5.1/7 Closure types [expr.prim.lambda.closure]:
(emphasis mine)
The closure type for a non-generic lambda-expression with no lambda-capture whose constraints (if any) are satisfied has a conversion function to pointer to function with C++ language linkage having the same parameter and return types as the closure type's function call operator. ... The value returned by this conversion function is the address of a function F that, when invoked, has the same effect as invoking the closure type's function call operator.
That means when the converted function pointer gets invoked the same rule applies.
However, the lambda has no capture list and it cannot have it because of the + that forces the conversion to a function pointer.
+ does not force a conversion to a function pointer, but adds a conversion operator to pointer to function for you to use as an option. Lambda remains a lambda, with all the access privileges granted to it, i.e. it may access the same names that the member function itself may access.

Use "operator T*()" instead of "T* operator->()" for member access

The expression x->y requires x to be a pointer to complete class type, or when x is an instance of a class, requires operator->() defined for x. But when the latter is the case, why not can I use conversion function instead (i.e., convert object x to a pointer)? For example:
struct A
{
int mi;
operator A*() { return this; }
};
int main()
{
A a;
a[1]; // ok: equivalent to *(a.operator A*() + 1);
a->mi; // ERROR
}
This gives an error message:
error: base operand of '->' has non-pointer type 'A'
But the question is, why don't it use a.operator A*() instead, just like a[1] does ?
This is due to the special overload resolution rules for operators in expressions. For most operators, if either operand has a type that is a class or an enumeration, operator functions and built-in operators compete with each other, and overload resolution determines which one is going to be used. This is what happens for a[1]. However, there are some exceptions, and the one that applies to your case is in paragraph [13.3.1.2p3.3] in the standard (emphasis mine in all quotes):
(3.3) — For the operator ,, the unary operator &, or the operator ->,
the built-in candidates set is empty. For all other operators, the
built-in candidates include all of the candidate operator functions
defined in 13.6 that, compared to the given operator,
have the same operator name, and
accept the same number of operands, and
accept operand types to which the given operand or operands can be converted according to 13.3.3.1, and
do not have the same parameter-type-list as any non-member candidate that is not a function template specialization.
So, for a[1], the user-defined conversion is used to get a pointer to which the built-in [] operator can be applied, but for the three exceptions up there, only operator functions are considered first (and there aren't any in this case). Later on, [13.3.1.2p9]:
If the operator is the operator ,, the unary operator &, or the
operator ->, and there are no viable functions, then the operator is
assumed to be the built-in operator and interpreted according to
Clause 5.
In short, for these three operators, the built-in versions are considered only if everything else fails, and then they have to work on the operands without any user-defined conversions.
As far as I can tell, this is done to avoid confusing or ambiguous behaviour. For example, built-in operators , and & would be viable for (almost) all operands, so overloading them wouldn't work if they would be considered during the normal step of overload resolution.
Operator -> has an unusual behaviour when overloaded, as it can result in a chain of invocations of overloaded ->, as explained in [note 129]:
If the value returned by the operator-> function has class type, this
may result in selecting and calling another operator-> function. The
process repeats until an operator-> function returns a value of
non-class type.
I suppose the possibility that you'd start from a class that overloads ->, which returns an object of another class type, which doesn't overload -> but has a user-defined conversion to a pointer type, resulting in a final invocation of the built-in -> was considered a bit too confusing. Restricting this to explicit overloading of -> looks safer.
All quotes are from N4431, the current working draft, but the relevant parts haven't changed since C++11.
I don't have the standard to hand, perhaps someone can come in and present a better answer after me. However, from the narrative on cppreference.com:
The left operand of the built-in operator. and operator-> is an expression of complete scalar type T (for operator.) or pointer to complete scalar type T* (for operator->), which is evaluated before the operator can be called. The right operand is the name of a member object or member function of T or of one of T's base classes, e.g. expr.member, optionally qualified, e.g. expr.name::member, optionally using template disambiguator, e.g. expr.template member.
The expression A->B is exactly equivalent to (*A).B for builtin types. If a user-defined operator-> is provided, operator-> is called again on the value that it returns, recursively, until the operator-> is reached that returns a plain pointer. After that, builtin semantics are applied to that pointer.
Emphasis is mine.
If operator -> is to be called recursively on the result of another operator -> (which will have a pointer return type), it strongly implies that operator -> must be called on a pointer type.

What is "myfunc" vs "myfunc()"

A noob question that probbaly applies to C as well as C++. Let's say I have
void myfunc() {
blah;
}
So, I call this function with:
myfunc();
However, no compiler error is produced when I "call" it with:
myfunc;
Program runs, but myfunc doesn't get called. So, what is C++ interpreting this as?
Now, I'm doing this in the Arduino IDE, all one big lump of code, so I don't get segfaults, etc. So maybe this would throw a runtime error on a dynamically linked host.
myfunc without the parens is the address of the function in memory.
For example, if you have to pass a function to some other function, you would do it with that.
A good example of this is in bsearch in the c standard library, where you need to pass a user defined comparator function in order to do a generic search.
The compiler just evaluates the expression. Since you're evaluating the name of a function, it's basically a no-op.
It's just like this:
int main() {
42; // evaluates 42 but does nothing with it
}
Your compiler should warn you that the result of the expression is unused, anyway.
In C myfunc or any other function name represents the function itself, which will be implicitly converted to a function pointer
Function to pointer
An lvalue of function type T can be implicitly converted to a prvalue pointer to that function. This does not apply to non-static member functions because lvalues that refer to non-static member functions do not exist.
https://en.cppreference.com/w/cpp/language/implicit_conversion#Function_to_pointer
and () is an operator that when applies to a function pointer or a function object will invoke that function
Built-in function call operator
The function call expressions have the form
E ( A1, A2, A3,... )
where
E is an expression that names a function
A1, A2, A3,... is a possibly empty list of arbitrary expressions, except the comma operator is not allowed at the top level to avoid ambiguity.
The expression that names the function can be
lvalue expression that refers to a function
pointer to function
explicit class member access expression that selects a member function
implicit class member access expression, e.g. member function name used within another member function.
So without the function-call operator myfunc; is just a no-op expression that contains a function pointer. If you've turned on compiler warnings (which you should really do) then they'd shout at you about the issue. GCC says that
statement is a reference, not call, to function 'func' [-Waddress]
warning: statement has no effect [-Wunused-value]
while Clang outputs warning: expression result unused [-Wunused-value]

Why is f(x).swap(v) okay but v.swap(f(x)) is not?

We have:
vector<int> f(int);
vector<int> v;
This works:
f(x).swap(v);
This doesn't:
v.swap(f(x));
And why?
swap() takes a non-const reference to a vector<int>. A non-const reference cannot bind to an rvalue (a temporary object). A call to a function that returns by value (like f) is an rvalue.
The reason that f(x).swap(v) works is because inside of std::vector<int>::swap, the temporary object returned by f(x) can use this to refer to itself. this is not an rvalue.
You are allowed to call member functions on temporaries but in C++ they cannot be bound to non-const references.
For example:
int &x = 5; // illegal because temporary int(5) cannot be bound to non-const reference x
Actually, (while James' answer is certainly right (and so is Prasoon's), there is some underlying problem to grasp.
When we reduce f(x) to its result y, and y.swap(v) (or v.swap(y), it doesn't matter in this case) to use generalized identifier names, it becomes
y.func(v)
Now, func() being a member function with one argument, it actually has two arguments: what's been passed in as v, and the implicit this pointer every non-static member function receives, here bound to y. Tossing encapsulation aside, every member function called as y.func(v) could be made a non-member function to be called as func(y,v). (And in fact, there actually are a non-member swap() functions. Also, each time you need to overload one of those binary operators that could be overloaded both as members or non-members, you have to make this decision.)
However, there are subtle differences between y.func(v) and func(y,v), because C++ treats the this argument, the argument that's passed by writing it before the . (the dot), different than the other arguments, and it does so in many ways.
As you have discovered, the this argument might be an rvalue (temporary) even for non-const member functions, while for the other arguments, a non-const reference prevents rvalues from being bound to the argument. Also, the this argument's run-time type might influence which function is called (for virtual members), while the other arguments' run-time type is irrelevant, because a function is chosen only depending on their compile-time type only. And implicit conversions are only ever applied to the explicit arguments of a member functions, but never to its implicit this argument. (That's why you can pass a string literal for a const std::string&, but cannot call std::string::size() on a string literal.)
So, to conclude, despite the fact that what's before the . ends up as an (implicit) function argument, it's actually treated very differently from the other function arguments.