I know this is silly and ugly, but I'm migrating some code automatically. My source language allows implicit conversion between strings and ints, and for example this is allowed:
var = "hello " + 2
print(var) # prints "hello 2"
How can I in C++ overload the + operator for const char* and int? I'm getting the error:
error: ‘std::string operator+(char* const&, int)’ must have an
argument of class or enumerated type
What you are asking for is illegal
To legally overload an operator at least one of the operands involved has to be a user-defined type. Since neither char* nor int is user-defined, what you are trying to accomplish isn't possible.
This, what you are trying to do, is intentionally, and explicitly, disallowed in the standard. Don't you think it would be weird if suddenly 1+3 = 42 because someone "clever" have defined an overload for operator+(int, int)?
What does the Standard say? (n3337)
13.3.1.2p1-2 Operators in expressions [over.match.oper]
If no operand of an operator in an expression has a type that is a class or an enumeration, the operator is assumed to be a built-in operator and interpreted according to Clause 5.
If either operand has a type that is a class or an enumeration, a user-defined operator function might be declared that implements this operator or a user-defined conversion can be neccessary to convert the operand to a type that is appropriate for a built-in operator.
( Note: The wording is the same in both C++03, and the next revision of the standard; C++14 )
Related
In C++, I'm searching for the crucial sections of the standard
explaining the subtle difference in behavior I've observed between the
language's two pointer-to-member access operators, .* and ->*.
According to my test program shown below, whilst ->* seems to allow its
right-hand expression to be of any type implicitly convertible to
pointer to member of S, .* does not so. When compiling with
gcc and clang, both compilers yield errors for the line marked '(2)'
stating that my class Offset cannot be used as a member pointer.
Test Program
https://godbolt.org/z/46nMPvKxE
#include <iostream>
struct S { int m; };
template<typename C, typename M>
struct Offset
{
M C::* value;
operator M C::* () { return value; } // implicit conversion function
};
int main()
{
S s{42};
S* ps = &s;
Offset<S, int> offset{&S::m};
std::cout << ps->*offset << '\n'; // (1) ok
std::cout << s.*offset << '\n'; // (2) error
std::cout.flush();
}
Compiler Output
GCC 12.2:
'offset' cannot be used as a member pointer, since it is of type 'Offset<S, int>'
clang 15.0:
right hand operand to .* has non-pointer-to-member type 'Offset<S, int>'
Program Variation
In order to prove that ->* actually performs an implicit conversion using Offset's
conversion function in the test program shown above, I declared it explicit for test purposes,
explicit operator M C::* () { return value; } // no longer implicit conversion function
resulting in the compilers to also yield errors for the line marked '(1)':
GCC 12.2:
error: no match for 'operator->*' (operand types are 'S*' and 'Offset<S, int>')
note: candidate: 'operator->*(S*, int S::*)' (built-in)
note: no known conversion for argument 2 from 'Offset<S, int>' to 'int S::*'
clang 15.0:
error: right hand operand to ->* has non-pointer-to-member type 'Offset<S, int>'
Research
Whilst there is a well-documented difference between the two operators in that
->* is overloadable and .* is not, my code obviously does not make use of this option
but rather relies on the built-in operator ->* defined for raw pointer type S*.
Besides differences in overloadability, I merely found documentation stating the similarity
of the expressions. Cited from the standard (https://open-std.org/jtc1/sc22/wg21/docs/papers/2020/n4868.pdf):
[7.6.4.2] The binary operator .* binds its second operand, which shall be of type “pointer to member of T” to its first
operand, which shall be a glvalue of class T or of a class of which T is an unambiguous and accessible base
class. The result is an object or a function of the type specified by the second operand.
[7.6.4.3] [...] The expression E1->E2 is converted into the equivalent form ((E1)).*E2.
And cited from cppreference.com (https://en.cppreference.com/w/cpp/language/operator_member_access#Built-in_pointer-to-member_access_operators):
The second operand of both operators is an expression of type pointer to member ( data or function) of T or pointer to member of an unambiguous and accessible base class B of T.
The expression E1->*E2 is exactly equivalent to (*E1).*E2 for built-in types; that is why the following rules address only E1.*E2.
Nowhere have I found a notion of conversion of the right hand operand.
Question
What have I overlooked? Can someone point me to an explanation of this difference in behavior?
When overloadable operators are used and at least one operand is of class or enumeration type, overload resolution is performed using a candidate set that includes built-in candidates ([over.match.oper]/3) - for ->* in particular, see [over.built]/9.
In this case, a built-in candidate is selected, so the implicit conversion is applied to the second operand, and then ->* is interpreted as the built-in operator ([over.match.oper]/11).
With .*, there's no overload resolution at all, so no implicit conversion either.
Edit: I have reformatted the post to be clearer.
Why does this work:
struct A {};
struct B {
B(A){}
};
void operator+(const B&, const B&) {}
int main()
{
A a1, a2;
a1 + a2;
}
and this does not?
struct B {
B(const char*){}
};
void operator+(const B&, const B&) {} //error: invalid operands of types 'const char [6]' and 'const char [6]' to binary 'operator+'|
int main()
{
"Hello" + "world";
}
Essentially, in the first example a1 and a2 both convert to B objects through the implicit conversion and use the operator+(const B&, const B&) to add.
Following from this example, I would have expected "Hello" and "world" to convert to B objects, again through the implicit constructor, and use operator+(const B&, const B&) to add to each other. Instead there is an error, which indicates the C-style strings do not attempt a user-defined conversion to B in order to add. Why is this? Is there a fundamental property that prevents this?
In your first example, overload resolution is allowed to find your operator+:
[C++14: 13.3.1.2/2]: If either operand has a type that is a class or an enumeration, a user-defined operator function might be declared that implements this operator or a user-defined conversion can be necessary to convert the operand to a type that is appropriate for a built-in operator. In this case, overload resolution is used to determine which operator function or built-in operator is to be invoked to implement the operator. [..]
[C++14: 13.3.2/1]: From the set of candidate functions constructed for a given context (13.3.1), a set of viable functions is chosen, from which the best function will be selected by comparing argument conversion sequences for the best fit (13.3.3). The selection of viable functions considers relationships between arguments and function parameters other than the ranking of conversion sequences.
[C++14: 13.3.2/2]: First, to be a viable function, a candidate function shall have enough parameters to agree in number with the arguments in the list.
If there are m arguments in the list, all candidate functions having exactly m parameters are viable.
[..]
[C++14: 13.3.2/3]: Second, for F to be a viable function, there shall exist for each argument an implicit conversion sequence (13.3.3.1) that converts that argument to the corresponding parameter of F. [..]
(You may examine the wording for "implicit conversion sequence" yourself to see that the operator+ call is permissible; the rules are too verbose to warrant verbatim reproduction here.)
However, in your second example, overload resolution is constrained to a basic arithmetic addition mechanism (one which is not defined for const char[N] or const char*), effectively prohibiting any operator+ function from being considered:
[C++14: 13.3.1.2/1]: If no operand of an operator in an expression has a type that is a class or an enumeration, the operator is assumed to be a built-in operator and interpreted according to Clause 5.
[C++14: 5.7/1]: [..] For addition, either both operands shall have arithmetic or unscoped enumeration type, or one operand shall be a pointer to a completely-defined object type and the other shall have integral or unscoped enumeration type. [..]
[C++14: 5.7/3]: The result of the binary + operator is the sum of the operands.
1. Explaining your compiler error:
The reason you can't concatenate two string literals using the '+' operator,
is because string literals are simply arrays of characters, and you can't concatenate two arrays.
Arrays will be implicitly converted to the pointer of their first element.
Or as the standard describes it:
[conv.array]
An lvalue or rvalue of type “array of N T” or “array of unknown bound
of T” can be converted to a prvalue of type “pointer to T”. The result
is a pointer to the first element of the array.
What you are really doing in the example above,
is trying to add two const char pointers together, and that is not possible.
2. Why the string literals aren't implicitly converted:
Since arrays and pointers are fundamental types, you can't provide an implicit conversation operator as you have done in your class example.
The main thing to keep in mind, is that std::string knows how to take in char[], but char[] does not know how to become a std::string. In your example, you've used B, as a replacement to char[], but you've also given it the ability to convert itself to A.
3. Alternatives:
You can concatenate string literals by leaving out the plus operator.
"stack" "overflow"; //this will work as you indented
Optionally, you could make "stack" a std::string, and then use the std::string's overloaded '+' operator:
std::string("stack") + "overflow"; //this will work
The expression x->y requires x to be a pointer to complete class type, or when x is an instance of a class, requires operator->() defined for x. But when the latter is the case, why not can I use conversion function instead (i.e., convert object x to a pointer)? For example:
struct A
{
int mi;
operator A*() { return this; }
};
int main()
{
A a;
a[1]; // ok: equivalent to *(a.operator A*() + 1);
a->mi; // ERROR
}
This gives an error message:
error: base operand of '->' has non-pointer type 'A'
But the question is, why don't it use a.operator A*() instead, just like a[1] does ?
This is due to the special overload resolution rules for operators in expressions. For most operators, if either operand has a type that is a class or an enumeration, operator functions and built-in operators compete with each other, and overload resolution determines which one is going to be used. This is what happens for a[1]. However, there are some exceptions, and the one that applies to your case is in paragraph [13.3.1.2p3.3] in the standard (emphasis mine in all quotes):
(3.3) — For the operator ,, the unary operator &, or the operator ->,
the built-in candidates set is empty. For all other operators, the
built-in candidates include all of the candidate operator functions
defined in 13.6 that, compared to the given operator,
have the same operator name, and
accept the same number of operands, and
accept operand types to which the given operand or operands can be converted according to 13.3.3.1, and
do not have the same parameter-type-list as any non-member candidate that is not a function template specialization.
So, for a[1], the user-defined conversion is used to get a pointer to which the built-in [] operator can be applied, but for the three exceptions up there, only operator functions are considered first (and there aren't any in this case). Later on, [13.3.1.2p9]:
If the operator is the operator ,, the unary operator &, or the
operator ->, and there are no viable functions, then the operator is
assumed to be the built-in operator and interpreted according to
Clause 5.
In short, for these three operators, the built-in versions are considered only if everything else fails, and then they have to work on the operands without any user-defined conversions.
As far as I can tell, this is done to avoid confusing or ambiguous behaviour. For example, built-in operators , and & would be viable for (almost) all operands, so overloading them wouldn't work if they would be considered during the normal step of overload resolution.
Operator -> has an unusual behaviour when overloaded, as it can result in a chain of invocations of overloaded ->, as explained in [note 129]:
If the value returned by the operator-> function has class type, this
may result in selecting and calling another operator-> function. The
process repeats until an operator-> function returns a value of
non-class type.
I suppose the possibility that you'd start from a class that overloads ->, which returns an object of another class type, which doesn't overload -> but has a user-defined conversion to a pointer type, resulting in a final invocation of the built-in -> was considered a bit too confusing. Restricting this to explicit overloading of -> looks safer.
All quotes are from N4431, the current working draft, but the relevant parts haven't changed since C++11.
I don't have the standard to hand, perhaps someone can come in and present a better answer after me. However, from the narrative on cppreference.com:
The left operand of the built-in operator. and operator-> is an expression of complete scalar type T (for operator.) or pointer to complete scalar type T* (for operator->), which is evaluated before the operator can be called. The right operand is the name of a member object or member function of T or of one of T's base classes, e.g. expr.member, optionally qualified, e.g. expr.name::member, optionally using template disambiguator, e.g. expr.template member.
The expression A->B is exactly equivalent to (*A).B for builtin types. If a user-defined operator-> is provided, operator-> is called again on the value that it returns, recursively, until the operator-> is reached that returns a plain pointer. After that, builtin semantics are applied to that pointer.
Emphasis is mine.
If operator -> is to be called recursively on the result of another operator -> (which will have a pointer return type), it strongly implies that operator -> must be called on a pointer type.
I am trying to have a C++ class that can be implicitly converted to std::array. Conversion works, but it is not implicit.
#include <array>
class A {
private:
std::array<float, 7> data;
public:
operator std::array<float, 7>&() { return data; }
operator const std::array<float, 7>&() const { return data; }
};
int main() {
A a;
a[1] = 0.5f; // fails to compile
auto it = a.begin(); // fails to compile
A b;
static_cast<std::array<float, 7>>(b)[1] = 0.5f; //ok
auto it2 = static_cast<std::array<float, 7>>(b).begin(); //ok
return 0;
}
I understand the above example is quite convoluted, as it basically completely exposes a private member of the class. But this is an oversimplified example, I am just trying to tackle the problem of why implicit conversions to std::array does not work.
I have tried the above example with both clang-3.2 and gcc-4.8. Neither compiles.
Even more perplexing is that if I use implicit conversion to pointer, compilation apparently succeeds:
operator float *() { return data.begin(); }
operator const float *() const { return data.cbegin(); }
But of course, this means losing the many niceties of std::array, which I will accept if there isn't a better solution.
I'm answering your question from a comment:
Could you please elaborate on why my conversion does not make sense? While trying to resolve operator[], why should the compiler not consider possible conversions?
Short answer, because that's how it works. A conversion operator to a built-in type can be called here, not to user-defined type.
A bit longer answer:
When an operator is used in an expression, overload resolution follows the rules laid out in 13.3.1.2.
First:
2 If either operand has a type that is a class or an enumeration, a user-defined operator function might be
declared that implements this operator or a user-defined conversion can be necessary to convert the operand
to a type that is appropriate for a built-in operator. In this case, overload resolution is used to determine
which operator function or built-in operator is to be invoked to implement the operator [...].
a[1] is, for this purpose interpreted as a.operator[](1), as shown in Table 11 in the same section.
The lookup is then performed as follows:
3 For a unary operator # with an operand of a type whose cv-unqualified version is T1, and for a binary
operator # with a left operand of a type whose cv-unqualified version is T1 and a right operand of a type
whose cv-unqualified version is T2, three sets of candidate functions, designated member candidates, non-
member candidates and built-in candidates, are constructed as follows:
— If T1 is a complete class type, the set of member candidates is the result of the qualified lookup of
T1::operator# (13.3.1.1.1); otherwise, the set of member candidates is empty. [1]
— The set of non-member candidates is the result of the unqualified lookup of operator# in the context
of the expression according to the usual rules for name lookup in unqualified function calls (3.4.2)
except that all member functions are ignored. However, if no operand has a class type, only those
non-member functions in the lookup set that have a first parameter of type T1 or “reference to (possibly
cv-qualified) T1”, when T1 is an enumeration type, or (if there is a right operand) a second parameter
of type T2 or “reference to (possibly cv-qualified) T2”, when T2 is an enumeration type, are candidate
functions. [2]
— For the operator ,, the unary operator &, or the operator ->, the built-in candidates set is empty.
For all other operators, the built-in candidates include all of the candidate operator functions defined
in 13.6 that, compared to the given operator,
— have the same operator name, and
— accept the same number of operands, and
— accept operand types to which the given operand or operands can be converted according to
13.3.3.1, and [3]
— do not have the same parameter-type-list as any non-template non-member candidate.
The result is as follows:
[1] finds nothing (there's no operator[] in your class
[2] finds nothing (there's no free function operator[] and neither of operands are enumeration types)
[3] finds built-in operator[](float*, std::ptrdiff_t) because A declares a conversion to float*
You can get them to work by overloading operator[] and begin() on A, or publicly inheriting from array (not recommended though).
The implicit conversion only works when it makes sense (say if you passed an A to a function that expects a std::array<float, 7>), not in your case. And that's a good thing if you ask me.
I was just browsing through the draft of the C++11 standard and found the following puzzling statement (§13.6/8):
For every type T there exist candidate operator functions of the form
T* operator+(T*);
How should this "unary +" operator on pointer be understood? Is this just a no-op in the normal case, which can nevertheless be overloaded? Or is there some deeper point I am missing here?
The + on pointers is a noop except for turning things to rvalues. It sometimes is handy if you want to decay arrays or functions
int a[] = { 1, 2, 3 };
auto &&x = +a;
Now x is an int*&& and not an int(&)[3]. If you want to pass x or +a to templates, this difference might become important. a + 0 is not always equivalent, consider
struct forward_decl;
extern forward_decl a[];
auto &&x = +a; // well-formed
auto &&y = a + 0; // ill-formed
The last line is ill-formed, because adding anything to a pointer requires the pointer's pointed-to class type to be completely defined (because it advances by sizeof(forward_decl) * N bytes).
The answer to your question is just a page above the quote you cited — §13.6/1:
The candidate operator functions that represent the built-in operators defined in Clause 5 are specified in this subclause. These candidate functions participate in the operator overload resolution process as described in 13.3.1.2 and are used for no other purpose. [ Note: Because built-in operators take only operands with non-class type, and operator overload resolution occurs only when an operand expression originally has class or enumeration type, operator overload resolution can resolve to a built-in operator only when an operand has a class type that has a user-defined conversion to a non-class type appropriate for the operator, or when an operand has an enumeration type that can be converted to a type appropriate for the operator. Also note that some of the candidate operator functions given in this subclause are more permissive than the built-in operators themselves. As described in 13.3.1.2, after a built-in operator is selected by overload resolution the expression is subject to the requirements for the built-in operator given in Clause 5, and therefore to any additional semantic constraints given there. If there is a user-written candidate with the same name and parameter types as a built-in candidate operator function, the built-in operator function is hidden and is not included in the set of candidate functions. —end note ]
Well, you could overload it do do whatever you want, but it's just there for symmetry with the unary - operator. As you mention, it's just a no-op most of the time.