Why is it not possible to overload the ternary operator ' ?: '?
I use the ternary operator often to consolidate if statements, and am curious why the language designers chose to forbid this operator from being overloaded. I looked for an explanation as to why in C++ Operator Overloading but did not find one describing why this isn't possible. The only information the footnote provides is that it cannot be overloaded.
My initial guess is that overloading the operator will almost always violate number one or two of the principles given in the link above. The meaning of the overload will rarely be obvious or clear or it will deviate from its original known semantics.
So my question is more of why is this not possible rather than how, as I know it cannot be done.
if you could override the ternary operator, you would have to write something like this:
xxx operator ?: ( bool condition, xxx trueVal, xxx falseVal );
To call your override, the compiler would have to calculate the value of both trueVal and falseVal. That's not how the built-in ternary operator works - it only calculates one of those values, which is why you can write things like:
return p == NULL ? 23 : p->value;
without worrying about indirecting through a NULL pointer.
I think the main reason at the time that it didn't seem worth
the effort of inventing a new syntax just for that operator.
There is no token ?:, so you'd have to create a number of
special grammar rules just for it. (The current grammar rule
has operator followed by an operator, which is a single
token.)
As we've learned (from experience) to use operator overloading
more reasonably, it has become apparent that we really shouldn't
have allowed overloading of && and || either, for the
reasons other responses have pointed out, and probably not
operator comma as well (since the overloaded versions won't have
the sequence point which the user expects). So the motivation
to support it is even less than it was originally.
One of the principles of the ternary operator is that the true / false expression are only evaluated based on the truth or falseness of the conditional expression.
cond ? expr1 : expr2
In this example expr1 is only evaluated if cond is true while expr2 is only evaluated if cond is false. Keeping that in mind lets look at what a signature for ternary overloading would look like (using fixed types here instead of a template for simplicity)
Result operator?(const Result& left, const Result& right) {
...
}
This signature simply isn't legal because it violates the exact semantics I described. In order to call this method the language would have to evaluate both expr1 and expr2 hence they are no longer conditionally evaluated. In order to support ternary the operator would either need to
Take a lambda for each value so it could produce them on demand. This would necessarily complicate the calling code though because it would have to take into account lambda call semantics where no lambda was logically present
The ternary operator would need to return a value to denote whether the compiler should use expr1 or expr2
EDIT
Some may argue that the lack of short circuiting in this scenario is fine. The reason being that C++ already allows you to violate short circuiting in operator overloads with || and &&
Result operator&&(const Result& left, const Result& right) {
...
}
Though I still find this behavior baffling even for C++.
The short and accurate answer is simply "because that's what Bjarne decided."
Although the arguments about which operands should be evaluated and in what sequence give a technically accurate description of what happens, they do little (nothing, really) to explain why this particular operator can't be overloaded.
In particular, the same basic arguments would apply equally well to other operators such as operator && and operator||. In the built-in version of each of these operators, the left operand is evaluated, then if and only if that produces 1 for && or a 0 for ||, the right operand is evaluated. Likewise, the (built in) comma operator evaluates its left operand, then its right operand.
In an overloaded version of any of these operators, both operands are always evaluated (in an unspecified sequence). As such, they're essentially identical to an overloaded ternary operator in this respect. They all lose the same guarantees about what operands are evaluated and in what order.
As to why Bjarne made that decision: I can see a few possibilities. One is that although it's technically an operator, the ternary operator is devoted primarily to flow control, so overloading it would be more like overloading if or while than it is like overloading most other operators.
Another possibility would be that it would be syntactically ugly, requiring the parser to deal with something like operator?:, which requires defining ?: as a token, etc. -- all requiring fairly serious changes to the C grammar. At least in my view, this argument seems pretty weak, as C++ already requires a much more complex parser than C does, and this change would really be much smaller than many other changes that have been made.
Perhaps the strongest argument of all is simply that it didn't seem like it would accomplish much. Since it is devoted primarily to flow control, changing what it does for some types of operands is unlikely to accomplish anything very useful.
For the same reason why you really should not (although you can) overload && or || operators - doing so would disable short-circuiting on those operators (evaluating only the necessary part and not everything), which can lead to severe complications.
Previous answers focused on short-circuiting, which is somewhat valid but not even the real problem with trying to do this IMO.
The closest possible implementation of the existing ternary operator (without short circuiting) would have to look like this:
template<typename T0, typename T1>
std::variant<T0, T1>&& operator?:(bool predicate, T0&& arg0, T1&& arg1)
{
if(predicate)
return { std::forward<T0&&>(arg0) };
return { std::forward<T1&&>(arg1); }
}
However, T0 might be void. T1 might be void. This won't build in either of those cases.
The variant is necessary because T0 and T1 might not be implicitly convertible to one another and the return type can't be used for function overload resolution, and that was a C++17 library addition. But it still doesn't really work, because variant isn't implicitly convertible to any of its possible types.
Related
Does the C++ standard guarantee that (x!=y) always has the same truth value as !(x==y)?
I know there are many subtleties involved here: The operators == and != may be overloaded. They may be overloaded to have different return types (which only have to be implicitly convertible to bool). Even the !-operator might be overloaded on the return type. That's why I handwavingly referred to the "truth value" above, but trying to elaborate it further, exploiting the implicit conversion to bool, and trying to eliminate possible ambiguities:
bool ne = (x!=y);
bool e = (x==y);
bool result = (ne == (!e));
Is result guaranteed to be true here?
The C++ standard specifies the equality operators in section 5.10, but mainly seems to define them syntactically (and some semantics regarding pointer comparisons). The concept of being EqualityComparable exists, but there is no dedicated statement about the relationship of its operator == to the != operator.
There exist related documents from C++ working groups, saying that...
It is vital that equal/unequal [...] behave as boolean negations of each other. After all, the world would make no sense if both operator==() and operator!=() returned false! As such, it is common to implement these operators in terms of each other
However, this only reflects the Common Sense™, and does not specify that they have to be implemented like this.
Some background: I'm just trying to write a function that checks whether two values (of unknown type) are equal, and print an error message if this is not the case. I'd like to say that the required concept here is that the types are EqualityComparable. But for this, one would still have to write if (!(x==y)) {…} and could not write if (x!=y) {…}, because this would use a different operator, which is not covered with the concept of EqualityComparable at all, and which might even be overloaded differently...
I know that the programmer basically can do whatever he wants in his custom overloads. I just wondered whether he is really allowed to do everything, or whether there are rules imposed by the standard. Maybe one of these subtle statements that suggest that deviating from the usual implementation causes undefined behavior, like the one that NathanOliver mentioned in a comment, but which seemed to only refer to certain types. For example, the standard explicitly states that for container types, a!=b is equivalent to !(a==b) (section 23.2.1, table 95, "Container requirements").
But for general, user-defined types, it currently seems that there are no such requirements. The question is tagged language-lawyer, because I hoped for a definite statement/reference, but I know that this may nearly be impossible: While one could point out the section where it said that the operators have to be negations of each other, one can hardly prove that none of the ~1500 pages of the standard says something like this...
In doubt, and unless there are further hints, I'll upvote/accept the corresponding answers later, and for now assume that for comparing not-equality for EqualityComparable types should be done with if (!(x==y)) to be on the safe side.
Does the C++ standard guarantee that (x!=y) always has the same truth value as !(x==y)?
No it doesn't. Absolutely nothing stops me from writing:
struct Broken {
bool operator==(const Broken& ) const { return true; }
bool operator!=(const Broken& ) const { return true; }
};
Broken x, y;
That is perfectly well-formed code. Semantically, it's broken (as the name might suggest), but there's certainly nothing wrong from it from a pure C++ code functionality perspective.
The standard also clearly indicates this is okay in [over.oper]/7:
The identities among certain predefined operators applied to basic types (for example, ++a ≡ a+=1) need not hold for operator functions. Some predefined operators, such as +=, require an operand to be an lvalue when applied to basic types; this is not required by operator functions.
In the same vein, nothing in the C++ standard guarantees that operator< actually implements a valid Ordering (or that x<y <==> !(x>=y), etc.). Some standard library implementations will actually add instrumentation to attempt to debug this for you in the ordered containers, but that is just a quality of implementation issue and not a standards-compliant-based decision.
Library solutions like Boost.Operators exist to at least make this a little easier on the programmer's side:
struct Fixed : equality_comparable<Fixed> {
bool operator==(const Fixed&) const;
// a consistent operator!= is provided for you
};
In C++14, Fixed is no longer an aggregate with the base class. However, in C++17 it's an aggregate again (by way of P0017).
With the adoption of P1185 for C++20, the library solution has effectively becomes a language solution - you just have to write this:
struct Fixed {
bool operator==(Fixed const&) const;
};
bool ne(Fixed const& x, Fixed const& y) {
return x != y;
}
The body of ne() becomes a valid expression that evaluates as !x.operator==(y) -- so you don't have to worry about keeping the two comparison in line nor rely on a library solution to help out.
In general, I don't think you can rely on it, because it doesn't always make sense for operator == and operator!= to always correspond, so I don't see how the standard could ever require it.
For example, consider the built-in floating point types, like doubles, for which NaNs always compare false, so operator== and operator!= can both return false at the same time. (Edit: Oops, this is wrong; see hvd's comment.)
As a result, if I'm writing a new class with floating point semantics (maybe a really_long_double), I have to implement the same behaviour to be consistent with the primitive types, so my operator== would have to behave the same and compare two NaNs as false, even though operator!= also compares them as false.
This might crop up in other circumstances, too. For example, if I was writing a class to represent a database nullable value I might run into the same issue, because all comparisons to database NULL are false. I might choose to implement that logic in my C++ code to have the same semantics as the database.
In practice, though, for your use case, it might not be worth worrying about these edge cases. Just document that your function compares the objects using operator== (or operator !=) and leave it at that.
No. You can write operator overloads for == and != that do whatever you wish. It probably would be a bad idea to do so, but the definition of C++ does not constrain those operators to be each other's logical opposites.
I'm designing my own programming language (called Lima, if you care its on www.btetrud.com), and I'm trying to wrap my head around how to implement operator overloading. I'm deciding to bind operators on specific objects (its a prototype based language). (Its also a dynamic language, where 'var' is like 'var' in javascript - a variable that can hold any type of value).
For example, this would be an object with a redefined + operator:
x =
{ int member
operator +
self int[b]:
ret b+self
int[a] self:
ret member+a
}
I hope its fairly obvious what that does. The operator is defined when x is both the right and left operand (using self to denote this).
The problem is what to do when you have two objects that define an operator in an open-ended way like this. For example, what do you do in this scenario:
A =
{ int x
operator +
self var[b]:
ret x+b
}
B =
{ int x
operator +
var[a] self:
ret x+a
}
a+b ;; is a's or b's + operator used?
So an easy answer to this question is "well duh, don't make ambiguous definitions", but its not that simple. What if you include a module that has an A type of object, and then defined a B type of object.
How do you create a language that guards against other objects hijacking what you want to do with your operators?
C++ has operator overloading defined as "members" of classes. How does C++ deal with ambiguity like this?
Most languages will give precedence to the class on the left. C++, I believe, doesn't let you overload operators on the right-hand side at all. When you define operator+, you are defining addition for when this type is on the left, for anything on the right.
In fact, it would not make sense if you allowed your operator + to work for when the type is on the right-hand side. It works for +, but consider -. If type A defines operator - in a certain way, and I do int x - A y, I don't want A's operator - to be called, because it will compute the subtraction in reverse!
In Python, which has more extensive operator overloading rules, there is a separate method for the reverse direction. For example, there is a __sub__ method which overloads the - operator when this type is on the left, and a __rsub__ which overloads the - operator when this type is on the right. This is similar to the capability, in your language, to allow the "self" to appear on the left or on the right, but it introduces ambiguity.
Python gives precedence to the thing on the left -- this works better in a dynamic language. If Python encounters x - y, it first calls x.__sub__(y) to see if x knows how to subtract y. This can either produce a result, or return a special value NotImplemented. If Python finds that NotImplemented was returned, it then tries the other way. It calls y.__rsub__(x), which would have been programmed knowing that y was on the right hand side. If that also returns NotImplemented, then a TypeError is raised, because the types were incompatible for that operation.
I think this is the ideal operator overloading strategy for dynamic languages.
Edit: To give a bit of a summary, you have an ambiguous situation, so you really only three choices:
Give precedence to one side or the other (usually the one on the left). This prevents a class with a right-side overload from hijacking a class with a left-side overload, but not the other way around. (This works best in dynamic languages, as the methods can decide whether they can handle it, and dynamically defer to the other one.)
Make it an error (as #dave is suggesting in his answer). If there is ever more than one viable choice, it is a compiler error. (This works best in static languages, where you can catch this thing in advance.)
Only allow the left-most class to define operator overloads, as in C++. (Then your class B would be illegal.)
The only other option is to introduce a complex system of precedence to the operator overloads, but then you said you want to reduce the cognitive overhead.
I'm going to answer this question by saying "duh, don't make ambiguous definitions".
If I recreate your example in C++ (using a function f instead of the + operator and int/float instead of A/B, but there really isn't much difference)...
template<class t>
void f(int a, t b)
{
std::cout << "me! me! me!";
}
template<class t>
void f(t a, float b)
{
std::cout << "no, me!";
}
int main(void)
{
f(1, 1.0f);
return 0;
}
...the compiler will tell me precisely that: error C2668: 'f' : ambiguous call to overloaded function
If you create a language powerful enough, it's always going to be possible to create things in it that don't make sense. When this happens, it's probably ok to just throw up your hands and say "this doesn't make sense".
In C++, a op b means a.op(b), so it's unambigious; the order settles it. If, in C++, you want to define an operator whose left operand is a built-in type, then the operator has to be a global function with two arguments, not a member; again, though, the order of the operands determines which method to call. It is illegal to define an operator where both operands are of built-in types.
I would suggest that given X + Y, the compiler should look for both X.op_plus(Y) and Y.op_added_to(X); each implementation should include an attribute indicating whether it should be a 'preferred', 'normal', 'fallback' implementation, and optionally also indicating that it is "common". If both implementations are defined, and they implementations are of different priorities (e.g. "preferred" and "normal"), use the type to select a preference. If both are defined to be of the same priority, and both are "common", favor the X.op_plus(Y) form. If both are defined with the same priority and they are not both "common", flag an error.
I would suggest that the ability to prioritize overloads and conversions would IMHO a very important feature for a language to have. It is not helpful for languages to squawk about ambiguous overloads in cases where both candidates would do the same thing, but languages should squawk in cases where two possible overloads would have different meanings, each of which would be useful in certain contexts. For example, given someFloat==someDouble or someDouble==someLong, a compiler should squawk, since there can be usefulness to knowing whether the numerical quantities represented by two values match, and there can also be usefulness in knowing whether the left-hand operand holds the best possible representation (for its type) of the value in the right-hand operand. Java and C# do not flag ambiguity in either case, opting instead to use the first meaning for the first expression and the second for the second, even though either meaning might be useful in either case. I would suggest that it would be better to reject such comparisons than to have them implement inconsistent semantics.
Overall, I'd suggest as a philosophy that a good language design should let a programmer indicate what's important and what isn't. If a programmer knows that certain "ambiguities" aren't problems, but other ones are, it should be easy to have the compiler flag the latter but not the former.
Addendum
I looked briefly through your proposal; it sees you're expecting bindings to be fully dynamic. I've worked with a language like that (HyperTalk, circa 1988) and it was "interesting". Consider, for example, that "2X" < "3" < 4 < 10 < "11" < "2X". Double dispatch can sometimes be useful, but only in cases where operators overloads with different semantics (e.g. string and numeric comparisons) are limited to operating on disjoint sets of things. Forbidding ambiguous operations at compile time is a good thing, since the programmer will be in a position to specify what's intended. Having such ambiguity trigger a run-time error is a bad thing, because the programmer may be long gone by the time an error surfaces. Consequently, I really can't offer any advice for how to do run-time double dispatch for operators except to say "don't", unless at compile time you restrict the operands to combinations where any possible overload would always have the same semantics.
For example, if you had an abstract "immutable list of numbers" type, with a member to report the length or return the number at a particular index, you could specify that two instances are equal if they have the same length, and every for every index they return the same number. While it would be possible to compare any two instances for equality by examining every item, that could be inefficient if e.g. one instance was a "BunchOfZeroes" type which simply held an integer N=1000000 and didn't actually store any items, and the other was an "NCopiesOfArray" which held N=500000 and {0,0} as the array to be copied. If many instances of those types are going to be compared, efficiency could be improved by having such comparisons invoke a method which, after checking overall array length, checks whether the "template" array contains any non-zero elements. If it doesn't, then it can be reported as equal the bunch-of-zeroes array without having to perform 1,000,000 element comparisons. Note that the invocation of such a method by double dispatch would not alter the program's behavior--it would merely allow it to execute more quickly.
Is it a bad idea to overload &&, || or comma operator and Why?
I wouldn't overload operator&& or operator||. Even if you define a class that gives rise to a Boolean algebra (finite sets, for example), it would probably be a better choice to overload operator& and operator|.
The reason is that C++ programmers expect special semantics for operator&& and operator||: they are short-circuited, i.e. don't evaluate their right-hand argument if not necessary. You can't get this behavior by overloading, since you'll be defining a function.
Overloading operator, has been done in e.g. the Boost.Assign library. This is also the only example of its overloading that I know, and I've never even considered overloading it myself. You'd better have a very specific use case for it where no other operator is suited.
For overloading the logical operators in C++, the operands must be evaluated, which isn't the way things normally work with short circuiting of built-in types.
Look at the below link.
Don't overload logical operators in C++
You shouldn't overload any operators in a way that is surprising. :-)
If you can do it in a way that makes sense (not only to you), it is fine to do so.
Like others have said, the logical operators are special in that they have the effect of lazy evaluation. So your overloads should probably preserve this lazy effect, like with expression templates, or only be used where people don't expect this effect anyway.
It is usually a bad idea: those three operators have a sequencing effect which is lost when you overload them. Loosing that sequencing effect can cause arm (i.e. strange bugs) to those not expecting that lost.
There are cases with template expressions where you can keep the sequencing effect, in those cases I see no problem in overloading them.
The overloads of operator, I know of have another problem: they work in such a way that the apparent chains of operation isn't the real one. Usually, they are used in context when it makes no difference, but once in a blue moon, that is another source of a strange bugs.
I'd say it depends on what your overloads are doing. For example, && and || are expected to work as logical conditions, so if your overload semantics work somehow differently, they can confuse other people (or even yourself, if you don't use them for a while and forget what they do). Consider what you would expect the operators to do if you wouldn't know how they are overloaded and if it would be clearer to just use normal methods instead.
As the others have said, missing lazy evaluation is the main reason to avoid overloading logical operators.
However, there is one very good reason to overload them: Expression templates. The Boost.Lambda library does this, and it's very useful!
Its bad idea except situations when your classes represents some logic entity, because overloaded operators will disorientate and can cause new bugs in code.
It is reasonable that sizeof and typeid cannot be overloaded, but I can't see the harm in overloading ?:, .* and .. Are there technical reasons for this?
To quote Bjarne Stroustrup:
There is no fundamental reason to
disallow overloading of ?:. I just
didn't see the need to introduce the
special case of overloading a ternary
operator. Note that a function
overloading expr1?expr2:expr3 would
not be able to guarantee that only one
of expr2 and expr3 was executed.
...
Operator . (dot) could in principle be
overloaded using the same technique as
used for ->. However, doing so can
lead to questions about whether an
operation is meant for the object
overloading . or an object referred to
by . ... This problem can be solved in
several ways. At the time of
standardization, it was not obvious
which way would be best.
Source
If you overload ., how would you access class members? What would be the meaning of obj.data?
What would the syntax be?
In fact, there are good reasons for not overloading any operator
which doesn't evaluate all of its operands: you shouldn't
overload && or || either (except in special cases). You can't
simulate this with an overloaded operator. Consider something
like:
p != NULL ? defaultValue : p->getValue()
where the type of defaultValue or p->getValue() causes overload
resolution to pick up your overload. It's a common idiom, but
it can't be made to work if you overloaded ?:.
Here's some reading material C++ FAQ Lite :)
In general there would be no benefit to overloading the operators above. What additional semantics would you be trying to implement?
The reason for overloading operators is to provide intuitive syntax to the user of your class. For example, it makes sense to overload + and += for strings. It's obvious to another developer what that means.
It's really not obvious what you would overload ?: for ... That said there are no technical reasons I am aware of that prevented these operators from being overloaded.
Overloading the -> operator allows for reference counted pointers to be created such as boost::shared_ptr. The concept of 'negating' an object might have different meanings in different contexts, so it's reasonable to occasionally overload this operator.
Defining "operator bool" is enough for ?: to work.
For operator . think of this: SomeClass."SomeString!!"
These overloadings prohibit compiler's lexer from parsing the file correctly.
The reason you can overload most operators is to be able to simulate built in types. Since none of the built in types can use the . operator, it wouldn't serve any purpose. operator* and operator-> are there so you can make your own pointer classes. All the math and boolean operators are there to be able to make your own numeric classes.
The C++ standard defines the expression using subscripts as a postfix expression. AFAIK, this operator always takes two arguments (the first is the pointer to T and the other is the enum or integral type). Hence it should qualify as a binary operator.
However MSDN and IBM does not list it as a binary operator.
So the question is, what is subscript operator? Is it unary or binary? For sure, it is not unary as it is not mentioned in $5.3 (at least straigt away).
What does it mean when the Standard mentions it's usage in the context of postfix expression?
I'd tend to agree with you in that operator[] is a binary operator in the strictest sense, since it does take two arguments: a (possibly implicit) reference to an object, and a value of some other type (not necessarily enumerated or integral). However, since it is a bracketing operator, you might say that the sequence of tokens [x], where x might be any valid subscript-expression, qualifies as a postfix unary operator in an abstract sense; think currying.
Also, you cannot overload a global operator[](const C&, size_t), for example. The compiler complains that operator[] must be a nonstatic member function.
You are correct that operator[] is a binary operator but it is special in that it must also be a member function.
Similar to operator()
You can read up on postfix expressions here
I just found an interesting article about operator[] and postfix expression, here
I think it's the context that [] is used in that counts. Section 5.2.1 the symbol [] is used in the context of a postfix expression that is 'is identical (by definition) to *((E1)+(E2))'. In this context, [] isn't an operator. In section 13.5.5 its used to mean the subscripting operator. In this case it's an operator that takes one argument. For example, if I wrote:
x = a[2];
It's not necessarily the case that the above statement evaluates to:
x = *(a + 2);
because 'a' might be an object. If a is an object type then in this context, [] is used as an subscript operator.
Anyway that's the best explanation I can derive from the standard that resolves apparent contradictions.
If you take a close look to http://en.wikipedia.org/wiki/Operators_in_C_and_C%2B%2B it will explain you that standard C++ recognize operator[] to be a binary operator, as you said.
Operator[] is, generally speaking, binary, and, despite there is the possibility to make it unary, it should always be used as binary inside a class, even because it has no sense outside a class.
It is well explained in the link I provided you...
Notice that sometimes many programmers overload operators without think too much about what they are doing, sometimes overloading them in an incorrect manner; the compiler is ease is this and accept it, but, probably, it was not the correct way to overload that operator.
Following guides like the one I provided you, is a good way to do things in the correct manner.
So, always beware examples where operators are overloaded without a good practice (out of standard), refer, first to the standard methods, and use those examples that are compliant to them.