How to construct expression templates from operations on fundamental data types? - c++

I'm trying to build expression templates from maths operations in order to correctly sequence unsequenced operations (such as opertor+).
The reason is because operator co_await with operator+ appears to be unsequenced (resulting in incorrect results from generators/yielding tasks). See here C++20 Coroutines, Unexpected reordering of await_resume, return_value and yield_value.
If I can use expression templates on primitive data types I can manually sequence the execution using expression templates. This is mainly an excercise to determine whether its possible (not whether its a good idea) to detemine whether a possible work around for the issue exists.
If its not possible to get global operators overloaded for primative types then does anybody have any ideas how to inject expression templates into existing operator based maths code.
Ongoing Research
Operators require atleast 1 class type for overloads:
why C++ operator overloading requires "having at least one parameter of class type"?
So, I cannot do it for fundamental data types.
So it transforms this questions focus to be around other ways to transform fundamental data type expressions into expression templates.
As per comment below from Peter.
As the arguments of operator+(a,b) are unsequenced this approach will not work :(

Related

Operator vs functions behaviour

I am reading through the following document,
https://code.google.com/p/go-wiki/wiki/GoForCPPProgrammers
and found the statement below a bit ambiguous:
Unlike in C++, new is a function, not an operator; new int is a syntax error.
In C++ we implement operators as functions, e.g. + using operator+.
So what is the exact difference of operator vs function in programming languages in general?
The actual distinction between functions and operators depends on the programming language. In plain C, operators are a part of the language itself. One cannot add an operator, nor change the behavior of an existing operator. This is not the case with C++, where operators are resolved to functions.
From a totally different point of view, consider Haskell, where ANY (binary) function may be treated as a binary operator:
If you don't speak Haskell, but know about dot products, this example should still be fairly straight-forward. Given:
dotP :: (Double, Double) -> (Double, Double) -> Double
dotP (x1, y1) (x2, y2) = x1 * x2 + y1 * y2
Both
dotP (1,2) (3,4)
and
(1,2) `dotP` (3,4)
will give 11.
To address the quote in the Go documentation: The Go developers are simply stressing that where in C++, one would treat new as a keyword with its own syntax, one should treat new in Go as any other function.
Although I still think the question is basically a duplicate of Difference between operator and function in C++?, it may be worthwhile to clarify what the difference means in the specific context you quoted.
The point there is that a function in C++ is something that has a name and possibly function arguments, and is called using this syntax:
func(arg1,arg2,...)
In other words, the name first, then a round bracket, then the comma-separated list of arguments. This is the function call syntax of C++.
Whereas an operator is used in the way described by clause 5 of the Standard. The details of the syntax vary depending on the kind of operator, e.g. there are unary operators like &, binary operators like +, * etc.; there is also the ternary conditional operator ? :, and then there are special keywords like new, delete, sizeof, some of which translate to function calls for user-defined types, but they don't use the function call syntax described above. I.e. you don't call
new(arg1,arg2,...)
but instead, you use a special "unary expression syntax" (§5.3), which implies, among other things, that there are no round brackets immediately after the keyword new (at least, not necessarily).
It's this syntactic difference that the authors talk about in the section you quoted.
"What is the difference between operators and functions?"
Syntax. But in fact, it's purely a convention with regards to
the language: in C++, + is an infix operator (and only
operators can be infix), and func() would be a function. But
even this isn't always true: MyClass::operator+() is
a function, but it can, and usually is invoked using the
operator syntax.
Other languages have different rules: in languages like Lisp,
there is no real difference. One can distinguish between
built-in functions vs. user defined functions, but the
distinction is somewhat artificial, since you could easily
extend lisp to add additional built-in functions. And there are
languages which allow using the infix notation for user defined
functions. And languages like Python map between them: lhs
+ rhs maps to the function call lhs.__add__( rhs ) (so
"operators" are really just syntactic sugar).
I sum, there is not rule for programming languages in general.
There are simply two different words, and each language is
free to use them as it pleases, to best describe the language.
So what is the exact difference of operator vs function in programming languages in general?
It is broad. In an abstract syntax tree, operators are unary, binary or sometimes ternary nodes - binding expressions together with a certain precedence e.g. + has lower precedence than *, which in turn has lower precedence than new.
Functions are a much more abstract concept. As a primitive, they are typed subroutine entrypoints that depending on the language can be used as rvalues with lexical scope.
C++ allows to override (overload) operators with methods by means of dynamically dispatching operator evaluation to said methods. This is a language "feature" that - as the existence of this question implies - mostly confuses people and is not available in Go.
operators are part of c++ language syntax, in C++ you may 'overload' them as functions if you dont want the default behaviour, For complex types or user defined types , Language may not have the semantic of the operator known , So yuser can overload them with thier own implementation.

How to dis-ambiguate operator definitions between objects/classes in a programming language?

I'm designing my own programming language (called Lima, if you care its on www.btetrud.com), and I'm trying to wrap my head around how to implement operator overloading. I'm deciding to bind operators on specific objects (its a prototype based language). (Its also a dynamic language, where 'var' is like 'var' in javascript - a variable that can hold any type of value).
For example, this would be an object with a redefined + operator:
x =
{ int member
operator +
self int[b]:
ret b+self
int[a] self:
ret member+a
}
I hope its fairly obvious what that does. The operator is defined when x is both the right and left operand (using self to denote this).
The problem is what to do when you have two objects that define an operator in an open-ended way like this. For example, what do you do in this scenario:
A =
{ int x
operator +
self var[b]:
ret x+b
}
B =
{ int x
operator +
var[a] self:
ret x+a
}
a+b ;; is a's or b's + operator used?
So an easy answer to this question is "well duh, don't make ambiguous definitions", but its not that simple. What if you include a module that has an A type of object, and then defined a B type of object.
How do you create a language that guards against other objects hijacking what you want to do with your operators?
C++ has operator overloading defined as "members" of classes. How does C++ deal with ambiguity like this?
Most languages will give precedence to the class on the left. C++, I believe, doesn't let you overload operators on the right-hand side at all. When you define operator+, you are defining addition for when this type is on the left, for anything on the right.
In fact, it would not make sense if you allowed your operator + to work for when the type is on the right-hand side. It works for +, but consider -. If type A defines operator - in a certain way, and I do int x - A y, I don't want A's operator - to be called, because it will compute the subtraction in reverse!
In Python, which has more extensive operator overloading rules, there is a separate method for the reverse direction. For example, there is a __sub__ method which overloads the - operator when this type is on the left, and a __rsub__ which overloads the - operator when this type is on the right. This is similar to the capability, in your language, to allow the "self" to appear on the left or on the right, but it introduces ambiguity.
Python gives precedence to the thing on the left -- this works better in a dynamic language. If Python encounters x - y, it first calls x.__sub__(y) to see if x knows how to subtract y. This can either produce a result, or return a special value NotImplemented. If Python finds that NotImplemented was returned, it then tries the other way. It calls y.__rsub__(x), which would have been programmed knowing that y was on the right hand side. If that also returns NotImplemented, then a TypeError is raised, because the types were incompatible for that operation.
I think this is the ideal operator overloading strategy for dynamic languages.
Edit: To give a bit of a summary, you have an ambiguous situation, so you really only three choices:
Give precedence to one side or the other (usually the one on the left). This prevents a class with a right-side overload from hijacking a class with a left-side overload, but not the other way around. (This works best in dynamic languages, as the methods can decide whether they can handle it, and dynamically defer to the other one.)
Make it an error (as #dave is suggesting in his answer). If there is ever more than one viable choice, it is a compiler error. (This works best in static languages, where you can catch this thing in advance.)
Only allow the left-most class to define operator overloads, as in C++. (Then your class B would be illegal.)
The only other option is to introduce a complex system of precedence to the operator overloads, but then you said you want to reduce the cognitive overhead.
I'm going to answer this question by saying "duh, don't make ambiguous definitions".
If I recreate your example in C++ (using a function f instead of the + operator and int/float instead of A/B, but there really isn't much difference)...
template<class t>
void f(int a, t b)
{
std::cout << "me! me! me!";
}
template<class t>
void f(t a, float b)
{
std::cout << "no, me!";
}
int main(void)
{
f(1, 1.0f);
return 0;
}
...the compiler will tell me precisely that: error C2668: 'f' : ambiguous call to overloaded function
If you create a language powerful enough, it's always going to be possible to create things in it that don't make sense. When this happens, it's probably ok to just throw up your hands and say "this doesn't make sense".
In C++, a op b means a.op(b), so it's unambigious; the order settles it. If, in C++, you want to define an operator whose left operand is a built-in type, then the operator has to be a global function with two arguments, not a member; again, though, the order of the operands determines which method to call. It is illegal to define an operator where both operands are of built-in types.
I would suggest that given X + Y, the compiler should look for both X.op_plus(Y) and Y.op_added_to(X); each implementation should include an attribute indicating whether it should be a 'preferred', 'normal', 'fallback' implementation, and optionally also indicating that it is "common". If both implementations are defined, and they implementations are of different priorities (e.g. "preferred" and "normal"), use the type to select a preference. If both are defined to be of the same priority, and both are "common", favor the X.op_plus(Y) form. If both are defined with the same priority and they are not both "common", flag an error.
I would suggest that the ability to prioritize overloads and conversions would IMHO a very important feature for a language to have. It is not helpful for languages to squawk about ambiguous overloads in cases where both candidates would do the same thing, but languages should squawk in cases where two possible overloads would have different meanings, each of which would be useful in certain contexts. For example, given someFloat==someDouble or someDouble==someLong, a compiler should squawk, since there can be usefulness to knowing whether the numerical quantities represented by two values match, and there can also be usefulness in knowing whether the left-hand operand holds the best possible representation (for its type) of the value in the right-hand operand. Java and C# do not flag ambiguity in either case, opting instead to use the first meaning for the first expression and the second for the second, even though either meaning might be useful in either case. I would suggest that it would be better to reject such comparisons than to have them implement inconsistent semantics.
Overall, I'd suggest as a philosophy that a good language design should let a programmer indicate what's important and what isn't. If a programmer knows that certain "ambiguities" aren't problems, but other ones are, it should be easy to have the compiler flag the latter but not the former.
Addendum
I looked briefly through your proposal; it sees you're expecting bindings to be fully dynamic. I've worked with a language like that (HyperTalk, circa 1988) and it was "interesting". Consider, for example, that "2X" < "3" < 4 < 10 < "11" < "2X". Double dispatch can sometimes be useful, but only in cases where operators overloads with different semantics (e.g. string and numeric comparisons) are limited to operating on disjoint sets of things. Forbidding ambiguous operations at compile time is a good thing, since the programmer will be in a position to specify what's intended. Having such ambiguity trigger a run-time error is a bad thing, because the programmer may be long gone by the time an error surfaces. Consequently, I really can't offer any advice for how to do run-time double dispatch for operators except to say "don't", unless at compile time you restrict the operands to combinations where any possible overload would always have the same semantics.
For example, if you had an abstract "immutable list of numbers" type, with a member to report the length or return the number at a particular index, you could specify that two instances are equal if they have the same length, and every for every index they return the same number. While it would be possible to compare any two instances for equality by examining every item, that could be inefficient if e.g. one instance was a "BunchOfZeroes" type which simply held an integer N=1000000 and didn't actually store any items, and the other was an "NCopiesOfArray" which held N=500000 and {0,0} as the array to be copied. If many instances of those types are going to be compared, efficiency could be improved by having such comparisons invoke a method which, after checking overall array length, checks whether the "template" array contains any non-zero elements. If it doesn't, then it can be reported as equal the bunch-of-zeroes array without having to perform 1,000,000 element comparisons. Note that the invocation of such a method by double dispatch would not alter the program's behavior--it would merely allow it to execute more quickly.

Overloading logical operators considered bad practice?

Is it a bad idea to overload &&, || or comma operator and Why?
I wouldn't overload operator&& or operator||. Even if you define a class that gives rise to a Boolean algebra (finite sets, for example), it would probably be a better choice to overload operator& and operator|.
The reason is that C++ programmers expect special semantics for operator&& and operator||: they are short-circuited, i.e. don't evaluate their right-hand argument if not necessary. You can't get this behavior by overloading, since you'll be defining a function.
Overloading operator, has been done in e.g. the Boost.Assign library. This is also the only example of its overloading that I know, and I've never even considered overloading it myself. You'd better have a very specific use case for it where no other operator is suited.
For overloading the logical operators in C++, the operands must be evaluated, which isn't the way things normally work with short circuiting of built-in types.
Look at the below link.
Don't overload logical operators in C++
You shouldn't overload any operators in a way that is surprising. :-)
If you can do it in a way that makes sense (not only to you), it is fine to do so.
Like others have said, the logical operators are special in that they have the effect of lazy evaluation. So your overloads should probably preserve this lazy effect, like with expression templates, or only be used where people don't expect this effect anyway.
It is usually a bad idea: those three operators have a sequencing effect which is lost when you overload them. Loosing that sequencing effect can cause arm (i.e. strange bugs) to those not expecting that lost.
There are cases with template expressions where you can keep the sequencing effect, in those cases I see no problem in overloading them.
The overloads of operator, I know of have another problem: they work in such a way that the apparent chains of operation isn't the real one. Usually, they are used in context when it makes no difference, but once in a blue moon, that is another source of a strange bugs.
I'd say it depends on what your overloads are doing. For example, && and || are expected to work as logical conditions, so if your overload semantics work somehow differently, they can confuse other people (or even yourself, if you don't use them for a while and forget what they do). Consider what you would expect the operators to do if you wouldn't know how they are overloaded and if it would be clearer to just use normal methods instead.
As the others have said, missing lazy evaluation is the main reason to avoid overloading logical operators.
However, there is one very good reason to overload them: Expression templates. The Boost.Lambda library does this, and it's very useful!
Its bad idea except situations when your classes represents some logic entity, because overloaded operators will disorientate and can cause new bugs in code.

Compile time operators

Could someone please list a all compile time operators in available in C++?
There are two operators in C++ whose result can always be determined at compile-time, regardless of the operand(s), and those are sizeof[1] and ::[2].
Of course there are plenty of particular uses of other operators that can be resolved at compile-time, for example those listed in the standard for integer constant expressions.
[1] C99, unlike C++, has variable length array types. sizeof applied to a VLA can't be determined at compile-time. Some C++ compilers provide VLAs as an extension.
[2] that is, it can be determined at compile time what entity is the result of the expression. If the entity is an object, then the object's value is another matter.
There is no such term in the standard.
But here's a list of all operators: http://en.wikipedia.org/wiki/Operators_in_C_and_C%2B%2B (I'm sure there are others...). It should be fairly easy to categorise them one way or the other.

condition execution using if or logic

when using objects I sometimes test for their existence
e.g
if(object)
object->Use();
could i just use
(object && object->Use());
and what differences are there, if any?
They're the same assuming object->Use() returns something that's valid in a boolean context; if it returns void the compiler will complain that a void return isn't being ignored like it should be, and other return types that don't fit will give you something like no match for 'operator&&'
One enormous difference is that the two function very differently if operator&& has been overloaded. Short circuit evaluation is only provided for the built in operators. In the case of an overloaded operator, both sides will be evaluated [in an unspecified order; operator&& also does not define a sequence point in this case], and the results passed to the actual function call.
If object and the return type of object->Use() are both primitive types, then you're okay. But if either are of class type, then it is possible object->Use() will be called even if object evaluates to false.
They are effectively the same thing but the second is not as clear as your first version, whose intent is obvious. Execution speed is probably no different, either.
Functionally they are the same, and a decent compiler should be able to optimize both equally well. However, writing an expression with operators like this and not checking the result is very odd. Perhaps if this style were common, it would be considered concise and easy to read, but it's not - right now it's just weird. You may get used to it and it could make perfect sense to you, but to others who read your code, their first impression will be, "What the heck is this?" Thus, I recommend going with the first, commonly used version if only to avoid making your fellow programmers insane.
When I was younger I think I would have found that appealing. I always wanted to trim down lines of code, but I realized later on that when you deviate too far from the norm, it'll bite you in the long run when you start working with a team. If you want to achieve zen-programming with minimum lines of code, focus on the logic more than the syntax.
I wouldn't do that. If you overloaded operator&& for pointer type pointing to object and class type returned by object->Use() all bets are off and there is no short-circuit evaluation.
Yes, you can. You see, C language, as well as C++, is a mix of two fairly independent worlds, or realms, if you will. There's the realm of statements and the realm of expressions. Each one can be seen as a separate sub-language in itself, with its own implementations of basic programming constructs.
In the realm of statements, the sequencing is achieved by the ; at the end of the single statement or by the } at the end of compound statement. In the realm of expressions the sequencing is provided by the , operator.
Branching in the realm of statements is implemented by if statement, while in the realm of expressions it can be implemented by either ?: operator or by use of the short-circuit evaluation properties of && and || operators (which is what you just did, assuming your expression is valid).
The realm of expressions has no cycles, but it has recursion that can replace it (requires function calls though, which inevitable forces us to switch to statements).
Obviously these realms are far from being equivalent in their power. C and C++ are languages dominated by statements. However, often one can implement fairly complex constructs using the language of expressions alone.
What you did above does implement equivalent branching in the language of expressions. Keep in mind that many people will find it hard to read in the normal code (mostly because, once again, they are used by statement-dominated C and C++ code). But it often comes very handy in some specific contexts, like template metaprogramming, for one example.