Optimizing composite std::functions

Optimizing composite std::functions - c++

Is it possible to optimize a series of "glued together" std::functions and/or is there any implementation that attempts to do this?
What I mean is most easily expressed mathematically: say I want to make a std::function that is a function of a function:
f(x,y,z) = x^2 * y^3 * z^4
g(x,y,z) = f(x,y,z) / (x*y^2)
Is there a way for an STL/compiler implementor to optimize away parts of the arithmetic is calling a function object of g, created from a function object of f?
This would be a kind of symbolic simplification of the functions, but because this is a std::function, it would have to be spotted on a machine level.
Due to this being an optimization, which takes time, and probably isn't free (in clock cycles and/or memory), it probably isn't allowed by the Standard? It leans very close to a language that is typically ran through a VM. (I'm thinking LLVM more than Java here, with runtime optimizations).
EDIT: In order to make the discussion "more useful", here's a short code snippet (I understand a lambda is not a std::function, but a lambda can be stored in a std::function, so assuming auto below means std::function<T> with the appropriate T will express perfectly what I meant above):
auto f = [](const double x, const double y, const double z){ return x*x*y*y*y*z*z*z*z; };
auto g = [](const double c, const double y, const double z){ return f(x,y,z)/(x*y*y); };
A "trivial" compiler would make g equivalent to
double g(const double x, const double y, const double z){ return x*x*y*y*y*z*z*z*z/(x*y*y); }
While an optimized std::function could make it (mathematically and in every other sense correct!):
double g( const double x, const double y, const double z){ return x*y*z*z*z*z; }
Note that although I'm talking about mathematical functions here, similar transformations could be made for functions in the general sense, but that would take more introspection, which means overhead.
I can see this being very important when designing mathematical and physics simulations, where the generality of compositing existing library functions into user-case functions, with all the usual mathematical simplifications could make for a nice method of expressive, yet performant calculation software.

This is why you leave the optimizing to the compiler. They're algebraically equivalent but not equivalent due to FP imprecision. Your two versions of g would yield subtly different answers, which could be very important if called in an inner loop- not to mention the behavioural difference if x, y, z was 0.
Secondly, as the contents of function are unknown until run-time, there's no way the compiler could perform such optimizations as it doesn't have the data it needs.

The compiler is allowed to optimize in specific allowed cases, or if the optimized code behaves "as if" it were the unopotimized code.
In this case not only would x or y being 0 change the results, but if f overflowed, or the data types were floating point or user defined the results could change as a result of such optimization. Thus I suspect in practice you'll never see it happen and would have to (if possible) compose a combined function at compile time (presumably using templates).

Related

swapping values of fundamental type

I guess we all know how to swap values, but there is another way:
constexpr auto assign(auto& ...a) noexcept
{
return [&](auto const ...v) noexcept { ((a = v), ...); };
}
To swap, we need to invoke assign(a, b)(b, a).
Is this an efficient way to swap values for fundamental types? Does it offer more room for optimizations than the usual way?

This is not a good way to swap values, even for fundamental types. There is no standard "assign idiom"; this is purely something that you made up.
Even if it works, there's no reason to expect that it would produce faster code than std::swap, which is a standard, well-known idiom.
I strongly recommend that you use std::swap instead, so folks reading your code will understand, without having to Google.

Since you keep asking for theoreticals in the comments: Compiler optimizations are based around whether the compiler can see common patterns to transform them into better code based on whatever the optimization metric is.
To that end, the compiler can be really smart at figuring out that the following is a swap:
// Conventional (simplified) swap definition
auto swap(int& a, int& b) -> void {
const int tmp = a;
a = b;
b = tmp;
}
In the code you provide, it should be almost equivalent for fundamental types only, since it will see the parameters const auto...v as the tmp object -- however it will see two sets of parameters for the copies. If we flatten a transformation of assign(a,b)(b,a), what the compile really sees from the template expansion is:
const auto v1 = a;
const auto v2 = b;
b = v1;
a = v2;
It's a similar expression to the standard swap, but not quite the same as to what compilers have been trained to recognize for several decades of optimizations. Most likely, the compilers will see this as an equivalent transformation, and produce the same assembly -- which is the case with both gcc and clang (Credit to #HolyBlackCat for the Godbolt link).
Please note that compilers are really smart at optimizing code written in conventional/expected ways. What they tend to dislike and struggle with is attempts to be clever. In particular, assign(a,b)(b,a) requires the compiler to flatten the inputs to make the optimization in the first place -- whereas the conventional swap(a,b) is spelt out for it. Basically: at best you will get the same as just doing things the conventional way.
To that end, this is not a good way to swap values. This likely does not provide better optimizations (if anything, likely slightly worse).
If we expand this definition to include generics, it gets worse since the const auto provides more copies, and does not perform proper moves (and proper move-semantics also help the compiler as well). Additionally it doesn't semantically read as a "swap", whereas swap(a,b) or even std::tie(a,b) = std::make_tuple(b,a) is much less ambiguous.
Also this is not an "idiom" as this has never been established by usage as having been a pattern.

I think it makes some sense if you were assigning multiple values at once, where some of the assignee (not sure that's the term?) might be changed in the process.
For instance, doing a rotate:
int a = 10, b = 20, c = 30;
a = b;
b = c;
c = a; // oops you might wanted c = 10, but c is actually 20 now.
However, if you are only swapping 2 values, then std::swap is probably the better way to go, in both readability and optimizability.
While readability is subjective, at least to me, swap(a, b) is easier to understand, whereas assign(a, b)(b, a) doesn't imply the idea of swapping directly.
Optimizability needs to be benchmarked to be certain, but as far as I know, std::swap is well optimized for basic types, and I do not think your way would further optimize it. However, I can not give you a sure answer without actually benchmarking it.

Why is C++ auto risky [duplicate]

It seems that auto was a fairly significant feature to be added in C++11 that seems to follow a lot of the newer languages. As with a language like Python, I have not seen any explicit variable declaration (I am not sure if it is possible using Python standards).
Is there a drawback to using auto to declare variables instead of explicitly declaring them?

The question is about drawbacks of auto, so this answer highlights some of those. A drawback of using a programming language feature (in this case, a facility associated with a language keyword) does not mean that feature is unacceptable, nor does it mean that feature should be avoided entirely. It means there are disadvantages along with advantages, so a decision to use auto type deduction over alternatives must consider engineering trade-offs.
When used well, auto has several advantages as well - which is not the subject of the question. The drawbacks result from ease of abuse, and from increased potential for code to behave in unintended or unexpected ways.
The main drawback is that, by using auto, you don't necessarily know the type of object being created. There are also occasions where the programmer might expect the compiler to deduce one type, but the compiler adamantly deduces another.
Given a declaration like
auto result = CallSomeFunction(x,y,z);
you don't necessarily have knowledge of what type result is. It might be an int. It might be a pointer. It might be something else. All of those support different operations. You can also dramatically change the code by a minor change like
auto result = CallSomeFunction(a,y,z);
because, depending on what overloads exist for CallSomeFunction() the type of result might be completely different - and subsequent code may therefore behave completely differently than intended. You might suddenly trigger error messages in later code(e.g. subsequently trying to dereference an int, trying to change something which is now const). The more sinister change is where your change sails past the compiler, but subsequent code behaves in different and unknown - possibly buggy - ways. For example (as noted by sashoalm in comments) if the deduced type of a variable changes an integral type to a floating point type - and subsequent code is unexpectedly and silently affected by loss of precision.
Not having explicit knowledge of the type of some variables therefore makes it harder to rigorously justify a claim that the code works as intended. This means more effort to justify claims of "fit for purpose" in high-criticality (e.g. safety-critical or mission-critical) domains.
The other, more common drawback, is the temptation for a programmer to use auto as a blunt instrument to force code to compile, rather than thinking about what the code is doing, and working to get it right.

This isn't a drawback of auto in a principled way exactly, but in practical terms it seems to be an issue for some. Basically, some people either: a) treat auto as a savior for types and shut their brain off when using it, or b) forget that auto always deduces to value types. This causes people to do things like this:
auto x = my_obj.method_that_returns_reference();
Oops, we just deep copied some object. It's often either a bug or a performance fail. Then, you can swing the other way too:
const auto& stuff = *func_that_returns_unique_ptr();
Now you get a dangling reference. These problems aren't caused by auto at all, so I don't consider them legitimate arguments against it. But it does seem like auto makes these issue more common (from my personal experience), for the reasons I listed at the beginning.
I think given time people will adjust, and understand the division of labor: auto deduces the underlying type, but you still want to think about reference-ness and const-ness. But it's taking a bit of time.

Other answers are mentioning drawbacks like "you don't really know what the type of a variable is." I'd say that this is largely related to sloppy naming convention in code. If your interfaces are clearly-named, you shouldn't need to care what the exact type is. Sure, auto result = callSomeFunction(a, b); doesn't tell you much. But auto valid = isValid(xmlFile, schema); tells you enough to use valid without having to care what its exact type is. After all, with just if (callSomeFunction(a, b)), you wouldn't know the type either. The same with any other subexpression temporary objects. So I don't consider this a real drawback of auto.
I'd say its primary drawback is that sometimes, the exact return type is not what you want to work with. In effect, sometimes the actual return type differs from the "logical" return type as an implementation/optimisation detail. Expression templates are a prime example. Let's say we have this:
SomeType operator* (const Matrix &lhs, const Vector &rhs);
Logically, we would expect SomeType to be Vector, and we definitely want to treat it as such in our code. However, it is possible that for optimisation purposes, the algebra library we're using implements expression templates, and the actual return type is this:
MultExpression<Matrix, Vector> operator* (const Matrix &lhs, const Vector &rhs);
Now, the problem is that MultExpression<Matrix, Vector> will in all likelihood store a const Matrix& and const Vector& internally; it expects that it will convert to a Vector before the end of its full-expression. If we have this code, all is well:
extern Matrix a, b, c;
extern Vector v;
void compute()
{
Vector res = a * (b * (c * v));
// do something with res
}
However, if we had used auto here, we could get in trouble:
void compute()
{
auto res = a * (b * (c * v));
// Oops! Now `res` is referring to temporaries (such as (c * v)) which no longer exist
}

It makes your code a little harder, or tedious, to read.
Imagine something like that:
auto output = doSomethingWithData(variables);
Now, to figure out the type of output, you'd have to track down signature of doSomethingWithData function.

One of the drawbacks is that sometimes you can't declare const_iterator with auto. You will get ordinary (non const) iterator in this example of code taken from this question:
map<string,int> usa;
//...init usa
auto city_it = usa.find("New York");

Like this developer, I hate auto. Or rather, I hate how people misuse auto.
I'm of the (strong) opinion that auto is for helping you write generic code, not for reducing typing.
C++ is a language whose goal is to let you write robust code, not to minimize development time.
This is fairly obvious from many features of C++, but unfortunately a few of the newer ones like auto that reduce typing mislead people into thinking they should start being lazy with typing.
In pre-auto days, people used typedefs, which was great because typedef allowed the designer of the library to help you figure out what the return type should be, so that their library works as expected. When you use auto, you take away that control from the class's designer and instead ask the compiler to figure out what the type should be, which removes one of the most powerful C++ tools from the toolbox and risks breaking their code.
Generally, if you use auto, it should be because your code works for any reasonable type, not because you're just too lazy to write down the type that it should work with.
If you use auto as a tool to help laziness, then what happens is that you eventually start introducing subtle bugs in your program, usually caused by implicit conversions that did not happen because you used auto.
Unfortunately, these bugs are difficult to illustrate in a short example here because their brevity makes them less convincing than the actual examples that come up in a user project -- however, they occur easily in template-heavy code that expect certain implicit conversions to take place.
If you want an example, there is one here. A little note, though: before being tempted to jump and criticize the code: keep in mind that many well-known and mature libraries have been developed around such implicit conversions, and they are there because they solve problems that can be difficult if not impossible to solve otherwise. Try to figure out a better solution before criticizing them.

auto does not have drawbacks per se, and I advocate to (hand-wavily) use it everywhere in new code. It allows your code to consistently type-check, and consistently avoid silent slicing. (If B derives from A and a function returning A suddenly returns B, then auto behaves as expected to store its return value)
Although, pre-C++11 legacy code may rely on implicit conversions induced by the use of explicitly-typed variables. Changing an explicitly-typed variable to auto might change code behaviour, so you'd better be cautious.

Keyword auto simply deduce the type from the return value. Therefore, it is not equivalent with a Python object, e.g.
# Python
a
a = 10 # OK
a = "10" # OK
a = ClassA() # OK
// C++
auto a; // Unable to deduce variable a
auto a = 10; // OK
a = "10"; // Value of const char* can't be assigned to int
a = ClassA{} // Value of ClassA can't be assigned to int
a = 10.0; // OK, implicit casting warning
Since auto is deduced during compilation, it won't have any drawback at runtime whatsoever.

What no one mentioned here so far, but for itself is worth an answer if you asked me.
Since (even if everyone should be aware that C != C++) code written in C can easily be designed to provide a base for C++ code and therefore be designed without too much effort to be C++ compatible, this could be a requirement for design.
I know about some rules where some well defined constructs from C are invalid for C++ and vice versa. But this would simply result in broken executables and the known UB-clause applies which most times is noticed by strange loopings resulting in crashes or whatever (or even may stay undetected, but that doesn't matter here).
But auto is the first time1 this changes!
Imagine you used auto as storage-class specifier before and transfer the code. It would not even necessarily (depending on the way it was used) "break"; it actually could silently change the behaviour of the program.
That's something one should keep in mind.
1At least the first time I'm aware of.

As I described in this answer auto can sometimes result in funky situations you didn't intend.
You have to explictly say auto& to have a reference type while doing just auto can create a pointer type. This can result in confusion by omitting the specifier all together, resulting in a copy of the reference instead of an actual reference.

One reason that I can think of is that you lose the opportunity to coerce the class that is returned. If your function or method returned a long 64 bit, and you only wanted a 32 unsigned int, then you lose the opportunity to control that.

I think auto is good when used in a localized context, where the reader easily & obviously can deduct its type, or well documented with a comment of its type or a name that infer the actual type. Those who don't understand how it works might take it in the wrong ways, like using it instead of template or similar. Here are some good and bad use cases in my opinion.
void test (const int & a)
{
// b is not const
// b is not a reference
auto b = a;
// b type is decided by the compiler based on value of a
// a is int
}
Good Uses
Iterators
std::vector<boost::tuple<ClassWithLongName1,std::vector<ClassWithLongName2>,int> v();
..
std::vector<boost::tuple<ClassWithLongName1,std::vector<ClassWithLongName2>,int>::iterator it = v.begin();
// VS
auto vi = v.begin();
Function Pointers
int test (ClassWithLongName1 a, ClassWithLongName2 b, int c)
{
..
}
..
int (*fp)(ClassWithLongName1, ClassWithLongName2, int) = test;
// VS
auto *f = test;
Bad Uses
Data Flow
auto input = "";
..
auto output = test(input);
Function Signature
auto test (auto a, auto b, auto c)
{
..
}
Trivial Cases
for(auto i = 0; i < 100; i++)
{
..
}

Another irritating example:
for (auto i = 0; i < s.size(); ++i)
generates a warning (comparison between signed and unsigned integer expressions [-Wsign-compare]), because i is a signed int. To avoid this you need to write e.g.
for (auto i = 0U; i < s.size(); ++i)
or perhaps better:
for (auto i = 0ULL; i < s.size(); ++i)

I'm surprised nobody has mentioned this, but suppose you are calculating the factorial of something:
#include <iostream>
using namespace std;
int main() {
auto n = 40;
auto factorial = 1;
for(int i = 1; i <=n; ++i)
{
factorial *= i;
}
cout << "Factorial of " << n << " = " << factorial <<endl;
cout << "Size of factorial: " << sizeof(factorial) << endl;
return 0;
}
This code will output this:
Factorial of 40 = 0
Size of factorial: 4
That was definetly not the expected result. That happened because auto deduced the type of the variable factorial as int because it was assigned to 1.

Is there a downside to declaring variables with auto in C++?

It seems that auto was a fairly significant feature to be added in C++11 that seems to follow a lot of the newer languages. As with a language like Python, I have not seen any explicit variable declaration (I am not sure if it is possible using Python standards).
Is there a drawback to using auto to declare variables instead of explicitly declaring them?

This isn't a drawback of auto in a principled way exactly, but in practical terms it seems to be an issue for some. Basically, some people either: a) treat auto as a savior for types and shut their brain off when using it, or b) forget that auto always deduces to value types. This causes people to do things like this:
auto x = my_obj.method_that_returns_reference();
Oops, we just deep copied some object. It's often either a bug or a performance fail. Then, you can swing the other way too:
const auto& stuff = *func_that_returns_unique_ptr();
Now you get a dangling reference. These problems aren't caused by auto at all, so I don't consider them legitimate arguments against it. But it does seem like auto makes these issue more common (from my personal experience), for the reasons I listed at the beginning.
I think given time people will adjust, and understand the division of labor: auto deduces the underlying type, but you still want to think about reference-ness and const-ness. But it's taking a bit of time.

Other answers are mentioning drawbacks like "you don't really know what the type of a variable is." I'd say that this is largely related to sloppy naming convention in code. If your interfaces are clearly-named, you shouldn't need to care what the exact type is. Sure, auto result = callSomeFunction(a, b); doesn't tell you much. But auto valid = isValid(xmlFile, schema); tells you enough to use valid without having to care what its exact type is. After all, with just if (callSomeFunction(a, b)), you wouldn't know the type either. The same with any other subexpression temporary objects. So I don't consider this a real drawback of auto.
I'd say its primary drawback is that sometimes, the exact return type is not what you want to work with. In effect, sometimes the actual return type differs from the "logical" return type as an implementation/optimisation detail. Expression templates are a prime example. Let's say we have this:
SomeType operator* (const Matrix &lhs, const Vector &rhs);
Logically, we would expect SomeType to be Vector, and we definitely want to treat it as such in our code. However, it is possible that for optimisation purposes, the algebra library we're using implements expression templates, and the actual return type is this:
MultExpression<Matrix, Vector> operator* (const Matrix &lhs, const Vector &rhs);
Now, the problem is that MultExpression<Matrix, Vector> will in all likelihood store a const Matrix& and const Vector& internally; it expects that it will convert to a Vector before the end of its full-expression. If we have this code, all is well:
extern Matrix a, b, c;
extern Vector v;
void compute()
{
Vector res = a * (b * (c * v));
// do something with res
}
However, if we had used auto here, we could get in trouble:
void compute()
{
auto res = a * (b * (c * v));
// Oops! Now `res` is referring to temporaries (such as (c * v)) which no longer exist
}

It makes your code a little harder, or tedious, to read.
Imagine something like that:
auto output = doSomethingWithData(variables);
Now, to figure out the type of output, you'd have to track down signature of doSomethingWithData function.

One of the drawbacks is that sometimes you can't declare const_iterator with auto. You will get ordinary (non const) iterator in this example of code taken from this question:
map<string,int> usa;
//...init usa
auto city_it = usa.find("New York");

Like this developer, I hate auto. Or rather, I hate how people misuse auto.
I'm of the (strong) opinion that auto is for helping you write generic code, not for reducing typing.
C++ is a language whose goal is to let you write robust code, not to minimize development time.
This is fairly obvious from many features of C++, but unfortunately a few of the newer ones like auto that reduce typing mislead people into thinking they should start being lazy with typing.
In pre-auto days, people used typedefs, which was great because typedef allowed the designer of the library to help you figure out what the return type should be, so that their library works as expected. When you use auto, you take away that control from the class's designer and instead ask the compiler to figure out what the type should be, which removes one of the most powerful C++ tools from the toolbox and risks breaking their code.
Generally, if you use auto, it should be because your code works for any reasonable type, not because you're just too lazy to write down the type that it should work with.
If you use auto as a tool to help laziness, then what happens is that you eventually start introducing subtle bugs in your program, usually caused by implicit conversions that did not happen because you used auto.
Unfortunately, these bugs are difficult to illustrate in a short example here because their brevity makes them less convincing than the actual examples that come up in a user project -- however, they occur easily in template-heavy code that expect certain implicit conversions to take place.
If you want an example, there is one here. A little note, though: before being tempted to jump and criticize the code: keep in mind that many well-known and mature libraries have been developed around such implicit conversions, and they are there because they solve problems that can be difficult if not impossible to solve otherwise. Try to figure out a better solution before criticizing them.

auto does not have drawbacks per se, and I advocate to (hand-wavily) use it everywhere in new code. It allows your code to consistently type-check, and consistently avoid silent slicing. (If B derives from A and a function returning A suddenly returns B, then auto behaves as expected to store its return value)
Although, pre-C++11 legacy code may rely on implicit conversions induced by the use of explicitly-typed variables. Changing an explicitly-typed variable to auto might change code behaviour, so you'd better be cautious.

Keyword auto simply deduce the type from the return value. Therefore, it is not equivalent with a Python object, e.g.
# Python
a
a = 10 # OK
a = "10" # OK
a = ClassA() # OK
// C++
auto a; // Unable to deduce variable a
auto a = 10; // OK
a = "10"; // Value of const char* can't be assigned to int
a = ClassA{} // Value of ClassA can't be assigned to int
a = 10.0; // OK, implicit casting warning
Since auto is deduced during compilation, it won't have any drawback at runtime whatsoever.

What no one mentioned here so far, but for itself is worth an answer if you asked me.
Since (even if everyone should be aware that C != C++) code written in C can easily be designed to provide a base for C++ code and therefore be designed without too much effort to be C++ compatible, this could be a requirement for design.
I know about some rules where some well defined constructs from C are invalid for C++ and vice versa. But this would simply result in broken executables and the known UB-clause applies which most times is noticed by strange loopings resulting in crashes or whatever (or even may stay undetected, but that doesn't matter here).
But auto is the first time1 this changes!
Imagine you used auto as storage-class specifier before and transfer the code. It would not even necessarily (depending on the way it was used) "break"; it actually could silently change the behaviour of the program.
That's something one should keep in mind.
1At least the first time I'm aware of.

As I described in this answer auto can sometimes result in funky situations you didn't intend.
You have to explictly say auto& to have a reference type while doing just auto can create a pointer type. This can result in confusion by omitting the specifier all together, resulting in a copy of the reference instead of an actual reference.

One reason that I can think of is that you lose the opportunity to coerce the class that is returned. If your function or method returned a long 64 bit, and you only wanted a 32 unsigned int, then you lose the opportunity to control that.

I think auto is good when used in a localized context, where the reader easily & obviously can deduct its type, or well documented with a comment of its type or a name that infer the actual type. Those who don't understand how it works might take it in the wrong ways, like using it instead of template or similar. Here are some good and bad use cases in my opinion.
void test (const int & a)
{
// b is not const
// b is not a reference
auto b = a;
// b type is decided by the compiler based on value of a
// a is int
}
Good Uses
Iterators
std::vector<boost::tuple<ClassWithLongName1,std::vector<ClassWithLongName2>,int> v();
..
std::vector<boost::tuple<ClassWithLongName1,std::vector<ClassWithLongName2>,int>::iterator it = v.begin();
// VS
auto vi = v.begin();
Function Pointers
int test (ClassWithLongName1 a, ClassWithLongName2 b, int c)
{
..
}
..
int (*fp)(ClassWithLongName1, ClassWithLongName2, int) = test;
// VS
auto *f = test;
Bad Uses
Data Flow
auto input = "";
..
auto output = test(input);
Function Signature
auto test (auto a, auto b, auto c)
{
..
}
Trivial Cases
for(auto i = 0; i < 100; i++)
{
..
}

Another irritating example:
for (auto i = 0; i < s.size(); ++i)
generates a warning (comparison between signed and unsigned integer expressions [-Wsign-compare]), because i is a signed int. To avoid this you need to write e.g.
for (auto i = 0U; i < s.size(); ++i)
or perhaps better:
for (auto i = 0ULL; i < s.size(); ++i)

I'm surprised nobody has mentioned this, but suppose you are calculating the factorial of something:
#include <iostream>
using namespace std;
int main() {
auto n = 40;
auto factorial = 1;
for(int i = 1; i <=n; ++i)
{
factorial *= i;
}
cout << "Factorial of " << n << " = " << factorial <<endl;
cout << "Size of factorial: " << sizeof(factorial) << endl;
return 0;
}
This code will output this:
Factorial of 40 = 0
Size of factorial: 4
That was definetly not the expected result. That happened because auto deduced the type of the variable factorial as int because it was assigned to 1.

How to define floating point constants within template. Avoid casts at run-time

Say I have a simple function that does something like this:
template<typename T>
T get_half(T a){
return 0.5*a;
}
this function will typically be evaluated with T being double or float.
The standard specifies that 0.5 will be a double (0.5f for float).
How can write the above code so that 0.5 will always be of type T so that there is no cast when evaluating either the product or the return?
What I want is 0.5 to be a constant of type T at compile time. The point of this question is that I want to avoid conversion at run time.
For example, if I write:
template<typename T>
T get_half(T a){
return T(0.5)*a;
}
Can I be absolutely sure that T(0.5) is evaluated at compile time?
if not, what would be the proper approach to accomplish this? I'm ok with using c++11 if that is needed.
Thank you in advance.
In c++11 I have a numeric_traits class something as follows (within a header file)
template<typename Scalar>
struct numeric_traits{
static constexpr Scalar one_half = 0.5;
//Many other useful constants ....
};
so within my code I would use this as:
template<typename T>
T get_half(T a){
return numeric_traits<T>::one_half*a;
}
This does what I want i.e. 0.5 is resolved at compile time with the precision I need and no casts happen at run-time. However the downsides are:
I need to modify numeric_traits every time I need a new constant
The sintax is probably too verbosely annoying? (not a big issue really, of course)
It'd be nice maybe have something like: constant(0.5) which resolves to T type at run-time.
Thank you in advance again.

There isn't and cannot be any way of forcing constants to never be computed at run-time, because some machines simply don't have a single instruction that can load all possible values of a type. For instance, machines may only have a 16-bit load constant instruction, where 0x12345678 would need to be computed, at run-time, as 0x1234 << 16 | 0x5678. Alternatively, such a constant might be loaded from memory, but that could be an even more costly operation than computing it.
You need to trust your compiler a little bit. On systems where it is feasible, any compiler that has any amount of optimisation at all will translate T(0.5) the same way it will translate 0.5f, assuming T is float. And 0.5f will be computed in the most sensible way for your platform. That might involve loading it as a constant, or that might involve computing it. Or who knows, your compiler might change T(0.5)*a to a/2 if that gives the same results.
In your question you give an example of adding a numeric_traits helper class. This, IMO, is overkill. In the extremely unlikely case that constexpr makes a difference, you can just write
template <typename T>
T get_half(T a) {
constexpr T half = 0.5;
return half * a;
}
However, this still does more harm than good, in my opinion: your get_half can now no longer be used with non-literal types. It requires the type to support conversions from double in constant expressions. Suppose you have an arbitrary-precision rational type, written without constexpr in mind. Now your get_half can not be used, because the initialisation constexpr T half = 0.5; is invalid, even if 0.5 * a might otherwise have compiled.
This is the case even with your numeric_traits helper class; it's not invalid just because I moved it into the function body.

return by value inline functions

I'm implementing some math types and I want to optimize the operators to minimize the amount of memory created, destroyed, and copied. To demonstrate I'll show you part of my Quaternion implementation.
class Quaternion
{
public:
double w,x,y,z;
...
Quaternion operator+(const Quaternion &other) const;
}
I want to know how the two following implementations differ from eachother. I do have a += implementation that operates in-place to where no memory is created, but some higher level operations utilizing quaternions it's useful to use + and not +=.
__forceinline Quaternion Quaternion::operator+( const Quaternion &other ) const
{
return Quaternion(w+other.w,x+other.x,y+other.y,z+other.z);
}
and
__forceinline Quaternion Quaternion::operator+( const Quaternion &other ) const
{
Quaternion q(w+other.w,x+other.x,y+other.y,z+other.z);
return q;
}
My c++ is completely self-taught so when it comes to some optimizations, I'm unsure what to do because I do not know exactly how the compiler handles these things. Also how do these mechanics translate to non-inline implementations.
Any other criticisms of my code are welcomed.

Your first example allows the compiler to potentially use somehting called "Return Value Optimization" (RVO).
The second example allows the compiler to potentially use something called "Named Return Value Optimization" (NRVO). These 2 optimizations are clearly closely related.
Some details of Microsoft's implementation of NRVO can be found here:
http://msdn.microsoft.com/en-us/library/ms364057.aspx
Note that the article indicates that NRVO support started with VS 2005 (MSVC 8.0). It doesn't specifically say whether the same applies to RVO or not, but I believe that MSVC used RVO optimizations before version 8.0.
This article about Move Constructors by Andrei Alexandrescu has good information about how RVO works (and when and why compilers might not use it).
Including this bit:
you'll be disappointed to hear that each compiler, and often each compiler version, has its own rules for detecting and applying RVO. Some apply RVO only to functions returning unnamed temporaries (the simplest form of RVO). The more sophisticated ones also apply RVO when there's a named result that the function returns (the so-called Named RVO, or NRVO).
In essence, when writing code, you can count on RVO being portably applied to your code depending on how you exactly write the code (under a very fluid definition of "exactly"), the phase of the moon, and the size of your shoes.
The article was written in 2003 and compilers should be much improved by now; hopefully, the phase of the moon is less important to when the compiler might use RVO/NRVO (maybe it's down to day-of-the-week). As noted above it appears that MS didn't implement NRVO until 2005. Maybe that's when someone working on the compiler at Microsoft got a new pair of more comfortable shoes a half-size larger than before.
Your examples are simple enough that I'd expect both to generate equivalent code with more recent compiler versions.

Between the two implementations you presented, there really is no difference. Any compiler doing any sort of optimizations whatsoever will optimize your local variable out.
As for the += operator, a slightly more involved discussion about whether or not you want your Quaternions to be immutable objects is probably required... I would always lead towards creating objects like this as immutable objects. (but then again, I'm more of a managed coder as well)

If these two implementations do not generate exactly the same assembly code when optimization is turned on, you should consider using a different compiler. :) And I don't think it matters whether or not the function is inlined.
By the way, be aware that __forceinline is very non-portable. I would just use plain old standard inline and let the compiler decide.

The current consensus is that you should implement first all your ?= operators that do not create new objects. Depending on whether exception safety is a problem (in your case it probably is not) or a goal the definition of ?= operator can be different. After that you implement operator? as a free function in terms of the ?= operator using pass-by-value semantics.
// thread safety is not a problem
class Q
{
double w,x,y,z;
public:
// constructors, other operators, other methods... omitted
Q& operator+=( Q const & rhs ) {
w += rhs.w;
x += rhs.x;
y += rhs.y;
z += rhs.z;
return *this;
}
};
Q operator+( Q lhs, Q const & rhs ) {
lhs += rhs;
return lhs;
}
This has the following advantages:
Only one implementation of the logic. If the class changes you only need to reimplement operator?= and operator? will adapt automatically.
The free function operator is symmetric with respect to implicit compiler conversions
It is the most efficient implementation of operator? you can find with respect to copies
Efficiency of operator?
When you call operator? on two elements, a third object must be created and returned. Using the approach above, the copy is performed in the method call. As it is, the compiler is able to elide the copy when you are passing a temporary object. Note that this should be read as 'the compiler knows that it can elide the copy', not as 'the compiler will elide the copy'. Mileage will vary with different compilers, and even the same compiler can yield different results in different compilation runs (due to different parameters or resources available to the optimizer).
In the following code, a temporary will be created with the sum of a and b, and that temporary must be passed again to operator+ together with c to create a second temporary with the final result:
Q a, b, c;
// initialize values
Q d = a + b + c;
If operator+ has pass by value semantics, the compiler can elide the pass-by-value copy (the compiler knows that the temporary will get destructed right after the second operator+ call, and does not need to create a different copy to pass in)
Even if the operator? could be implemented as a one line function (Q operator+( Q lhs, Q const & rhs ) { return lhs+=rhs; }) in the code, it should not be so. The reason is that the compiler cannot know whether the reference returned by operator?= is in fact a reference to the same object or not. By making the return statement explicitly take the lhs object, the compiler knows that the return copy can be elided.
Symmetry with respect to types
If there is an implicit conversion from type T to type Q, and you have two instances t and q respectively of each type, then you expect (t+q) and (q+t) both to be callable. If you implement operator+ as a member function inside Q, then the compiler will not be able to convert the t object into a temporary Q object and later call (Q(t)+q) as it cannot perform type conversions in the left hand side to call a member function. Thus with a member function implementation t+q will not compile.
Note that this is also true for operators that are not symmetric in arithmetic terms, we are talking about types. If you can substract a T from a Q by promoting the T to a Q, then there is no reason not to be able to substract a Q from a T with another automatic promotion.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js