C++ Expression Templates - c++

I currently use C for numerical computations. I've heard that using C++ Expression Templates is better for scientific computing. What are C++ Expression Templates in simple terms?
Are there books around that discuss numerical methods/computations using C++ Expression Templates?
In what way, C++ Expression Templates are better than using pure C?

What are C++ Expression Templates in simple terms?
Expression templates are a category of C++ template meta programming which delays evaluation of subexpressions until the full expression is known, so that optimizations (especially the elimination of temporaries) can be applied.
Are there books around that discuss numerical methods/computations using C++ Expression Templates?
I believe ET's were invented by Todd Veldhuizen who published a paper on it 15 years ago. (It seems that many older links to it are dead by now, but currently here is a version of it.) Some material about it is in David Vandevoorde's and Nicolai Josuttis' C++ Templates: The Complete Guide.
In what way, C++ Expression Templates are better than using pure C?
They allow you to write your code in an expressive high level way without losing performance. For example,
void f(const my_array<double> a1, const my_array<double> a2)
{
my_array<double> a3 = 1.2 * a1 + a1 * a2;
// ..
}
can be optimized all the way down to
for( my_array<double>::size_type idx=0; idx<a1.size(); ++idx )
a3[idx] = 1.2*a1[idx] + a1[idx]*a2[idx];
which is faster, but harder to understand.

Adding to sbi's answer, expression templates implement high-level peephole optimizations using templates for pattern matching and synthesis.
They also add syntactic sugar, or make your code more readable, by allowing you to specify the algorithm in terms of simple operations. So, in this case, simplicity and elegance are achieved through optimization by metaprogramming. At least, if you do everything right.

There is a nice article on C++ template math in the good old Flipcode archive (sure brings back memories):
http://www.flipcode.com/archives/Faster_Vector_Math_Using_Templates.shtml

Related

What are the use cases of C++20 Concepts?

I found about Concepts while reviewing C++20 features. I found that they add validation to templates arguments but apart from that I don't understand what are the real world use cases of C++20 concepts.
C++ already has things like std::is_integral and they can perform validation very well.
I'm sure I am missing something about C++20 concepts and what it enables.
SFINAE (see here & here) was an accidentally Turing complete sublanguage that executes at overload resolution and template specialization selection time.
Turns out it is used a lot in template code.
Concepts and requires clauses are an attempt to take that accidentally useful language feature and make it suck less.
The origin of concepts was going to have 3 pieces; (a) describing what is required for a given bit of template code in a clean way, (b) also provide a way to map other types to satisfy those requirements non-intrusively, and (c) check template code so that any type which satisfies the concept is guaranteed to compile
All attempts at (a) plus (c) sucked, usually taking forever to compile and/or restricting what you can check with (a). (b) was also dropped to ensure (a) was better; you can write such concept map machinery manually in many cases, but C++ doesn't provide it for you.
So, now what is it good for?
auto sum( Addable auto... values )
that uses the concept of Addable to concisely express an interface of a template. Error messages you get when passing a non-addable highlight it isn't Addable, and the expression that doesn't work.
template<class T, class A>
struct vector{
bool operator==(vector<t,A>const& o)requires EquallyComparible<T>;
};
here, we state this vector has a == if and only if the T does. Doing this before concepts is an annoying undertaking, and even adding the specs to the standard is.
This is the turing tar pit; everyting is equivalent, but nothing is easy. All programs can be written with I/O plus a (a=(a-b);(a<0)?goto c:next 3 argument instruction; but a richer language makes programs suck less. Concepts takes an esoteric branch of C++, SFINAE, and makes it clean and simpler (so more people can leverage it), and improves error messages.

What are the benefits of using Boost.Phoenix?

I can not understand what the real benefits of using Boost.Phoenix.
When I use it with Boost.Spirit grammars, it's really useful:
double_[ boost::phoenix::push_back( boost::phoenix::ref( v ), _1 ) ]
When I use it for lambda functions, it's also useful and elegant:
boost::range::for_each( my_string, if_ ( '\\' == arg1 ) [ arg1 = '/' ] );
But what are the benefits of everything else in this library? The documentation says: "Functors everywhere". I don't understand what is the good of it?
I'll point you out what is the critical difference between Boost.Lambda and Boost.Phoenix:
Boost.Phoenix supports (statically) polymorphic functors, while Boost.Lambda binds are always monomorphic.
(At the same time, in many aspects the two libraries can be combined, so they are not exclusive choices.)
Let me illustrate (Warning: Code not tested.):
Phoenix
In Phoenix a functor can converted into a Phoenix "lazy function" (from http://www.boost.org/doc/libs/1_54_0/libs/phoenix/doc/html/phoenix/starter_kit/lazy_functions.html)
struct is_odd_impl{
typedef bool result_type; // less necessary in C++11
template <typename Arg>
bool operator()(Arg arg1) const{
return arg1 % 2 == 1;
}
};
boost::phoenix::function<is_odd_impl> is_odd;
is_odd is truly polymorphic (as the functor is_odd_impl). That is is_odd(_1) can act on anything (that makes sense). For example in is_odd(_1)(2u)==true and is_odd(_1)(2l)==true. is_odd can be combined into a more complex expression without losing its polymorphic behavior.
Lambda attempt
What is the closest we can get to this in Boost.Lambda?, we could defined two overloads:
bool is_odd_overload(unsigned arg1){return arg1 % 2 == 1;}
bool is_odd_overload(long arg1){return arg1 % 2 == 1;}
but to create a Lambda "lazy function" we will have to choose one of the two:
using boost::lambda::bind;
auto f0 = bind(&is_odd_overload, _1); // not ok, cannot resolve what of the two.
auto f1 = bind(static_cast<bool(*)(unsigned)>(&is_odd_overload), _1); //ok, but choice has been made
auto f2 = bind(static_cast<bool(*)(long)>(&is_odd_overload), _1); //ok, but choice has been made
Even if we define a template version
template<class T>
bool is_odd_template(T arg1){return arg1 % 2 == 1;}
we will have to bind to a particular instance of the template function, for example
auto f3 = bind(&is_odd_template<unsigned>, _1); // not tested
Neither f1 nor f2 nor f3 are truly polymorphic since a choice has been made at the time of binding.
(Note1: this may not be the best example since things may seem to work due to implicit conversions from unsigned to long, but that is another matter.)
To summarize, given a polymorphic function/functor Lambda cannot bind to the polymorphic function (as far as I know), while Phoenix can. It is true that Phoenix relies on the "Result Of protocol" http://www.boost.org/doc/libs/1_54_0/libs/utility/utility.htm#result_of but 1) at least it is possible, 2) This is less of a problem in C++11, where return types are very easy to deduce and it can be done automatically.
In fact, in C++11, Phoenix lambdas are still more powerful than C++11
built-in lambdas. Even in C++14, where template generic lambdas are
implemented, Phoenix is still more general, because it allows a
certain level of introspection. (For this an other things, Joel de
Guzman (developer of Phoenix) was and still is well ahead of his
time.)
Well, its a very powerful lambda language.
I used it to create a prototype for a math-like DSL:
http://code.google.com/p/asadchev/source/browse/trunk/work/cxx/interval.hpp
and many other things:
http://code.google.com/p/asadchev/source/browse/#svn%2Ftrunk%2Fprojects%2Fboost%2Fphoenix
I have never used Phoenix, but...
From the Phoenix Library docs:
The Phoenix library enables FP techniques such as higher order functions, lambda (unnamed functions), currying (partial function application) and lazy evaluation in C++
From the Wikipedia article on Functional programming:
... functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data. It emphasizes the application of functions, in contrast to the imperative programming style, which emphasizes changes in state
So, Phoenix is a library for enabling Functional Programming in C++.
The major interest in Functional Programming these days seems to stem from the perceived advantages in correctness, and performance, due to limiting or eliminating side-effects.
Correctness, because without side-effects, the code that you see is everything going on in the system. Some other code won't be changing your state underneath you. You can much more easily write bug-free code in this sort of environment.
Performance, because without side-effects, the code you write can safely run in parallel, without any resource managing primitives, or atomic-access tricks. Multi-threading can be enabled extremely easily, even automatically, and operate extremely efficiently.
Don't look at Boost.Phoenix2.
Evolution of lambda expressions in boost looks like:
Bind -> Lambda, Phoenix2 (as Spirit part) -> Phoenix3 (as separate library, under development).
Result is single lambda-library with polymorphic functors support (others are going to become deprecated).
Functional programming in C++. It's hard to explain unless you have previously used a language with proper support for functional programming, such as SML. I tried to use Phoenix and found it nice, but very impractical in real-life projects because it greatly increases compilation times, and error messages are awful when you do something wrong. I rememeber getting a few megabytes of errors from GCC when I played with Phoenix. Also, debugging deeply nested template instantiations is a PITA. (Actually, these are also all the arguments against using most of boost.)

Why does C++ not allow user-defined operators?

I've been wondering this for quite some time. There are already a whole bunch of them and they can be overloaded, so why not do it to the end and allow custom operators? I think it could be a great addition.
I've been told that this would make the language too hard to compile. This makes me wonder, C++ cannot really be designed for easy compilation anyway, so is it really undoable? Of course, if you use an LR parser with a static table and a grammar such as
E → T + E | T
T → F * T | F
F → id | '(' E ')'
it wouldn't work. In Prolog, which is usually parsed with a Operator-Precedence parser AFAIK, new operators can easily be defined, but the language is much simpler. Now, the grammar could obviously be rewritten to accept identifiers in every place where an operator is hard-coded into the grammar.
What other solutions and parser schemes are there and what other things have influenced that design decision?
http://www2.research.att.com/~bs/bs_faq2.html#overload-operator
The possibility has been considered several times, but each time I/we decided that the likely problems outweighed the likely benefits.
It's not a language-technical problem. Even when I first considerd it in 1983, I knew how it could be implemented. However, my experience has been that when we go beyond the most trivial examples people seem to have subtlely different opinions of "the obvious" meaning of uses of an operator. A classical example is a**b**c. Assume that ** has been made to mean exponentiation. Now should a**b**c mean (a**b)**c or a**(b**c)? I thought the answer was obvious and my friends agreed - and then we found that we didn't agree on which resolution was the obvious one. My conjecture is that such problems would lead to subtle bugs.
It would become even harder to compile than what already is. Also, there would be problems with operators' precedence: how do you define it? You need a way to tell the compiler that an user-defined operator has precedence over another operator.
Almost surely it's feasible, but I think that C++ doesn't need other ways to shoot yourself in the foot :-)
This would make the language even more complex. And that obviously wouldn't be desirable.
Still, check out Boost Spirit. It goes a long way to make stuff like you mentioned possible using lots of template metaprogramming tricks.
Actually it's designed to be very easy to parse and compile. C has 32 defined keywords, all other tokens are function and variables.
C++ only has a few more. One can easily identify which token is for which, so know what to look for when one uses the + token or whatever.
The problem with allowing custom operators is that you also have to allow the programmer to specify the syntax for how the operators should be used. I suppose the C++ type system could help a little, but it would help resolving issues like associativity, etc.
It would make the already complex language, much more complex...
This is usually avoided because most code is written by more that one man, so the code should be "reviewable", and it's hardly a "desired" feature of a language.
Joel Spolsky have a good article about this.
I just found out that it's actually possible to achieve something very similar to overloaded operators. Consider
Vector v, a, b; v = a /vectorProduct/ b;
Turns out you can achieve the behaviour of custom operator by using dummy classes delimited by existing operators. =)

Intermediate results using expression templates

in C++ Template Metaprogramming : Concepts, Tools, and Techniques from Boost and Beyond
... One drawback of expression templates is that they tend to encourage writing large, complicated expressions, because evaluation is only delayed until the assignment operator is invoked. If a programmer wants to reuse some intermediate result without evaluating it early, she may be forced to declare a complicated type like:
Expression<
Expression<Array,plus,Array>,
plus,
Expression<Array,minus,Array>
> intermediate = a + b + (c - d);
(or worse). Notice how this type not only exactly and redundantly reflects the structure of the computationand so would need to be maintained as the formula changes but also overwhelms it? This is a long-standing problem for C++ DSELs. The usual workaround is to capture the expression using type erasure, but in that case one pays for dynamic dispatching. There has been much discussion recently, spearheaded by Bjarne Stroustrup himself, about reusing the vestigial auto keyword to get type deduction in variable declarations, so that the above could be rewritten as:
auto intermediate = a + b + (c - d);
This feature would be a huge advantage to C++ DSEL authors and users alike...
Is it possible to solve this problem with the current c++ std. (non C++0X)
For Example i want to write a Expression like:
Expr X,Y
Matrix A,B,C,D
X=A+B+C
Y=X+C
D:=X+Y
Where operator := evaluate the expression at the latest Time.
For now, you can always use BOOST_AUTO() in the place of C++0x's auto keyword to get intermediate results more easily.
Matrix x, y;
BOOST_AUTO(result, (x + y) * (x + y)); // or whatever.
I don't understand your question. auto is going to be reused in C++0x for automatic type inference.
I personally see this as a drawback for expression templates since they often relay on having a smaller life span than the objects they are built upon which can turn out to be false if the expression template is captured as I explain here.

Is Template Metaprogramming faster than the equivalent C code?

Is Template Metaprogramming faster than the equivalent C code ? ( I'm talking about the runtime performance) :)
First, a disclaimer:
What I think you're asking about is not just template metaprogramming, but also generic programming. The two concepts are closely related, and there's no exact definition of what each encompasses. But in short, template metaprogramming is essentially writing a program using templates, which is evaluated at compile-time. (which makes it entirely free at runtime. Nothing happens. The value (or more commonly, type) has already been computed by the compiler, and is available as a constant (either a const variable, or an enum), or as a typedef nested in a class (if you've used it to "compute" a type).
Generic programming is using templates and when necessary, template metaprogramming, to create generic code which works the same (and with no loss in performance), with all and any type. I'm going to use examples of both in the following.
A common use for template metaprogramming is to enable types to be used in generic programming, even if they were not designed for it.
Since template metaprogramming technically takes place entirely at compile-time, your question is a bit more relevant for generic programming, which still takes place at runtime, but is efficient because it can be specialized for the precise types it's used with at compile-time.
Anyway...
Depends on how you define "the equivalent C code".
The trick about template metaprogramming (or generic programming in general) is that it allows a lot of computation to be moved to compile-time, and it enables flexible, parametrized code that is just as efficient as hardcoded values.
The code displayed here for example computes a number in the fibonacci sequence at compile-time.
The C++ code 'unsigned long fib11 = fibonacci<11uL>::value', relies on the template metaprogram defined in that link, and is as efficient as the C code 'unsigned long fib11 = 89uL'. The templates are evaluated at compile-time, yielding a constant that can be assigned to a variable. So at runtime, the code is actually identical to a simple assignment.
So if that is the "equivalent C code", the performance is the same.
If the equivalent C code is "a program that can compute arbitrary fibonacci numbers, applied to find the 11th number in the sequence", then the C version will be much slower, because it has to be implemented as a function, which computes the value at runtime. But this is the "equivalent C code" in the sense that it is a C program that exhibits the same flexibility (it is not just a hardcoded constant, but an actual function that can return any number in the fibonacci sequence).
Of course, this isn't often useful. But it's pretty much the canonical example of template metaprogramming.
A more realistic example of generic programming is sorting.
In C, you have the qsort standard library function taking an array and a comparator function pointer. The call to this function pointer can not be inlined (except in trivial cases), because at compile-time, it is not known which function is going to be called.
Of course the alternative is a hand-written sorting function designed for your specific datatype.
In C++, the equivalent is the function template std::sort. It too takes a comparator, but instead of this being a function pointer, it is a function object, looking like this:
struct MyComp {
bool operator()(const MyType& lhs, const MyType& rhs) {
// return true if lhs < rhs, however this operation is defined for MyType objects
}
};
and this can be inlined. The std::sort function is passed a template argument, so it knows the exact type of the comparator, and so it knows that the comparator function is not just an unknown function pointer, but MyComp::operator().
The end result is that the C++ function std::sort is exactly as efficient as your hand-coded implementation in C of the same sorting algorithm.
So again, if that is "the equivalent C code", then the performance is the same.
But if the "equivalent C code" is "a generalized sorting function which can be applied to any type, and allows user-defined comparators", then the generic programming-version in C++ is vastly more efficient.
That's really the trick. Generic programming and template metaprogramming are not "faster than C". They are methods to achieve general, reusable code which is as fast as handcoded, and hardcoded C
It is a way to get the best of both worlds. The performance of hardcoded algorithms, and the flexibility and reusability of general, parameterized ones.
Template Metaprogramming (TMP) is 'run' at compile time, so it's not really comparing apples to apples when comparing it to normal C/C++ code.
But, if you have something evaluated by TMP, then there's no runtime cost at all.
If you mean reusable code, then yes without a doubt. Metaprogramming is a superior way to produce libraries, not client code. Client code is not generic, it is written to do specific things.
For example, look at the qsort from C standard library, and C++ standard sort. This is how qsort works :
int compare(const void* a, const void* b)
{
return (*(int*)a > *(int*)b);
}
int main()
{
int data[5] = {5, 4, 3, 2, 1};
qsort(data, 5, sizeof(int), compare);
}
Now look at sort :
struct compare
{
bool operator()(int a, int b)
{ return a < b; }
};
int main()
{
int data[5] = {5, 4, 3, 2, 1};
std::sort(data, data+5, compare());
}
sort is cleaner, safer and more efficient because the comparison function is inlined inside the sort. That is the benefit of metaprogramming in my opinion, you write generic code, but the compiler produce code like the hand-coded one!
Another place where I find metaprogramming very beautiful is when you write a library like boost::spirit or boost::xpressive, with spirit you can write EBNF inside C++ and let the compile check the EBNF syntax for you and with xpressive you can write regex and let the compiler check the regex syntax for you also!
I am not sure if the questioner means by TMP calculating values at compile time. This is an example I wrote using boost :)
unsigned long greatestCommonDivisor = boost::math::static_gcd<25657, 54887524>::value;
What ever you do with C you can't mimic the above code, basically you have to hand-calculate it then assign the result to greatestCommonDivisor!
The answer is it depends.
Template metaprogramming can be used to easily write recursive descent language parsers and these can be inefficient compared to a carefully crafted C program or a table-based implementation (e.g. flex/bison/yacc).
On the other hand, you can write metaprograms that generate unrolled loops which can be more efficient than a more an conventional C implementation that uses loops.
The main benefit is that metaprograms allow the programmer to do more with less code.
The downside is that it also gives you a gatling gun to shoot yourself in the foot with.
Template metaprogramming can be thought of as compile-time execution.
The compile-time is going to take longer to compile your code since it has to compile and then execute the templates, generate code, then compile again.
The run-time overhead I am not sure about, it shouldn't be much more than if you wrote it yourself in C code I would imagine.
I worked on a project where another programmer had tried out metaprogramming. It was terrible. It was a complete headache. I'm an average programmer with a lot of C++ experience, and trying to devise what the hell they were trying to do took way more time than if they had written it straight out to begin with.
I'm jaded against C++ MetaProgramming because of this experience.
I'm a firm believer that the best code is most easily readable by an average developer. It's the readability of the software that is the #1 priority. I can make anything work using any language... but the skill is in making it readable and easily workable for the next person on the project. C++ MetaProgramming fails to pass muster.
Template metaprogramming does not give you any magical powers in terms of performance. It's basically a very sophisticated preprocessor; you can always write the equivalent in C or C++, it just might take you a very long time.
I do not think there is any hype, but a clear and simple answer about templates is given by C++ FAQ: https://isocpp.org/wiki/faq/templates#overview-templates
About the original question: it cannot be answered, as those things are not comparable.