Intermediate results using expression templates - c++

in C++ Template Metaprogramming : Concepts, Tools, and Techniques from Boost and Beyond
... One drawback of expression templates is that they tend to encourage writing large, complicated expressions, because evaluation is only delayed until the assignment operator is invoked. If a programmer wants to reuse some intermediate result without evaluating it early, she may be forced to declare a complicated type like:
Expression<
Expression<Array,plus,Array>,
plus,
Expression<Array,minus,Array>
> intermediate = a + b + (c - d);
(or worse). Notice how this type not only exactly and redundantly reflects the structure of the computationand so would need to be maintained as the formula changes but also overwhelms it? This is a long-standing problem for C++ DSELs. The usual workaround is to capture the expression using type erasure, but in that case one pays for dynamic dispatching. There has been much discussion recently, spearheaded by Bjarne Stroustrup himself, about reusing the vestigial auto keyword to get type deduction in variable declarations, so that the above could be rewritten as:
auto intermediate = a + b + (c - d);
This feature would be a huge advantage to C++ DSEL authors and users alike...
Is it possible to solve this problem with the current c++ std. (non C++0X)
For Example i want to write a Expression like:
Expr X,Y
Matrix A,B,C,D
X=A+B+C
Y=X+C
D:=X+Y
Where operator := evaluate the expression at the latest Time.

For now, you can always use BOOST_AUTO() in the place of C++0x's auto keyword to get intermediate results more easily.
Matrix x, y;
BOOST_AUTO(result, (x + y) * (x + y)); // or whatever.

I don't understand your question. auto is going to be reused in C++0x for automatic type inference.
I personally see this as a drawback for expression templates since they often relay on having a smaller life span than the objects they are built upon which can turn out to be false if the expression template is captured as I explain here.

Related

Are there any C++ language obstacles that prevent adopting D ranges?

This is a C++ / D cross-over question. The D programming language has ranges that -in contrast to C++ libraries such as Boost.Range- are not based on iterator pairs. The official C++ Ranges Study Group seems to have been bogged down in nailing a technical specification.
Question: does the current C++11 or the upcoming C++14 Standard have any obstacles that prevent adopting D ranges -as well as a suitably rangefied version of <algorithm>- wholesale?
I don't know D or its ranges well enough, but they seem lazy and composable as well as capable of providing a superset of the STL's algorithms. Given their claim to success for D, it would seem very nice to have as a library for C++. I wonder how essential D's unique features (e.g. string mixins, uniform function call syntax) were for implementing its ranges, and whether C++ could mimic that without too much effort (e.g. C++14 constexpr seems quite similar to D compile-time function evaluation)
Note: I am seeking technical answers, not opinions whether D ranges are the right design to have as a C++ library.
I don't think there is any inherent technical limitation in C++ which would make it impossible to define a system of D-style ranges and corresponding algorithms in C++. The biggest language level problem would be that C++ range-based for-loops require that begin() and end() can be used on the ranges but assuming we would go to the length of defining a library using D-style ranges, extending range-based for-loops to deal with them seems a marginal change.
The main technical problem I have encountered when experimenting with algorithms on D-style ranges in C++ was that I couldn't make the algorithms as fast as my iterator (actually, cursor) based implementations. Of course, this could just be my algorithm implementations but I haven't seen anybody providing a reasonable set of D-style range based algorithms in C++ which I could profile against. Performance is important and the C++ standard library shall provide, at least, weakly efficient implementations of algorithms (a generic implementation of an algorithm is called weakly efficient if it is at least as fast when applied to a data structure as a custom implementation of the same algorithm using the same data structure using the same programming language). I wasn't able to create weakly efficient algorithms based on D-style ranges and my objective are actually strongly efficient algorithms (similar to weakly efficient but allowing any programming language and only assuming the same underlying hardware).
When experimenting with D-style range based algorithms I found the algorithms a lot harder to implement than iterator-based algorithms and found it necessary to deal with kludges to work around some of their limitations. Of course, not everything in the current way algorithms are specified in C++ is perfect either. A rough outline of how I want to change the algorithms and the abstractions they work with is on may STL 2.0 page. This page doesn't really deal much with ranges, however, as this is a related but somewhat different topic. I would rather envision iterator (well, really cursor) based ranges than D-style ranges but the question wasn't about that.
One technical problem all range abstractions in C++ do face is having to deal with temporary objects in a reasonable way. For example, consider this expression:
auto result = ranges::unique(ranges::sort(std::vector<int>{ read_integers() }));
In dependent of whether ranges::sort() or ranges::unique() are lazy or not, the representation of the temporary range needs to be dealt with. Merely providing a view of the source range isn't an option for either of these algorithms because the temporary object will go away at the end of the expression. One possibility could be to move the range if it comes in as r-value, requiring different result for both ranges::sort() and ranges::unique() to distinguish the cases of the actual argument being either a temporary object or an object kept alive independently. D doesn't have this particular problem because it is garbage collected and the source range would, thus, be kept alive in either case.
The above example also shows one of the problems with possibly lazy evaluated algorithm: since any type, including types which can't be spelled out otherwise, can be deduced by auto variables or templated functions, there is nothing forcing the lazy evaluation at the end of an expression. Thus, the results from the expression templates can be obtained and the algorithm isn't really executed. That is, if an l-value is passed to an algorithm, it needs to be made sure that the expression is actually evaluated to obtain the actual effect. For example, any sort() algorithm mutating the entire sequence clearly does the mutation in-place (if you want a version doesn't do it in-place just copy the container and apply the in-place version; if you only have a non-in-place version you can't avoid the extra sequence which may be an immediate problem, e.g., for gigantic sequences). Assuming it is lazy in some way the l-value access to the original sequence provides a peak into the current status which is almost certainly a bad thing. This may imply that lazy evaluation of mutating algorithms isn't such a great idea anyway.
In any case, there are some aspects of C++ which make it impossible to immediately adopt the D-sytle ranges although the same considerations also apply to other range abstractions. I'd think these considerations are, thus, somewhat out of scope for the question, too. Also, the obvious "solution" to the first of the problems (add garbage collection) is unlikely to happen. I don't know if there is a solution to the second problem in D. There may emerge a solution to the second problem (tentatively dubbed operator auto) but I'm not aware of a concrete proposal or how such a feature would actually look like.
BTW, the Ranges Study Group isn't really bogged down by any technical details. So far, we merely tried to find out what problems we are actually trying to solve and to scope out, to some extend, the solution space. Also, groups generally don't get any work done, at all! The actual work is always done by individuals, often by very few individuals. Since a major part of the work is actually designing a set of abstractions I would expect that the foundations of any results of the Ranges Study Group is done by 1 to 3 individuals who have some vision of what is needed and how it should look like.
My C++11 knowledge is much more limited than I'd like it to be, so there may be newer features which improve things that I'm not aware of yet, but there are three areas that I can think of at the moment which are at least problematic: template constraints, static if, and type introspection.
In D, a range-based function will usually have a template constraint on it indicating which type of ranges it accepts (e.g. forward range vs random-access range). For instance, here's a simplified signature for std.algorithm.sort:
auto sort(alias less = "a < b", Range)(Range r)
if(isRandomAccessRange!Range &&
hasSlicing!Range &&
hasLength!Range)
{...}
It checks that the type being passed in is a random-access range, that it can be sliced, and that it has a length property. Any type which does not satisfy those requirements will not compile with sort, and when the template constraint fails, it makes it clear to the programmer why their type won't work with sort (rather than just giving a nasty compiler error from in the middle of the templated function when it fails to compile with the given type).
Now, while that may just seem like a usability improvement over just giving a compilation error when sort fails to compile because the type doesn't have the right operations, it actually has a large impact on function overloading as well as type introspection. For instance, here are two of std.algorithm.find's overloads:
R find(alias pred = "a == b", R, E)(R haystack, E needle)
if(isInputRange!R &&
is(typeof(binaryFun!pred(haystack.front, needle)) : bool))
{...}
R1 find(alias pred = "a == b", R1, R2)(R1 haystack, R2 needle)
if(isForwardRange!R1 && isForwardRange!R2 &&
is(typeof(binaryFun!pred(haystack.front, needle.front)) : bool) &&
!isRandomAccessRange!R1)
{...}
The first one accepts a needle which is only a single element, whereas the second accepts a needle which is a forward range. The two are able to have different parameter types based purely on the template constraints and can have drastically different code internally. Without something like template constraints, you can't have templated functions which are overloaded on attributes of their arguments (as opposed to being overloaded on the specific types themselves), which makes it much harder (if not impossible) to have different implementations based on the genre of range being used (e.g. input range vs forward range) or other attributes of the types being used. Some work has been being done in this area in C++ with concepts and similar ideas, but AFAIK, C++ is still seriously lacking in the features necessary to overload templates (be they templated functions or templated types) based on the attributes of their argument types rather than specializing on specific argument types (as occurs with template specialization).
A related feature would be static if. It's the same as if, except that its condition is evaluated at compile time, and whether it's true or false will actually determine which branch is compiled in as opposed to which branch is run. It allows you to branch code based on conditions known at compile time. e.g.
static if(isDynamicArray!T)
{}
else
{}
or
static if(isRandomAccessRange!Range)
{}
else static if(isBidirectionalRange!Range)
{}
else static if(isForwardRange!Range)
{}
else static if(isInputRange!Range)
{}
else
static assert(0, Range.stringof ~ " is not a valid range!");
static if can to some extent obviate the need for template constraints, as you can essentially put the overloads for a templated function within a single function. e.g.
R find(alias pred = "a == b", R, E)(R haystack, E needle)
{
static if(isInputRange!R &&
is(typeof(binaryFun!pred(haystack.front, needle)) : bool))
{...}
else static if(isForwardRange!R1 && isForwardRange!R2 &&
is(typeof(binaryFun!pred(haystack.front, needle.front)) : bool) &&
!isRandomAccessRange!R1)
{...}
}
but that still results in nastier errors when compilation fails and actually makes it so that you can't overload the template (at least with D's implementation), because overloading is determined before the template is instantiated. So, you can use static if to specialize pieces of a template implementation, but it doesn't quite get you enough of what template constraints get you to not need template constraints (or something similar).
Rather, static if is excellent for doing stuff like specializing only a piece of your function's implementation or for making it so that a range type can properly inherit the attributes of the range type that it's wrapping. For instance, if you call std.algorithm.map on an array of integers, the resultant range can have slicing (because the source range does), whereas if you called map on a range which didn't have slicing (e.g. the ranges returned by std.algorithm.filter can't have slicing), then the resultant ranges won't have slicing. In order to do that, map uses static if to compile in opSlice only when the source range supports it. Currently, map 's code that does this looks like
static if (hasSlicing!R)
{
static if (is(typeof(_input[ulong.max .. ulong.max])))
private alias opSlice_t = ulong;
else
private alias opSlice_t = uint;
static if (hasLength!R)
{
auto opSlice(opSlice_t low, opSlice_t high)
{
return typeof(this)(_input[low .. high]);
}
}
else static if (is(typeof(_input[opSlice_t.max .. $])))
{
struct DollarToken{}
enum opDollar = DollarToken.init;
auto opSlice(opSlice_t low, DollarToken)
{
return typeof(this)(_input[low .. $]);
}
auto opSlice(opSlice_t low, opSlice_t high)
{
return this[low .. $].take(high - low);
}
}
}
This is code in the type definition of map's return type, and whether that code is compiled in or not depends entirely on the results of the static ifs, none of which could be replaced with template specializations based on specific types without having to write a new specialized template for map for every new type that you use with it (which obviously isn't tenable). In order to compile in code based on attributes of types rather than with specific types, you really need something like static if (which C++ does not currently have).
The third major item which C++ is lacking (and which I've more or less touched on throughout) is type introspection. The fact that you can do something like is(typeof(binaryFun!pred(haystack.front, needle)) : bool) or isForwardRange!Range is crucial. Without the ability to check whether a particular type has a particular set of attributes or that a particular piece of code compiles, you can't even write the conditions which template constraints and static if use. For instance, std.range.isInputRange looks something like this
template isInputRange(R)
{
enum bool isInputRange = is(typeof(
{
R r = void; // can define a range object
if (r.empty) {} // can test for empty
r.popFront(); // can invoke popFront()
auto h = r.front; // can get the front of the range
}));
}
It checks that a particular piece of code compiles for the given type. If it does, then that type can be used as an input range. If it doesn't, then it can't. AFAIK, it's impossible to do anything even vaguely like this in C++. But to sanely implement ranges, you really need to be able to do stuff like have isInputRange or test whether a particular type compiles with sort - is(typeof(sort(myRange))). Without that, you can't specialize implementations based on what types of operations a particular range supports, you can't properly forward the attributes of a range when wrapping it (and range functions wrap their arguments in new ranges all the time), and you can't even properly protect your function against being compiled with types which won't work with it. And, of course, the results of static if and template constraints also affect the type introspection (as they affect what will and won't compile), so the three features are very much interconnected.
Really, the main reasons that ranges don't work very well in C++ are the some reasons that metaprogramming in C++ is primitive in comparison to metaprogramming in D. AFAIK, there's no reason that these features (or similar ones) couldn't be added to C++ and fix the problem, but until C++ has metaprogramming capabilities similar to those of D, ranges in C++ are going to be seriously impaired.
Other features such as mixins and Uniform Function Call Syntax would also help, but they're nowhere near as fundamental. Mixins would help primarily with reducing code duplication, and UFCS helps primarily with making it so that generic code can just call all functions as if they were member functions so that if a type happens to define a particular function (e.g. find) then that would be used instead of the more general, free function version (and the code still works if no such member function is declared, because then the free function is used). UFCS is not fundamentally required, and you could even go the opposite direction and favor free functions for everything (like C++11 did with begin and end), though to do that well, it essentially requires that the free functions be able to test for the existence of the member function and then call the member function internally rather than using their own implementations. So, again you need type introspection along with static if and/or template constraints.
As much as I love ranges, at this point, I've pretty much given up on attempting to do anything with them in C++, because the features to make them sane just aren't there. But if other folks can figure out how to do it, all the more power to them. Regardless of ranges though, I'd love to see C++ gain features such as template constraints, static if, and type introspection, because without them, metaprogramming is way less pleasant, to the point that while I do it all the time in D, I almost never do it in C++.

Does C++11 support types recursion in templates?

I want to explain the question in detail. In many languages with strong type systems (like Felix, Ocaml, Haskell) you can define a polymorphic list by composing type constructors. Here's the Felix definition:
typedef list[T] = 1 + T * list[T];
typedef list[T] = (1 + T * self) as self;
In Ocaml:
type 'a list = Empty | Cons ('a, 'a list)
In C, this is recursive but neither polymorphic nor compositional:
struct int_list { int elt; struct int_list *next; };
In C++ it would be done like this, if C++ supported type recursion:
struct unit {};
template<typename T>
using list<T> = variant< unit, tuple<T, list<T>> >;
given a suitable definition for tuple (aka pair) and variant (but not the broken one used in Boost). Alternatively:
using list<T> = variant< unit, tuple<T, &list<T>> >;
might be acceptable given a slightly different definition of variant. It was not possible to even write this in C++ < C++11 because without template typedefs, there's no way to get polymorphism, and without a sane syntax for typedefs, there's no way to get the target type in scope. The using syntax above solves both these problems, however this does not imply recursion is permitted.
In particular please note that allowing recursion has a major impact on the ABI, i.e. on name mangling (it can't be done unless the name mangling scheme allows for representation of fixpoints).
My question: is required to work in C++11?
[Assuming the expansion doesn't result in an infinitely large struct]
Edit: just to be clear, the requirement is for general structural typing. Templates provide precisely that, for example
pair<int, double>
pair<int, pair <long, double> >
are anonymously (structurally) typed, and pair is clearly polymorphic. However recursion in C++ < C++11 cannot be stated, not even with a pointer. In C++11 you can state the recursion, albeit with a template typedef (with the new using syntax the expression on the LHS of the = sign is in scope on the RHS).
Structural (anonymous) typing with polymorphism and recursion are minimal requirements for a type system.
Any modern type system must support polynomial type functors or the type system is too clumbsy to do any kind of high level programming. The combinators required for this are usually stated by type theoreticians like:
1 | * | + | fix
where 1 is the unit type, * is tuple formation, + is variant formation, and fix is recursion. The idea is simply that:
if t is a type and u is a type then t + u and t * u are also types
In C++, struct unit{} is 1, tuple is *, variant is + and fixpoints might be obtained with the using = syntax. It's not quite anonymous typing because the fixpoint would require a template typedef.
Edit: Just an example of polymorphic type constructor in C:
T* // pointer formation
T (*)(U) // one argument function type
T[2] // array
Unfortunately in C, function values aren't compositional, and pointer formation is subject to lvalue constraint, and the syntactic rules for type composition are not themselves compositional, but here we can say:
if T is a type T* is a type
if T and U are types, T (*)(U) is a type
if T is a type T[2] is a type
so these type constuctors (combinators) can be applied recursively to get new types without having to create a new intermediate type. In C++ we can easily fix the syntactic problem:
template<typename T> using ptr<T> = T*;
template<typename T, typename U> using fun<T,U> = T (*)(U);
template<typename T> using arr2<T> = T[2];
so now you can write:
arr2<fun<double, ptr<int>>>
and the syntax is compositional, as well as the typing.
No, that is not possible. Even indirect recursion through alias templates is forbidden.
C++11, 4.5.7/3:
The type-id in an alias template declaration shall not refer to the alias template being declared. The type produced by an alias template specialization shall not directly or indirectly make use of that specialization. [ Example:
template <class T> struct A;
template <class T> using B = typename A<T>::U;
template <class T> struct A {
typedef B<T> U;
};
B<short> b; // error: instantiation of B<short> uses own type via A<short>::U
— end example ]
If you want this, stick to your Felix, Ocaml, or Haskell. You will easily realize that very few (none?) sucessful languages have type systems as rich as those three. And in my opinion, if all languages were the same, learning new ones wouldn't be worth it.
template<typename T>
using list<T> = variant< unit, tuple<T, list<T>> >;
In C++ doesn't work because an alias template doesn't define a new type. It's purely an alias, a synonym, and it is equivalent to its substitution. This is a feature, btw.
That alias template is equivalent to the following piece of Haskell:
type List a = Either () (a, List a)
GHCi rejects this because "[cycles] in type synonym declarations" are not allowed. I'm not sure if this is outright banned in C++, or if it is allowed but causes infinite recursion when substituted. Either way, it doesn't work.
The way to define new types in C++ is with the struct, class, union, and enum keywords. If you want something like the following Haskell (I insist on Haskell examples, because I don't know the other two languages), then you need to use those keywords.
newtype List a = List (Either () (a, List a))
I think you may need to review your type theory, as several of your assertions are incorrect.
Let's address your main question (and backhanded point) - as others have pointed out type recursion of the type you requested is not allowed. This does not mean that c++ does not support type recursion. It supports it perfectly well. The type recursion you requested is type name recursion, which is a syntactic flair that actually has no consequence on the actual type system.
C++ allows tuple membership recursion by proxy. For instance, c++ allows
class A
{
A * oneOfMe_;
};
That is type recursion that has real consequences. (And obviously no language can do this without internal proxy representation because size is infinitely recursive otherwise).
Also C++ allows translationtime polymorphism, which allow for the creation of objects that act like any type you may create using name recursion. The name recursion is only used to unload types to members or provide translationtime behavior assignments in the type system. Type tags, type traits, etc. are well known c++ idioms for this.
To prove that type name recursion does not add functionality to a type system, it only needs to be pointed out that c++'s type system allows a fully Turing Complete type calculation, using metaprogramming on compiletime constants (and typelists of them), through simple mapping of names to constants. This means there is a function MakeItC++:YourIdeaOfPrettyName->TypeParametrisedByTypelistOfInts that makes any Turing computible typesystem you want.
As you know, being a student of type theory, variants are dual to tuple products. In the type category, any property of variants has a dual property of tuple products with arrows reversed. If you work consistently with the duality, you do not get properties with "new capabilities" (in terms of type calculations). So on the level of type calculations, you obviously don't need variants. (This should also be obvious from the Turing Completeness.)
However, in terms of runtime behavior in an imperative language, you do get different behavior. And it is bad behavior. Whereas products restrict semantics, variants relax semantics. You should never want this, as it provably destroys code correctness. The history of statically typed programming languages has been moving towards greater and greater expression of the semantics in the type system, with the goal that the compiler should be able to understand when the program does not mean what you want it to. The goal has been to turn the compiler into the program verification system.
For instance, with type units, you can express that a particular value isn't just an int but is actually an acceleration measured in meters per square seconds. Assigning a value that is a velocity expressed in feet per hour divided by a timespan of minutes shouldn't just divide the two values - it should note that a conversion is necessary (and either perform it or fail compilation or... do the right thing). Assinging a force should fail compilation. Doing these kinds of checks on program meaning could have given us potentially more martian exploration, for instance.
Variants are the opposite direction. Sure, "if you code correctly, they work correctly", but that's not the point with code verification. They provably add code loci where a different engineer, unfamiliar with current type usage, can introduce the incorrect semantic assumption without translation failure. And, there is always a code transformation that changes an imperative code section from one that uses Variants unsafely to one that use semantically validated non-variant types, so their use is also "always suboptimal".
The majority of runtime uses for variants are typically those that are better encapsulated in runtime polymorphism. Runtime polymorphism has a statically verified semantics that may have associated runtime invariant checking and unlike variants (where the sum type is universally declared in one code locus) actually supports the Open-Closed principle. By needing to declare a variant in one location, you must change that location everytime you add a new functional type to the sum. This means that code never closes to change, and therefore may have bugs introduced. Runtime polymorphism, though, allows new behaviors to be added in separate code loci from the other behaviors.
(And besides, most real language type systems are not distributive anyway. (a, b | c) =/= (a, b) | (a, c) so what is the point here?)
I would be careful making blanket statements about what makes a type system good without getting some experience in the field, particularly if your point is to be provocative and political and enact change. I do not see anything in your post that actually points to healthy changes for any computer language. I do not see features, safety, or any of the other actual real-world concerns being addressed. I totally get the love of type theory. I think every computer scientist should know Cateogry Theory and the denotational semantics of programming languages (domain theory, cartesian categories, all the good stuff). I think if more people understood the Curry-Howard isomorphism as an ontological manifesto, constructivist logics would get more respect.
But none of that provides reasons to attack the c++ type system. There are legitimate attacks for nearly every language - type name recursion and variant availability are not them.
EDIT: Since my point about Turing completeness does not seem to be understood, nor my comment about the c++ way of using type tags and traits to offload type calculations, maybe an example is in order.
Now the OP claims to want this in a usage case for lists, which my earlier point on the layout easily handles. Better, just use std::list. But from other comments and elsewhere, I think they really want this to work on the Felix->C++ translation.
So, what I think the OP thinks they want is something like
template <typename Type>
class SomeClass
{
// ...
};
and then be able to build a type
SomeClass< /*insert the SomeClass<...> type created here*/ >
I've mentioned this is just a naming convention wanted. Nobody wants typenames - they are transients of the translation process. What is actually wanted is what you will do with Type later on in the structural composition of the type. It will be used in typename calculations to produce member data and method signatures.
So, what can be done in c++ is
struct SelfTag {};
Then, when you want to refer to self, just put this type tag there.
When it's meaningful to do the type calculation, you have a template specialisation on SelfTag that will substitute SomeClass<SelfTag> instead of substituting SelfTag in the appropriate place of the type calculation.
My point here is that the c++ type system is Turing Complete - and that means a lot more than what I think the OP is reading everytime I've written that. Any type calculation may be done (given constraints of compiler recursion) and that really does mean that if you have a problem in one type system in a completely different language, you can find a translation here. I hope this makes things even clearer about my point. Coming back and saying "well you still can't do XYZ in the type system" would be clearly missing the point.
C++ does have the "curiously recurring template pattern", or CRTP. It's not specific to C++11, however. It means you can do the following (shamelessly copied from Wikipedia):
template <typename T>
struct base
{
// ...
};
struct derived : base<derived>
{
// ...
};
#jpalcek answered my question. However, my actual problem (as hinted at in the examples) can be solved without recursive aliases like this:
// core combinators
struct unit;
struct point;
template<class T,class U> struct fix;
template<class T, class U> struct tup2;
template<class T, class U> struct var2;
template <> struct
fix<
point,
var2<unit, tup2<int,point> >
>
{
// definition goes here
};
using the fix and point types to represent recursion. I happen not to require any of the templates to be defined, I only need to define the specialisations. What I needed was a name that would be the same in two distinct translation units for external linkage: the name had to be a function of the type structure.
#Ex0du5 prompted thinking about this. The actual solution is also related to a correspondence from Gabriel des Rois many years ago. I want to thank everyone that contributed.

Why is C++11 constexpr so restrictive?

As you probably know, C++11 introduces the constexpr keyword.
C++11 introduced the keyword constexpr, which allows the user to
guarantee that a function or object constructor is a compile-time
constant.
[...]
This allows the compiler to understand, and verify, that [function name] is a
compile-time constant.
My question is why are there such strict restrictions on form of the functions that can be declared. I understand desire to guarantee that function is pure, but consider this:
The use of constexpr on a function imposes some limitations on what
that function can do. First, the function must have a non-void return
type. Second, the function body cannot declare variables or define new
types. Third, the body may only contain declarations, null statements
and a single return statement. There must exist argument values such
that, after argument substitution, the expression in the return
statement produces a constant expression.
That means that this pure function is illegal:
constexpr int maybeInCppC1Y(int a, int b)
{
if (a>0)
return a+b;
else
return a-b;
//can be written as return (a>0) ? (a+b):(a-b); but that isnt the point
}
Also you cant define local variables... :(
So I'm wondering is this a design decision, or do compilers suck when it comes to proving function a is pure?
The reason you'd need to write statements instead of expressions is that you want to take advantage of the additional capabilities of statements, particularly the ability to loop. But to be useful, that would require the ability to declare variables (also banned).
If you combine a facility for looping, with mutable variables, with logical branching (as in if statements) then you have the ability to create infinite loops. It is not possible to determine if such a loop will ever terminate (the halting problem). Thus some sources would cause the compiler to hang.
By using recursive pure functions it is possible to cause infinite recursion, which can be shown to be equivalently powerful to the looping capabilities described above. However, C++ already has that problem at compile time - it occurs with template expansion - and so compilers already have to have a switch for "template stack depth" so they know when to give up.
So the restrictions seem designed to ensure that this problem (of determining if a C++ compilation will ever finish) doesn't get any thornier than it already is.
The rules for constexpr functions are designed such that it's impossible to write a constexpr function that has any side-effects.
By requiring constexpr to have no side-effects it becomes impossible for a user to determine where/when it was actually evaluated. This is important since constexpr functions are allowed to happen at both compile time and run time at the discretion of the compiler.
If side-effects were allowed then there would need to be some rules about the order in which they would be observed. That would be incredibly difficult to define - even harder than the static initialisation order problem.
A relatively simple set of rules for guaranteeing these functions to be side-effect free is to require that they be just a single expression (with a few extra restrictions on top of that). This sounds limiting initially and rules out the if statement as you noted. Whilst that particular case would have no side-effects it would have introduced extra complexity into the rules and given that you can write the same things using the ternary operator or recursively it's not really a huge deal.
n2235 is the paper that proposed the constexpr addition in C++. It discusses the rational for the design - the relevant quote seems to be this one from a discussion on destructors, but relevant generally:
The reason is that a constant-expression is intended to be evaluated by the compiler
at translation time just like any other literal of built-in type; in particular no
observable side-effect is permitted.
Interestingly the paper also mentions that a previous proposal suggested the the compiler figured out automatically which functions were constexpr without the new keyword, but this was found to be unworkably complex, which seems to support my suggestion that the rules were designed to be simple.
(I suspect there will be other quotes in the references cited in the paper, but this covers the key point of my argument about the no side-effects)
Actually the C++ standardization committee is thinking about removing several of these constraints for c++14. See the following working document http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2013/n3597.html
The restrictions could certainly be lifted quite a bit without enabling code which cannot be executed during compile time, or which cannot be proven to always halt. However I guess it wasn't done because
it would complicate the compiler for minimal gain. C++ compilers are quite complex as is
specifying exactly how much is allowed without violating the restrictions above would have been time consuming, and given that desired features have been postponed in order to get the standard out of the door, there probably was little incentive to add more work (and further delay of the standard) for little gain
some of the restrictions would have been either rather arbitrary or rather complicated (especially on loops, given that C++ doesn't have the concept of a native incrementing for loop, but both the end condition and the increment code have to be explicitly specified in the for statement, making it possible to use arbitrary expressions for them)
Of course, only a member of the standards committee could give an authoritative answer whether my assumptions are correct.
I think constexpr is just for const objects. I mean; you can now have static const objects like String::empty_string constructs statically(without hacking!). This may reduce time before 'main' called. And static const objects may have functions like .length(), operator==,... so this is why 'expr' is needed. In 'C' you can create static constant structs like below:
static const Foos foo = { .a = 1, .b = 2, };
Linux kernel has tons of this type classes. In c++ you could do this now with constexpr.
note: I dunno but code below should not be accepted so like if version:
constexpr int maybeInCppC1Y(int a, int b) { return (a > 0) ? (a + b) : (a - b); }

What are the benefits of using Boost.Phoenix?

I can not understand what the real benefits of using Boost.Phoenix.
When I use it with Boost.Spirit grammars, it's really useful:
double_[ boost::phoenix::push_back( boost::phoenix::ref( v ), _1 ) ]
When I use it for lambda functions, it's also useful and elegant:
boost::range::for_each( my_string, if_ ( '\\' == arg1 ) [ arg1 = '/' ] );
But what are the benefits of everything else in this library? The documentation says: "Functors everywhere". I don't understand what is the good of it?
I'll point you out what is the critical difference between Boost.Lambda and Boost.Phoenix:
Boost.Phoenix supports (statically) polymorphic functors, while Boost.Lambda binds are always monomorphic.
(At the same time, in many aspects the two libraries can be combined, so they are not exclusive choices.)
Let me illustrate (Warning: Code not tested.):
Phoenix
In Phoenix a functor can converted into a Phoenix "lazy function" (from http://www.boost.org/doc/libs/1_54_0/libs/phoenix/doc/html/phoenix/starter_kit/lazy_functions.html)
struct is_odd_impl{
typedef bool result_type; // less necessary in C++11
template <typename Arg>
bool operator()(Arg arg1) const{
return arg1 % 2 == 1;
}
};
boost::phoenix::function<is_odd_impl> is_odd;
is_odd is truly polymorphic (as the functor is_odd_impl). That is is_odd(_1) can act on anything (that makes sense). For example in is_odd(_1)(2u)==true and is_odd(_1)(2l)==true. is_odd can be combined into a more complex expression without losing its polymorphic behavior.
Lambda attempt
What is the closest we can get to this in Boost.Lambda?, we could defined two overloads:
bool is_odd_overload(unsigned arg1){return arg1 % 2 == 1;}
bool is_odd_overload(long arg1){return arg1 % 2 == 1;}
but to create a Lambda "lazy function" we will have to choose one of the two:
using boost::lambda::bind;
auto f0 = bind(&is_odd_overload, _1); // not ok, cannot resolve what of the two.
auto f1 = bind(static_cast<bool(*)(unsigned)>(&is_odd_overload), _1); //ok, but choice has been made
auto f2 = bind(static_cast<bool(*)(long)>(&is_odd_overload), _1); //ok, but choice has been made
Even if we define a template version
template<class T>
bool is_odd_template(T arg1){return arg1 % 2 == 1;}
we will have to bind to a particular instance of the template function, for example
auto f3 = bind(&is_odd_template<unsigned>, _1); // not tested
Neither f1 nor f2 nor f3 are truly polymorphic since a choice has been made at the time of binding.
(Note1: this may not be the best example since things may seem to work due to implicit conversions from unsigned to long, but that is another matter.)
To summarize, given a polymorphic function/functor Lambda cannot bind to the polymorphic function (as far as I know), while Phoenix can. It is true that Phoenix relies on the "Result Of protocol" http://www.boost.org/doc/libs/1_54_0/libs/utility/utility.htm#result_of but 1) at least it is possible, 2) This is less of a problem in C++11, where return types are very easy to deduce and it can be done automatically.
In fact, in C++11, Phoenix lambdas are still more powerful than C++11
built-in lambdas. Even in C++14, where template generic lambdas are
implemented, Phoenix is still more general, because it allows a
certain level of introspection. (For this an other things, Joel de
Guzman (developer of Phoenix) was and still is well ahead of his
time.)
Well, its a very powerful lambda language.
I used it to create a prototype for a math-like DSL:
http://code.google.com/p/asadchev/source/browse/trunk/work/cxx/interval.hpp
and many other things:
http://code.google.com/p/asadchev/source/browse/#svn%2Ftrunk%2Fprojects%2Fboost%2Fphoenix
I have never used Phoenix, but...
From the Phoenix Library docs:
The Phoenix library enables FP techniques such as higher order functions, lambda (unnamed functions), currying (partial function application) and lazy evaluation in C++
From the Wikipedia article on Functional programming:
... functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data. It emphasizes the application of functions, in contrast to the imperative programming style, which emphasizes changes in state
So, Phoenix is a library for enabling Functional Programming in C++.
The major interest in Functional Programming these days seems to stem from the perceived advantages in correctness, and performance, due to limiting or eliminating side-effects.
Correctness, because without side-effects, the code that you see is everything going on in the system. Some other code won't be changing your state underneath you. You can much more easily write bug-free code in this sort of environment.
Performance, because without side-effects, the code you write can safely run in parallel, without any resource managing primitives, or atomic-access tricks. Multi-threading can be enabled extremely easily, even automatically, and operate extremely efficiently.
Don't look at Boost.Phoenix2.
Evolution of lambda expressions in boost looks like:
Bind -> Lambda, Phoenix2 (as Spirit part) -> Phoenix3 (as separate library, under development).
Result is single lambda-library with polymorphic functors support (others are going to become deprecated).
Functional programming in C++. It's hard to explain unless you have previously used a language with proper support for functional programming, such as SML. I tried to use Phoenix and found it nice, but very impractical in real-life projects because it greatly increases compilation times, and error messages are awful when you do something wrong. I rememeber getting a few megabytes of errors from GCC when I played with Phoenix. Also, debugging deeply nested template instantiations is a PITA. (Actually, these are also all the arguments against using most of boost.)

Explain ML type inference to a C++ programmer

How does ML perform the type inference in the following function definition:
let add a b = a + b
Is it like C++ templates where no type-checking is performed until the point of template instantiation after which if the type supports the necessary operations, the function works or else a compilation error is thrown ?
i.e. for example, the following function template
template <typename NumType>
NumType add(NumType a, NumType b) {
return a + b;
}
will work for
add<int>(23, 11);
but won't work for
add<ostream>(cout, fout);
Is what I am guessing is correct or ML type inference works differently?
PS: Sorry for my poor English; it's not my native language.
I suggest you have a look at this article: What is Hindley-Milner? (and why is it cool)
Here is the simplest example they use to explain type inference (it's not ML, but the idea is the same):
def foo(s: String) = s.length
// note: no explicit types
def bar(x, y) = foo(x) + y
Just looking at the definition of bar, we can easily see that its type must be (String, Int)=>Int. That's type inference in a nutshell. Read the whole article for more information and examples.
I'm not a C++ expert, but I think templates are something else that is closer to genericity/parametricity, which is something different.
I think trying to relate ML type inference to almost anything in C++ is more likely to lead to confusion than understanding. C++ just doesn't have anything that's much like type inference at all.
The only part of C++ that doesn't making typing explicit is templates, but (for the most part) they support generic programming. A C++ function template like you've given might apply equally to an unbounded set of types -- just for example, the code you have uses NumType as the template parameter, but would work with strings. A single program could instantiate your add to add two strings in one place, and two numbers in another place.
ML type inference isn't for generic programming. In C or C++, you explicitly define the type of a parameter, and then the compiler checks that everything you try to do with that parameter is allowed by that type. ML reverses that: it looks at the things you do with the parameter, and figures out what the type has to be for you to be able to do those things. If you've tried to do things that contradict each other, it'll tell you there is no type that can satisfy the constraints.
This would be pretty close to impossible in C or C++, largely because of all the implicit type conversions that are allowed. Just for example, if I have something like a + b in ML, it can immediately conclude that a and b must be ints -- but in C or C++, they could be almost any combination of integer or pointer or floating point types (with the constraint that they can't both be pointers) or used defined types that overload operator+ (e.g., std::string). In ML, finding types can be exponential in the worst case, but is almost always pretty fast. In C++, I'd estimate it being exponential much more often, and in a lot of cases would probably be under-constrained, so a given function could have any of a number of different signatures.
ML uses Hindley-Milner type inference. In this simple case all it has to do is look at the body of the function and see that it uses + with the arguments and returns that. Thus it can infer that the arguments must be the type of arguments that + accepts (i.e. ints) and the function returns the type that + returns (also int). Thus the inferred type of add is int -> int -> int.
Note that in SML (but not CAML) + is also defined for other types than int, but it will still infer int when there are multiple possibilities (i.e. the add function you defined can not be used to add two floats).
F# and ML are somewhat similar with regards to type inference, so you might find
Overview of type inference in F#
helpful.