Why haven't modern compilers removed the need for expression templates? - c++

The standard pitch for expression templates in C++ is that they increase efficiency by removing unnecessary temporary objects. Why can't C++ compilers already remove these unnecessary temporary objects?
This is a question that I think I already know the answer to but I want to confirm since I couldn't find a low-level answer online.
Expression templates essentially allow/force an extreme degree of inlining. However, even with inlining, compilers cannot optimize out calls to operator new and operator delete because they treat those calls as opaque since those calls can be overridden in other translation units. Expression templates completely remove those calls for intermediate objects.
These superfluous calls to operator new and operator delete can be seen in a simple example where we only copy:
#include <array>
#include <vector>
std::vector<int> foo(std::vector<int> x)
{
std::vector<int> y{x};
std::vector<int> z{y};
return z;
}
std::array<int, 3> bar(std::array<int, 3> x)
{
std::array<int, 3> y{x};
std::array<int, 3> z{y};
return z;
}
In the generated code, we see that foo() compiles to a relatively lengthy function with two calls to operator new and one call to operator delete while bar() compiles to only a transfer of registers and doesn't do any unnecessary copying.
Is this analysis correct?
Could any C++ compiler legally elide the copies in foo()?

However, even with inlining, compilers cannot optimize out calls to operator new and operator delete because they treat those calls as opaque since those calls can be overridden in other translation units.
since c++14, this is no more true, allocation calls can be optimized-out/reused under certain conditions:
[expr.new#10] An implementation is allowed to omit a call to a replaceable global allocation function. When it does so, the storage is instead provided by the implementation or provided by extending the allocation of another new-expression.[conditions follows]
So foo() may be legally optimized to something equivalent to bar() nowadays ...
Expression templates essentially allow/force an extreme degree of inlining
IMO the point of expression templates is not much about inlining per se, it's rather exploiting the symmetries of the type system of the domain specific language the expression models.
For example, when you multiply three, say, hermitian matrices, an expression template can use a space-time optimized algorithm exploiting the fact that the product is associative and that hermitian matrices are adjoint-symmetric, resulting in a reduction of total operation count (and possibly even better accuracy). And all this, occurs at compile time.
Conversely, a compiler cannot know what an hermitian matrix is, it's constrained evaluating the expression the brute way (according to your implementation floating point semantics).

There are two kinds of expression templates.
One kind is about domain specific languages embedded directly in C++. Boost.Spirit turns expressions into recursive descent parsers. Boost.Xpressive turns them into regular expressions. Good old Boost.Lambda turns them into function objects with argument placeholders.
Obviously there is nothing a compiler can do to get rid of that need. It would take special-purpose language extensions to add the capabilities that the eDSL adds, like lambdas were added to C++11. But it's not productive to do that for every eDSL written; it would make the language gigantic and impossible to comprehend, among other problems.
The second kind is the expression templates that keep high-level semantics the same but optimize execution. They apply domain-specific knowledge to transform expressions into more efficient execution paths, while keeping the semantics the same. A linear algebra library might do that as Massimiliano explained in his answer, or a SIMD library like Boost.Simd might translate multiple operations into a single fused operation like multiply-add.
These libraries provide services that a compiler could, in theory, perform without modifying the language specification. However, in order to do so, the compiler would have to recognize the domain in question and have all the built-in domain knowledge to do the transformations. This approach is way too complex and would make compilers huge and even slower than they are.
An alternative approach to expression templates for these kinds of libraries would be compiler plugins, i.e. instead of writing a special matrix class that has all the expression template magic, you write a plugin for the compiler that knows about the matrix type and transforms the AST that the compiler uses. The problem with this approach is that either compilers would have to agree on a plugin API (not going to happen, they work too differently internally) or the library author has to write a separate plugin for every compiler they want their library to be usable (or at least performant) with.

Related

Are there any C++ language obstacles that prevent adopting D ranges?

This is a C++ / D cross-over question. The D programming language has ranges that -in contrast to C++ libraries such as Boost.Range- are not based on iterator pairs. The official C++ Ranges Study Group seems to have been bogged down in nailing a technical specification.
Question: does the current C++11 or the upcoming C++14 Standard have any obstacles that prevent adopting D ranges -as well as a suitably rangefied version of <algorithm>- wholesale?
I don't know D or its ranges well enough, but they seem lazy and composable as well as capable of providing a superset of the STL's algorithms. Given their claim to success for D, it would seem very nice to have as a library for C++. I wonder how essential D's unique features (e.g. string mixins, uniform function call syntax) were for implementing its ranges, and whether C++ could mimic that without too much effort (e.g. C++14 constexpr seems quite similar to D compile-time function evaluation)
Note: I am seeking technical answers, not opinions whether D ranges are the right design to have as a C++ library.
I don't think there is any inherent technical limitation in C++ which would make it impossible to define a system of D-style ranges and corresponding algorithms in C++. The biggest language level problem would be that C++ range-based for-loops require that begin() and end() can be used on the ranges but assuming we would go to the length of defining a library using D-style ranges, extending range-based for-loops to deal with them seems a marginal change.
The main technical problem I have encountered when experimenting with algorithms on D-style ranges in C++ was that I couldn't make the algorithms as fast as my iterator (actually, cursor) based implementations. Of course, this could just be my algorithm implementations but I haven't seen anybody providing a reasonable set of D-style range based algorithms in C++ which I could profile against. Performance is important and the C++ standard library shall provide, at least, weakly efficient implementations of algorithms (a generic implementation of an algorithm is called weakly efficient if it is at least as fast when applied to a data structure as a custom implementation of the same algorithm using the same data structure using the same programming language). I wasn't able to create weakly efficient algorithms based on D-style ranges and my objective are actually strongly efficient algorithms (similar to weakly efficient but allowing any programming language and only assuming the same underlying hardware).
When experimenting with D-style range based algorithms I found the algorithms a lot harder to implement than iterator-based algorithms and found it necessary to deal with kludges to work around some of their limitations. Of course, not everything in the current way algorithms are specified in C++ is perfect either. A rough outline of how I want to change the algorithms and the abstractions they work with is on may STL 2.0 page. This page doesn't really deal much with ranges, however, as this is a related but somewhat different topic. I would rather envision iterator (well, really cursor) based ranges than D-style ranges but the question wasn't about that.
One technical problem all range abstractions in C++ do face is having to deal with temporary objects in a reasonable way. For example, consider this expression:
auto result = ranges::unique(ranges::sort(std::vector<int>{ read_integers() }));
In dependent of whether ranges::sort() or ranges::unique() are lazy or not, the representation of the temporary range needs to be dealt with. Merely providing a view of the source range isn't an option for either of these algorithms because the temporary object will go away at the end of the expression. One possibility could be to move the range if it comes in as r-value, requiring different result for both ranges::sort() and ranges::unique() to distinguish the cases of the actual argument being either a temporary object or an object kept alive independently. D doesn't have this particular problem because it is garbage collected and the source range would, thus, be kept alive in either case.
The above example also shows one of the problems with possibly lazy evaluated algorithm: since any type, including types which can't be spelled out otherwise, can be deduced by auto variables or templated functions, there is nothing forcing the lazy evaluation at the end of an expression. Thus, the results from the expression templates can be obtained and the algorithm isn't really executed. That is, if an l-value is passed to an algorithm, it needs to be made sure that the expression is actually evaluated to obtain the actual effect. For example, any sort() algorithm mutating the entire sequence clearly does the mutation in-place (if you want a version doesn't do it in-place just copy the container and apply the in-place version; if you only have a non-in-place version you can't avoid the extra sequence which may be an immediate problem, e.g., for gigantic sequences). Assuming it is lazy in some way the l-value access to the original sequence provides a peak into the current status which is almost certainly a bad thing. This may imply that lazy evaluation of mutating algorithms isn't such a great idea anyway.
In any case, there are some aspects of C++ which make it impossible to immediately adopt the D-sytle ranges although the same considerations also apply to other range abstractions. I'd think these considerations are, thus, somewhat out of scope for the question, too. Also, the obvious "solution" to the first of the problems (add garbage collection) is unlikely to happen. I don't know if there is a solution to the second problem in D. There may emerge a solution to the second problem (tentatively dubbed operator auto) but I'm not aware of a concrete proposal or how such a feature would actually look like.
BTW, the Ranges Study Group isn't really bogged down by any technical details. So far, we merely tried to find out what problems we are actually trying to solve and to scope out, to some extend, the solution space. Also, groups generally don't get any work done, at all! The actual work is always done by individuals, often by very few individuals. Since a major part of the work is actually designing a set of abstractions I would expect that the foundations of any results of the Ranges Study Group is done by 1 to 3 individuals who have some vision of what is needed and how it should look like.
My C++11 knowledge is much more limited than I'd like it to be, so there may be newer features which improve things that I'm not aware of yet, but there are three areas that I can think of at the moment which are at least problematic: template constraints, static if, and type introspection.
In D, a range-based function will usually have a template constraint on it indicating which type of ranges it accepts (e.g. forward range vs random-access range). For instance, here's a simplified signature for std.algorithm.sort:
auto sort(alias less = "a < b", Range)(Range r)
if(isRandomAccessRange!Range &&
hasSlicing!Range &&
hasLength!Range)
{...}
It checks that the type being passed in is a random-access range, that it can be sliced, and that it has a length property. Any type which does not satisfy those requirements will not compile with sort, and when the template constraint fails, it makes it clear to the programmer why their type won't work with sort (rather than just giving a nasty compiler error from in the middle of the templated function when it fails to compile with the given type).
Now, while that may just seem like a usability improvement over just giving a compilation error when sort fails to compile because the type doesn't have the right operations, it actually has a large impact on function overloading as well as type introspection. For instance, here are two of std.algorithm.find's overloads:
R find(alias pred = "a == b", R, E)(R haystack, E needle)
if(isInputRange!R &&
is(typeof(binaryFun!pred(haystack.front, needle)) : bool))
{...}
R1 find(alias pred = "a == b", R1, R2)(R1 haystack, R2 needle)
if(isForwardRange!R1 && isForwardRange!R2 &&
is(typeof(binaryFun!pred(haystack.front, needle.front)) : bool) &&
!isRandomAccessRange!R1)
{...}
The first one accepts a needle which is only a single element, whereas the second accepts a needle which is a forward range. The two are able to have different parameter types based purely on the template constraints and can have drastically different code internally. Without something like template constraints, you can't have templated functions which are overloaded on attributes of their arguments (as opposed to being overloaded on the specific types themselves), which makes it much harder (if not impossible) to have different implementations based on the genre of range being used (e.g. input range vs forward range) or other attributes of the types being used. Some work has been being done in this area in C++ with concepts and similar ideas, but AFAIK, C++ is still seriously lacking in the features necessary to overload templates (be they templated functions or templated types) based on the attributes of their argument types rather than specializing on specific argument types (as occurs with template specialization).
A related feature would be static if. It's the same as if, except that its condition is evaluated at compile time, and whether it's true or false will actually determine which branch is compiled in as opposed to which branch is run. It allows you to branch code based on conditions known at compile time. e.g.
static if(isDynamicArray!T)
{}
else
{}
or
static if(isRandomAccessRange!Range)
{}
else static if(isBidirectionalRange!Range)
{}
else static if(isForwardRange!Range)
{}
else static if(isInputRange!Range)
{}
else
static assert(0, Range.stringof ~ " is not a valid range!");
static if can to some extent obviate the need for template constraints, as you can essentially put the overloads for a templated function within a single function. e.g.
R find(alias pred = "a == b", R, E)(R haystack, E needle)
{
static if(isInputRange!R &&
is(typeof(binaryFun!pred(haystack.front, needle)) : bool))
{...}
else static if(isForwardRange!R1 && isForwardRange!R2 &&
is(typeof(binaryFun!pred(haystack.front, needle.front)) : bool) &&
!isRandomAccessRange!R1)
{...}
}
but that still results in nastier errors when compilation fails and actually makes it so that you can't overload the template (at least with D's implementation), because overloading is determined before the template is instantiated. So, you can use static if to specialize pieces of a template implementation, but it doesn't quite get you enough of what template constraints get you to not need template constraints (or something similar).
Rather, static if is excellent for doing stuff like specializing only a piece of your function's implementation or for making it so that a range type can properly inherit the attributes of the range type that it's wrapping. For instance, if you call std.algorithm.map on an array of integers, the resultant range can have slicing (because the source range does), whereas if you called map on a range which didn't have slicing (e.g. the ranges returned by std.algorithm.filter can't have slicing), then the resultant ranges won't have slicing. In order to do that, map uses static if to compile in opSlice only when the source range supports it. Currently, map 's code that does this looks like
static if (hasSlicing!R)
{
static if (is(typeof(_input[ulong.max .. ulong.max])))
private alias opSlice_t = ulong;
else
private alias opSlice_t = uint;
static if (hasLength!R)
{
auto opSlice(opSlice_t low, opSlice_t high)
{
return typeof(this)(_input[low .. high]);
}
}
else static if (is(typeof(_input[opSlice_t.max .. $])))
{
struct DollarToken{}
enum opDollar = DollarToken.init;
auto opSlice(opSlice_t low, DollarToken)
{
return typeof(this)(_input[low .. $]);
}
auto opSlice(opSlice_t low, opSlice_t high)
{
return this[low .. $].take(high - low);
}
}
}
This is code in the type definition of map's return type, and whether that code is compiled in or not depends entirely on the results of the static ifs, none of which could be replaced with template specializations based on specific types without having to write a new specialized template for map for every new type that you use with it (which obviously isn't tenable). In order to compile in code based on attributes of types rather than with specific types, you really need something like static if (which C++ does not currently have).
The third major item which C++ is lacking (and which I've more or less touched on throughout) is type introspection. The fact that you can do something like is(typeof(binaryFun!pred(haystack.front, needle)) : bool) or isForwardRange!Range is crucial. Without the ability to check whether a particular type has a particular set of attributes or that a particular piece of code compiles, you can't even write the conditions which template constraints and static if use. For instance, std.range.isInputRange looks something like this
template isInputRange(R)
{
enum bool isInputRange = is(typeof(
{
R r = void; // can define a range object
if (r.empty) {} // can test for empty
r.popFront(); // can invoke popFront()
auto h = r.front; // can get the front of the range
}));
}
It checks that a particular piece of code compiles for the given type. If it does, then that type can be used as an input range. If it doesn't, then it can't. AFAIK, it's impossible to do anything even vaguely like this in C++. But to sanely implement ranges, you really need to be able to do stuff like have isInputRange or test whether a particular type compiles with sort - is(typeof(sort(myRange))). Without that, you can't specialize implementations based on what types of operations a particular range supports, you can't properly forward the attributes of a range when wrapping it (and range functions wrap their arguments in new ranges all the time), and you can't even properly protect your function against being compiled with types which won't work with it. And, of course, the results of static if and template constraints also affect the type introspection (as they affect what will and won't compile), so the three features are very much interconnected.
Really, the main reasons that ranges don't work very well in C++ are the some reasons that metaprogramming in C++ is primitive in comparison to metaprogramming in D. AFAIK, there's no reason that these features (or similar ones) couldn't be added to C++ and fix the problem, but until C++ has metaprogramming capabilities similar to those of D, ranges in C++ are going to be seriously impaired.
Other features such as mixins and Uniform Function Call Syntax would also help, but they're nowhere near as fundamental. Mixins would help primarily with reducing code duplication, and UFCS helps primarily with making it so that generic code can just call all functions as if they were member functions so that if a type happens to define a particular function (e.g. find) then that would be used instead of the more general, free function version (and the code still works if no such member function is declared, because then the free function is used). UFCS is not fundamentally required, and you could even go the opposite direction and favor free functions for everything (like C++11 did with begin and end), though to do that well, it essentially requires that the free functions be able to test for the existence of the member function and then call the member function internally rather than using their own implementations. So, again you need type introspection along with static if and/or template constraints.
As much as I love ranges, at this point, I've pretty much given up on attempting to do anything with them in C++, because the features to make them sane just aren't there. But if other folks can figure out how to do it, all the more power to them. Regardless of ranges though, I'd love to see C++ gain features such as template constraints, static if, and type introspection, because without them, metaprogramming is way less pleasant, to the point that while I do it all the time in D, I almost never do it in C++.

Will C++ compiler generate code for each template type?

I have two questions about templates in C++. Let's imagine I have written a simple List and now I want to use it in my program to store pointers to different object types (A*, B* ... ALot*). My colleague says that for each type there will be generated a dedicated piece of code, even though all pointers in fact have the same size.
If this is true, can somebody explain me why? For example in Java generics have the same purpose as templates for pointers in C++. Generics are only used for pre-compile type checking and are stripped down before compilation. And of course the same byte code is used for everything.
Second question is, will dedicated code be also generated for char and short (considering that they both have the same size and there are no specialization).
If this makes any difference, we are talking about embedded applications.
I have found a similar question, but it did not completely answer my question: Do C++ template classes duplicate code for each pointer type used?
Thanks a lot!
I have two questions about templates in C++. Let's imagine I have written a simple List and now I want to use it in my program to store pointers to different object types (A*, B* ... ALot*). My colleague says that for each type there will be generated a dedicated piece of code, even though all pointers in fact have the same size.
Yes, this is equivalent to having both functions written.
Some linkers will detect the identical functions, and eliminate them. Some libraries are aware that their linker doesn't have this feature, and factor out common code into a single implementation, leaving only a casting wrapper around the common code. Ie, a std::vector<T*> specialization may forward all work to a std::vector<void*> then do casting on the way out.
Now, comdat folding is delicate: it is relatively easy to make functions you think are identical, but end up not being the same, so two functions are generated. As a toy example, you could go off and print the typename via typeid(x).name(). Now each version of the function is distinct, and they cannot be eliminated.
In some cases, you might do something like this thinking that it is a run time property that differs, and hence identical code will be created, and the identical functions eliminated -- but a smart C++ compiler might figure out what you did, use the as-if rule and turn it into a compile-time check, and block not-really-identical functions from being treated as identical.
If this is true, can somebody explain me why? For example in Java generics have the same purpose as templates for pointers in C++. Generics are only used for per-compile type checking and are stripped down before compilation. And of course the same byte code is used for everything.
No, they aren't. Generics are roughly equivalent to the C++ technique of type erasure, such as what std::function<void()> does to store any callable object. In C++, type erasure is often done via templates, but not all uses of templates are type erasure!
The things that C++ does with templates that are not in essence type erasure are generally impossible to do with Java generics.
In C++, you can create a type erased container of pointers using templates, but std::vector doesn't do that -- it creates an actual container of pointers. The advantage to this is that all type checking on the std::vector is done at compile time, so there doesn't have to be any run time checks: a safe type-erased std::vector may require run time type checking and the associated overhead involved.
Second question is, will dedicated code be also generated for char and short (considering that they both have the same size and there are no specialization).
They are distinct types. I can write code that will behave differently with a char or short value. As an example:
std::cout << x << "\n";
with x being a short, this print an integer whose value is x -- with x being a char, this prints the character corresponding to x.
Now, almost all template code exists in header files, and is implicitly inline. While inline doesn't mean what most folk think it means, it does mean that the compiler can hoist the code into the calling context easily.
If this makes any difference, we are talking about embedded applications.
What really makes a difference is what your particular compiler and linker is, and what settings and flags they have active.
The answer is maybe. In general, each instantiation of a
template is a unique type, with a unique implementation, and
will result in a totally independent instance of the code.
Merging the instances is possible, but would be considered
"optimization" (under the "as if" rule), and this optimization
isn't wide spread.
With regards to comparisons with Java, there are several points
to keep in mind:
C++ uses value semantics by default. An std::vector, for
example, will actually insert copies. And whether you're
copying a short or a double does make a difference in the
generated code. In Java, short and double will be boxed,
and the generated code will clone a boxed instance in some way;
cloning doesn't require different code, since it calls a virtual
function of Object, but physically copying does.
C++ is far more powerful than Java. In particular, it allows
comparing things like the address of functions, and it requires
that the functions in different instantiations of templates have
different addresses. Usually, this is not an important point,
and I can easily imagine a compiler with an option which tells
it to ignore this point, and to merge instances which are
identical at the binary level. (I think VC++ has something like
this.)
Another issue is that the implementation of a template in C++
must be present in the header file. In Java, of course,
everything must be present, always, so this issue affects all
classes, not just template. This is, of course, one of the
reasons why Java is not appropriate for large applications. But
it means that you don't want any complicated functionality in a
template; doing so loses one of the major advantages of C++,
compared to Java (and many other languages). In fact, it's not
rare, when implementing complicated functionality in templates,
to have the template inherit from a non-template class which
does most of the implementation in terms of void*. While
implementing large blocks of code in terms of void* is never
fun, it does have the advantage of offering the best of both
worlds to the client: the implementation is hidden in compiled
files, invisible in any way, shape or manner to the client.

Why is C++11 constexpr so restrictive?

As you probably know, C++11 introduces the constexpr keyword.
C++11 introduced the keyword constexpr, which allows the user to
guarantee that a function or object constructor is a compile-time
constant.
[...]
This allows the compiler to understand, and verify, that [function name] is a
compile-time constant.
My question is why are there such strict restrictions on form of the functions that can be declared. I understand desire to guarantee that function is pure, but consider this:
The use of constexpr on a function imposes some limitations on what
that function can do. First, the function must have a non-void return
type. Second, the function body cannot declare variables or define new
types. Third, the body may only contain declarations, null statements
and a single return statement. There must exist argument values such
that, after argument substitution, the expression in the return
statement produces a constant expression.
That means that this pure function is illegal:
constexpr int maybeInCppC1Y(int a, int b)
{
if (a>0)
return a+b;
else
return a-b;
//can be written as return (a>0) ? (a+b):(a-b); but that isnt the point
}
Also you cant define local variables... :(
So I'm wondering is this a design decision, or do compilers suck when it comes to proving function a is pure?
The reason you'd need to write statements instead of expressions is that you want to take advantage of the additional capabilities of statements, particularly the ability to loop. But to be useful, that would require the ability to declare variables (also banned).
If you combine a facility for looping, with mutable variables, with logical branching (as in if statements) then you have the ability to create infinite loops. It is not possible to determine if such a loop will ever terminate (the halting problem). Thus some sources would cause the compiler to hang.
By using recursive pure functions it is possible to cause infinite recursion, which can be shown to be equivalently powerful to the looping capabilities described above. However, C++ already has that problem at compile time - it occurs with template expansion - and so compilers already have to have a switch for "template stack depth" so they know when to give up.
So the restrictions seem designed to ensure that this problem (of determining if a C++ compilation will ever finish) doesn't get any thornier than it already is.
The rules for constexpr functions are designed such that it's impossible to write a constexpr function that has any side-effects.
By requiring constexpr to have no side-effects it becomes impossible for a user to determine where/when it was actually evaluated. This is important since constexpr functions are allowed to happen at both compile time and run time at the discretion of the compiler.
If side-effects were allowed then there would need to be some rules about the order in which they would be observed. That would be incredibly difficult to define - even harder than the static initialisation order problem.
A relatively simple set of rules for guaranteeing these functions to be side-effect free is to require that they be just a single expression (with a few extra restrictions on top of that). This sounds limiting initially and rules out the if statement as you noted. Whilst that particular case would have no side-effects it would have introduced extra complexity into the rules and given that you can write the same things using the ternary operator or recursively it's not really a huge deal.
n2235 is the paper that proposed the constexpr addition in C++. It discusses the rational for the design - the relevant quote seems to be this one from a discussion on destructors, but relevant generally:
The reason is that a constant-expression is intended to be evaluated by the compiler
at translation time just like any other literal of built-in type; in particular no
observable side-effect is permitted.
Interestingly the paper also mentions that a previous proposal suggested the the compiler figured out automatically which functions were constexpr without the new keyword, but this was found to be unworkably complex, which seems to support my suggestion that the rules were designed to be simple.
(I suspect there will be other quotes in the references cited in the paper, but this covers the key point of my argument about the no side-effects)
Actually the C++ standardization committee is thinking about removing several of these constraints for c++14. See the following working document http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2013/n3597.html
The restrictions could certainly be lifted quite a bit without enabling code which cannot be executed during compile time, or which cannot be proven to always halt. However I guess it wasn't done because
it would complicate the compiler for minimal gain. C++ compilers are quite complex as is
specifying exactly how much is allowed without violating the restrictions above would have been time consuming, and given that desired features have been postponed in order to get the standard out of the door, there probably was little incentive to add more work (and further delay of the standard) for little gain
some of the restrictions would have been either rather arbitrary or rather complicated (especially on loops, given that C++ doesn't have the concept of a native incrementing for loop, but both the end condition and the increment code have to be explicitly specified in the for statement, making it possible to use arbitrary expressions for them)
Of course, only a member of the standards committee could give an authoritative answer whether my assumptions are correct.
I think constexpr is just for const objects. I mean; you can now have static const objects like String::empty_string constructs statically(without hacking!). This may reduce time before 'main' called. And static const objects may have functions like .length(), operator==,... so this is why 'expr' is needed. In 'C' you can create static constant structs like below:
static const Foos foo = { .a = 1, .b = 2, };
Linux kernel has tons of this type classes. In c++ you could do this now with constexpr.
note: I dunno but code below should not be accepted so like if version:
constexpr int maybeInCppC1Y(int a, int b) { return (a > 0) ? (a + b) : (a - b); }

Whyever **not** declare a function to be `constexpr`?

Any function that consists of a return statement only could be declared
constexpr and thus will allow to be evaluated at compile time if all
arguments are constexpr and only constexpr functions are called in its body. Is there any reason not to declare any such function constexpr ?
Example:
constexpr int sum(int x, int y) { return x + y; }
constexpr i = 10;
static_assert(sum(i, 13) == 23, "sum correct");
Could anyone provide an example where declaring a function constexpr
would do any harm?
Some initial thoughts:
Even if there should be no good reason for ever declaring a function
not constexpr I could imagine that the constexpr keyword has a
transitional role: its absence in code that does not need compile-time
evaluations would allow compilers that do not implement compile-time
evaluations still to compile that code (but to fail reliably on code
that needs them as made explict by using constexpr).
But what I do not understand: if there should be no good reason for
ever declaring a function not constexpr, why is not every function
in the standard library declared constexpr? (You cannot argue
that it is not done yet because there was not sufficient time yet to
do it, because doing it for all is a no-brainer -- contrary to deciding for every single function if to make it constexpr or not.)
--- I am aware that N2976
deliberately not requires cstrs for many standard library types such
as the containers as this would be too limitating for possible
implementations. Lets exclude them from the argument and just wonder:
once a type in the standard library actually has a constexpr cstr, why is not every function operating on it declared constexpr?
In most cases you also cannot argue that you may prefer not to declare a function constexpr simply because you do not envisage any compile-time usage: because if others evtl. will use your code, they may see such a use that you do not. (But granted for type trait types and stuff alike, of course.)
So I guess there must be a good reason and a good example for deliberately not declaring a function constexpr?
(with "every function" I always mean: every function that meets the
requirements for being constexpr, i.e., is defined as a single
return statement, takes only arguments of types with constexpr
cstrs and calls only constexpr functions. Since C++14, much more is allowed in the body of such function: e.g., C++14 constexpr functions may use local variables and loops, so an even wider class of functions could be declared constexpr.)
The question Why does std::forward discard constexpr-ness? is a special case of this one.
Functions can only be declared constexpr if they obey the rules for constexpr --- no dynamic casts, no memory allocation, no calls to non-constexpr functions, etc.
Declaring a function in the standard library as constexpr requires that ALL implementations obey those rules.
Firstly, this requires checking for each function that it can be implemented as constexpr, which is a long job.
Secondly, this is a big constraint on the implementations, and will outlaw many debugging implementations. It is therefore only worth it if the benefits outweigh the costs, or the requirements are sufficiently tight that the implementation pretty much has to obey the constexpr rules anyway. Making this evaluation for each function is again a long job.
I think what you're referring to is called partial evaluation. What you're touching on is that some programs can be split into two parts - a piece that requires runtime information, and a piece that can be done without any runtime information - and that in theory you could just fully evaluate the part of the program that doesn't need any runtime information before you even start running the program. There are some programming languages that do this. For example, the D programming language has an interpreter built into the compiler that lets you execute code at compile-time, provided that it meets certain restrictions.
There are a few main challenges in getting partial evaluation working. First, it dramatically complicates the logic of the compiler because the compiler will need to have the ability to simulate all of the operations that you could put into an executable program at compile-time. This, in the worst case, requires you to have a full interpreter inside of the compiler, making a difficult problem (writing a good C++ compiler) and making it orders of magnitude harder to do.
I believe that the reason for the current specification about constexpr is simply to limit the complexity of compilers. The cases it's limited to are fairly simple to check. There's no need to implement loops in the compiler (which could cause a whole other slew of problems, like what happens if you get an infinite loop inside the compiler). It also avoids the compiler potentially having to evaluate statements that could cause segfaults at runtime, such as following a bad pointer.
Another consideration to keep in mind is that some functions have side-effects, such as reading from cin or opening a network connection. Functions like these fundamentally can't be optimized at compile-time, since doing so would require knowledge only available at runtime.
To summarize, there's no theoretical reason you couldn't partially evaluate C++ programs at compile-time. In fact, people do this all the time. Optimizing compilers, for example, are essentially programs that try to do this as much as possible. Template metaprogramming is one instance where C++ programmers try to execute code inside the compiler, and it's possible to do some great things with templates partially because the rules for templates form a functional language, which the compiler has an easier time implementing. Moreover, if you think of the tradeoff between compiler author hours and programming hours, template metaprogramming shows that if you're okay making programmers bend over backwards to get what they want, you can build a pretty weak language (the template system) and keep the language complexity simple. (I say "weak" as in "not particularly expressive," not "weak" in the computability theory sense).
Hope this helps!
If the function has side effects, you would not want to mark it constexpr. Example
I can't get any unexpected results from that, actually it looks like gcc 4.5.1 just ignores constexpr

Should const functionality be expanded?

EDIT: this question could probably use a more apropos title. Feel free to suggest one in the comments.
In using C++ with a large class set I once came upon a situation where const became a hassle, not because of its functionality, but because it's got a very simplistic definition. Its applicability to an integer or string is obvious, but for more complicated classes there are often multiple properties that could be modified independently of one another. I imagine many people forced to learn what the mutable keyword does might have had similar frustrations.
The most apparent example to me would be a matrix class, representing a 3D transform. A matrix will represent both a translation and a rotation each of which can be changed without modifying the other. Imagine the following class and functions with the hypothetical addition of 'multi-property const'.
class Matrix {
void translate(const Vector & translation) const("rotation");
void rotate(const Quaternion & rotation) const("translation");
}
public void spin180(const("translation") & Matrix matrix);
public void moveToOrigin(const("rotation") & Matrix matrix);
Or imagine predefined const keywords like "_comparable" which allow you to define functions that modify the object at will as long as you promise not to change anything that would affect the sort order of the object, easing the use of objects in sorted containers.
What would be some of the pros and cons of this kind of functionality? Can you imagine a practical use for it in your code? Is there a good approach to achieving this kind of functionality with the current const keyword functionality?
Bear in mind
I know such a language feature could easily be abused. The same can be said of many C++ language features
Like const I would expect this to be a strictly compile-time bit of functionality.
If you already think const is the stupidest thing since sliced mud, I'll take it as read that you feel the same way about this. No need to post, thanks.
EDIT:
In response to SBK's comment about member markup, I would suggest that you don't have any. For classes / members marked const, it works exactly as it always has. For anything marked const("foo") it treats all the members as mutable unless otherwise marked, leaving it up to the class author to ensure that his functions work as advertised. Besides, in a matrix represented as a 2D array internally, you can't mark the individual fields as const or non-const for translation or rotation because all the degrees of freedom are inside a single variable declaration.
Scott Meyers was working on a system of expanding the language with arbitary constraints (using templates).
So you could say a function/method was Verified,ThreadSafe (etc or any other constraints you liked). Then such constrained functions could only call other functions which had at least (or more) constraints. (eg a method maked ThreadSafe could only call another method marked ThreadSafe (unless the coder explicitly cast away that constraint).
Here is the article:
http://www.artima.com/cppsource/codefeatures.html
The cool concept I liked was that the constraints were enforced at compile time.
In cases where you have groups of members that are either const together or mutable together, wouldn't it make as much sense to formalize that by putting them in their own class together? That can be done today without changing the language.
Refinement
When an ADT is indistinguishable from itself after some operation the const property holds for the entire ADT. You wish to define partial constness.
In your sort order example you are asserting that operator< of the ADT is invariant under some other operation on the ADT. Your ad-hoc const names such as "rotation" are defined by the set of operations for which the ADT is invariant. We could leave the invariant unnamed and just list the operations that are invariant inside const(). Due to overloading functions would need to be specified with their full declaration.
void set_color (Color c) const (operator<, std::string get_name());
void set_name (std::string name) const (Color get_color());
So the const names can be seen as a formalism - their existence or absence doesn't change the power of the system. But 'typedef' could be used to name a list of invariants if that proves useful.
typedef const(operator<, std::string get_name()) DontWorryOnlyNameChanged;
It would be hard to think of good names for many cases.
Usefulness
The value in const is that the compiler can check it. This is a different kind of const.
But I see one big flaw in all of this. From your matrix example I might incorrectly infer that rotation and translation are independent and therefore commutative. But there is an obvious data dependency and matrix multiplication is not commutative. Interestingly, this is an example where partial constness is invariant under repeated application of one or the other but not both. 'translate' would be surprised to find that it's object had been translated due to a rotation after a previous translation. Perhaps I am misunderstanding the meaning of rotate and translate. But that's the problem, that constness now seems open to interpretation. So we need ... drum roll ... Logic.
Logic
It appears your proposal is analogous to dependent typing. With a powerful enough type system almost anything is provable at compile time. Your interest is in theorem provers and type theory, not C++. Look into intuitionistic logic, sequent calculus, Hoare logic, and Coq.
Now I've come full circle. Naming makes sense again,
int times_2(int n) const("divisible_by_3");
since divisible_by_3 is actually a type. Here's a prime number type in Qi. Welcome to the rabbit hole. And I pretended to be getting somewhere. What is this place? Why are there no clocks in here?
Such high level concepts are useful for a programmer.
If I wanted to make const-ness fine-grained, I'd do it structurally:
struct C { int x; int y; };
C const<x> *c;
C const<x,y> *d;
C const& e;
C &f;
c=&e; // fail, c->y is mutable via c
d=&e;
c=&f;
d=c;
If you allow me to express a preference for a scope that maximally const methods are preferred (the normal overloading would prefer the non-const method if my ref/pointer is non-const), then the compiler or a standalone static analysis could deduce the sets of must-be-const members for me.
Of course, this is all moot unless you plan on implementing a preprocessor that takes the nice high-level finely grained const C++ and translates it into casting-away-const C++. We don't even have C++0x yet.
I don't think that you can achieve this as strictly compile-time functionality.
I can't think of a good example so this strictly functional one will have to do:
struct Foo{
int bar;
};
bool operator <(Foo l, Foo r){
return (l.bar & 0xFF) < (r.bar & 0xFF);
}
Now I put a some Foos into a sorted set. Obviously the lower 8 bits of bar must remain unchanged so that the order is preserved. The upper bits can however be freely changed. This means the Foos in the set aren't const but aren't mutable either. However I don't see any way you could describe this level of constness in a general useful form without using runtime checking.
If you formalized the requirements I could even imagine, that you could prove that no compiler capable of doing this (at compile time) could even exist.
It could be interesting, but one of the useful features of const's simple definition is that the compiler can check it. If you start adding arbitrary constraints, such as "cannot change sort order", the compiler as it stands now cannot check it. Further, the problem of compile-time checking of arbitrary constraints is, in the general case, impossible to solve due to the halting problem. I would rather see the feature remain limited to what can actually be checked by a compiler.
There is work on enabling compilers to check more and more things — sophisticated type systems (including dependent type systems), and work such as the that done in SPARKAda, allowing for compiler-aided verification of various constraints — but they all eventually hit the theoretical limits of computer science.
I don't think the core language, and especially the const keyword, would be the right place for this. The concept of const in C++ is meant to express the idea that a particular action will not modify a certain area of memory. It is a very low-level idea.
What you are proposing is a logical const-ness that has to do with the high-level semantics of your program. The main problem, as I see it, is that semantics can vary so much between different classes and different programs that there would be no way for there to be a one-size-fits all language construct for this.
What would need to happen is that the programmer would need to be able to write validation code that the compiler would run in order to check that particular operations met his definition of semantic (or "logical") const-ness. When you think about it, though, such code, if it ran at compile-time, would not be very different from a unit test.
Really what you want is for the compiler to test whether functions adhere to a particular semantic contract. That's what unit tests are for. So what you're asking is that there be a language feature that automatically runs unit tests for you during the compilation step. I think that's not terribly useful, given how complicated the system would need to be.