This question is just out of curiosity. In recursive templates if we forget to put one particular specialization, then compiler will do large number of iterations and then stop at sometime and gives error such as,
error: incomplete type ‘X<-0x000000000000001ca>’ used in nested name specifier
In certain cases, the compilation goes infinite. For example, see the below code (just for illustration; compiled with gcc 4.4.1):
template<int I>
struct Infinite
{
enum { value = (I & 0x1)? Infinite<I+1>::value : Infinite<I-1>::value };
};
int main ()
{
int i = Infinite<1>::value;
}
Should not be compiler smart enough to stop at some time ?
Edit: The compilation error shown above is for other code. For the sample code, compilation never stops (however, I get to see such errors in between)
If I understand your question correctly, you want the compiler to recognise that it will never stop iterating. Besides just stopping after a fixed number of nesting types, what you want is provably impossible: If I see it correctly you can express any turing-machine in this fashion (at least the templates in D are turing complete).
So if you can build a compiler that recognises that it will nest types forever before actually trying to, you decide the halting problem which is undecidable for turing-machines.
However, I could very well be mistaken that you can put any computation in the parameter-list (but simulating a register-machine appears to be possible, as we can encode all registers in a separate integer template parameter (yes, int is finite, but quite large, which makes it practically unbounded)).
Getting the parser into an infinite loop using template is not new.
// Stresses the compiler infinitely
// from: http://www.fefe.de/c++/c%2b%2b-talk.pdf
template<class T> struct Loop { Loop<T*> operator->(); };
Loop<int> i, j = i->hooray;
The compiler does what you ask it do do. You asked it to engage into infinite recursion - it did exactly that. If you want it to "stop at some time", you have to ask it to stop at "some time" and tell it what specific "some time" you mean exactly.
Template recursion is not different from any other recursion in C++ program: it is your responsibility to specify where the recursion bottoms-out.
Should not be compiler smart enough to stop at some time ?
How do you define the phrase "at some time"? How would the compiler know your definition of "at some time"? How would it know when it must stop if you don't tell it explicitly? You've to tell it first by defining specialization(s) of the non-stopping class template (what you've written is non-stopping class template).
In your case, you must have two specializations of the class template, one in each directions (increasing, and decreasing). Something like this:
template<>
struct Infinite<100> //this is to stop template with <I+1> argument
{
enum { value = 678678 }; //see, it doesn't use Infinite<> any further!
};
template<>
struct Infinite<-100> //this is to stop template with <I-1> argument
{
enum { value = -67878 }; //see, it too doesn't use Infinite<> any further!
};
Related
Essentially I want to repeat a line of code while changing the value of a single variable, just like a basic for loop.
Right now I'm looking at this:
#define UNROLL2(body) n = 0; body n++; body
int n;
UNROLL2(std::cout << "hello " << n << "\n";)
Works all right, I have one issue with this however.
It relies on the compiler to optimize out the iteration of n and hopefully turn the variable indices into constants.
Is there a better way to construct such a macro? One that wouldn't rely on compiler optimizations?
At first I thought I could just use a defined value as n and redefine it as the macro churns on but.. cant do that.
Also, yes I'm aware most answers on similar topics despise macros and it is theoretically possible to unroll loops with templates. Using MSVC though, I found the results to be inconsistent if the code body requires capturing and while I could make it work without captures, it would make everything look far more confusing than just using macros.
In your question, you presented an example that performed console I/O, and I made the comment that console I/O has an overhead substantially larger than that of a loop construct (conditional branch), so it makes very little sense for this type of thing to be unrolled. This is a case where a smart optimizer would probably not unroll, because the increase in code size would not pay dividends in speed. However, based on your follow-up comments, it appears that this was just a little throw-away example, and that I shouldn't have focused so much on the specifics.
In fact, I completely understand what you are saying about MSVC not unrolling loops. Even with optimizations enabled, it tends not to do loop-unrolling unless you are using profile-guided optimizations. Something as trivial as:
void Leaf();
void MyFunction()
{
for (int i = 0; i < 2; ++i) { Leaf(); }
}
gets transformed into:
push rbx
sub rsp, 32
mov ebx, 2 ; initialize loop counter to 2
npad 5
Loop:
call Leaf
sub rbx, 1 ; decrement loop counter
jne SHORT Loop ; loop again if loop counter != 0
add rsp, 32
pop rbx
ret
even at /O2, which is just pathetic.
I discovered this a while ago, and looked to see if it had already been reported as a defect. Unfortunately, Microsoft recently performed a massive purge of all their old bugs from Connect, so you can't go back very far in the archives, but I did find this similar bug. That one got closed as being related to intrinsics, which was either a misunderstanding or a cop-out, so I opened a new, simplified one, based on the code shown above. I'm still awaiting a meaningful response. Seems like pretty low-hanging fruit to me, as far as optimizations go, and all competing compilers will do this, so this is extremely embarrassing for Microsoft's compiler.
So yeah, if you can't switch compilers, and PGO isn't helping you (or you can't use it either), I totally understand why you might want to do some type of manual unrolling. But I don't really understand why you are template-averse. The reason to use templates isn't about despising macros, but rather because they provide a much cleaner, more powerful syntax, while equally guaranteeing that they will be evaluated/expanded at compile time.
You can have something like:
template <int N>
struct Unroller
{
template <typename T>
void operator()(T& t)
{
t();
Unroller<N-1>()(t);
}
};
template <>
struct Unroller<0>
{
template <typename T>
void operator()(T&)
{ }
};
and combine it with a functor that can be as simple or as complex as you need it to be:
struct MyOperation
{
inline void operator()() { Leaf(); }
};
so that, with the magic of recursive template expansion, you can do:
void MyFunction()
{
MyOperation op;
Unroller<16>()(op);
}
and get precisely the output you expect:
sub rsp, 40
call Leaf
call Leaf
call Leaf
call Leaf
call Leaf
call Leaf
call Leaf
call Leaf
call Leaf
call Leaf
call Leaf
call Leaf
call Leaf
call Leaf
call Leaf
add rsp, 40
jmp Leaf
Naturally, this is a simple example, but it shows you that the optimizer is even able to do a tail-call optimization here. Because the template magic works with a functor, as I said above, you can make the logic to be unrolled as complicated as it needs to be, adding member variables, etc. It'll all get unrolled because the templates are expanded recursively at compile time.
Literally the only disadvantage that I can find to this is that it bloats the object file a bit with all of the template expansions. In this case, with Unroller<16>, I get 17 different function definitions emitted in the object file. But, aside from a minor impact on compile times, that's no big deal, because they won't be included in the final binary output. It would obviously be better if the compiler's optimizer was smart enough to do this on its own, but until that time, this is a viable solution for holding its hand and forcing it to generate the code you want, and I think it's much cleaner than the macro-based approach.
This can be done using Boost Preprocessor library (with macro BOOST_PP_REPEAT), but please bear in mind that the fact that you can does not mean that you should.
#include <iostream>
#include <boost/preprocessor/repetition/repeat.hpp>
#define DECL(z, n, text) std::cout << "n = " << n << std::endl;
int main()
{
int n = 0;
BOOST_PP_REPEAT(5, DECL, "");
}
I found this example at cppreference.com, and it seems to be the defacto example used through-out StackOverflow:
template<int N>
struct S {
int a[N];
};
Surely, non-type templatization has more value than this example. What other optimizations does this syntax enable? Why was it created?
I am curious, because I have code that is dependent on the version of a separate library that is installed. I am working in an embedded environment, so optimization is important, but I would like to have readable code as well. That being said, I would like to use this style of templating to handle version differences (examples below). First, am I thinking of this correctly, and MOST IMPORTANTLY does it provide a benefit or drawback over using a #ifdef statement?
Attempt 1:
template<int VERSION = 500>
void print (char *s);
template<int VERSION>
void print (char *s) {
std::cout << "ERROR! Unsupported version: " << VERSION << "!" << std::endl;
}
template<>
void print<500> (char *s) {
// print using 500 syntax
}
template<>
void print<600> (char *s) {
// print using 600 syntax
}
OR - Since the template is constant at compile time, could a compiler consider the other branches of the if statement dead code using syntax similar to:
Attempt 2:
template<int VERSION = 500>
void print (char *s) {
if (VERSION == 500) {
// print using 500 syntax
} else if (VERSION == 600) {
// print using 600 syntax
} else {
std::cout << "ERROR! Unsupported version: " << VERSION << "!" << std::endl;
}
}
Would either attempt produce output comparable in size to this?
void print (char *s) {
#if defined(500)
// print using 500 syntax
#elif defined(600)
// print using 600 syntax
#else
std::cout << "ERROR! Unsupported version: " << VERSION << "!" << std::endl;
#endif
}
If you can't tell I'm somewhat mystified by all this, and the deeper the explanation the better as far as I'm concerned.
Compilers find dead code elimination easy. That is the case where you have a chain of ifs depending (only) on a template parameter's value or type. All branches must contain valid code, but when compiled and optimized the dead branches evaporate.
A classic example is a per pixel operation written with template parameters that control details of code flow. The body can be full of branches, yet the compiled output branchless.
Similar techniques can be used to unroll loops (say scanline loops). Care must be taken to understand the code size multiplication that can result: especially if your compiler lacks ICF (aka comdat folding) such as the gold gcc linker and msvc (among others) have.
Fancier things can also be done, like manual jump tables.
You can do pure compile time type checks with no runtime behaviour at alll stuff like dimensional analysis. Or distinguish between points and vectors in n-space.
Enums can be used to name types or switches. Pointers to functions to enable efficient inlining. Pointers to data to allow 'global' state that is mockable, or siloable, or decoupled from implementation. Pointers to strings to allow efficient readable names in code. Lists of integral values for myriads of purposes, like the indexes trick to unpack tuples. Complex operations on static data, like compile time sorting of data in multiple indexes, or checking integrity of static data with complex invariants.
I am sure I missed some.
An obvious optimization is when using an integer, the compiler has a constant rather than a variable:
int foo(size_t); // definition not visible
// vs
template<size_t N>
size_t foo() {return N*N;}
With the template, there's nothing to compute at runtime, and the result may be used as a constant, which can aid other optimizations. You can take this example further by declaring it constexpr, as 5gon12eder mentioned below.
Next example:
int foo(double, size_t); // definition not visible
// vs
template<size_t N>
size_t foo(double p) {
double r(p);
for (size_t i(0) i < N; ++i) {
r *= p;
}
return r;
}
Ok. Now the number of iterations of the loop is known. The loop may be unrolled/optimized accordingly, which can be good for size, speed, and eliminating branches.
Also, basing off your example, std::array<> exists. std::array<> can be much better than std::vector<> in some contexts, because std::vector<> uses heap allocations and non-local memory.
There's also the possibility that some specializations will have different implementations. You can separate those and (potentially) reduce other referenced definitions.
Of course, templates<> can also work against you unnecessarily duplication of your programs.
templates<> also require longer symbol names.
Getting back to your version example: Yes, it's certainly possible that if VERSION is known at compilation, the code which is never executed can be deleted and you may also be able to reduce referenced functions. The primary difference will be that void print (char *s) will have a shorter name than the template (whose symbol name includes all template parameters). For one function, that's counting bytes. For complex programs with many functions and templates, that cost can go up quickly.
There is an enormous range of potential applications of non-typename template parameters. In his book The C++ Programming Language, Stroustrup gives an interesting example that sketches out a type-safe zero-overhead framework for dealing with physical quantities. Basically, the idea is that he writes a template that accepts integers denoting the powers of fundamental physical quantities such as length or mass and then defines arithmetic on them. In the resulting framework, you can add speed with speed or divide distance by time but you cannot add mass to time. Have a look at Boost.Units for an industry-strength implementation of this idea.
For your second question. Any reasonable compiler should be able to produce exactly the same machine code for
#define FOO
#ifdef FOO
do_foo();
#else
do_bar();
#endif
and
#define FOO_P 1
if (FOO_P)
do_foo();
else
do_bar();
except that the second version is much more readable and the compiler can catch errors in both branches simultaneously. Using a template is a third way to generate the same code but I doubt that it will improve readability.
I'm in a position where this design would greatly improve the clarity and maintenance-needs for my code.
What I'm looking for is something like this:
#define MY_MACRO(arg) #if (arg)>0 cout<<((arg)*5.0)<<endl; #else cout<<((arg)/5.0)<<endl; #endif
The idea here:
The pre-processor substitutes different lines of code depending on the compile-time (constant) value of the macro argument. Of course, I know this syntax doesn't work, as the # is seen as the string-ize operator instead of the standard #if, but I think this demonstrates the pre-processor functionality I am trying to achieve.
I know that I could just put a standard if statement in there, and the compiler/runtime would be left to check the value. But this is unnecessary work for the application when arg will always be passed a constant value, like 10.8 or -12.5 that only needs to be evaluated at compile-time.
Performance needs for this number-crunching application require that all unnecessary runtime conditions be eliminated whenever possible, and many constant values and macros have been used (in place of variables and functions) to make this happen. The ability to continue this trend without having to mix pre-processor code with real if conditions would make this much cleaner - and, of course, code-cleanliness is one of the biggest concerns when using macros, especially at this level.
As far as I know, you cannot have #if (or anything similar) inside your macro.
However, if the condition is known at compile-time, you can safetly use a normal if statement. The compiler will optimise it (assuming you have optimisions turned on).
It's called "Dead code elimination"
Simple, use real C++:
template <bool B> void foo_impl (int arg) { cout << arg*5.0 << endl; }
template < > void foo_impl<false>(int arg) { cout << arg/5.0 << endl; }
template <int I> void foo ( ) { foo_impl< (I>0) >(I); }
[edit]
Or in modern C++, if constexpr.
const int bob = 0;
if(bob)
{
int fred = 6/bob;
}
you will get an error on the line where the divide is done:
"error C2124: divide or mod by zero"
which is lame, because it is just as inevitable that the 'if' check will fail, as it is the divide will result in a div by 0. quite frankly I see no reason for the compiler to even evaluate anything in the 'if', except to assure brace integrity.
anyway, obviously that example isn't my problem, my problem comes when doing complicated template stuff to try and do as much at compile time as possible, in some cases arguments may be 0.
is there anyway to fix this error? or disable it? or any better workarounds than this:
currently the only work around I can think of (which I've done before when I encountered the same problem with recursive enum access) is to use template specialization to do the 'if'.
Oh yeah, I'm using Visual Studio Professional 2005 SP1 with the vista/win7 fix.
I suppose your compiler tries to optimize the code snippet since bob is defined const, so that the initial value of fred can be determined at compile time. Maybe you can prevent this optimization by declaring bob non-const or using the volatile keyword.
Can you provide more detail on what you're trying to do with templates? Perhaps you can use a specialised template for 0 that does nothing like in the good old Factorial example and avoid the error altogether.
template <int N>
struct Blah
{
enum { value = 6 / N };
};
template <>
struct Blah<0>
{
enum { value = 0 };
};
The problem - and the compiler has no choice in this - is that bob is a Integral Constant Expression, as is 6. Therefore 6/bob is also an ICE, and must be evaluated at compile time.
There's a very simple solution: inline int FredFromBob(int bob) { return 6/bob; } - a function call expression is never an ICE, even if the function is trivial and declared inline.
Could you give an example where static_assert(...) ('C++11') would solve the problem in hand elegantly?
I am familiar with run-time assert(...). When should I prefer static_assert(...) over regular assert(...)?
Also, in boost there is something called BOOST_STATIC_ASSERT, is it the same as static_assert(...)?
Static assert is used to make assertions at compile time. When the static assertion fails, the program simply doesn't compile. This is useful in different situations, like, for example, if you implement some functionality by code that critically depends on unsigned int object having exactly 32 bits. You can put a static assert like this
static_assert(sizeof(unsigned int) * CHAR_BIT == 32);
in your code. On another platform, with differently sized unsigned int type the compilation will fail, thus drawing attention of the developer to the problematic portion of the code and advising them to re-implement or re-inspect it.
For another example, you might want to pass some integral value as a void * pointer to a function (a hack, but useful at times) and you want to make sure that the integral value will fit into the pointer
int i;
static_assert(sizeof(void *) >= sizeof i);
foo((void *) i);
You might want to asset that char type is signed
static_assert(CHAR_MIN < 0);
or that integral division with negative values rounds towards zero
static_assert(-5 / 2 == -2);
And so on.
Run-time assertions in many cases can be used instead of static assertions, but run-time assertions only work at run-time and only when control passes over the assertion. For this reason a failing run-time assertion may lay dormant, undetected for extended periods of time.
Of course, the expression in static assertion has to be a compile-time constant. It can't be a run-time value. For run-time values you have no other choice but use the ordinary assert.
Off the top of my head...
#include "SomeLibrary.h"
static_assert(SomeLibrary::Version > 2,
"Old versions of SomeLibrary are missing the foo functionality. Cannot proceed!");
class UsingSomeLibrary {
// ...
};
Assuming that SomeLibrary::Version is declared as a static const, rather than being #defined (as one would expect in a C++ library).
Contrast with having to actually compile SomeLibrary and your code, link everything, and run the executable only then to find out that you spent 30 minutes compiling an incompatible version of SomeLibrary.
#Arak, in response to your comment: yes, you can have static_assert just sitting out wherever, from the look of it:
class Foo
{
public:
static const int bar = 3;
};
static_assert(Foo::bar > 4, "Foo::bar is too small :(");
int main()
{
return Foo::bar;
}
$ g++ --std=c++0x a.cpp
a.cpp:7: error: static assertion failed: "Foo::bar is too small :("
I use it to ensure my assumptions about compiler behaviour, headers, libs and even my own code are correct. For example here I verify that the struct has been correctly packed to the expected size.
struct LogicalBlockAddress
{
#pragma pack(push, 1)
Uint32 logicalBlockNumber;
Uint16 partitionReferenceNumber;
#pragma pack(pop)
};
BOOST_STATIC_ASSERT(sizeof(LogicalBlockAddress) == 6);
In a class wrapping stdio.h's fseek(), I have taken some shortcuts with enum Origin and check that those shortcuts align with the constants defined by stdio.h
uint64_t BasicFile::seek(int64_t offset, enum Origin origin)
{
BOOST_STATIC_ASSERT(SEEK_SET == Origin::SET);
You should prefer static_assert over assert when the behaviour is defined at compile time, and not at runtime, such as the examples I've given above. An example where this is not the case would include parameter and return code checking.
BOOST_STATIC_ASSERT is a pre-C++0x macro that generates illegal code if the condition is not satisfied. The intentions are the same, albeit static_assert is standardised and may provide better compiler diagnostics.
BOOST_STATIC_ASSERT is a cross platform wrapper for static_assert functionality.
Currently I am using static_assert in order to enforce "Concepts" on a class.
example:
template <typename T, typename U>
struct Type
{
BOOST_STATIC_ASSERT(boost::is_base_of<T, Interface>::value);
BOOST_STATIC_ASSERT(std::numeric_limits<U>::is_integer);
/* ... more code ... */
};
This will cause a compile time error if any of the above conditions are not met.
One use of static_assert might be to ensure that a structure (that is an interface with the outside world, such as a network or file) is exactly the size that you expect. This would catch cases where somebody adds or modifies a member from the structure without realising the consequences. The static_assert would pick it up and alert the user.
In absence of concepts one can use static_assert for simple and readable compile-time type checking, for example, in templates:
template <class T>
void MyFunc(T value)
{
static_assert(std::is_base_of<MyBase, T>::value,
"T must be derived from MyBase");
// ...
}
This doesn't directly answers the original question, but makes an interesting study into how to enforce these compile time checks prior to C++11.
Chapter 2 (Section 2.1) of Modern C++ Design by Andrei Alexanderscu implements this idea of Compile-time assertions like this
template<int> struct CompileTimeError;
template<> struct CompileTimeError<true> {};
#define STATIC_CHECK(expr, msg) \
{ CompileTimeError<((expr) != 0)> ERROR_##msg; (void)ERROR_##msg; }
Compare the macro STATIC_CHECK() and static_assert()
STATIC_CHECK(0, COMPILATION_FAILED);
static_assert(0, "compilation failed");
To add on to all the other answers, it can also be useful when using non-type template parameters.
Consider the following example.
Let's say you want to define some kind of function whose particular functionality can be somewhat determined at compile time, such as a trivial function below, which returns a random integer in the range determined at compile time. You want to check, however, that the minimum value in the range is less than the maximum value.
Without static_assert, you could do something like this:
#include <iostream>
#include <random>
template <int min, int max>
int get_number() {
if constexpr (min >= max) {
throw std::invalid_argument("Min. val. must be less than max. val.\n");
}
srand(time(nullptr));
static std::uniform_int_distribution<int> dist{min, max};
std::mt19937 mt{(unsigned int) rand()};
return dist(mt);
}
If min < max, all is fine and the if constexpr branch gets rejected at compile time. However, if min >= max, the program still compiles, but now you have a function that, when called, will throw an exception with 100% certainty. Thus, in the latter case, even though the "error" (of min being greater than or equal to max) was present at compile-time, it will only be discovered at run-time.
This is where static_assert comes in.
Since static_assert is evaluated at compile-time, if the boolean constant expression it is testing is evaluated to be false, a compile-time error will be generated, and the program will not compile.
Thus, the above function can be improved as so:
#include <iostream>
#include <random>
template <int min, int max>
int get_number() {
static_assert(min < max, "Min. value must be less than max. value.\n");
srand(time(nullptr));
static std::uniform_int_distribution<int> dist{min, max};
std::mt19937 mt{(unsigned int) rand()};
return dist(mt);
}
Now, if the function template is instantiated with a value for min that is equal to or greater than max, then static_assert will evaluate its boolean constant expression to be false, and will throw a compile-time error, thus alerting you to the error immediately, without giving the opportunity for an exception at runtime.
(Note: the above method is just an example and should not be used for generating random numbers, as repeated calls in quick succession to the function will generate the same numbers due to the seed value passed to the std::mt19937 constructor through rand() being the same (due to time(nullptr) returning the same value) - also, the range of values generated by std::uniform_int_distribution is actually a closed interval, so the same value can be passed to its constructor for upper and lower bounds (though there wouldn't be any point))
The static_assert can be used to forbid the use of the delete keyword this way:
#define delete static_assert(0, "The keyword \"delete\" is forbidden.");
Every modern C++ developer may want to do that if he or she wants to use a conservative garbage collector by using only classes and structs that overload the operator new to invoke a function that allocates memory on the conservative heap of the conservative garbage collector that can be initialized and instantiated by invoking some function that does this in the beginning of the main function.
For example every modern C++ developer that wants to use the Boehm-Demers-Weiser conservative garbage collector will in the beginning of the main function write:
GC_init();
And in every class and struct overload the operator new this way:
void* operator new(size_t size)
{
return GC_malloc(size);
}
And now that the operator delete is not needed anymore, because the Boehm-Demers-Weiser conservative garbage collector is responsible to both free and deallocate every block of memory when it is not needed anymore, the developer wants to forbid the delete keyword.
One way is overloading the delete operator this way:
void operator delete(void* ptr)
{
assert(0);
}
But this is not recommended, because the modern C++ developer will know that he/she mistakenly invoked the delete operator on run time, but this is better to know this soon on compile time.
So the best solution to this scenario in my opinion is to use the static_assert as shown in the beginning of this answer.
Of course that this can also be done with BOOST_STATIC_ASSERT, but I think that static_assert is better and should be preferred more always.