How to enumerate a constant array at compile time in C++? - c++

I am trying to generate a hash at COMPILE TIME from a literal string (array of characters). For example:
unsigned long long compiledHash = ComputeHash("literal string");
I am currently stuck on finding a way to enumerate all characters in the string and creating a unique hash. If I use a for loop like I would normally, the compiler won't generate the hash at compile time, which is not what I want.
I might of found a way to do so, but the compiler is stuck in an infinite loop when calculating the hash.
template <size_t _length, typename T, int n> struct CostructHash {
unsigned long long Value;
constexpr __forceinline CostructHash(const T(&str)[_length]) :
Value(str[n] ^ n + (n > 0 ? CostructHash<_length, T, n - 1>(str).Value : 0)) {}
};
template<size_t _length>
constexpr __forceinline unsigned long long ComputeHash(const char(&str)[_length]) {
return CostructHash<_length, char, _length - 1>(str).Value;
}
As you can see I use recursion to go through all characters in the string, but I must of messed up somewhere, because as I said the compiler freezes forever when it calls ComputeHash.
I understand that I must be missing the base case that stops the recursion, but as far as I understand (n > 0 ? CostructHash<_length, T, n - 1>(str).Value : 0) should do the job since I am always decreasing n by 1 and checking if n is bigger than 0. So why is the recursion not stopping?
Also, there may be an easier way to do what I am trying?

Do you see the problem in the code
The recursion is infinite because there is no base case for the template instantiations.
but as far as I understand (n > 0 ? CostructHash<_length, T, n - 1>(str).Value : 0) should do the job since I am always decreasing n by 1 and checking if n is bigger than 0. So why is the recursion not stopping?
The template is instantiated before the compiler decides whether that branch will be taken. You have to use if constexpr instead of the ternary conditional, or you have to specialise the template for the base case.
Also, there may be an easier way to do what I am trying?
This seems to work fine:
constexpr std::size_t
ComputeHash(std::string_view str) {
std::size_t result = 0;
std::size_t i = 0;
for(auto c : str) {
result += c ^ i++;
}
return result;
}

Related

template metaprogramming Ternary value not hitting base case

I am writing a simple test program using TMP to calculate the Nth fibonacci number. I have already found many ways to do this, but I'm just trying out a bunch of ways to get my understanding better. The way I am having a problem with is this:
template<int A>
struct fib
{
static const bool value = (A<2);
static const int num = (value?A:(fib<A-1>::num + fib<A-2>::num));
};
The error message I am getting is:
error: template instantiation depth exceeds maximum of 900 (use -ftemplate-depth= to increase the maximum) instantiating 'fib<-1796>::value'|
I have tried substituting many values into the "false" field of the ternary, just to play with it and see what it does. I still do not understand why this does not work. Can anyone help me and tell me why? Thanks.
EDIT: My guess is that the compiler might be evaluating the T/F fields of the ternary before checking to see if the value is true or false, but I'm not sure since that's not how an if statement is supposed to work at all, and these are supposed to roughly emulate if statements
I must admit that I'm not that experienced concerning template programming. But in OP's case, a simple solution would be template specialization.
Sample code:
#include <iostream>
template<int A>
struct fib
{
static const int num = fib<A-1>::num + fib<A-2>::num;
};
template<>
struct fib<1>
{
static const int num = 1;
};
template<>
struct fib<2>
{
static const int num = 1;
};
int main()
{
fib<10> fib10;
std::cout << "fib<10>: " << fib10.num << '\n';
return 0;
}
Output:
fib<10>: 55
Live Demo on coliru
One way to write this in a more straightforward manner is to use if constexpr. Unlike with regular if (and with the ternary operator), templates in the not taken branch are not instantiated.
template <int n>
struct fib {
constexpr static int eval() {
if constexpr (n < 2)
return n;
else
return fib<n-1>::eval() + fib<n-2>::eval();
}
};
Of course once you have if constexpr you don't really need templates to make a compile-time function of this type. A constexpr non-template function will do just fine. This is just an illustration of the technique.
The first comment on my original post is the correct answer. The author is (n.m.). Template type instantiations are evaluated in both fields of the ternary, while the object instantiation itself is not. To solve this, I will look into std::condiditional or std::enable_if

Idiomatic way to calculate template parameter depending on other parameters

I am looking for an idiomatic way to optimize this template I wrote.
My main concern is how to correctly define the template parameter n and using it as a return parameter while the user must not overwrite it.
I am also open for other suggestions on how to write this template in an idiomatic C++14 way.
template<
typename InType=uint32_t,
typename OutType=float,
unsigned long bits=8,
unsigned long n=(sizeof(InType) * 8) / bits
>
std::array<OutType,n> hash_to_color(InType in) noexcept {
InType mask = ~0;
mask = mask << bits;
mask = ~mask;
std::array<OutType,n> out;
auto out_max = static_cast<OutType>((1 << bits) - 1);
for (auto i = 0; i < n; i++) {
auto selected = (in >> (i * bits)) & mask;
out[i] = static_cast<OutType>(selected) / out_max;
}
return out;
}
Regarding the n template parameter, you can avoid it by using auto as the return type in C++14. Here's a simpler example of the principle:
template<int N>
auto f()
{
constexpr int bar = N * 3;
std::array<int, bar> foo;
return foo;
}
Naturally the calculation of the array template parameter must be a constant expression.
Another option (compatible with C++11) is trailing-return-type:
template<int N>
auto f() -> std::array<int, N * 3>
{
This is a wee bit more verbose than taking advantage of C++14's allowing of return type deduction from the return statement.
Note: ~0 in your code is wrong because 0 is an int, it should be ~(InType)0. Also (1 << bits) - 1 has potential overflow issues.
I think M.M.'s answer is excellent, and, in your case, I'd definitely use one of the two alternatives suggested there.
Suppose you later encounter a situation where the logic is, given n, use not 3 n, but something more complicated, e.g., n2 + 3 n + 1. Alternatively, maybe the logic is not very complicated, but it is subject to change.
The first option - using automatically deduced auto, is pithy, but the omission sometimes makes the declaration less clear.
The second option - trailing return type - violates DRY to some extent.
(Just to clarify again, I don't think that these are significant problems in the context of your question or M.M.'s answer.)
So, a third option would be to factor out the logic to a constexpr function:
#include <array>
constexpr int arr_size(int n) { return n * n + 3 * n + 1; }
Since it's constexpr, it can be used to instantiate the template:
template<int N>
std::array<int, arr_size(N)> f() {
return std::array<int, arr_size(N)>();
}
Note that now the function has an explicit return type, but the logic of arr_size appears only once.
You could use this as usual:
int main() {
auto a = f<10>();
a[0] = 3;
}

constexpr depth limit with clang (fconstexpr-depth doesnt seem to work)

Is there anyway to configure constexpr instantiation depth?
I am running with -fconstexpr-depth=4096 (using clang/XCode).
But still fail to compile this code with error:
Constexpr variable fib_1 must be initialized by a constant expression.
The code fails irrespective of whether option -fconstexpr-depth=4096 is set or not.
Is this a bug with clang or is expected to behave this way.
Note: this works good till fib_cxpr(26), 27 is when it starts to fail.
Code:
constexpr int fib_cxpr(int idx) {
return idx == 0 ? 0 :
idx == 1 ? 1 :
fib_cxpr(idx-1) + fib_cxpr(idx-2);
}
int main() {
constexpr auto fib_1 = fib_cxpr(27);
return 0;
}
TL;DR:
For clang you want the command line argument -fconstexpr-steps=1271242 and you do not need more than -fconstexpr-depth=27
The recursive method of calculating Fibonacci numbers does not require very much recursion depth. The depth required for fib(n) is in fact no more than n. This is because the longest chain of calls is through the fib(i-1) recursive call.
constexpr auto fib_1 = fib_cxpr(3); // fails with -fconstexpr-depth=2, works with -fconstexpr-depth=3
constexpr auto fib_1 = fib_cxpr(4); // fails with -fconstexpr-depth=3, works with -fconstexpr-depth=4
So we can conclude that -fconstexpr-depth is not the setting that matters.
Furthermore, the error messages also indicate a difference:
constexpr auto fib_1 = fib_cxpr(27);
Compiled with -fconstexpr-depth=26, to be sure we hit that limit, clang produces the message:
note: constexpr evaluation exceeded maximum depth of 26 calls
But compiling with -fconstexpr-depth=27, which is enough depth, produces the message:
note: constexpr evaluation hit maximum step limit; possible infinite loop?
So we know that clang is distinguishing between two failures: recursion depth and 'step limit'.
The top Google results for 'clang maximum step limit' lead to pages about the clang patch implementing this feature, including the implementation of the command-line option: -fconstexpr-steps. Further Googling of this option indicates that there's no user-level documentation.
So there's no documentation about what clang counts as a 'step' or how many 'steps' clang requires for fib(27). We could just set this really high, but I think that's a bad idea. Instead some experimentation shows:
n : steps
0 : 2
1 : 2
2 : 6
3 : 10
4 : 18
Which indicates that steps(fib(n)) == steps(fib(n-1)) + steps(fib(n-2)) + 2. A bit of calculation shows that, according to this, fib(27) should require 1,271,242 of clang's steps. So compiling with -fconstexpr-steps=1271242 should allow the program to compile, which indeed it does. Compiling with -fconstexpr-steps=1271241 results in an error the same as before, so we know we have an exact limit.
An alternative, less exact method involves observing from the patch that the default step limit is 1,048,576 (220), which is obviously sufficient for fib(26). Intuitively, doubling that should be plenty, and from the earlier analysis we know that two million is plenty. A tight limit would be ⌈φ · steps(fib(26))⌉ (which does happen to be exactly 1,271,242).
Another thing to note is that these results clearly show that clang is not doing any memoization of constexpr evaluation. GCC does, but it appears that this is not implemented in clang at all. Although memoization increases the memory requirements it can sometimes, as in this case, vastly reduce the time required for evaluation. The two conclusions I draw from this are that writing constexpr code that requires memoization for good compile times is not a good idea for portable code, and that clang could be improved with support for constexpr memoization and a command line option to enable/disable it.
You can also refactor your Fibonacci algorithm to include explicit memoization which will work in clang.
// Copyright 2021 Google LLC.
// SPDX-License-Identifier: Apache-2.0
#include <iostream>
template <int idx>
constexpr int fib_cxpr();
// This constexpr template value acts as the explicit memoization for the fib_cxpr function.
template <int i>
constexpr int kFib = fib_cxpr<i>();
// Arguments cannot be used in constexpr contexts (like the if constexpr),
// so idx is refactored as a template value argument instead.
template <int idx>
constexpr int fib_cxpr() {
if constexpr (idx == 0 || idx == 1) {
return idx;
} else {
return kFib<idx-1> + kFib<idx-2>;
}
}
int main() {
constexpr auto fib_1 = fib_cxpr<27>();
std::cout << fib_1 << "\n";
return 0;
}
This version works for arbitrary inputs to fib_cxpr and compiles with only 4 steps.
https://godbolt.org/z/9cvz3hbaE
This isn't directly answering the question but I apparently don't have enough reputation to add this as as comment...
Unrelated to "depth limit" but strongly related to Fibonacci number calculation.
Recursion is maybe the wrong approach and not needed.
There is a ultra fast solution with low memory footprint possible.
So, we could use a compile time pre calculation of all Fibonacci numbers that fit into a 64 bit value.
One important property of the Fibonacci series is that the values grow strongly exponential. So, all existing build in integer data types will overflow rather quick.
With Binet's formula you can calculate that the 93rd Fibonacci number is the last that will fit in a 64bit unsigned value.
And calculating 93 values during compilation is a really simple task.
We will first define the default approach for calculation a Fibonacci number as a constexpr function:
// Constexpr function to calculate the nth Fibonacci number
constexpr unsigned long long getFibonacciNumber(size_t index) noexcept {
// Initialize first two even numbers
unsigned long long f1{ 0 }, f2{ 1 };
// calculating Fibonacci value
while (index--) {
// get next value of Fibonacci sequence
unsigned long long f3 = f2 + f1;
// Move to next number
f1 = f2;
f2 = f3;
}
return f2;
}
With that, Fibonacci numbers can easily be calculated at compile time. Then, we fill a std::array with all Fibonacci numbers. We use also a constexpr and make it a template with a variadic parameter pack.
We use std::integer_sequence to create a Fibonacci number for indices 0,1,2,3,4,5, ....
That is straigtforward and not complicated:
template <size_t... ManyIndices>
constexpr auto generateArrayHelper(std::integer_sequence<size_t, ManyIndices...>) noexcept {
return std::array<unsigned long long, sizeof...(ManyIndices)>{ { getFibonacciNumber(ManyIndices)... } };
};
This function will be fed with an integer sequence 0,1,2,3,4,... and return a std::array<unsigned long long, ...> with the corresponding Fibonacci numbers.
We know that we can store maximum 93 values. And therefore we make a next function, that will call the above with the integer sequence 1,2,3,4,...,92,93, like so:
constexpr auto generateArray() noexcept {
return generateArrayHelper(std::make_integer_sequence<size_t, MaxIndexFor64BitValue>());
}
And now, finally,
constexpr auto FIB = generateArray();
will give us a compile-time std::array<unsigned long long, 93> with the name FIB containing all Fibonacci numbers. And if we need the i'th Fibonacci number, then we can simply write FIB[i]. There will be no calculation at runtime.
I do not think that there is a faster way to calculate the n'th Fibonacci number.
Please see the complete program below:
#include <iostream>
#include <array>
#include <utility>
// ----------------------------------------------------------------------
// All the following will be done during compile time
// Constexpr function to calculate the nth Fibonacci number
constexpr unsigned long long getFibonacciNumber(size_t index) {
// Initialize first two even numbers
unsigned long long f1{ 0 }, f2{ 1 };
// calculating Fibonacci value
while (index--) {
// get next value of Fibonacci sequence
unsigned long long f3 = f2 + f1;
// Move to next number
f1 = f2;
f2 = f3;
}
return f2;
}
// We will automatically build an array of Fibonacci numberscompile time
// Generate a std::array with n elements
template <size_t... ManyIndices>
constexpr auto generateArrayHelper(std::integer_sequence<size_t, ManyIndices...>) noexcept {
return std::array<unsigned long long, sizeof...(ManyIndices)>{ { getFibonacciNumber(ManyIndices)... } };
};
// Max index for Fibonaccis that for in an 64bit unsigned value (Binets formula)
constexpr size_t MaxIndexFor64BitValue = 93;
// Generate the required number of elements
constexpr auto generateArray()noexcept {
return generateArrayHelper(std::make_integer_sequence<size_t, MaxIndexFor64BitValue>());
}
// This is an constexpr array of all Fibonacci numbers
constexpr auto FIB = generateArray();
// ----------------------------------------------------------------------
// Test
int main() {
// Print all possible Fibonacci numbers
for (size_t i{}; i < MaxIndexFor64BitValue; ++i)
std::cout << i << "\t--> " << FIB[i] << '\n';
return 0;
}
Developed and tested with Microsoft Visual Studio Community 2019, Version 16.8.2.
Additionally compiled and tested with clang11.0 and gcc10.2
Language: C++17

Is strict template evaluation in principle impossible in C++?

I think I understand how templates are evaluated lazily in C++ e.g. a la recursive replacements and a final simplification of the expansion. This typically limits the recursion depth available. I wonder if with the features new in C++11 (e.g. variadic templates or template packs) or with some Boost it is possible to force strict template evaluation. Or is this in principle impossible in C++?
Consider for example a template which sums all integer values 0..n:
template <int n>
struct sumAll { enum { value = n + sumAll<n-1>::value }; };
template <>
struct sumAll<0> { enum { value = 0 }; };
#include <iostream>
int main() { std::cout << sumAll<10000>::value << std::endl; }
Here sumAll<10>::value would be expanded to
sumAll<10>::value = 10 + sumAll<9>::value
= 10 + 9 + sumAll<8>::value
= 10 + 9 + 8 + sumAll<7>::value
= ...
= 10 + 9 + 8 + 7 + 6 + 5 + 4 + 3 + 2 + 1 + 0
and the final summation would only be performed once the template has been completely expanded. If that final expansion gets too long (e.g. in complex series expansions with many terms) the compiler will ultimately run out of space to store additional terms.
My question was in essence if there was a way to perform simplifications (like above summation) earlier.
You decide the recursion depth yourself. And just like normal recursion can cause stack overflows, template recursion can. But that's often fixable by a better recursive algorithm. Trivially:
template <int n>
struct sumAll { enum { value = n + n-1 + sumAll<n-2>::value }; };
template <>
struct sumAll<1> { enum { value = 0 }; };
template <>
struct sumAll<0> { enum { value = 0 }; };
Smarter:
template <int n>
struct sumAll { enum { value = (n*n+2)/2; };
Of course, you may complain that the latter is just being silly and real examples are more complex. But isn't that the whole problem? The compiler can't magically make that complexity go away for you.
C++ templates are turing-complete, which means that you use them to evaluate every computable function at compile time. It then follows from the halting theorem that
You cannot, in general, compute the amount of memory require to compile of a C++ program in advance. (I.e., there is no computable function which maps every C++ program to a memory bound for its compilation)
You cannot, in general, decide whether the compiler will ever finish instantiating template, or will go on forever.
So while you might be able to tweak a compiler to use less memory in some cases, you cannot solve the general problem of it running out of memory sometimes.

How can the allowed range of an integer be restricted with compile time errors?

I would like to create a type that is an integer value, but with a restricted range.
Attempting to create an instance of this type with a value outside the allowable range should cause a compile time error.
I have found examples that allow compile time errors to be triggered when an enumeration value outside those specified is used, but none that allow a restricted range of integers (without names).
Is this possible?
Yes but it's clunky:
// Defining as template but the main class can have the range hard-coded
template <int Min, int Max>
class limited_int {
private:
limited_int(int i) : value_(i) {}
int value_;
public:
template <int Val> // This needs to be a template for compile time errors
static limited_int make_limited() {
static_assert(Val >= Min && Val <= Max, "Bad! Bad value.");
// If you don't have static_assert upgrade your compiler or use:
//typedef char assert_in_range[Val >= Min && Val <= Max];
return Val;
}
int value() const { return value_; }
};
typedef limited_int<0, 9> digit;
int main(int argc, const char**)
{
// Error can't create directly (ctor is private)
//digit d0 = 5;
// OK
digit d1 = digit::make_limited<5>();
// Compilation error, out of range (can't create zero sized array)
//digit d2 = digit::make_limited<10>();
// Error, can't determine at compile time if argc is in range
//digit d3 = digit::make_limited<argc>();
}
Things will be much easier when C++0x is out with constexpr, static_assert and user defined literals.
Might be able to do something similar by combining macros and C++0x's static assert.
#define SET_CHECK(a,b) { static_assert(b>3 && b<7); a=b; }
A runtime integer's value can only be checked at runtime, since it only exists at runtime, but if you make a runtime check on all writing methods, you can guarantee it's contents. You can build a regular integral replacement class with given restrictions for that.
For constant integers, you could use a template to enforce such a thing.
template<bool cond, typename truetype> struct enable_if {
};
template<typename truetype> struct enable_if<true, truetype> {
typedef truetype type;
};
class RestrictedInt {
int value;
RestrictedInt(int N)
: value(N) {
}
public:
template<int N> static typename enable_if< (N > lowerbound) && (N < upperbound), RestrictedInt>::type Create() {
return RestrictedInt(N);
}
};
Attempting to create this class with a template value that isn't within the range will cause a substitution failure and a compile-time error. Of course, it will still require adornment with operators et al to replace int, and if you want to compile-time guarantee other operations, you will have to provide static functions for them (there are easier ways to guarantee compile-time arithmetic).
Well, as you noticed, there is already a form of diagnostic for enumerations.
It's generally crude: ie the checking is "loose", but could provide a crude form of check as well.
enum Range { Min = 0, Max = 31 };
You can generally assign (without complaint) any values between the minimal and maximal values defined.
You can in fact often assign a bit more (I think gcc works with powers of 2).