Related
I'm working on a rendering engine using Vulkan and Visual Studio 2017, and I bumped into the following type of problem recently.
I have a template struct template<uint32_t id> struct A;. This struct is defined (in separate header files) for id=0, ... , N-1. All of the definitions have a static constexpr std::array<B, M(id)> member for some struct B and number M depending on id. I have a constexpr function (and a helper function) which for a given value b of type B counts how many elements of all of these arrays equal to b. It looks something like this:
Helper function:
template<size_t Size>
constexpr void count_in_array(B b, const std::array<B, Size>& a, uint32_t& count)
{
for(auto& e : a)
{
if(e==b)
++count;
}
}
Main function:
template<uint32_t... ids>
constexpr uint32_t count_in_arrays(B b, std::index_sequence<ids...>)
{
uint32_t count=0;
auto l ={ (count_in_array(b, A<ids>::member, count), 0)... };
return count;
}
When I compile, I get a C1001 internal compiler error. The strange thing is that my funcions work, because if I use them to define a constexpr variable
constexpr uint32_t var=count_in_arrays(b, std::make_index_sequence<N>());
(for a constexpr B b),
and I hoover the mouse over that variable, I see the computed (and correct) number in the appearing rectangle.
I am not familiar with compiler switches, I only tried to use #pragma optimize("", on/off) around the above functions, but that didn't help. Does somebody have an idea how to make Visual Studio to compile my code?
Remark: I am pretty sure that the struct B is not important here, in my case, it is a simple data struct containing some built-in variables.
First, an internal compiler error is always a compiler bug. Please report this to MSVC.
Second, this implementation is a bit odd. When you write constexpr functions you want to think in a more functionally-oriented way - input-only arguments, output-only results. count_in_array should surely just return a number:
template <size_t Size>
constexpr uint32_t count_in_array(B b, const std::array<B, Size>& a)
{
uint32_t count = 0;
for(auto& e : a)
{
if(e==b)
++count;
}
return count;
}
This is a more reasonable implementation - count returns a count. Not only that, but it composes really nicely. How do you get all the counts? You sum them:
template <size_t... Ids>
constexpr uint32_t count_in_arrays(B b, std::index_sequence<Ids...>)
{
return (count_in_array(b, A<Ids>::member) + ...);
}
Much clearer.
Note that, while I think fold-expressions don't quite work in MSVC yet (though might soon?), that in of itself is not a reason to implement this differently. It just means that you need to manually sum - not that count_in_array() shouldn't return a count.
I want to optimize a little programm/library i'm writing and since 2 weeks i'm somewhat stuck and now wondering if what i had in mind is even possible like that.
(Please be gentle i don't have very much experience in meta-programming.)
My goal is of course to have certain computations be done by the compiler, so that the programmer - hopefully - only has to edit code at one point in the program and have the compiler "create" all the boilerplate. I do have a resonably good idea how to do what i want with macros, but it is wished that i do it with templates if possible.
My goal is:
Lets say i have a class that a using programmer can derive from. There he can have multiple incoming and outgoing datatypes that i want to register somehow so that the base class can do i'ts operations on them.
class my_own_multiply : function_base {
in<int> a;
in<float> b;
out<double> c;
// ["..."] // other content of the class that actually does something but is irrelevant
register_ins<a, b> ins_of_function; // example meta-function calls
register_outs<c> outs_of_function;
}
The meta-code i have up till now is this: (but it's not jet working/complete)
template <typename... Ts>
struct register_ins {
const std::array<std::unique_ptr<in_type_erasured>, sizeof...(Ts)> ins;
constexpr std::array<std::unique_ptr<in_type_erasured>, sizeof...(Ts)>
build_ins_array() {
std::array<std::unique_ptr<in_type_erasured>, sizeof...(Ts)> ins_build;
for (unsigned int i = 0; i < sizeof...(Ts); ++i) {
ins_build[i] = std::make_unique<in_type_erasured>();
}
return ins_build;
}
constexpr register_ins() : ins(build_ins_array()) {
}
template <typename T>
T getValueOf(unsigned int in_nr) {
return ins[in_nr]->getValue();
}
};
As you may see, i want to call my meta-template-code with a variable number of ins. (Variable in the sens that the programmer can put however many he likes in there, but they won't change at runtime so they can be "baked" in at compile time)
The meta-code is supposed to be creating an array, that is of the lengt of the number of ins and is initialized so that every field points to the original in in the my_own_multiply class. Basically giving him an indexable data structure that will always have the correct size. And that i could access from the function_base class to use all ins for certain functions wich are also iterable making things convinient for me.
Now i have looked into how one might do that, but i now am getting the feeling that i might not really be allowed to "create" this array at compile time in a fashion that allows me to still have the ins a and b be non static and non const so that i can mutate them. From my side they wouldn't have to be const anyway, but my compliler seems to not like them to be free. The only thing i need const is the array with the pointers. But using constexpr possibly "makes" me make them const?
Okay, i will clarify what i don't get:
When i'm trying to create an "instance" of my meta-stuff-structure then it fails because it expects all kinds of const, constexpr and so on. But i don't want them since i need to be able to mutate most of those variables. I only need this meta-stuff to create an array of the correct size already at compile time. But i don't want to sacrifice having to make everything static and const in order to achive this. So is this even possible under these kinds of terms?
I do not get all the things you have in mind (also regarding that std::unique_ptr in your example), but maybe this helps:
Starting from C++14 (or C++11, but that is strictly limited) you may write constexpr functions which can be evaluated at compile-time. As a precondition (in simple words), all arguments "passed by the caller" must be constexpr. If you want to enforce that the compiler replaces that "call" by the result of a compile-time computation, you must assign the result to a constexpr.
Writing usual functions (just with constexpr added) allows to write code which is simple to read. Moreover, you can use the same code for both: compile-time computations and run-time computations.
C++17 example (similar things are possible in C++14, although some stuff from std is just missing the constexpr qualifier):
http://coliru.stacked-crooked.com/a/154e2dfcc41fb6c7
#include <cassert>
#include <array>
template<class T, std::size_t N>
constexpr std::array<T, N> multiply(
const std::array<T, N>& a,
const std::array<T, N>& b
) {
// may be evaluated in `constexpr` or in non-`constexpr` context
// ... in simple man's words this means:
// inside this function, `a` and `b` are not `constexpr`
// but the return can be used as `constexpr` if all arguments are `constexpr` for the "caller"
std::array<T, N> ret{};
for(size_t n=0; n<N; ++n) ret[n] = a[n] * b[n];
return ret;
}
int main() {
{// compile-time evaluation is possible if the input data is `constexpr`
constexpr auto a = std::array{2, 4, 6};
constexpr auto b = std::array{1, 2, 3};
constexpr auto c = multiply(a, b);// assigning to a `constexpr` guarantees compile-time evaluation
static_assert(c[0] == 2);
static_assert(c[1] == 8);
static_assert(c[2] == 18);
}
{// for run-time data, the same function can be used
auto a = std::array{2, 4, 6};
auto b = std::array{1, 2, 3};
auto c = multiply(a, b);
assert(c[0] == 2);
assert(c[1] == 8);
assert(c[2] == 18);
}
return 0;
}
If I want to do something like iterate over a tuple, I have to resort to crazy template metaprogramming and template helper specializations. For example, the following program won't work:
#include <iostream>
#include <tuple>
#include <utility>
constexpr auto multiple_return_values()
{
return std::make_tuple(3, 3.14, "pi");
}
template <typename T>
constexpr void foo(T t)
{
for (auto i = 0u; i < std::tuple_size<T>::value; ++i)
{
std::get<i>(t);
}
}
int main()
{
constexpr auto ret = multiple_return_values();
foo(ret);
}
Because i can't be const or we wouldn't be able to implement it. But for loops are a compile-time construct that can be evaluated statically. Compilers are free to remove it, transform it, fold it, unroll it or do whatever they want with it thanks to the as-if rule. But then why can't loops be used in a constexpr manner? There's nothing in this code that needs to be done at "runtime". Compiler optimizations are proof of that.
I know that you could potentially modify i inside the body of the loop, but the compiler can still be able to detect that. Example:
// ...snip...
template <typename T>
constexpr int foo(T t)
{
/* Dead code */
for (auto i = 0u; i < std::tuple_size<T>::value; ++i)
{
}
return 42;
}
int main()
{
constexpr auto ret = multiple_return_values();
/* No error */
std::array<int, foo(ret)> arr;
}
Since std::get<>() is a compile-time construct, unlike std::cout.operator<<, I can't see why it's disallowed.
πάντα ῥεῖ gave a good and useful answer, I would like to mention another issue though with constexpr for.
In C++, at the most fundamental level, all expressions have a type which can be determined statically (at compile-time). There are things like RTTI and boost::any of course, but they are built on top of this framework, and the static type of an expression is an important concept for understanding some of the rules in the standard.
Suppose that you can iterate over a heterogenous container using a fancy for syntax, like this maybe:
std::tuple<int, float, std::string> my_tuple;
for (const auto & x : my_tuple) {
f(x);
}
Here, f is some overloaded function. Clearly, the intended meaning of this is to call different overloads of f for each of the types in the tuple. What this really means is that in the expression f(x), overload resolution has to run three different times. If we play by the current rules of C++, the only way this can make sense is if we basically unroll the loop into three different loop bodies, before we try to figure out what the types of the expressions are.
What if the code is actually
for (const auto & x : my_tuple) {
auto y = f(x);
}
auto is not magic, it doesn't mean "no type info", it means, "deduce the type, please, compiler". But clearly, there really need to be three different types of y in general.
On the other hand, there are tricky issues with this kind of thing -- in C++ the parser needs to be able to know what names are types and what names are templates in order to correctly parse the language. Can the parser be modified to do some loop unrolling of constexpr for loops before all the types are resolved? I don't know but I think it might be nontrivial. Maybe there is a better way...
To avoid this issue, in current versions of C++, people use the visitor pattern. The idea is that you will have an overloaded function or function object and it will be applied to each element of the sequence. Then each overload has its own "body" so there's no ambiguity as to the types or meanings of the variables in them. There are libraries like boost::fusion or boost::hana that let you do iteration over heterogenous sequences using a given vistior -- you would use their mechanism instead of a for-loop.
If you could do constexpr for with just ints, e.g.
for (constexpr i = 0; i < 10; ++i) { ... }
this raises the same difficulty as heterogenous for loop. If you can use i as a template parameter inside the body, then you can make variables that refer to different types in different runs of the loop body, and then it's not clear what the static types of the expressions should be.
So, I'm not sure, but I think there may be some nontrivial technical issues associated with actually adding a constexpr for feature to the language. The visitor pattern / the planned reflection features may end up being less of a headache IMO... who knows.
Let me give another example I just thought of that shows the difficulty involved.
In normal C++, the compiler knows the static type of every variable on the stack, and so it can compute the layout of the stack frame for that function.
You can be sure that the address of a local variable won't change while the function is executing. For instance,
std::array<int, 3> a{{1,2,3}};
for (int i = 0; i < 3; ++i) {
auto x = a[i];
int y = 15;
std::cout << &y << std::endl;
}
In this code, y is a local variable in the body of a for loop. It has a well-defined address throughout this function, and the address printed by the compiler will be the same each time.
What should be the behavior of similar code with constexpr for?
std::tuple<int, long double, std::string> a{};
for (int i = 0; i < 3; ++i) {
auto x = std::get<i>(a);
int y = 15;
std::cout << &y << std::endl;
}
The point is that the type of x is deduced differently in each pass through the loop -- since it has a different type, it may have different size and alignment on the stack. Since y comes after it on the stack, that means that y might change its address on different runs of the loop -- right?
What should be the behavior if a pointer to y is taken in one pass through the loop, and then dereferenced in a later pass? Should it be undefined behavior, even though it would probably be legal in the similar "no-constexpr for" code with std::array showed above?
Should the address of y not be allowed to change? Should the compiler have to pad the address of y so that the largest of the types in the tuple can be accommodated before y? Does that mean that the compiler can't simply unroll the loops and start generating code, but must unroll every instance of the loop before-hand, then collect all of the type information from each of the N instantiations and then find a satisfactory layout?
I think you are better off just using a pack expansion, it's a lot more clear how it is supposed to be implemented by the compiler, and how efficient it's going to be at compile and run time.
Here's a way to do it that does not need too much boilerplate, inspired from http://stackoverflow.com/a/26902803/1495627 :
template<std::size_t N>
struct num { static const constexpr auto value = N; };
template <class F, std::size_t... Is>
void for_(F func, std::index_sequence<Is...>)
{
using expander = int[];
(void)expander{0, ((void)func(num<Is>{}), 0)...};
}
template <std::size_t N, typename F>
void for_(F func)
{
for_(func, std::make_index_sequence<N>());
}
Then you can do :
for_<N>([&] (auto i) {
std::get<i.value>(t); // do stuff
});
If you have a C++17 compiler accessible, it can be simplified to
template <class F, std::size_t... Is>
void for_(F func, std::index_sequence<Is...>)
{
(func(num<Is>{}), ...);
}
In C++20 most of the std::algorithm functions will be constexpr. For example using std::transform, many operations requiring a loop can be done at compile time. Consider this example calculating the factorial of every number in an array at compile time (adapted from Boost.Hana documentation):
#include <array>
#include <algorithm>
constexpr int factorial(int n) {
return n == 0 ? 1 : n * factorial(n - 1);
}
template <typename T, std::size_t N, typename F>
constexpr std::array<std::result_of_t<F(T)>, N>
transform_array(std::array<T, N> array, F f) {
auto array_f = std::array<std::result_of_t<F(T)>, N>{};
// This is a constexpr "loop":
std::transform(array.begin(), array.end(), array_f.begin(), [&f](auto el){return f(el);});
return array_f;
}
int main() {
constexpr std::array<int, 4> ints{{1, 2, 3, 4}};
// This can be done at compile time!
constexpr std::array<int, 4> facts = transform_array(ints, factorial);
static_assert(facts == std::array<int, 4>{{1, 2, 6, 24}}, "");
}
See how the array facts can be computed at compile time using a "loop", i.e. an std::algorithm. At the time of writing this, you need an experimental version of the newest clang or gcc release which you can try out on godbolt.org. But soon C++20 will be fully implemented by all the major compilers in the release versions.
This proposal "Expansion Statements" is interesting and I will provide the link for you to read further explanations.
Click this link
The proposal introduced the syntactic sugar for... as similar to the sizeof... operator. for... loop statement is a compile-time expression which means it has nothing to do in the runtime.
For example:
std::tuple<int, float, char> Tup1 {5, 3.14, 'K'};
for... (auto elem : Tup1) {
std::cout << elem << " ";
}
The compiler will generate the code at the compile-time and this is the equivalence:
std::tuple<int, float, char> Tup1 {5, 3.14, 'K'};
{
auto elem = std::get<0>(Tup1);
std::cout << elem << " ";
}
{
auto elem = std::get<1>(Tup1);
std::cout << elem << " ";
}
{
auto elem = std::get<2>(Tup1);
std::cout << elem << " ";
}
Thus, the expansion statement is not a loop but a repeated version of the loop body as it was said in the document.
Since this proposal isn't in C++'s current version or in the technical specification (if it's accepted). We can use the alternative version from the boost library specifically <boost/hana/for_each.hpp> and use the tuple version of boost from <boost/hana/tuple.hpp>. Click this link.
#include <boost/hana/for_each.hpp>
#include <boost/hana/tuple.hpp>
using namespace boost;
...
hana::tuple<int, std::string, float> Tup1 {5, "one", 5.55};
hana::for_each(Tup1, [](auto&& x){
std::cout << x << " ";
});
// Which will print:
// 5 "one" 5.55
The first argument of boost::hana::for_each must be a foldable container.
Why isn't a for-loop a compile-time expression?
Because a for() loop is used to define runtime control flow in the c++ language.
Generally variadic templates cannot be unpacked within runtime control flow statements in c++.
std::get<i>(t);
cannot be deduced at compile time, since i is a runtime variable.
Use variadic template parameter unpacking instead.
You might also find this post useful (if this not even remarks a duplicate having answers for your question):
iterate over tuple
Here are two examples attempting to replicate a compile-time for loop (which isn't part of the language at this time), using fold expressions and std::integer_sequence. The first example shows a simple assignment in the loop, and the second example shows tuple indexing and uses a lambda with template parameters available in C++20.
For a function with a template parameter, e.g.
template <int n>
constexpr int factorial() {
if constexpr (n == 0) { return 1; }
else { return n * factorial<n - 1>(); }
}
Where we want to loop over the template parameter, like this:
template <int N>
constexpr auto example() {
std::array<int, N> vals{};
for (int i = 0; i < N; ++i) {
vals[i] = factorial<i>(); // this doesn't work
}
return vals;
}
One can do this:
template <int... Is>
constexpr auto get_array(std::integer_sequence<int, Is...> a) -> std::array<int, a.size()> {
std::array<int, a.size()> vals{};
((vals[Is] = factorial<Is>()), ...);
return vals;
}
And then get the result at compile time:
constexpr auto x = get_array(std::make_integer_sequence<int, 5>{});
// x = {1, 1, 2, 6, 24}
Similarly, for a tuple:
constexpr auto multiple_return_values()
{
return std::make_tuple(3, 3.14, "pi");
}
int main(void) {
static constexpr auto ret = multiple_return_values();
constexpr auto for_constexpr = [&]<int... Is>(std::integer_sequence<int, Is...> a) {
((std::get<Is>(ret)), ...); // std::get<i>(t); from the question
return 0;
}
// use it:
constexpr auto w = for_constexpr(std::make_integer_sequence<int, std::tuple_size_v<decltype(ret)>>{});
}
So I have a function that takes a variable length argument list, for example:
int avg(int count,...){
//stuff
}
I can call it with avg(4,2,3,9,4); and it works fine. It needs to maintain this functionality.
Is there a way for me to also call it with an array instead of listing the variables? For example:
avg(4,myArray[5]) such that the function avg doesn't see any difference?
No there is no such way. You can however make two functions, one that takes a variable number of arguments, and one that takes an array (or better yet, an std::vector). The first function simply packs the arguments into the array (or vector) and calls the second function.
void f() {}
template<typename T, std::size_t N>
void f(T array[N])
{
}
template<typename T, typename... Args>
void f(const T& value, const Args&... args)
{
process(value);
f(args...);
}
No. Since pointers are essentially unsigned integers it would not be able to tell the difference between a memory address and an unsigned integer. Alternatively (as I am sure you wanted to avoid), you would have to do:
avg( 4, myArray[ 0 ], ..., myArray[ 3 ] );
... where ... is myArray at positions 1 and 2 if you wanted to conform with the same parameters as your previous function. There are other ways to do this, such as using C++ vectors.
You can easily do it
struct{int arr[100];}p;
double avg2(int count,int* arr){
memcpy(&p,arr,count*sizeof(int));
return avg(count,p);
}
Better approach would be get rid of variadic arguments. This was inherited from C and it is a good practice to avoid it as much as possible.
Now your example avg(4,myArray[5]) is a bit fuzzy. I assume, that first argument defines how much items must be taken from array and second argument you planned to pass just an array. I assume this index operator is typo or limping method showing array size.
So you expect something like this:
int avg(int count, ...)
{
int sum = 0;
std::va_list args;
va_start(args, count);
for (int i = 0; i < count; ++i) {
sum += va_arg(args, int);
}
va_end(args);
return sum / count;
}
template <size_t N, size_t... I>
int avg_helper(size_t count, const int (&arr)[N], std::index_sequence<I...>)
{
return avg(count, arr[I]...);
}
template <size_t N>
int avg(int count, const int (&arr)[N])
{
if (count > N)
throw std::invalid_argument { "to large count passed" };
return avg_helper(count, arr, std::make_index_sequence<N> {});
}
https://godbolt.org/z/7v1n7zaWq
Now note that in overload resolution variadic function is match as a last one. So when compiler can match template it will select it instead variadic function.
Note there is a trap. If you will pass a pointer (for example array decay) variadic argument function will kick in again. So as protection I've added extra overload which will trigger static_assert warning about array decay.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Variable number of arguments in C++?
May I not set the number of arguments of a function with variable number of arguments? As an example: can the following interface be implemented?
int sum(...) { ... }
sum(1, 2, 3, 4); // return 10
Conventional variadic functions are messy and not type-safe, but in C++11 you can do this cleanly using variadic templates and (compile-time) recursion:
// Base case for recursion
template <typename T>
inline T sum(T n) {
return n;
}
// Recursive case
template <typename T, typename... Args>
inline T sum(T n, Args... args) {
return n + sum(args...);
}
Since it's a template, this'll work for any types that have an operator+ defined:
std::cout << sum(1, 2, 3) << std::endl; // Prints 6
std::cout << sum(3.14, 2.72) << std::endl; // Prints 5.86
However, because the return type of the recursive template function is taken from the first argument only, you can get suprising results if you mix different argument types in one call: sum(2.5, 2) returns 4.5 as expected, but sum(2, 2.5) returns 2 because the return type is int, not double. If you want to be fancy, you can use the new alternative function syntax to specify that the return type is whatever the natural type of n + sum(args...) would be:
// Recursive case
template <typename T, typename... Args>
inline auto sum(T n, Args... args) -> decltype(n + sum(args...)) {
return n + sum(args...);
}
Now sum(2.5, 2) and sum(2, 2.5) both return 4.5.
If your actual logic is more complex than summation, and you don't want it inlined, you can use the inline template functions to put all the values into some sort of container (such as a std::vector or std::array) and pass that into the non-inline function to do the real work at the end.
You probably want to do this by writing the function to take something like a vector<int>, which you'll construct on the fly with a braced initializer list:
int sum(std::vector<int> const &n) {
return std::accumulate(begin(n), end(n), 0);
}
If there's some possibility the numbers might be (for example) floating point instead, you probably want to write it as a template instead:
template <class T>
T sum(std::vector<T> const &n) {
return std::accumulate(begin(n), end(n), T());
}
Either way, you'd invoke this just marginally differently:
int x = sum({1,2,3,4});
Warning: this feature was added to C++ fairly recently, so some compilers (e.g., VC++) don't support it yet -- though others (e.g., g++ 4.7+), do.
No, you can't.
Just don't use variable arguments. They suck in every conceivable fashion and are completely not worth anybody's time.
A C++ variadic function must know how many (and what type) of arguments it was passed. For example, printf's format string tells it what extra arguments to expect.
Your sum has no way of knowing if it got 4 ints or 10. You could make the 1st argument a length:
int sum(int howmany, ...) { ... }
so the function knows how many ints follow. But really you should just pass an array (or vector if you're feeling C++'y)
There are multiple ways to solve your issue. I'll go over a few:
Method 1:
-Create a series of overloaded sum functions to suit your needs.
Cons
-code bloat
This can be implemented by making multiple functions with headers:
int sum(int a);
int sum(int a, int b);
int sum(int a, int b, int c);
etc...
Method 2:
-create a custom class with a linked list, and pass in a pointer to the head of the linked list. This is probably your best move in this case, assuming you don't know the amount of data to be passed in.
Function header:
int sum(LinkedList *headPointer);
Method 3:
-pass in an array of variables
Function header:
int sum(int input[]);
Method 4:
-create a function with auto-set variables
Function header:
int sum(int a=0, int b=0, int c=0, int d=0,... int n=0);