This question already has an answer here:
Achieving 'constexpr for' with indexing
(1 answer)
Closed 5 months ago.
There are a lot of approaches how to iterate trough std::tuple. And it is similar to range-based-for loop. I want to do something like this, but with indices of tuple, to get access to elements of various tuples.
For example I have tuple of different types, but all of them has same free functions / operators std::tuple<float, std::complex, vec3, vec4> and I want to do some operation between two or more such tuples.
I tried to write something like this:
template<typename Lambda, typename... Types, int... Indices>
void TupleIndexElems_Indexed(TTuple<Types...>, Lambda&& Func, TIntegerSequence<int, Indices...>)
{
Func.template operator()<Indices...>();
}
template<typename TupleType, typename Lambda>
void TupleIndexElems(Lambda&& Func)
{
TupleIndexElems_Impl(TupleType{}, Func);
}
template<typename... Types, typename Lambda>
void TupleIndexElems_Impl(TTuple<Types...>, Lambda&& Func)
{
TupleIndexElems_Indexed(TTuple<Types...>{}, Func, TMakeIntegerSequence<int, sizeof...(Types)>{});
}
Usage:
FSkyLightSettings& operator+=(FSkyLightSettings& Other)
{
auto Tup1 = AsTuple();
auto Tup2 = Other.AsTuple();
using TupType = TTuple<float*, FLinearColor*, FLinearColor*>;
auto AddFunc = [] <typename Tup, int Index> (Tup t1, Tup t2)
{
*t1.template Get<Index>() = (*t1.template Get<Index>()) + (*t2.template Get<Index>());
};
TupleIndexElems<TupType>([=]<int... Indices>
{
AddFunc.template operator()<TupType, Indices>(Tup1, Tup2); // How to fold it?
});
return *this;
}
I thought the best way to do it is using variaic lambda template, but when I tried to call it, I confused about impossibility to use fold expression.
Are there any elegant solutions to do that (for various versions of C++)?
UPD: I've also tried to use recursive lambda, but I can't due to compiler error C3536:
auto PlusVariadic = [=]<int Index, int... Indices>
{
Plus.template operator()<TupType, Index>(Tup1, Tup2); // How to fold it?
if constexpr (Index != 0)
{
PlusVariadic.operator()<Indices...>();
}
};
One convenient way in C++20 I use to iterate tuples is to create a constexpr_for function that calls a lambda with a std::integral_constant parameter to allow indexing, as described in my Achieving 'constexpr for' with indexing post.
#include <utility>
#include <type_traits>
template<size_t Size, typename F>
constexpr void constexpr_for(F&& function) {
auto unfold = [&]<size_t... Ints>(std::index_sequence<Ints...>) {
(std::forward<F>(function)(std::integral_constant<size_t, Ints>{}), ...);
};
unfold(std::make_index_sequence<Size>());
}
example usage:
#include <tuple>
#include <iostream>
int main() {
auto Tup1 = std::make_tuple(1, 2.0, 3ull, 4u);
auto Tup2 = std::make_tuple(1ull, 2.0f, 3.0, (char)4);
constexpr auto size = std::tuple_size_v<decltype(Tup1)>;
constexpr_for<size>([&](auto i) {
std::get<i>(Tup1) += std::get<i>(Tup2);
std::cout << "tuple<" << i << "> = " << std::get<i>(Tup1) << '\n';
});
}
Output:
tuple<0> = 2
tuple<1> = 4
tuple<2> = 6
tuple<3> = 8
Try it out on godbolt.
I am writing a little variadic summing function (using c++20, but my question would remain the same with c++17 syntax). I would like to make the following code as short and clear as possible (but without using folding expressions. This is only a toy problem, but in later applications I would like to avoid fold expressions):
Additive auto sum(Additive auto&& val, Additive auto&&... vals) {
auto add = [](Additive auto&& val1, Additive auto&& val2) {
return val1 + val2;
}; // neccessary??
if constexpr(sizeof...(vals) == 1) {
return add(val, std::forward<decltype(vals)>(vals)...); // (1)
//return val + std::forward<decltype(vals)>(vals)...; // (2)
}
else return val + sum(std::forward<decltype(vals)>(vals)...);
}
Using line (1) the above code compiles, but it makes the definition of the 'add' lambda neccessary. Line (2), however, does not compile, I get the following error with gcc: parameter packs not expanded with ‘...’. If I add parentheses around the std::forward expression in line (2), I get the following error: expected binary operator before ‘)’ token.
Is there any way to pass a parameter pack with length 1 to an operator?
Embrace the power of negative thinking and start induction with zero instead of one:
auto sum(auto &&val, auto &&...vals) {
if constexpr (sizeof...(vals) == 0)
return val;
else
return val + sum(std::forward<decltype(vals)>(vals)...);
}
The above definition has the side effect that sum(x) will now compile and return x. (In fact, you can even make the function work with no arguments, by having it return zero, but then the question arises: zero of which type? To avoid having to go there, I left this case undefined.) If you insist on sum being defined only from arity 2 upwards, you can use this instead:
auto sum(auto &&val0, auto &&val1, auto &&...vals) {
if constexpr (sizeof...(vals) == 0)
return val0 + val1;
else
return val0 + sum(std::forward<decltype(val1)>(val1),
std::forward<decltype(vals)>(vals)...);
}
However, you should probably allow the ‘vacuous’ case whenever it makes sense to do so: it makes for simpler and more general code. Notice for example how in the latter definition the addition operator appears twice: this is effectively duplicating the folding logic between the two cases (in this case it’s just one addition, so it’s relatively simple, but with more complicated operations it might be more burdensome), whereas handling the degenerate case is usually trivial and doesn’t duplicate anything.
(I omitted concept annotations, as they do not seem particularly relevant to the main problem.)
template<class... Additive> decltype(auto) sum(Additive &&...val) {
return (std::forward<Additive>(val) + ...);
}
?
Offtopic: unsure about Op's real needs, I've accidentally quickdesigned one thing I've been thinking of, from time to time. :D
#include <iostream>
#include <functional>
#include <type_traits>
template<class... Fs> struct Overloads;
template<class F, class... Fs> struct Overloads<F, Fs...>: Overloads<Fs...> {
using Fallback = Overloads<Fs...>;
constexpr Overloads(F &&f, Fs &&...fs): Fallback(std::forward<Fs>(fs)...), f(std::forward<F>(f)) {}
template<class... Args> constexpr decltype(auto) operator()(Args &&...args) const {
if constexpr(std::is_invocable_v<F, Args...>) return std::invoke(f, std::forward<Args>(args)...);
else return Fallback::operator()(std::forward<Args>(args)...);
}
private:
F f;
};
template<class... Fs> Overloads(Fs &&...fs) -> Overloads<Fs...>;
template<class F> struct Overloads<F> {
constexpr Overloads(F &&f): f(std::forward<F>(f)) {}
template<class... Args> constexpr decltype(auto) operator()(Args &&...args) const {
return std::invoke(f, std::forward<Args>(args)...);
}
private:
F f;
};
template<> struct Overloads<> {
template<class... Args> constexpr void operator()(Args &&...) const noexcept {}
};
constexpr int f(int x, int y) noexcept { return x + y; }
void g(int x) { std::cout << x << '\n'; }
template<class... Vals> decltype(auto) omg(Vals &&...vals) {
static constexpr auto fg = Overloads(f, g);
return fg(std::forward<Vals>(vals)...);
}
int main() {
omg(omg(40, 2));
}
>_<
You can unpack the one item into a variable and use that:
if constexpr (sizeof...(vals) == 1) {
auto&& only_value(std::forward<decltype(vals)>(vals)...);
return val + only_value;
}
What's the easiest way to default construct an std::variant from the index of the desired type, when the index is only known at runtime? In other words, I want to write:
const auto indx = std::variant<types...>{someobject}.index();
//...somewhere later, indx having been passed around...
std::variant<types...> var = variant_from_index(indx);
///var is now set to contain a default constructed someobject
Note that indx cannot be made constexpr, so std::in_place_index doesn't work here.
The problem here is of course that since it isn't known which constructor from types... to call at compile time, somehow basically a table of all possible constructors (or maybe default constructed variants to copy from) has to be built at compile time and then accessed at run time. Some template magic is apparently in place here, but what would be the cleanest way?
I tried the following (on coliru), but the index sequence seems to come out wrong (the print in the end gives 2 0 0), and I'm confused as to why:
Edit: it works as fixed below, I had the constexpr array initialization wrong. So the question is now, is there a neater way to do this?
#include <variant>
#include <iostream>
using var_t = std::variant<int, float, const char *>;
//For debug
template<class ...types>
struct WhichType;
template<class T, class U>
struct default_variants;
template<class...Params, std::size_t... I>
struct default_variants<std::variant<Params...>, std::index_sequence<I...>> {
using variant_t = std::variant<Params...>;
//Uncomment to see the index sequence
//WhichType<std::index_sequence<I...>> idx{};
constexpr static variant_t variants[sizeof...(Params)]{variant_t{std::in_place_index<I>}...};
constexpr static std::size_t indices[sizeof...(Params)]{I...};
};
template<class T>
struct default_variants_builder;
template<class...Params>
struct default_variants_builder<std::variant<Params...>> {
using indices = std::make_index_sequence<sizeof...(Params)>;
using type = default_variants<std::variant<Params...>, indices>;
};
int main() {
using builder_t = typename default_variants_builder<var_t>::type;
var_t floatvar{1.2f};
var_t variant2 = builder_t::variants[floatvar.index()];
std::cout << "Contained " << floatvar.index() << "; Now contains " << variant2.index() << "\n";
}
With Boost.Mp11 this is basically a one-liner (as always):
template <typename V>
auto variant_from_index(size_t index) -> V
{
return mp_with_index<mp_size<V>>(index,
[](auto I){ return V(std::in_place_index<I>); });
}
Your description of the problem is accurate - you need a way to turn a runtime index into a compile-time index. mp_with_index does that for you - you give it the runtime index and the maximum compile-time index (mp_size<V> here, which would give the same value as std::variant_size_v<V> if you prefer that instead) and it will invoke a function you provide with the correct constant (I has type integral_constant<size_t, index> here, except with index being a constant expression).
How about this?
template <class Variant, std::size_t I = 0>
Variant variant_from_index(std::size_t index) {
if constexpr(I >= std::variant_size_v<Variant>)
throw std::runtime_error{"Variant index " + std::to_string(I + index) + " out of bounds"};
else
return index == 0
? Variant{std::in_place_index<I>}
: variant_from_index<Variant, I + 1>(index - 1);
}
See it live on Wandbox
Not sure if this is very elegant or not but I think it works:
#include <variant>
#include <iostream>
template<typename V, std::size_t N = std::variant_size_v<V>>
struct variant_by_index {
V make_default(std::size_t i) {
if (i >= std::variant_size_v<V>) {
throw std::invalid_argument("bad type index.");
}
constexpr size_t index = std::variant_size_v<V> - N;
if (i == index) {
return std::variant_alternative_t<index, V>();
} else {
return variant_by_index<V, N - 1>().make_default(i);
}
}
};
template<typename V>
struct variant_by_index<V, 0> {
V make_default(std::size_t i) {
throw std::bad_variant_access("bad type index.");
}
};
using var_t = std::variant<int, float, const char *>;
int main() {
variant_by_index<var_t> type_indexer;
var_t my_var_0 = type_indexer.make_default(0);
std::cout << "my_var_0 has int? " << std::holds_alternative<int>(my_var_0) << "\n";
var_t my_var_1 = type_indexer.make_default(1);
std::cout << "my_var_1 has float? " << std::holds_alternative<float>(my_var_1) << "\n";
try {
var_t my_var_1 = type_indexer.make_default(3);
} catch(const std::bad_variant_access&) {
std::cout << "Could not create with type 3.\n";
}
return 0;
}
I believe a (somewhat) elegant way might be using a more general idiom for choosing a numeric template parameter value at run time, as discussed in this question:
Idiom for simulating run-time numeric template parameters?
The foo function there will be std::get<std::size_t I> (or a lambda which captures the variant and takes no arguments).
(This question has been dramatically edited from the original, without changing the real intent of the original question)
If we add up all the elements in a vector<int>, then the answer could overflow, requiring something like intmax_t to store the answer accurately and without overflow. But intmax_t isn't suitable for vector<double>.
I could manually specify the types:
template<typename>
struct sum_traits;
template<>
struct sum_traits<int> {
typedef long accumulate_safely_t;
};
and then use them as follows:
template<typename C>
auto sum(const C& c) {
sum_traits<decltype(c.begin())> :: accumulate_safely_t> r = 0;
for(auto &el: c)
r += el;
return r;
}
My questions: Is it possible to automatically identify a suitable type, a large and accurate type, so I don't have to manually specify each one via the type trait?
The main problem with your code is that auto r = 0 is equivalent to int r = 0. That's not how your C++98 code worked. In general, you can't find a perfect target type. Your code is just a variant of std::accumulate, so we can look at how the Standard solved this problem: it allows you to pass in the initial value for the accumulator, but also its type: long sum = std::accumulate(begin, end, long{0});
Given:
If we add up all the elements in a vector, then the answer could overflow, requiring something like intmax_t to store the answer accurately and without overflow.
Question:
My questions: Is it possible to automatically identify a suitable type, a large and accurate type, so I don't have to manually specify each one via the type trait?
The problem here is that you want to take runtime data (a vector) and from it deduce a type (a compile-time thing).
Since type deduction is a compile-time operation, we must use only the information available to us at compile time to make this decision.
The only information we have at compile-time (unless you supply more) is std::numeric_limits<int>::max() and std::numeric_limits<std::vector<int>::size_type>::max().
You don't even have std::vector<int>::max_size() at this stage, as it's not mandated to be constexpr. Neither can you rely on std::vector<int>::allocator_type::max_size() because it's:
a member function
optional
deprecated in c++17
So what we're left with is a maximum possible sum of:
std::numeric_limits<int>::max() * std::numeric_limits<std::vector<int>::size_type>::max()
we could now use a compile-time disjunction to find an appropriate integer (if such an integer exists) (something involving std::conditional)
This doesn't make the type adapt to runtime conditions, but it will at least adapt to the architecture for which you're compiling.
Something like this:
template <bool Signed, unsigned long long NofBits>
struct smallest_integer
{
template<std::size_t Bits, class...Candidates>
struct select_candidate;
template<std::size_t Bits, class...Candidates>
using select_candidate_t = typename select_candidate<Bits, Candidates...>::type;
template<std::size_t Bits, class Candidate, class...Rest>
struct select_candidate<Bits, Candidate, Rest...>
{
using type = std::conditional_t<std::numeric_limits<Candidate>::digits >= Bits, Candidate, select_candidate_t<Bits, Rest...>>;
};
template<std::size_t Bits, class Candidate>
struct select_candidate<Bits, Candidate>
{
using type = std::conditional_t<std::numeric_limits<Candidate>::digits >= Bits, Candidate, void>;
};
using type =
std::conditional_t<Signed,
select_candidate_t<NofBits, std::int8_t, std::int16_t, std::int32_t, std::int64_t, __int128_t>,
select_candidate_t<NofBits, std::uint8_t, std::uint16_t, std::uint32_t, std::uint64_t, __uint128_t>>;
};
template<bool Signed, unsigned long long NofBits> using smallest_integer_t = typename smallest_integer<Signed, NofBits>::type;
template<class L, class R>
struct result_of_multiply
{
static constexpr auto lbits = std::numeric_limits<L>::digits;
static constexpr auto rbits = std::numeric_limits<R>::digits;
static constexpr auto is_signed = std::numeric_limits<L>::is_signed or std::numeric_limits<R>::is_signed;
static constexpr auto result_bits = lbits + rbits;
using type = smallest_integer_t<is_signed, result_bits>;
};
template<class L, class R> using result_of_multiply_t = typename result_of_multiply<L, R>::type;
struct safe_multiply
{
template<class L, class R>
auto operator()(L const& l, R const& r) const -> result_of_multiply_t<L, R>
{
return result_of_multiply_t<L, R>(l) * result_of_multiply_t<L, R>(r);
}
};
template<class T>
auto accumulate_values(const std::vector<T>& v)
{
using result_type = result_of_multiply_t<T, decltype(std::declval<std::vector<T>>().max_size())>;
return std::accumulate(v.begin(), v.end(), result_type(0), std::plus<>());
}
struct uint128_t_printer
{
std::ostream& operator()(std::ostream& os) const
{
auto n = n_;
if (n == 0) return os << '0';
char str[40] = {0}; // log10(1 << 128) + '\0'
char *s = str + sizeof(str) - 1; // start at the end
while (n != 0) {
*--s = "0123456789"[n % 10]; // save last digit
n /= 10; // drop it
}
return os << s;
}
__uint128_t n_;
};
std::ostream& operator<<(std::ostream& os, const uint128_t_printer& p)
{
return p(os);
}
auto output(__uint128_t n)
{
return uint128_t_printer{n};
}
int main()
{
using rtype = result_of_multiply<std::size_t, unsigned>;
std::cout << rtype::is_signed << std::endl;
std::cout << rtype::lbits << std::endl;
std::cout << rtype::rbits << std::endl;
std::cout << rtype::result_bits << std::endl;
std::cout << std::numeric_limits<rtype::type>::digits << std::endl;
std::vector<int> v { 1, 2, 3, 4, 5, 6 };
auto z = accumulate_values(v);
std::cout << output(z) << std::endl;
auto i = safe_multiply()(std::numeric_limits<unsigned>::max(), std::numeric_limits<unsigned>::max());
std::cout << i << std::endl;
}
You can use return type deduction in C++14 just like this:
template<typename C>
auto sum(const C& c) {
auto r = 0;
for(auto &el: c)
r += el;
return r;
}
In C++11, considering your C++98 code, you may use the following:
template<typename C>
auto sum(const C& c) -> typename C::value_type {
auto r = 0;
for(auto &el: c)
r += el;
return r;
}
But, as pointed in the comments, auto r = 0; will still resolve to int at compile time. As proposed in an other answer, you may want to make the initial value type (and so the return value type) a template parameter as well:
template<typename C, typename T>
T sum(const C& c, T init) {
for(auto &el: c)
init += el;
return init;
}
// usage
std::vector<std::string> v({"Hello ", "World ", "!!!"});
std::cout << sum(v, std::string{});
I am new to C++11. I am writing the following recursive lambda function, but it doesn't compile.
sum.cpp
#include <iostream>
#include <functional>
auto term = [](int a)->int {
return a*a;
};
auto next = [](int a)->int {
return ++a;
};
auto sum = [term,next,&sum](int a, int b)mutable ->int {
if(a>b)
return 0;
else
return term(a) + sum(next(a),b);
};
int main(){
std::cout<<sum(1,10)<<std::endl;
return 0;
}
compilation error:
vimal#linux-718q:~/Study/09C++/c++0x/lambda> g++ -std=c++0x sum.cpp
sum.cpp: In lambda function:
sum.cpp:18:36: error: ‘((<lambda(int, int)>*)this)-><lambda(int, int)>::sum’ cannot be used as a function
gcc version
gcc version 4.5.0 20091231 (experimental) (GCC)
But if I change the declaration of sum() as below, it works:
std::function<int(int,int)> sum = [term,next,&sum](int a, int b)->int {
if(a>b)
return 0;
else
return term(a) + sum(next(a),b);
};
Could someone please throw light on this?
Think about the difference between the auto version and the fully specified type version. The auto keyword infers its type from whatever it's initialized with, but what you're initializing it with needs to know what its type is (in this case, the lambda closure needs to know the types it's capturing). Something of a chicken-and-egg problem.
On the other hand, a fully specified function object's type doesn't need to "know" anything about what is being assigned to it, and so the lambda's closure can likewise be fully informed about the types its capturing.
Consider this slight modification of your code and it may make more sense:
std::function<int(int, int)> sum;
sum = [term, next, &sum](int a, int b) -> int {
if (a > b)
return 0;
else
return term(a) + sum(next(a), b);
};
Obviously, this wouldn't work with auto. Recursive lambda functions work perfectly well (at least they do in MSVC, where I have experience with them), it's just that they aren't really compatible with type inference.
The trick is to feed in the lambda implementation to itself as a parameter, not by capture.
const auto sum = [term, next](int a, int b) {
auto sum_impl = [term, next](int a, int b, auto& sum_ref) mutable {
if (a > b) {
return 0;
}
return term(a) + sum_ref(next(a), b, sum_ref);
};
return sum_impl(a, b, sum_impl);
};
All problems in computer science can be solved by another level of indirection. I first found this easy trick at http://pedromelendez.com/blog/2015/07/16/recursive-lambdas-in-c14/
It does require C++14 while the question is on C++11, but perhaps interesting to most.
Here's the full example at Godbolt.
Going via std::function is also possible but can result in slower code. But not always. Have a look at the answers to std::function vs template
This is not just a peculiarity about C++,
it's directly mapping to the mathematics of lambda calculus. From Wikipedia:
Lambda calculus cannot express this as directly as some other
notations:
all functions are anonymous in lambda calculus, so we can't refer to a
value which is yet to be defined, inside the lambda term defining that
same value. However, recursion can still be achieved by arranging for a
lambda expression to receive itself as its argument value
With C++14, it is now quite easy to make an efficient recursive lambda without having to incur the additional overhead of std::function, in just a few lines of code:
template <class F>
struct y_combinator {
F f; // the lambda will be stored here
// a forwarding operator():
template <class... Args>
decltype(auto) operator()(Args&&... args) const {
// we pass ourselves to f, then the arguments.
return f(*this, std::forward<Args>(args)...);
}
};
// helper function that deduces the type of the lambda:
template <class F>
y_combinator<std::decay_t<F>> make_y_combinator(F&& f) {
return {std::forward<F>(f)};
}
with which your original sum attempt becomes:
auto sum = make_y_combinator([term,next](auto sum, int a, int b) -> int {
if (a>b) {
return 0;
}
else {
return term(a) + sum(next(a),b);
}
});
In C++17, with CTAD, we can add a deduction guide:
template <class F> y_combinator(F) -> y_combinator<F>;
Which obviates the need for the helper function. We can just write y_combinator{[](auto self, ...){...}} directly.
In C++20, with CTAD for aggregates, the deduction guide won't be necessary.
In C++23, with deducing this, you don't need a Y-combinator at all:
auto sum = [term,next](this auto const& sum, int a, int b) -> int {
if (a>b) {
return 0;
}
else {
return term(a) + sum(next(a),b);
}
}
I have another solution, but work only with stateless lambdas:
void f()
{
static int (*self)(int) = [](int i)->int { return i>0 ? self(i-1)*i : 1; };
std::cout<<self(10);
}
Trick here is that lambdas can access static variables and you can convert stateless ones to function pointer.
You can use it with standard lambdas:
void g()
{
int sum;
auto rec = [&sum](int i) -> int
{
static int (*inner)(int&, int) = [](int& _sum, int i)->int
{
_sum += i;
return i>0 ? inner(_sum, i-1)*i : 1;
};
return inner(sum, i);
};
}
Its work in GCC 4.7
To make lambda recursive without using external classes and functions (like std::function or fixed-point combinator) one can use the following construction in C++14 (live example):
#include <utility>
#include <list>
#include <memory>
#include <iostream>
int main()
{
struct tree
{
int payload;
std::list< tree > children = {}; // std::list of incomplete type is allowed
};
std::size_t indent = 0;
// indication of result type here is essential
const auto print = [&] (const auto & self, const tree & node) -> void
{
std::cout << std::string(indent, ' ') << node.payload << '\n';
++indent;
for (const tree & t : node.children) {
self(self, t);
}
--indent;
};
print(print, {1, {{2, {{8}}}, {3, {{5, {{7}}}, {6}}}, {4}}});
}
prints:
1
2
8
3
5
7
6
4
Note, result type of lambda should be specified explicitly.
You can make a lambda function call itself recursively. The only thing you need to do is to is to reference it through a function wrapper so that the compiler knows it's return and argument type (you can't capture a variable -- the lambda itself -- that hasn't been defined yet).
function<int (int)> f;
f = [&f](int x) {
if (x == 0) return 0;
return x + f(x-1);
};
printf("%d\n", f(10));
Be very careful not to run out of the scope of the wrapper f.
I ran a benchmark comparing a recursive function vs a recursive lambda function using the std::function<> capture method. With full optimizations enabled on clang version 4.1, the lambda version ran significantly slower.
#include <iostream>
#include <functional>
#include <chrono>
uint64_t sum1(int n) {
return (n <= 1) ? 1 : n + sum1(n - 1);
}
std::function<uint64_t(int)> sum2 = [&] (int n) {
return (n <= 1) ? 1 : n + sum2(n - 1);
};
auto const ITERATIONS = 10000;
auto const DEPTH = 100000;
template <class Func, class Input>
void benchmark(Func&& func, Input&& input) {
auto t1 = std::chrono::high_resolution_clock::now();
for (auto i = 0; i != ITERATIONS; ++i) {
func(input);
}
auto t2 = std::chrono::high_resolution_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(t2-t1).count();
std::cout << "Duration: " << duration << std::endl;
}
int main() {
benchmark(sum1, DEPTH);
benchmark(sum2, DEPTH);
}
Produces results:
Duration: 0 // regular function
Duration: 4027 // lambda function
(Note: I also confirmed with a version that took the inputs from cin, so as to eliminate compile time evaluation)
Clang also produces a compiler warning:
main.cc:10:29: warning: variable 'sum2' is uninitialized when used within its own initialization [-Wuninitialized]
Which is expected, and safe, but should be noted.
It's great to have a solution in our toolbelts, but I think the language will need a better way to handle this case if performance is to be comparable to current methods.
Note:
As a commenter pointed out, it seems latest version of VC++ has found a way to optimize this to the point of equal performance. Maybe we don't need a better way to handle this, after all (except for syntactic sugar).
Also, as some other SO posts have outlined in recent weeks, the performance of std::function<> itself may be the cause of slowdown vs calling function directly, at least when the lambda capture is too large to fit into some library-optimized space std::function uses for small-functors (I guess kinda like the various short string optimizations?).
Here is a refined version of the Y-combinator solution based on one proposed by #Barry.
template <class F>
struct recursive {
F f;
template <class... Ts>
decltype(auto) operator()(Ts&&... ts) const { return f(std::ref(*this), std::forward<Ts>(ts)...); }
template <class... Ts>
decltype(auto) operator()(Ts&&... ts) { return f(std::ref(*this), std::forward<Ts>(ts)...); }
};
template <class F> recursive(F) -> recursive<F>;
auto const rec = [](auto f){ return recursive{std::move(f)}; };
To use this, one could do the following
auto fib = rec([&](auto&& fib, int i) {
// implementation detail omitted.
});
It is similar to the let rec keyword in OCaml, although not the same.
In C++23 deducing this (P0847) will be added:
auto f = [](this auto& self, int i) -> int
{
return i > 0 ? self(i - 1) + i : 0;
}
For now its only available in EDG eccp and (partially) available in MSVC:
https://godbolt.org/z/f3E3xT3fY
This is a slightly simpler implementation of the fixpoint operator which makes it a little more obvious exactly what's going on.
#include <iostream>
#include <functional>
using namespace std;
template<typename T, typename... Args>
struct fixpoint
{
typedef function<T(Args...)> effective_type;
typedef function<T(const effective_type&, Args...)> function_type;
function_type f_nonr;
T operator()(Args... args) const
{
return f_nonr(*this, args...);
}
fixpoint(const function_type& p_f)
: f_nonr(p_f)
{
}
};
int main()
{
auto fib_nonr = [](const function<int(int)>& f, int n) -> int
{
return n < 2 ? n : f(n-1) + f(n-2);
};
auto fib = fixpoint<int,int>(fib_nonr);
for (int i = 0; i < 6; ++i)
{
cout << fib(i) << '\n';
}
}
C++ 14:
Here is a recursive anonymous stateless/no capture generic set of lambdas
that outputs all numbers from 1, 20
([](auto f, auto n, auto m) {
f(f, n, m);
})(
[](auto f, auto n, auto m) -> void
{
cout << typeid(n).name() << el;
cout << n << el;
if (n<m)
f(f, ++n, m);
},
1, 20);
If I understand correctly this is using the Y-combinator solution
And here is the sum(n, m) version
auto sum = [](auto n, auto m) {
return ([](auto f, auto n, auto m) {
int res = f(f, n, m);
return res;
})(
[](auto f, auto n, auto m) -> int
{
if (n > m)
return 0;
else {
int sum = n + f(f, n + 1, m);
return sum;
}
},
n, m); };
auto result = sum(1, 10); //result == 55
Here's the proof that a recursive lambda with a small body almost has the same performance like a usual recursive fuction which can call itself directly.
#include <iostream>
#include <chrono>
#include <type_traits>
#include <functional>
#include <atomic>
#include <cmath>
using namespace std;
using namespace chrono;
unsigned recursiveFn( unsigned x )
{
if( x ) [[likely]]
return recursiveFn( x - 1 ) + recursiveFn( x - 1 );
else
return 0;
};
atomic_uint result;
int main()
{
auto perf = []( function<void ()> fn ) -> double
{
using dur_t = high_resolution_clock::duration;
using urep_t = make_unsigned_t<dur_t::rep>;
high_resolution_clock::duration durMin( (urep_t)-1 >> 1 );
for( unsigned r = 10; r--; )
{
auto start = high_resolution_clock::now();
fn();
dur_t dur = high_resolution_clock::now() - start;
if( dur < durMin )
durMin = dur;
}
return durMin.count() / 1.0e9;
};
auto recursiveLamdba = []( auto &self, unsigned x ) -> unsigned
{
if( x ) [[likely]]
return self( self, x - 1 ) + self( self, x - 1 );
else
return 0;
};
constexpr unsigned DEPTH = 28;
double
tLambda = perf( [&]() { ::result = recursiveLamdba( recursiveLamdba, DEPTH ); } ),
tFn = perf( [&]() { ::result = recursiveFn( DEPTH ); } );
cout << trunc( 1000.0 * (tLambda / tFn - 1.0) + 0.5 ) / 10.0 << "%" << endl;
}
For my AMD Zen1 CPU with current MSVC the recursiveFn is about 10% faster. For my Phenom II x4 945 with g++ 11.1.x both functions have the same performance.
Keep in mind that this is almost the worst case since the body of the funtion is very small. If it is larger the part of the recursive function call itself is smaller.
You're trying to capture a variable (sum) you're in the middle of defining. That can't be good.
I don't think truely self-recursive C++0x lambdas are possible. You should be able to capture other lambdas, though.
Here is the final answer for the OP. Anyway, Visual Studio 2010 does not support capturing global variables. And you do not need to capture them because global variable is accessable globally by define. The following answer uses local variable instead.
#include <functional>
#include <iostream>
template<typename T>
struct t2t
{
typedef T t;
};
template<typename R, typename V1, typename V2>
struct fixpoint
{
typedef std::function<R (V1, V2)> func_t;
typedef std::function<func_t (func_t)> tfunc_t;
typedef std::function<func_t (tfunc_t)> yfunc_t;
class loopfunc_t {
public:
func_t operator()(loopfunc_t v)const {
return func(v);
}
template<typename L>
loopfunc_t(const L &l):func(l){}
typedef V1 Parameter1_t;
typedef V2 Parameter2_t;
private:
std::function<func_t (loopfunc_t)> func;
};
static yfunc_t fix;
};
template<typename R, typename V1, typename V2>
typename fixpoint<R, V1, V2>::yfunc_t fixpoint<R, V1, V2>::fix = [](tfunc_t f) -> func_t {
return [f](fixpoint<R, V1, V2>::loopfunc_t x){ return f(x(x)); }
([f](fixpoint<R, V1, V2>::loopfunc_t x) -> fixpoint<R, V1, V2>::func_t{
auto &ff = f;
return [ff, x](t2t<decltype(x)>::t::Parameter1_t v1,
t2t<decltype(x)>::t::Parameter1_t v2){
return ff(x(x))(v1, v2);
};
});
};
int _tmain(int argc, _TCHAR* argv[])
{
auto term = [](int a)->int {
return a*a;
};
auto next = [](int a)->int {
return ++a;
};
auto sum = fixpoint<int, int, int>::fix(
[term,next](std::function<int (int, int)> sum1) -> std::function<int (int, int)>{
auto &term1 = term;
auto &next1 = next;
return [term1, next1, sum1](int a, int b)mutable ->int {
if(a>b)
return 0;
else
return term1(a) + sum1(next1(a),b);
};
});
std::cout<<sum(1,10)<<std::endl; //385
return 0;
}
This answer is inferior to Yankes' one, but still, here it goes:
using dp_type = void (*)();
using fp_type = void (*)(dp_type, unsigned, unsigned);
fp_type fp = [](dp_type dp, unsigned const a, unsigned const b) {
::std::cout << a << ::std::endl;
return reinterpret_cast<fp_type>(dp)(dp, b, a + b);
};
fp(reinterpret_cast<dp_type>(fp), 0, 1);
You need a fixed point combinator. See this.
or look at the following code:
//As decltype(variable)::member_name is invalid currently,
//the following template is a workaround.
//Usage: t2t<decltype(variable)>::t::member_name
template<typename T>
struct t2t
{
typedef T t;
};
template<typename R, typename V>
struct fixpoint
{
typedef std::function<R (V)> func_t;
typedef std::function<func_t (func_t)> tfunc_t;
typedef std::function<func_t (tfunc_t)> yfunc_t;
class loopfunc_t {
public:
func_t operator()(loopfunc_t v)const {
return func(v);
}
template<typename L>
loopfunc_t(const L &l):func(l){}
typedef V Parameter_t;
private:
std::function<func_t (loopfunc_t)> func;
};
static yfunc_t fix;
};
template<typename R, typename V>
typename fixpoint<R, V>::yfunc_t fixpoint<R, V>::fix =
[](fixpoint<R, V>::tfunc_t f) -> fixpoint<R, V>::func_t {
fixpoint<R, V>::loopfunc_t l = [f](fixpoint<R, V>::loopfunc_t x) ->
fixpoint<R, V>::func_t{
//f cannot be captured since it is not a local variable
//of this scope. We need a new reference to it.
auto &ff = f;
//We need struct t2t because template parameter
//V is not accessable in this level.
return [ff, x](t2t<decltype(x)>::t::Parameter_t v){
return ff(x(x))(v);
};
};
return l(l);
};
int _tmain(int argc, _TCHAR* argv[])
{
int v = 0;
std::function<int (int)> fac =
fixpoint<int, int>::fix([](std::function<int (int)> f)
-> std::function<int (int)>{
return [f](int i) -> int{
if(i==0) return 1;
else return i * f(i-1);
};
});
int i = fac(10);
std::cout << i; //3628800
return 0;
}