Are there any parts of the standard library that would support the following use case:
You have N collections, with each collection potentially being a type of collection (C1, C2 ..., Cn) all of which support begin(), end() and iteration, ie (vector, deque, etc).
Each of these collections can contain a objects of different types ie, collections are C1, C2, C3, and are of different sizes.
In addition, all of these types can be ordered via a timestamp, but the way that each of these items store a timestamp is different. For example, type A has member A.timestamp, B has a member B.TimeStamp, C has a function C.GetTimestamp().
Each of the collections are already ordered by this function.
What I want to do is iterate over all items in all collections, in order, using the order functions, and call another function, i.e. a visit function, std::function<void(A &)> for collections of A, std::function<void(B&)> for items of type B, etc.
I then want to, make calls for each item in timestamp order. For example:
class A
{
public:
time_t timeStamp;
int length;
};
class B
{
public:
B(time_t _tm, std::string _name):timestamp(_tm), name(_name){}
time_t GetTimestamp() { return timeStamp; }
std::string GetName() { return name; }
private:
time_t timeStamp;
std::string name;
}
std::vector<A> listA {1, 4}, {5, 7}, {8,9});
std::deque<B> listB { B(0,"bob"), B(3, "Frank") };
// iterate over listA and listB in time sequential order
// for items in A call [](const A&a) { std::cout << a.length << std::endl; }
// for items in B call [](const B&b) { std::cout << b.name << std::endl; }
// Output would be:
// bob
// 4
// Frank
// 7
// 9
My thought on implementation would be to, define a base class which is templated by the type that the ordering function returns:
template<classname O>
class VisitedCollection
{
public:
virtual bool end() = 0; // returns if we are at the end of collection
virtual O next_order_measure() = 0; returns an instance of a class that can be used for ordering
virtual void visit();
};
class VisitedCollectionA: VisitedCollection<time_t>
{
public:
VisitedCollectionA(std::vector<A> &&a): items(std::move(a))
{
next_item = items.begin();
}
virtual bool end() override { return next_item == items.end(); }
virtual time_t next_order_meaure() override { return nextItem->timeStamp;}
virtual void visit() override { std::cout << nextItem->length << std::endl;}
private:
std::vector<A> &items;
std::vector<A>::iterator next_item;
}
... similar for class B
... could also add a class C, D, etc
Now I can create a collection of VisitedCollection<time_t>, and add VisitedCollectionA, and VisitedCollectionB. This collection of collections would:
Start by looking at the return value of the ordering function for the first item in each collection. Which ever one has the smallest value, call it's visitor function. Then find the collection which whose next item has the lowest ordering value. On ties on the ordering function, iterate collection which comes first in the "collection" of collections. Once a collection hits 'end', it's removed from the iteration.
I'm considering rolling my own, but wanted to know if there was already something like this in the standard library. I could even make visit() a lambda, which would allow the VisitedCollectionA types be templates, which take an ordering type and a collection type, that would allow the creation of the main visitor to be initialized with something like
{
VisitedCollection<time_t, std::vector<A>>
(
vecA,
[](){ return next_item->timeStamp; },
[](const A&a) { std::cout << a << std::endl; }
),
VisitedCollection<time_t, std::deque<B>>
(
deqB,
[](){ return next_item->GetTimestamp(); },
[](const B&b) { std::cout << a << std::endl; }
)
}
This feels a bit like a mixture of variant and ranges
Does something like this exist?
There's no straightforward solution in the C++ std (or any other std, I guess?) but some support for a self-made solution can be found there anyway. Even though suboptimal, it should be easy to understand and rewrite if needed.
First of all, you need const getters (omitted here) and a unified interface for getting timestamps out of the types iterated:
time_t timestamp(A const& a) { return a.timeStamp; }
time_t timestamp(B const& b) { return b.GetTimestamp(); }
template<typename... Ts> time_t timestamp(std::variant<Ts...> const& v) {
return visit([](auto&& e) { return timestamp(e); }, v);
} // see below
The basic idea is putting references to all elements into a single container (as variants), sorting and then visiting them:
void visit_sorted_timestamps(auto visitor, auto&&... ranges) {
std::vector<std::variant<
std::reference_wrapper<std::ranges::range_value_t<decltype(ranges)>>...
>> mixed;
mixed.reserve((... + size(ranges)));
(..., mixed.insert(end(mixed), begin(ranges), end(ranges)));
std::sort(begin(mixed), end(mixed), [](auto&& v1, auto&& v2) {
return timestamp(v1) < timestamp(v2);
});
for (auto&& v: mixed) visit(visitor, v);
}
Usage example:
int main() {
visit_sorted_timestamps(Overload{
[](A const& a) { std::cout << a.length << '\n'; },
[](B const& b) { std::cout << b.GetName() << '\n'; }
}, listA, listB);
}
If you don't have the lambda "overloading" struct yet, here it goes:
template<typename... Fs> struct Overload: Fs... { using Fs::operator()...; };
template<typename... Fs> Overload(Fs...) -> Overload<Fs...>;
I want to compare one value against several others and check if it matches at least one of those values, I assumed it would be something like
if (x = any_of(1, 2 ,3)
// do something
But the examples of it I've seen online have been
bool any_of(InputIt first, InputIt last, UnaryPredicate)
What does that mean?
New to c++ so apologies if this is a stupid question.
There is plenty of literature and video tutorials on the subject of "iterators in C++", you should do some research in that direction because it's a fundamental concept in C++.
A quick summary on the matter: an iterator is something that points to an element in a collection (or range) of values. A few examples of such collections:
std::vector is the most common one. It's basically a resizable array.
std::list is a linked list.
std::array is a fixed size array with some nice helpers around C style arrays
int myInt[12] is a C style array of integers. This one shouldn't be used anymore.
Algorithms from the C++ standard library that operate on a collection of values (such as std::any_of) take the collection by two iterators. The first iterator InputIt first points to the beginning of said collection, while InputIt last points to the end of the collection (actually one past the end).
A UnaryPredicate is a function that takes 1 argument (unary) and returns a bool (predicate).
In order to make std::any_of do what you want, you have to put your values in a collection and x in the UnaryPredicate:
int x = 3;
std::vector values = {1, 2, 3};
if (std::any_of(values.begin(), values.end(), [x](int y) { return x == y; }))
// ...
The UnaryPredicate in this case is a lambda function.
As you can see this is quite verbose code given your example. But once you have a dynamic amound of values that you want to compare, or you want to check for more complex things than just equality, this algorithm becomes way more beneficial.
Fun little experiment
Just for fun, I made a little code snippet that implements an any_of like you wanted to have it. It's quite a lot of code and pretty complicated aswell (definitely not beginner level!) but it is very flexible and actually nice to use. The full code can be found here.
Here is how you would use it:
int main()
{
int x = 7;
std::vector dynamic_int_range = {1, 2, 3, 4, 5, 6, 7, 8};
if (x == any_of(1, 2, 3, 4, 5))
{
std::cout << "x is in the compile time collection!\n";
}
else if (x == any_of(dynamic_int_range))
{
std::cout << "x is in the run time collection!\n";
}
else
{
std::cout << "x is not in the collection :(\n";
}
std::string s = "abc";
std::vector<std::string> dynamic_string_range = {"xyz", "uvw", "rst", "opq"};
if (s == any_of("abc", "def", "ghi"))
{
std::cout << "s is in the compile time collection!\n";
}
else if (s == any_of(dynamic_string_range))
{
std::cout << "s is in the run time collection!\n";
}
else
{
std::cout << "s is not in the collection :(\n";
}
}
And here how it's implemented:
namespace detail
{
template <typename ...Args>
struct ct_any_of_helper
{
std::tuple<Args...> values;
constexpr ct_any_of_helper(Args... values) : values(std::move(values)...) { }
template <typename T>
[[nodiscard]] friend constexpr bool operator==(T lhs, ct_any_of_helper const& rhs) noexcept
{
return std::apply([&](auto... vals) { return ((lhs == vals) || ...); }, rhs.values);
}
};
template <typename Container>
struct rt_any_of_helper
{
Container const& values;
constexpr rt_any_of_helper(Container const& values) : values(values) { }
template <typename T>
[[nodiscard]] friend constexpr bool operator==(T&& lhs, rt_any_of_helper&& rhs) noexcept
{
return std::any_of(cbegin(rhs.values), cend(rhs.values), [&](auto val)
{
return lhs == val;
});
}
};
template <typename T>
auto is_container(int) -> decltype(cbegin(std::declval<T>()) == cend(std::declval<T>()), std::true_type{});
template <typename T>
std::false_type is_container(...);
template <typename T>
constexpr bool is_container_v = decltype(is_container<T>(0))::value;
}
template <typename ...Args>
[[nodiscard]] constexpr auto any_of(Args&&... values)
{
using namespace detail;
if constexpr (sizeof...(Args) == 1 && is_container_v<std::tuple_element_t<0, std::tuple<Args...>>>)
return rt_any_of_helper(std::forward<Args>(values)...);
else
return ct_any_of_helper(std::forward<Args>(values)...);
}
In case an expert sees this code and wants to complain about the dangling reference: come on, who would write someting like this:
auto a = any_of(std::array {1, 2, 3, 4});
if (x == std::move(a)) // ...
That's not what this function is for.
Your values must already exist somewhere else, it is very likely that it will be a vector.
std::any_of operates on iterators.
Iterators in C++ are ranges, two values that tell you where is the beginning, and where is the end of the range.
Most C++ Standard Template Library collections, including std::vector, support iterator API, and so you can use std::any_of on them.
For the sake of a full example, lets check if a vector contains 42 in over the top way, just to use std::any_of.
Since we only want to check if value in vector exists without changing anything (std::any_of doesn't modify the collection), we use .cbegin() and .cend() that return constant beginning and end of the vector, those are important to std::any_of, as it has to iterate over the entire vector to check if there's at least one value matching the given predicate.
The last parameter must be unary predicate, that means that it is a function, that accepts a single argument, and returns whether given argument fits some criteria.
To put it simply, std::any_of is used to check whether there's at least one value in a collection, that has some property that you care about.
Code:
#include <algorithm>
#include <iostream>
#include <vector>
bool is_42(int value) {
return value == 42;
}
int main() {
std::vector<int> vec{
1, 2, 3,
// 42 // uncomment this
};
if(std::any_of(vec.cbegin(), vec.cend(), is_42)) {
std::cout << "42 is in vec" << std::endl;
} else {
std::cout << "42 isn't in vec" << std::endl;
}
}
As stated by user #a.abuzaid, you can create your own method for this. The method they provided, however, lacks in a number of areas stated in the comments of the answer. I can't really get my head around std::any_of as of right now and just decided to create this template:
template <typename Iterable, typename type>
bool any_of(Iterable iterable, type value) {
for (type comparison : iterable) {
if (comparison == value) {
return true;
}
}
return false;
}
An example use here would be if (any_of(myVectorOfStrings, std::string("Find me!"))) { do stuff }, in which the iterable is a vector of strings and the value is the string "Find me!".
You can just create a function where you are comparing x to two other numbers to check if they are the same for instance
bool anyof(int x, int y, int z) {
if ((x == y) || (x == z))
return true;
}
and then within your main you can call the function like this:
if (anyof(x, 1, 2))
cout << "Matches a number";
I'm facing an almost-logical problem while working on C++11.
I have a class I have to plot (aka draw a trend) and I want to exclude all the points which do not satisfy a given condition.
The points are of the class Foo and all the conditional functions are defined with the signature bool Foo::Bar(Args...) const where Args... represents a number of parameters (e.g. upper and lower limits on the returned value).
Everything went well up to the moment I wished to apply a single condition to the values to plot. Let's say I have a FooPlotter class which has something like:
template<class ...Args> GraphClass FooPlotter::Plot([...],bool (Foo::*Bar)(Args...), Args... args)
Which will iterate over my data container and apply the condition Foo::*Bar to all the elements, plotting the values which satisfy the given condition.
So far so good.
At a given point I wanted to pass a vector of conditions to the same method, in order to use several conditions to filter data.
I first created a class to contain everything I need to have later:
template<class ...Args> class FooCondition{
public:
FooCondition(bool (Foo::*Bar)(Args...) const, Args... args)
{
fCondition = Bar;
fArgs = std::make_tuple(args);
}
bool operator()(Foo data){ return (data.*fCondition)(args); }
private:
bool (Foo::*fCondition)(Args...) const;
std::tuple<Args...> fArgs;
};
Then I got stuck on how to define a (iterable) container which can contain FooCondition objects despite them having several types for the Args... arguments pack.
The problem is that some methods have Args... = uint64_t,uint_64_t while others require no argument to be called.
I digged a bit on how to handle this kind of situation. I tried several approaches, but none of them worked well.
For the moment I added ignored arguments to all the Bar methods, uniformising them and working-around the issue, but I am not really satisfied!
Has some of you an idea on how to store differently typed FooCondition objects in an elegant way?
EDIT: Additional information on the result I want to obtain.
First I want to be able to create a std::vector of FooCondition items:
std::vector<FooCondition> conditions;
conditions.emplace_back(FooCondition(&Foo::IsBefore, uint64_t timestamp1));
conditions.emplace_back(FooCondition(&Foo::IsAttributeBetween, double a, double b));
conditions.emplace_back(FooCondition(&Foo::IsOk));
At this point I wish I can do something like the following, in my FooPlotter::Plot method:
GraphClass FooPlotter::Plot(vector<Foo> data, vector<FooCondition> conditions){
GraphClass graph;
for(const auto &itData : data){
bool shouldPlot = true;
for(const auto &itCondition : conditions){
shouldPlot &= itCondition(itData);
}
if(shouldPlot) graph.AddPoint(itData);
}
return graph;
}
As you can argue the FooCondition struct should pass the right arguments to the method automatically using the overloaded operator.
Here the issue is to find the correct container to be able to create a collection of FooCondition templates despite the size of their arguments pack.
It seems to me that, with FooCondition you're trying to create a substitute for a std::function<bool(Foo *)> (or maybe std::function<bool(Foo const *)>) initialized with a std::bind that fix some arguments for Foo methods.
I mean... I think that instead of
std::vector<FooCondition> conditions;
conditions.emplace_back(FooCondition(&Foo::IsBefore, uint64_t timestamp1));
conditions.emplace_back(FooCondition(&Foo::IsAttributeBetween, double a, double b));
conditions.emplace_back(FooCondition(&Foo::IsOk));
you should write something as
std::vector<std::function<bool(Foo const *)>> vfc;
using namespace std::placeholders;
vfc.emplace_back(std::bind(&Foo::IsBefore, _1, 64U));
vfc.emplace_back(std::bind(&Foo::IsAttributeBetween, _1, 10.0, 100.0));
vfc.emplace_back(std::bind(&Foo::IsOk, _1));
The following is a simplified full working C++11 example with a main() that simulate Plot()
#include <vector>
#include <iostream>
#include <functional>
struct Foo
{
double value;
bool IsBefore (std::uint64_t ts) const
{ std::cout << "- IsBefore(" << ts << ')' << std::endl;
return value < ts; }
bool IsAttributeBetween (double a, double b) const
{ std::cout << "- IsAttrributeBetwen(" << a << ", " << b << ')'
<< std::endl; return (a < value) && (value < b); }
bool IsOk () const
{ std::cout << "- IsOk" << std::endl; return value != 0.0; }
};
int main ()
{
std::vector<std::function<bool(Foo const *)>> vfc;
using namespace std::placeholders;
vfc.emplace_back(std::bind(&Foo::IsBefore, _1, 64U));
vfc.emplace_back(std::bind(&Foo::IsAttributeBetween, _1, 10.0, 100.0));
vfc.emplace_back(std::bind(&Foo::IsOk, _1));
std::vector<Foo> vf { Foo{0.0}, Foo{10.0}, Foo{20.0}, Foo{80.0} };
for ( auto const & f : vf )
{
bool bval { true };
for ( auto const & c : vfc )
bval &= c(&f);
std::cout << "---- for " << f.value << ": " << bval << std::endl;
}
}
Another way is avoid the use of std::bind and use lambda function instead.
By example
std::vector<std::function<bool(Foo const *)>> vfc;
vfc.emplace_back([](Foo const * fp)
{ return fp->IsBefore(64U); });
vfc.emplace_back([](Foo const * fp)
{ return fp->IsAttributeBetween(10.0, 100.0); });
vfc.emplace_back([](Foo const * fp)
{ return fp->IsOk(); });
All of the foo bar aside you just need a class with a method which can be implemented to satisfy the plot.
Just add a Plot method on the class which accepts the node and perform the transformation and plotting in the same step.
You need not worry about args when plotting because each function knows what arguments it needs.
Thus a simple args* will suffice and when null no arguments, therein each arg reveals it's type and value or can be assumed from the function invocation.
I trying to convert some loops in my code to use the for_each functionality of the STL. Currently, I calculate and accumulate two separate values over the same set of data, requiring me to loop over the data twice. In the interest of speed, I want to loop once and accumulate both values. Using for_each was suggested as it apparently can be worked into a multithreaded or multiprocessor implementation fairly easily (I haven't learned how to do that yet.)
Creating a function that only loops over the data once and calculates both values is easy, but I need to return both. To use with for_each, I need to return both calculated values at each iteration so STL can sum them. From my understanding, this isn't possible as for_each expects a single value returned.
The goal with using for_each, besides cleaner code (arguably?) is to eventually move to a multithreaded or multiprocessor implementation so that the loop over the data can be done in parallel so things run faster.
It was suggested to me that I look at using a functor instead of a function. However, that raises two issues.
How will using a functor instead allow the return accumulation of two values?
I have two methods of applying this algorithm. The current code has a virtual base class and then two classes that inherit and implement the actual working code. I can't figure out how to have a "virtual functor" so that each method class can implement its own version.
Thanks!
Here is an example of using a functor to perform two accumulations in parallel.
struct MyFunctor
{
// Initialise accumulators to zero
MyFunctor() : acc_A(0), acc_B(0) {}
// for_each calls operator() for each container element
void operator() (const T &x)
{
acc_A += x.foo();
acc_B += x.bar();
}
int acc_A;
int acc_B;
};
// Invoke for_each, and capture the result
MyFunctor func = std::for_each(container.begin(), container.end(), MyFunctor());
[Note that you could also consider using std::accumulate(), with an appropriate overload for operator+.]
As for virtual functors, you cannot do these directly, as STL functions take functors by value, not by reference (so you'd get a slicing problem). You'd need to implement a sort of "proxy" functor that in turn contains a reference to your virtual functor.* Along the lines of:
struct AbstractFunctor
{
virtual void operator() (const T &x) = 0;
};
struct MyFunctor : AbstractFunctor
{
virtual void operator() (const T &x) { ... }
};
struct Proxy
{
Proxy(AbstractFunctor &f) : f(f) {}
void operator() (const T &x) { f(x); }
AbstractFunctor &f;
};
MyFunctor func;
std::for_each(container.begin(), container.end(), Proxy(func));
* Scott Meyers gives a good example of this technique in Item 38 of his excellent Effective STL.
Three (main) approaches
Ok, I ended up doing three (main) implementations (with minor variations). I did a simple benchmark to see whether there were any efficiency differenes. Check the benchmarks section at the bottom
1. std::for_each with c++0x lambda
Taking some c++0x shortcuts: see http://ideone.com/TvJZd
#include <vector>
#include <algorithm>
#include <iostream>
int main()
{
std::vector<int> a = { 1,2,3,4,5,6,7 };
int sum=0, product=1;
std::for_each(a.begin(), a.end(), [&] (int i) { sum+=i; product*=i; });
std::cout << "sum: " << sum << ", product: " << product << std::endl;
return 0;
}
Prints
sum: 28, product: 5040
As mentioned by others, you'd normally prefer a normal loop:
for (int i: a)
{ sum+=i; product*=i; }
Which is both
shorter,
more legible,
less unexpected (ref capturing) and
likely more optimizable by the compiler
Also, very close in non-c++11/0x:
for (std::vector<int>::const_iterator it=a.begin(); it!=a.end(); ++it)
{ sum+=*it; product*=*it; }
2. std::accumulate with handwritten accumulator object
Added one based on std::accumulate: see http://ideone.com/gfi2C
struct accu_t
{
int sum, product;
static accu_t& handle(accu_t& a, int i)
{
a.sum+=i;
a.product*=i;
return a;
}
} accum = { 0, 1 };
accum = std::accumulate(a.begin(), a.end(), accum, &accu_t::handle);
3. std::accumulate with std::tuple
Ok I couldn't resist. Here is one with accumulate but operating on a std::tuple (removing the need for the functor type): see http://ideone.com/zHbUh
template <typename Tuple, typename T>
Tuple handle(Tuple t, T v)
{
std::get<0>(t) += v;
std::get<1>(t) *= v;
return t;
}
int main()
{
std::vector<int> a = { 1,2,3,4,5,6,7 };
for (auto i=1ul << 31; i;)
{
auto accum = std::make_tuple(0,1);
accum = std::accumulate(a.begin(), a.end(), accum, handle<decltype(accum), int>);
if (!--i)
std::cout << "sum: " << std::get<0>(accum) << ", product: " << std::get<1>(accum) << std::endl;
}
return 0;
}
Benchmarks:
Measured by doing the accumulation 2<<31 times (see snippet for the std::tuple based variant). Tested with -O2 and -O3 only:
there is no measurable difference between any of the approaches shown (0.760s):
the for_each with a lambda
handcoded iterator loop or even the c++11 for (int i:a)
the handcoded accu_t struct (0.760s)
using std::tuple
all variants exhibit a speed up of more than 18x going from -O2 to -O3 (13.8s to 0.760s), again regardless of the implementation chosen
The tuple/accumulate the performance stays exactly the same with Tuple& handle(Tuple& t, T v) (by reference).
C++ does not have native support for lazy evaluation (as Haskell does).
I'm wondering if it is possible to implement lazy evaluation in C++ in a reasonable manner. If yes, how would you do it?
EDIT: I like Konrad Rudolph's answer.
I'm wondering if it's possible to implement it in a more generic fashion, for example by using a parametrized class lazy that essentially works for T the way matrix_add works for matrix.
Any operation on T would return lazy instead. The only problem is to store the arguments and operation code inside lazy itself. Can anyone see how to improve this?
I'm wondering if it is possible to implement lazy evaluation in C++ in a reasonable manner. If yes, how would you do it?
Yes, this is possible and quite often done, e.g. for matrix calculations. The main mechanism to facilitate this is operator overloading. Consider the case of matrix addition. The signature of the function would usually look something like this:
matrix operator +(matrix const& a, matrix const& b);
Now, to make this function lazy, it's enough to return a proxy instead of the actual result:
struct matrix_add;
matrix_add operator +(matrix const& a, matrix const& b) {
return matrix_add(a, b);
}
Now all that needs to be done is to write this proxy:
struct matrix_add {
matrix_add(matrix const& a, matrix const& b) : a(a), b(b) { }
operator matrix() const {
matrix result;
// Do the addition.
return result;
}
private:
matrix const& a, b;
};
The magic lies in the method operator matrix() which is an implicit conversion operator from matrix_add to plain matrix. This way, you can chain multiple operations (by providing appropriate overloads of course). The evaluation takes place only when the final result is assigned to a matrix instance.
EDIT I should have been more explicit. As it is, the code makes no sense because although evaluation happens lazily, it still happens in the same expression. In particular, another addition will evaluate this code unless the matrix_add structure is changed to allow chained addition. C++0x greatly facilitates this by allowing variadic templates (i.e. template lists of variable length).
However, one very simple case where this code would actually have a real, direct benefit is the following:
int value = (A + B)(2, 3);
Here, it is assumed that A and B are two-dimensional matrices and that dereferencing is done in Fortran notation, i.e. the above calculates one element out of a matrix sum. It's of course wasteful to add the whole matrices. matrix_add to the rescue:
struct matrix_add {
// … yadda, yadda, yadda …
int operator ()(unsigned int x, unsigned int y) {
// Calculate *just one* element:
return a(x, y) + b(x, y);
}
};
Other examples abound. I've just remembered that I have implemented something related not long ago. Basically, I had to implement a string class that should adhere to a fixed, pre-defined interface. However, my particular string class dealt with huge strings that weren't actually stored in memory. Usually, the user would just access small substrings from the original string using a function infix. I overloaded this function for my string type to return a proxy that held a reference to my string, along with the desired start and end position. Only when this substring was actually used did it query a C API to retrieve this portion of the string.
Boost.Lambda is very nice, but Boost.Proto is exactly what you are looking for. It already has overloads of all C++ operators, which by default perform their usual function when proto::eval() is called, but can be changed.
What Konrad already explained can be put further to support nested invocations of operators, all executed lazily. In Konrad's example, he has an expression object that can store exactly two arguments, for exactly two operands of one operation. The problem is that it will only execute one subexpression lazily, which nicely explains the concept in lazy evaluation put in simple terms, but doesn't improve performance substantially. The other example shows also well how one can apply operator() to add only some elements using that expression object. But to evaluate arbitrary complex expressions, we need some mechanism that can store the structure of that too. We can't get around templates to do that. And the name for that is expression templates. The idea is that one templated expression object can store the structure of some arbitrary sub-expression recursively, like a tree, where the operations are the nodes, and the operands are the child-nodes. For a very good explanation i just found today (some days after i wrote the below code) see here.
template<typename Lhs, typename Rhs>
struct AddOp {
Lhs const& lhs;
Rhs const& rhs;
AddOp(Lhs const& lhs, Rhs const& rhs):lhs(lhs), rhs(rhs) {
// empty body
}
Lhs const& get_lhs() const { return lhs; }
Rhs const& get_rhs() const { return rhs; }
};
That will store any addition operation, even nested one, as can be seen by the following definition of an operator+ for a simple point type:
struct Point { int x, y; };
// add expression template with point at the right
template<typename Lhs, typename Rhs> AddOp<AddOp<Lhs, Rhs>, Point>
operator+(AddOp<Lhs, Rhs> const& lhs, Point const& p) {
return AddOp<AddOp<Lhs, Rhs>, Point>(lhs, p);
}
// add expression template with point at the left
template<typename Lhs, typename Rhs> AddOp< Point, AddOp<Lhs, Rhs> >
operator+(Point const& p, AddOp<Lhs, Rhs> const& rhs) {
return AddOp< Point, AddOp<Lhs, Rhs> >(p, rhs);
}
// add two points, yield a expression template
AddOp< Point, Point >
operator+(Point const& lhs, Point const& rhs) {
return AddOp<Point, Point>(lhs, rhs);
}
Now, if you have
Point p1 = { 1, 2 }, p2 = { 3, 4 }, p3 = { 5, 6 };
p1 + (p2 + p3); // returns AddOp< Point, AddOp<Point, Point> >
You now just need to overload operator= and add a suitable constructor for the Point type and accept AddOp. Change its definition to:
struct Point {
int x, y;
Point(int x = 0, int y = 0):x(x), y(y) { }
template<typename Lhs, typename Rhs>
Point(AddOp<Lhs, Rhs> const& op) {
x = op.get_x();
y = op.get_y();
}
template<typename Lhs, typename Rhs>
Point& operator=(AddOp<Lhs, Rhs> const& op) {
x = op.get_x();
y = op.get_y();
return *this;
}
int get_x() const { return x; }
int get_y() const { return y; }
};
And add the appropriate get_x and get_y into AddOp as member functions:
int get_x() const {
return lhs.get_x() + rhs.get_x();
}
int get_y() const {
return lhs.get_y() + rhs.get_y();
}
Note how we haven't created any temporaries of type Point. It could have been a big matrix with many fields. But at the time the result is needed, we calculate it lazily.
I have nothing to add to Konrad's post, but you can look at Eigen for an example of lazy evaluation done right, in a real world app. It is pretty awe inspiring.
I'm thinking about implementing a template class, that uses std::function. The class should, more or less, look like this:
template <typename Value>
class Lazy
{
public:
Lazy(std::function<Value()> function) : _function(function), _evaluated(false) {}
Value &operator*() { Evaluate(); return _value; }
Value *operator->() { Evaluate(); return &_value; }
private:
void Evaluate()
{
if (!_evaluated)
{
_value = _function();
_evaluated = true;
}
}
std::function<Value()> _function;
Value _value;
bool _evaluated;
};
For example usage:
class Noisy
{
public:
Noisy(int i = 0) : _i(i)
{
std::cout << "Noisy(" << _i << ")" << std::endl;
}
Noisy(const Noisy &that) : _i(that._i)
{
std::cout << "Noisy(const Noisy &)" << std::endl;
}
~Noisy()
{
std::cout << "~Noisy(" << _i << ")" << std::endl;
}
void MakeNoise()
{
std::cout << "MakeNoise(" << _i << ")" << std::endl;
}
private:
int _i;
};
int main()
{
Lazy<Noisy> n = [] () { return Noisy(10); };
std::cout << "about to make noise" << std::endl;
n->MakeNoise();
(*n).MakeNoise();
auto &nn = *n;
nn.MakeNoise();
}
Above code should produce the following message on the console:
Noisy(0)
about to make noise
Noisy(10)
~Noisy(10)
MakeNoise(10)
MakeNoise(10)
MakeNoise(10)
~Noisy(10)
Note that the constructor printing Noisy(10) will not be called until the variable is accessed.
This class is far from perfect, though. The first thing would be the default constructor of Value will have to be called on member initialization (printing Noisy(0) in this case). We can use pointer for _value instead, but I'm not sure whether it would affect the performance.
Johannes' answer works.But when it comes to more parentheses ,it doesn't work as wish. Here is an example.
Point p1 = { 1, 2 }, p2 = { 3, 4 }, p3 = { 5, 6 }, p4 = { 7, 8 };
(p1 + p2) + (p3+p4)// it works ,but not lazy enough
Because the three overloaded + operator didn't cover the case
AddOp<Llhs,Lrhs>+AddOp<Rlhs,Rrhs>
So the compiler has to convert either (p1+p2) or(p3+p4) to Point ,that's not lazy enough.And when compiler decides which to convert ,it complains. Because none is better than the other .
Here comes my extension: add yet another overloaded operator +
template <typename LLhs, typename LRhs, typename RLhs, typename RRhs>
AddOp<AddOp<LLhs, LRhs>, AddOp<RLhs, RRhs>> operator+(const AddOp<LLhs, LRhs> & leftOperandconst, const AddOp<RLhs, RRhs> & rightOperand)
{
return AddOp<AddOp<LLhs, LRhs>, AddOp<RLhs, RRhs>>(leftOperandconst, rightOperand);
}
Now ,the compiler can handle the case above correctly ,and no implicit conversion ,volia!
As it's going to be done in C++0x, by lambda expressions.
Anything is possible.
It depends on exactly what you mean:
class X
{
public: static X& getObjectA()
{
static X instanceA;
return instanceA;
}
};
Here we have the affect of a global variable that is lazily evaluated at the point of first use.
As newly requested in the question.
And stealing Konrad Rudolph design and extending it.
The Lazy object:
template<typename O,typename T1,typename T2>
struct Lazy
{
Lazy(T1 const& l,T2 const& r)
:lhs(l),rhs(r) {}
typedef typename O::Result Result;
operator Result() const
{
O op;
return op(lhs,rhs);
}
private:
T1 const& lhs;
T2 const& rhs;
};
How to use it:
namespace M
{
class Matrix
{
};
struct MatrixAdd
{
typedef Matrix Result;
Result operator()(Matrix const& lhs,Matrix const& rhs) const
{
Result r;
return r;
}
};
struct MatrixSub
{
typedef Matrix Result;
Result operator()(Matrix const& lhs,Matrix const& rhs) const
{
Result r;
return r;
}
};
template<typename T1,typename T2>
Lazy<MatrixAdd,T1,T2> operator+(T1 const& lhs,T2 const& rhs)
{
return Lazy<MatrixAdd,T1,T2>(lhs,rhs);
}
template<typename T1,typename T2>
Lazy<MatrixSub,T1,T2> operator-(T1 const& lhs,T2 const& rhs)
{
return Lazy<MatrixSub,T1,T2>(lhs,rhs);
}
}
In C++11 lazy evaluation similar to hiapay's answer can be achieved using std::shared_future. You still have to encapsulate calculations in lambdas but memoization is taken care of:
std::shared_future<int> a = std::async(std::launch::deferred, [](){ return 1+1; });
Here's a full example:
#include <iostream>
#include <future>
#define LAZY(EXPR, ...) std::async(std::launch::deferred, [__VA_ARGS__](){ std::cout << "evaluating "#EXPR << std::endl; return EXPR; })
int main() {
std::shared_future<int> f1 = LAZY(8);
std::shared_future<int> f2 = LAZY(2);
std::shared_future<int> f3 = LAZY(f1.get() * f2.get(), f1, f2);
std::cout << "f3 = " << f3.get() << std::endl;
std::cout << "f2 = " << f2.get() << std::endl;
std::cout << "f1 = " << f1.get() << std::endl;
return 0;
}
C++0x is nice and all.... but for those of us living in the present you have Boost lambda library and Boost Phoenix. Both with the intent of bringing large amounts of functional programming to C++.
Lets take Haskell as our inspiration - it being lazy to the core.
Also, let's keep in mind how Linq in C# uses Enumerators in a monadic (urgh - here is the word - sorry) way.
Last not least, lets keep in mind, what coroutines are supposed to provide to programmers. Namely the decoupling of computational steps (e.g. producer consumer) from each other.
And lets try to think about how coroutines relate to lazy evaluation.
All of the above appears to be somehow related.
Next, lets try to extract our personal definition of what "lazy" comes down to.
One interpretation is: We want to state our computation in a composable way, before executing it. Some of those parts we use to compose our complete solution might very well draw upon huge (sometimes infinite) data sources, with our full computation also either producing a finite or infinite result.
Lets get concrete and into some code. We need an example for that! Here, I choose the fizzbuzz "problem" as an example, just for the reason that there is some nice, lazy solution to it.
In Haskell, it looks like this:
module FizzBuzz
( fb
)
where
fb n =
fmap merge fizzBuzzAndNumbers
where
fizz = cycle ["","","fizz"]
buzz = cycle ["","","","","buzz"]
fizzBuzz = zipWith (++) fizz buzz
fizzBuzzAndNumbers = zip [1..n] fizzBuzz
merge (x,s) = if length s == 0 then show x else s
The Haskell function cycle creates an infinite list (lazy, of course!) from a finite list by simply repeating the values in the finite list forever. In an eager programming style, writing something like that would ring alarm bells (memory overflow, endless loops!). But not so in a lazy language. The trick is, that lazy lists are not computed right away. Maybe never. Normally only as much as subsequent code requires it.
The third line in the where block above creates another lazy!! list, by means of combining the infinite lists fizz and buzz by means of the single two elements recipe "concatenate a string element from either input list into a single string". Again, if this were to be immediately evaluated, we would have to wait for our computer to run out of resources.
In the 4th line, we create tuples of the members of a finite lazy list [1..n] with our infinite lazy list fizzbuzz. The result is still lazy.
Even in the main body of our fb function, there is no need to get eager. The whole function returns a list with the solution, which itself is -again- lazy. You could as well think of the result of fb 50 as a computation which you can (partially) evaluate later. Or combine with other stuff, leading to an even larger (lazy) evaluation.
So, in order to get started with our C++ version of "fizzbuzz", we need to think of ways how to combine partial steps of our computation into larger bits of computations, each drawing data from previous steps as required.
You can see the full story in a gist of mine.
Here the basic ideas behind the code:
Borrowing from C# and Linq, we "invent" a stateful, generic type Enumerator, which holds
- The current value of the partial computation
- The state of a partial computation (so we can produce subsequent values)
- The worker function, which produces the next state, the next value and a bool which states if there is more data or if the enumeration has come to an end.
In order to be able to compose Enumerator<T,S> instance by means of the power of the . (dot), this class also contains functions, borrowed from Haskell type classes such as Functor and Applicative.
The worker function for enumerator is always of the form: S -> std::tuple<bool,S,T where S is the generic type variable representing the state and T is the generic type variable representing a value - the result of a computation step.
All this is already visible in the first lines of the Enumerator class definition.
template <class T, class S>
class Enumerator
{
public:
typedef typename S State_t;
typedef typename T Value_t;
typedef std::function<
std::tuple<bool, State_t, Value_t>
(const State_t&
)
> Worker_t;
Enumerator(Worker_t worker, State_t s0)
: m_worker(worker)
, m_state(s0)
, m_value{}
{
}
// ...
};
So, all we need to create a specific enumerator instance, we need to create a worker function, have the initial state and create an instance of Enumerator with those two arguments.
Here an example - function range(first,last) creates a finite range of values. This corresponds to a lazy list in the Haskell world.
template <class T>
Enumerator<T, T> range(const T& first, const T& last)
{
auto finiteRange =
[first, last](const T& state)
{
T v = state;
T s1 = (state < last) ? (state + 1) : state;
bool active = state != s1;
return std::make_tuple(active, s1, v);
};
return Enumerator<T,T>(finiteRange, first);
}
And we can make use of this function, for example like this: auto r1 = range(size_t{1},10); - We have created ourselves a lazy list with 10 elements!
Now, all is missing for our "wow" experience, is to see how we can compose enumerators.
Coming back to Haskells cycle function, which is kind of cool. How would it look in our C++ world? Here it is:
template <class T, class S>
auto
cycle
( Enumerator<T, S> values
) -> Enumerator<T, S>
{
auto eternally =
[values](const S& state) -> std::tuple<bool, S, T>
{
auto[active, s1, v] = values.step(state);
if (active)
{
return std::make_tuple(active, s1, v);
}
else
{
return std::make_tuple(true, values.state(), v);
}
};
return Enumerator<T, S>(eternally, values.state());
}
It takes an enumerator as input and returns an enumerator. Local (lambda) function eternally simply resets the input enumeration to its start value whenever it runs out of values and voilà - we have an infinite, ever repeating version of the list we gave as an argument:: auto foo = cycle(range(size_t{1},3)); And we can already shamelessly compose our lazy "computations".
zip is a good example, showing that we can also create a new enumerator from two input enumerators. The resulting enumerator yields as many values as the smaller of either of the input enumerators (tuples with 2 element, one for each input enumerator). I have implemented zip inside class Enumerator itself. Here is how it looks like:
// member function of class Enumerator<S,T>
template <class T1, class S1>
auto
zip
( Enumerator<T1, S1> other
) -> Enumerator<std::tuple<T, T1>, std::tuple<S, S1> >
{
auto worker0 = this->m_worker;
auto worker1 = other.worker();
auto combine =
[worker0,worker1](std::tuple<S, S1> state) ->
std::tuple<bool, std::tuple<S, S1>, std::tuple<T, T1> >
{
auto[s0, s1] = state;
auto[active0, newS0, v0] = worker0(s0);
auto[active1, newS1, v1] = worker1(s1);
return std::make_tuple
( active0 && active1
, std::make_tuple(newS0, newS1)
, std::make_tuple(v0, v1)
);
};
return Enumerator<std::tuple<T, T1>, std::tuple<S, S1> >
( combine
, std::make_tuple(m_state, other.state())
);
}
Please note, how the "combining" also ends up in combining the state of both sources and the values of both sources.
As this post is already TL;DR; for many, here the...
Summary
Yes, lazy evaluation can be implemented in C++. Here, I did it by borrowing the function names from haskell and the paradigm from C# enumerators and Linq. There might be similarities to pythons itertools, btw. I think they followed a similar approach.
My implementation (see the gist link above) is just a prototype - not production code, btw. So no warranties whatsoever from my side. It serves well as demo code to get the general idea across, though.
And what would this answer be without the final C++ version of fizzbuz, eh? Here it is:
std::string fizzbuzz(size_t n)
{
typedef std::vector<std::string> SVec;
// merge (x,s) = if length s == 0 then show x else s
auto merge =
[](const std::tuple<size_t, std::string> & value)
-> std::string
{
auto[x, s] = value;
if (s.length() > 0) return s;
else return std::to_string(x);
};
SVec fizzes{ "","","fizz" };
SVec buzzes{ "","","","","buzz" };
return
range(size_t{ 1 }, n)
.zip
( cycle(iterRange(fizzes.cbegin(), fizzes.cend()))
.zipWith
( std::function(concatStrings)
, cycle(iterRange(buzzes.cbegin(), buzzes.cend()))
)
)
.map<std::string>(merge)
.statefulFold<std::ostringstream&>
(
[](std::ostringstream& oss, const std::string& s)
{
if (0 == oss.tellp())
{
oss << s;
}
else
{
oss << "," << s;
}
}
, std::ostringstream()
)
.str();
}
And... to drive the point home even further - here a variation of fizzbuzz which returns an "infinite list" to the caller:
typedef std::vector<std::string> SVec;
static const SVec fizzes{ "","","fizz" };
static const SVec buzzes{ "","","","","buzz" };
auto fizzbuzzInfinite() -> decltype(auto)
{
// merge (x,s) = if length s == 0 then show x else s
auto merge =
[](const std::tuple<size_t, std::string> & value)
-> std::string
{
auto[x, s] = value;
if (s.length() > 0) return s;
else return std::to_string(x);
};
auto result =
range(size_t{ 1 })
.zip
(cycle(iterRange(fizzes.cbegin(), fizzes.cend()))
.zipWith
(std::function(concatStrings)
, cycle(iterRange(buzzes.cbegin(), buzzes.cend()))
)
)
.map<std::string>(merge)
;
return result;
}
It is worth showing, since you can learn from it how to dodge the question what the exact return type of that function is (as it depends on the implementation of the function alone, namely how the code combines the enumerators).
Also it demonstrates that we had to move the vectors fizzes and buzzes outside the scope of the function so they are still around when eventually on the outside, the lazy mechanism produces values. If we had not done that, the iterRange(..) code would have stored iterators to the vectors which are long gone.
Using a very simple definition of lazy evaluation, which is the value is not evaluated until needed, I would say that one could implement this through the use of a pointer and macros (for syntax sugar).
#include <stdatomic.h>
#define lazy(var_type) lazy_ ## var_type
#define def_lazy_type( var_type ) \
typedef _Atomic var_type _atomic_ ## var_type; \
typedef _atomic_ ## var_type * lazy(var_type); //pointer to atomic type
#define def_lazy_variable(var_type, var_name ) \
_atomic_ ## var_type _ ## var_name; \
lazy_ ## var_type var_name = & _ ## var_name;
#define assign_lazy( var_name, val ) atomic_store( & _ ## var_name, val )
#define eval_lazy(var_name) atomic_load( &(*var_name) )
#include <stdio.h>
def_lazy_type(int)
void print_power2 ( lazy(int) i )
{
printf( "%d\n", eval_lazy(i) * eval_lazy(i) );
}
typedef struct {
int a;
} simple;
def_lazy_type(simple)
void print_simple ( lazy(simple) s )
{
simple temp = eval_lazy(s);
printf("%d\n", temp.a );
}
#define def_lazy_array1( var_type, nElements, var_name ) \
_atomic_ ## var_type _ ## var_name [ nElements ]; \
lazy(var_type) var_name = _ ## var_name;
int main ( )
{
//declarations
def_lazy_variable( int, X )
def_lazy_variable( simple, Y)
def_lazy_array1(int,10,Z)
simple new_simple;
//first the lazy int
assign_lazy(X,111);
print_power2(X);
//second the lazy struct
new_simple.a = 555;
assign_lazy(Y,new_simple);
print_simple ( Y );
//third the array of lazy ints
for(int i=0; i < 10; i++)
{
assign_lazy( Z[i], i );
}
for(int i=0; i < 10; i++)
{
int r = eval_lazy( &Z[i] ); //must pass with &
printf("%d\n", r );
}
return 0;
}
You'll notice in the function print_power2 there is a macro called eval_lazy which does nothing more than dereference a pointer to get the value just prior to when it's actually needed. The lazy type is accessed atomically, so it's completely thread-safe.