C++ Math Parser with user-defined function - c++

I want to implement a math parser with user-defined function.
There are several problems to be solved.
For example, int eg(int a,int b){return a+b;} is the function I want to add to the parser.
First: How to store all the functions into a container?
std::map<std::string,boost::any> func_map may be a choose (by func_map["eg"]=eg". However, It's very hard to call the function in this kind of map, for I have to use any_cast<T> to get the real function from the wrapper of boost::any.
Second: How to handle the overloaded function?
It's true that I can distinguish the overloaded functions by the method of typeid, but it's far from a real implementation.
Parsering expressions is not a difficult skill and the hardest part is described above.
muparserx provides an interesting solution for this problem, but I'm finding another method.
I'm not familiar with lambda expressions but may be it's an acceptable way.
Update:
I need something like this:
int eg(int a,int b){ return a+b;}
int eg(int a,int b, string c){return a+b+c.length();}
double eh(string a){return length.size()/double(2);}
int main(){
multimap<string,PACKED_FUNC> func_map;
func_map.insert(make_pair("eg",pack_function<int,int>(eg));
func_map.insert(make_pair("eg",pack_function<int,int,string>(eg));
func_map.insert(make_pair("eh",pack_function<string>(eh));
auto p1=make_tuple(1,2);
int result1=apply("eg",PACK_TUPLE(p1));//result1=3
auto p2=tuple_cat(p1,make_tuple("test"));
int result2=apply("eg",PACK_TUPLE(p2));//result2=7
auto p3=make_tuple("testagain");
double result3=apply("eh",PACK_TUPLE(p3));//result3=4.5
return 0;
}

How to store all the functions into a container?
To store then inside some container, they must be of the same type. The std::function wrapper is a good choice, since this allows you to use even stateful function objects. Since you probably don't want all functions to take the same number of arguments, you need to "extract" the arity of the functions from the static host type system. An easy solution is to use functions that accept a std::vector:
// Arguments type to the function "interface"
using Arguments = std::vector<int> const &;
// the interface
using Function = std::function<int (Arguments)>;
But you don't want your users to write functions that have to unpack their arguments manually, so it's sensible to automate that.
// Base case of packing a function.
// If it's taking a vector and no more
// arguments, then there's nothing left to
// pack.
template<
std::size_t N,
typename Fn>
Function pack(Fn && fn) {
return
[fn = std::forward<decltype(fn)>(fn)]
(Arguments arguments)
{
if (N != arguments.size()) {
throw
std::string{"wrong number of arguments, expected "} +
std::to_string(N) +
std::string{" but got "} +
std::to_string(arguments.size());
}
return fn(arguments);
};
}
The above code handles the easy case: A function that already accepts a vector. For all other functions they need to be wrapped and packed into a newly created function. Doing this one argument a time makes this relatively easy:
// pack a function to a function that takes
// it's arguments from a vector, one argument after
// the other.
template<
std::size_t N,
typename Arg,
typename... Args,
typename Fn>
Function pack(Fn && fn) {
return pack<N+1, Args...>(
[fn = std::forward<decltype(fn)>(fn)]
(Arguments arguments, Args const &... args)
{
return fn(
arguments,
arguments[N],
args...);
});
}
The above only works with (special) functions that already take a vector. For normal functions we need an function to turn them into such special functions:
// transform a function into one that takes its
// arguments from a vector
template<
typename... Args,
typename Fn>
Function pack_function(Fn && fn) {
return pack<0, Args...>(
[fn = std::forward<decltype(fn)>(fn)]
(Arguments arguments, Args const &... args)
{
return fn(args...);
});
}
Using this, you can pack any function up to be the same type:
Function fn =
pack_function<int, int>([] (auto lhs, auto rhs) {return lhs - rhs;});
You can then have them in a map, and call them using some vector, parsed from some input:
int main(int, char**) {
std::map<std::string, Function> operations;
operations ["add"] = pack_function<int, int>(add);
operations ["sub"] = pack_function<int, int>(
[](auto lhs, auto rhs) { return lhs - rhs;});
operations ["sum"] = [] (auto summands) {
int result = 0;
for (auto e : summands) {
result += e;
}
return result;
};
std::string line;
while (std::getline(std::cin, line)) {
std::istringstream command{line};
std::string operation;
command >> operation;
std::vector<int> arguments {
std::istream_iterator<int>{command},
std::istream_iterator<int>{} };
auto function = operations.find(operation);
if (function != operations.end ()) {
std::cout << line << " = ";
try {
std::cout << function->second(arguments);
} catch (std::string const & error) {
std::cout << error;
}
std::cout << std::endl;
}
}
return 0;
}
A live demo of the above code is here.
How to handle the overloaded function? It's true that I can distinguish the overloaded functions by the method of typeid, but it's far from a real implementation.
As you see, you don't need to, if you pack the relevant information into the function. Btw, typeid shouldn't be used for anything but diagnostics, as it's not guaranteed to return different strings with different types.
Now, finally, to handle functions that don't only take a different number of arguments, but also differ in the types of their arguments, you need to unify those types into a single one. That's normally called a "sum type", and very easy to achieve in languages like Haskell:
data Sum = IVal Int | SVal String
-- A value of type Sum is either an Int or a String
In C++ this is a lot harder to achieve, but a simple sketch could look such:
struct Base {
virtual ~Base() = 0;
};
inline Base::~Base() {}
template<typename Target>
struct Storage : public Base {
Target value;
};
struct Any {
std::unique_ptr<Base const> value;
template<typename Target>
Target const & as(void) const {
return
dynamic_cast<Storage<Target> const &>(*value).value;
}
};
template<typename Target>
auto make_any(Target && value) {
return Any{std::make_unique<Storage<Target>>(value)};
}
But this is only a rough sketch, since there's boost::any which should work perfectly for this case. Note that the above and also boost::any are not quite like a real sum type (they can be any type, not just one from a given selection), but that shouldn't matter in your case.
I hope this gets you started :)
Since you had problems adding multi type support I expanded a bit on the above sketch and got it working. The code is far from being production ready, though: I'm throwing strings around and don't talk to me about perfect forwarding :D
The main change to the above Any class is the use of a shared pointer instead of a unique one. This is only because it saved me from writing copy and move constructors and assignment operators.
Apart from that I added a member function to be able to print an Any value to a stream and added the respective operator:
struct Base {
virtual ~Base() = 0;
virtual void print_to(std::ostream &) const = 0;
};
inline Base::~Base() {}
template<typename Target>
struct Storage : public Base {
Target value;
Storage (Target t) // screw perfect forwarding :D
: value(std::forward<Target>(t)) {}
void print_to(std::ostream & stream) const {
stream << value;
}
};
struct Any {
std::shared_ptr<Base const> value;
template<typename Target>
Target const & as(void) const {
return
dynamic_cast<Storage<Target> const &>(*value).value;
}
template<typename T>
operator T const &(void) const {
return as<T>();
}
friend std::ostream & operator<<(std::ostream& stream, Any const & thing) {
thing.value->print_to(stream);
return stream;
}
};
template<typename Target>
Any make_any(Target && value) {
return Any{std::make_shared<Storage<typename std::remove_reference<Target>::type> const>(std::forward<Target>(value))};
}
I also wrote a small "parsing" function which shows how to turn a raw literal into an Any value containing (in this case) either an integer, a double or a string value:
Any parse_literal(std::string const & literal) {
try {
std::size_t next;
auto integer = std::stoi(literal, & next);
if (next == literal.size()) {
return make_any (integer);
}
auto floating = std::stod(literal, & next);
if (next == literal. size()) {
return make_any (floating);
}
} catch (std::invalid_argument const &) {}
// not very sensible, string literals should better be
// enclosed in some form of quotes, but that's the
// job of the parser
return make_any<std:: string> (std::string{literal});
}
std::istream & operator>>(std::istream & stream, Any & thing) {
std::string raw;
if (stream >> raw) {
thing = parse_literal (raw);
}
return stream;
}
By also providing operator>> it's possible to keep using istream_iterators for input.
The packing functions (or more precisely the functions returned by them) are also modified: When passing an element from the arguments vector to the next function, an conversion from Any to the respective argument type is performed. This may also fail, in which case a std::bad_cast is caught and an informative message rethrown. The innermost function (the lambda created inside pack_function) wraps its result into an make_any call.
add 5 4 = 9
sub 3 2 = 1
add 1 2 3 = wrong number of arguments, expected 2 but got 3
add 4 = wrong number of arguments, expected 2 but got 1
sum 1 2 3 4 = 10
sum = 0
sub 3 1.5 = argument 1 has wrong type
addf 3 3.4 = argument 0 has wrong type
addf 3.0 3.4 = 6.4
hi Pete = Hello Pete, how are you?
An example similar to the previous one can be found here. I need to add that this Any type doesn't support implicit type conversions, so when you have an Any with an int stored, you cannot pass that to an function expecting a double. Though this can be implemented (by manually providing a lot of conversion rules).
But I also saw your update, so I took that code and applied the necessary modifications to run with my presented solution:
Any apply (multimap<string, Function> const & map, string const & name, Arguments arguments) {
auto range = map.equal_range(name);
for (auto function = range.first;
function != range.second;
++function) {
try {
return (function->second)(arguments);
} catch (string const &) {}
}
throw string {" no such function "};
}
int eg(int a,int b){ return a+b;}
int eg(int a,int b, string c){return a+b+c.length();}
double eh(string a){return a.size()/double(2);}
int main(){
multimap<string, Function> func_map;
func_map.insert(make_pair(
"eg",pack_function<int,int>(
static_cast<int(*)(int, int)>(&eg))));
func_map.insert(make_pair(
"eg",pack_function<int,int,string>(
static_cast<int (*)(int, int, string)>(&eg))));
func_map.insert(make_pair(
"eh",pack_function<string>(eh)));
// auto p1=make_tuple(1,2);
// if you want tuples, just write a
// function to covert them to a vector
// of Any.
Arguments p1 =
{make_any (1), make_any (2)};
int result1 =
apply(func_map, "eg", p1).as<int>();
vector<Any> p2{p1};
p2.push_back(make_any<string> ("test"));
int result2 =
apply(func_map, "eg", p2).as<int>();
Arguments p3 = {make_any<string>("testagain")};
double result3 =
apply(func_map, "eh", p3).as<double>();
cout << result1 << endl;
cout << result2 << endl;
cout << result3 << endl;
return 0;
}
It doesn't use tuples, but you could write a (template recursive) function to access each element of a tuple, wrap it into an Any and pack it inside a vector.
Also I'm not sure why the implicit conversion from Any doesn't work when initialising the result variables.
Hm, converting it to use boost::any shouldn't be that difficult. First, the make_any would just use boost::any's constructor:
template<typename T>
boost::any make_any(T&& value) {
return boost::any{std::forward<T>(value)};
}
In the pack function, the only thing that I'd guess needs to be changed is the "extraction" of the correct type from the current element in the arguments vector. Currently this is as simple as arguments.at(N), relying on implicit conversion to the required type. Since boost::any doesn't support implicit conversion, you need to use boost::any_cast to get to the underlying value:
template<
std::size_t N,
typename Arg,
typename... Args,
typename Fn>
Function pack(Fn && fn) {
return pack<N+1, Args...>(
[fn = std::forward<decltype(fn)>(fn)]
(Arguments arguments, Args const &... args)
{
try {
return fn(
arguments,
boost::any_cast<Arg>(arguments.at(N)),
args...);
} catch (boost::bad_any_cast const &) { // throws different type of exception
throw std::string{"argument "} + std::to_string (N) +
std::string{" has wrong type "};
}
});
}
And of course, if you use it like in the example you provided you also need to use boost::any_cast to access the result value.
This should (in theory) do it, eventually you need to add some std::remove_reference "magic" to the template parameter of the boost::any_cast calls, but I doubt that this is neccessary.
(typename std::remove_reference<T>::type instead of just T)
Though I currently cannot test any of the above.

Related

Use std::variant as class member and apply visitor

I'm trying to use std::variant as a class member variable and then use operator overloading so that the two Variants of this class can use the operator plus to produce a new variable. The problem is that std::get does not work as I thought and so I cannot retrieve the correct (hardcoded) string types so that the AddVisitor struct is used.
I get a compilation error that says: no matching function for call to ‘get<0>(std::basic_string&)’
Also is there a way that operator+ function detects the type without if-else statements?
I have already checked a lot of answers in SO including ones that answer questions about similar Boost functionality, but I cannot get it to work.
#include <iostream>
#include <variant>
#include <string>
#include "stdafx.h"
using Variant = std::variant<int, std::string>;
template<typename T>
struct AddVisitor
{
T operator()(T v1, T v2)
{
return v1 + v2;
}
};
class Var
{
Variant v;
public:
template<typename T>
Var(T value) : v(value) {}
Var operator+(Var& val)
{
// PROBLEM: This is a hard coded example that I want to use, so that concatenation of two strings happens.
return std::visit(AddVisitor<std::string>(), std::get<std::string>(v), std::get<std::string>(val.get()));
// Is there a way to get the correct type without if-else statements here?
}
Variant get()
{
return v;
}
};
int main()
{
Var x("Hello "), y("World");
// The expected output is this:
Var res = x + y;
return 0;
}
I expect to be able to use the plus operator and concatenate two strings or two integers and create a new Var variable.
Ok, so there are a few things to talk about.
First, the visitor for std::visit with more than one variant argument should accept all combinations of variant types. In your case it should accept:
(string, string)
(string, int)
(int, int)
(int, string)
If for you only string, string and int, int are valid you still need to accept the other combinations for the code to compile, but you can throw in them.
Next, the visitor shouldn't be templated. Instead the operator() should be templated or overloaded for all the above combinations.
So here is AddVisitor:
struct AddVisitor
{
auto operator()(const std::string& a, const std::string& b) const -> Variant
{
return a + b;
}
auto operator()(int a, int b) const -> Variant
{
return a + b;
}
// all other overloads invalid
template <class T, class U>
auto operator()(T, U) const -> Variant
{
throw std::invalid_argument{"invalid"};
}
};
It's not clear from documentation what the overloads can return, but I couldn't make it compile unless all return Variant. Fortunately the compiler errors are TREMENDOUSLY HELPFULL . (I need to check the standard).
Next, when you call std::visit you need to pass the variants you have.
So the final code is this:
auto operator+(Var& val) -> Var
{
return std::visit(AddVisitor{}, get(), val.get());
}
And you can indeed use it like you want:
Var res = x + y;
Another issue with your code is that get makes unnecessary copies. And copies of std::variant are not cheap to make. So I suggest:
auto get() const -> const Variant& { return v; }
auto get() -> Variant& { return v; }

Lambda type deduction

auto dothings = [](long position) {
auto variable;
/*do things*/
return variable;
};
float x = dothings(1l);
char y = dothings(2l);
Basically, the thing I'm curious about, is whether is it possible in any way for the variable inside the lambda to deduce the type the return value is assigned to, in this situation it's float and char. Is there an equivalent to template typename? Thanks.
This can be done, but it's a) kinda complex, and b), not really a great idea, 99.9% of the time. Here's how you proceed. The only way you can do something based on the type you assign the expression to is to take advantage of implicit conversions. This requires a templated implicit conversation operator, which can't be declared locally in a lambda, so we have to start by writing a bit of support code:
template <class T>
struct identity {
T get(); // not defined
};
template <class F>
struct ReturnConverter {
F f;
template <class T>
operator T() {
return f(identity<T>{});
}
};
template <class F>
auto makeReturnConverter(F f) { return ReturnConverter<F>{f}; }
The first class is just to help the lambda with inferring types. The second class is one that itself takes a lambda (or any callable), and has an implicit conversion operator to any type. When the conversion is asked for, it calls the callable, using our identity class template as a way to feed the type. We can use this now like this:
auto doIt = [] (long l) {
return makeReturnConverter([=] (auto t) {
return l + sizeof(decltype(t.get()));
});
};
This lambda creates our special ReturnConverter class by feeding in another lambda. This lambda captures the long l argument of the outer lambda (by value), and it's prepared to accept our special identity class as the sole argument. It can then back out the "target" or destination type. I use sizeof here to quickly show an example where the result depends on both the argument to the lambda, and the target type. But note that once I get the type using decltype(t.get()), I could actually declare a variable of that type and do whatever I wanted with it.
float x = doIt(5);
double y = doIt(2);
After these calls, x will be 9 (a float is size 4, + 5) and y will be 10 (double is size 8, + 2).
Here's a more interesting example:
auto doIt2 = [] (long l) {
return makeReturnConverter([=] (auto t) {
decltype(t.get()) x;
for (std::size_t i = 0; i != l; ++i ) {
x.push_back(i*i);
}
return x;
});
};
std::vector<int> x = doIt2(5);
std::deque<int> y = doIt2(5);
Here, I'm able to generically build a standard container according to what the left side asks for, as long as the standard container has the push_back method.
Like I said to start, this is overly complicated and hard to justify in the vast majority of cases. Usually you could just specific type as an explicit (non-inferred) template parameter and auto the left. I have used it in very specific cases though. For instance, in gtest, when you declare test fixtures, any reused data are declared as member variables in the fixture. Non static member variables have to be declared with their type (can't use auto), so I used this trick to allow quickly building certain kinds of fixture data while keeping repetition to as close to zero as I could. It's ok in this use case because that code that code doesn't need to be very robust but really wants to minimize repetition at almost any cost; usually this isn't such a good trade-off (and with auto available on the left isn't usually necessary).
The answer here is NO.
The type returned is based on the values used inside the function
auto dothings = [](long position) {
auto variable; // This is illegal as the compiler can not deduce the type.
// An auto declaration needs some way for it to determine the type.
// So that when you use it below it can determine the
// return type of the function.
/*do things*/
return variable;
};
The assignment operator looks at the type of the result expression to see if there are any auto conversions that can be applied to convert the function result type into the destination of the assignment type.
char x = dothings(10); // the result of the expression on the right
// is converted to char before assignment. If
// it can't be converted then it is a compiler error.
You can think of lambdas as syntactic sugar for creating a functor.
[<capture List>](<Parameter List>) {
<Code>
}
Is the same as:
struct <CompilerGeneratedName>
{
<Capture List With Types>;
Constructor(<Capture List>)
: <Capture List With Types>(<Capture List>)
{}
<Calculated Return> operator()(<Parameter List>) const {
<Code>
}
}{<Capture List>};
Example:
{
int y = 4;
auto l1 = [&y](int x){return y++ + x;}
struct MyF
{
int& y;
MyF(int& y)
: y(y)
{}
int operator()(int x) const {
return y++ + x;
}
};
auto l2 = MyF(y);
std::cout << l2(5) << " " << l1(5) << "\n";
}
The only way I know to do this is to use a template parameter:
template<typename T>
T dothings(long value)
{
T result;
...
return result;
}
auto x = dothings<float>(1l);
auto y = dothings<char>(2l);
The template also allows you to specialize for certain types. So if you wanted different logic for doubles, you could write the following:
template<>
double dothings<double>(long value)
{
T result;
...
return result * 2.0;
}

Variadic arguments and function pointers vector

I'm facing an almost-logical problem while working on C++11.
I have a class I have to plot (aka draw a trend) and I want to exclude all the points which do not satisfy a given condition.
The points are of the class Foo and all the conditional functions are defined with the signature bool Foo::Bar(Args...) const where Args... represents a number of parameters (e.g. upper and lower limits on the returned value).
Everything went well up to the moment I wished to apply a single condition to the values to plot. Let's say I have a FooPlotter class which has something like:
template<class ...Args> GraphClass FooPlotter::Plot([...],bool (Foo::*Bar)(Args...), Args... args)
Which will iterate over my data container and apply the condition Foo::*Bar to all the elements, plotting the values which satisfy the given condition.
So far so good.
At a given point I wanted to pass a vector of conditions to the same method, in order to use several conditions to filter data.
I first created a class to contain everything I need to have later:
template<class ...Args> class FooCondition{
public:
FooCondition(bool (Foo::*Bar)(Args...) const, Args... args)
{
fCondition = Bar;
fArgs = std::make_tuple(args);
}
bool operator()(Foo data){ return (data.*fCondition)(args); }
private:
bool (Foo::*fCondition)(Args...) const;
std::tuple<Args...> fArgs;
};
Then I got stuck on how to define a (iterable) container which can contain FooCondition objects despite them having several types for the Args... arguments pack.
The problem is that some methods have Args... = uint64_t,uint_64_t while others require no argument to be called.
I digged a bit on how to handle this kind of situation. I tried several approaches, but none of them worked well.
For the moment I added ignored arguments to all the Bar methods, uniformising them and working-around the issue, but I am not really satisfied!
Has some of you an idea on how to store differently typed FooCondition objects in an elegant way?
EDIT: Additional information on the result I want to obtain.
First I want to be able to create a std::vector of FooCondition items:
std::vector<FooCondition> conditions;
conditions.emplace_back(FooCondition(&Foo::IsBefore, uint64_t timestamp1));
conditions.emplace_back(FooCondition(&Foo::IsAttributeBetween, double a, double b));
conditions.emplace_back(FooCondition(&Foo::IsOk));
At this point I wish I can do something like the following, in my FooPlotter::Plot method:
GraphClass FooPlotter::Plot(vector<Foo> data, vector<FooCondition> conditions){
GraphClass graph;
for(const auto &itData : data){
bool shouldPlot = true;
for(const auto &itCondition : conditions){
shouldPlot &= itCondition(itData);
}
if(shouldPlot) graph.AddPoint(itData);
}
return graph;
}
As you can argue the FooCondition struct should pass the right arguments to the method automatically using the overloaded operator.
Here the issue is to find the correct container to be able to create a collection of FooCondition templates despite the size of their arguments pack.
It seems to me that, with FooCondition you're trying to create a substitute for a std::function<bool(Foo *)> (or maybe std::function<bool(Foo const *)>) initialized with a std::bind that fix some arguments for Foo methods.
I mean... I think that instead of
std::vector<FooCondition> conditions;
conditions.emplace_back(FooCondition(&Foo::IsBefore, uint64_t timestamp1));
conditions.emplace_back(FooCondition(&Foo::IsAttributeBetween, double a, double b));
conditions.emplace_back(FooCondition(&Foo::IsOk));
you should write something as
std::vector<std::function<bool(Foo const *)>> vfc;
using namespace std::placeholders;
vfc.emplace_back(std::bind(&Foo::IsBefore, _1, 64U));
vfc.emplace_back(std::bind(&Foo::IsAttributeBetween, _1, 10.0, 100.0));
vfc.emplace_back(std::bind(&Foo::IsOk, _1));
The following is a simplified full working C++11 example with a main() that simulate Plot()
#include <vector>
#include <iostream>
#include <functional>
struct Foo
{
double value;
bool IsBefore (std::uint64_t ts) const
{ std::cout << "- IsBefore(" << ts << ')' << std::endl;
return value < ts; }
bool IsAttributeBetween (double a, double b) const
{ std::cout << "- IsAttrributeBetwen(" << a << ", " << b << ')'
<< std::endl; return (a < value) && (value < b); }
bool IsOk () const
{ std::cout << "- IsOk" << std::endl; return value != 0.0; }
};
int main ()
{
std::vector<std::function<bool(Foo const *)>> vfc;
using namespace std::placeholders;
vfc.emplace_back(std::bind(&Foo::IsBefore, _1, 64U));
vfc.emplace_back(std::bind(&Foo::IsAttributeBetween, _1, 10.0, 100.0));
vfc.emplace_back(std::bind(&Foo::IsOk, _1));
std::vector<Foo> vf { Foo{0.0}, Foo{10.0}, Foo{20.0}, Foo{80.0} };
for ( auto const & f : vf )
{
bool bval { true };
for ( auto const & c : vfc )
bval &= c(&f);
std::cout << "---- for " << f.value << ": " << bval << std::endl;
}
}
Another way is avoid the use of std::bind and use lambda function instead.
By example
std::vector<std::function<bool(Foo const *)>> vfc;
vfc.emplace_back([](Foo const * fp)
{ return fp->IsBefore(64U); });
vfc.emplace_back([](Foo const * fp)
{ return fp->IsAttributeBetween(10.0, 100.0); });
vfc.emplace_back([](Foo const * fp)
{ return fp->IsOk(); });
All of the foo bar aside you just need a class with a method which can be implemented to satisfy the plot.
Just add a Plot method on the class which accepts the node and perform the transformation and plotting in the same step.
You need not worry about args when plotting because each function knows what arguments it needs.
Thus a simple args* will suffice and when null no arguments, therein each arg reveals it's type and value or can be assumed from the function invocation.

Overload function for arguments (not) deducable at compile time

Is there a way to overload a function in a way to distinguish between the argument being evaluable at compile time or at runtime only?
Suppose I have the following function:
std::string lookup(int x) {
return table<x>::value;
}
which allows me to select a string value based on the parameter x in constant time (with space overhead). However, in some cases x cannot be provided at compile time, and I need to run a version of foo which does the lookup with a higher time complexity.
I could use functions with a different name of course, but I would like to have an unified interface.
I accepted an answer, but I'm still interested if this distinction is possible with exactly the same function call.
I believe the closest you can get is to overload lookup on int and std::integral_constant<int>; then, if the caller knows the value at compile-type, they can call the latter overload:
#include <type_traits>
#include <string>
std::string lookup(int const& x) // a
{
return "a"; // high-complexity lookup using x
}
template<int x>
std::string lookup(std::integral_constant<int, x>) // b
{
return "b"; // return table<x>::value;
}
template<typename T = void>
void lookup(int const&&) // c
{
static_assert(
!std::is_same<T, T>{},
"to pass a compile-time constant to lookup, pass"
" an instance of std::integral_constant<int>"
);
}
template<int N>
using int_ = std::integral_constant<int, N>;
int main()
{
int x = 3;
int const y = 3;
constexpr int z = 3;
lookup(x); // calls a
lookup(y); // calls a
lookup(z); // calls a
lookup(int_<3>{}); // calls b
lookup(3); // calls c, compile-time error
}
Online Demo
Notes:
I've provided an int_ helper here so construction of std::integral_constant<int> is less verbose for the caller; this is optional.
Overload c will have false negatives (e.g. constexpr int variables are passed to overload a, not overload c), but this will weed out any actual int literals.
One option would be to use overloading in a similar manner:
template <int x> std::string find() {
return table<x>::value;
}
std::string find(int x) {
return ...
}
There is also this trick:
std::string lookup(int x) {
switch(x) {
case 0: return table<0>::value;
case 1: return table<1>::value;
case 2: return table<2>::value;
case 3: return table<3>::value;
default: return generic_lookup(x);
}
This sort of thing works well when it's advantageous, but not required, for the integer to be known at compile time. For example, if it helps the optimizer. It can be hell on compile times though, if you're calling many instances of some complicated function in this way.

Lazy evaluation in C++

C++ does not have native support for lazy evaluation (as Haskell does).
I'm wondering if it is possible to implement lazy evaluation in C++ in a reasonable manner. If yes, how would you do it?
EDIT: I like Konrad Rudolph's answer.
I'm wondering if it's possible to implement it in a more generic fashion, for example by using a parametrized class lazy that essentially works for T the way matrix_add works for matrix.
Any operation on T would return lazy instead. The only problem is to store the arguments and operation code inside lazy itself. Can anyone see how to improve this?
I'm wondering if it is possible to implement lazy evaluation in C++ in a reasonable manner. If yes, how would you do it?
Yes, this is possible and quite often done, e.g. for matrix calculations. The main mechanism to facilitate this is operator overloading. Consider the case of matrix addition. The signature of the function would usually look something like this:
matrix operator +(matrix const& a, matrix const& b);
Now, to make this function lazy, it's enough to return a proxy instead of the actual result:
struct matrix_add;
matrix_add operator +(matrix const& a, matrix const& b) {
return matrix_add(a, b);
}
Now all that needs to be done is to write this proxy:
struct matrix_add {
matrix_add(matrix const& a, matrix const& b) : a(a), b(b) { }
operator matrix() const {
matrix result;
// Do the addition.
return result;
}
private:
matrix const& a, b;
};
The magic lies in the method operator matrix() which is an implicit conversion operator from matrix_add to plain matrix. This way, you can chain multiple operations (by providing appropriate overloads of course). The evaluation takes place only when the final result is assigned to a matrix instance.
EDIT I should have been more explicit. As it is, the code makes no sense because although evaluation happens lazily, it still happens in the same expression. In particular, another addition will evaluate this code unless the matrix_add structure is changed to allow chained addition. C++0x greatly facilitates this by allowing variadic templates (i.e. template lists of variable length).
However, one very simple case where this code would actually have a real, direct benefit is the following:
int value = (A + B)(2, 3);
Here, it is assumed that A and B are two-dimensional matrices and that dereferencing is done in Fortran notation, i.e. the above calculates one element out of a matrix sum. It's of course wasteful to add the whole matrices. matrix_add to the rescue:
struct matrix_add {
// … yadda, yadda, yadda …
int operator ()(unsigned int x, unsigned int y) {
// Calculate *just one* element:
return a(x, y) + b(x, y);
}
};
Other examples abound. I've just remembered that I have implemented something related not long ago. Basically, I had to implement a string class that should adhere to a fixed, pre-defined interface. However, my particular string class dealt with huge strings that weren't actually stored in memory. Usually, the user would just access small substrings from the original string using a function infix. I overloaded this function for my string type to return a proxy that held a reference to my string, along with the desired start and end position. Only when this substring was actually used did it query a C API to retrieve this portion of the string.
Boost.Lambda is very nice, but Boost.Proto is exactly what you are looking for. It already has overloads of all C++ operators, which by default perform their usual function when proto::eval() is called, but can be changed.
What Konrad already explained can be put further to support nested invocations of operators, all executed lazily. In Konrad's example, he has an expression object that can store exactly two arguments, for exactly two operands of one operation. The problem is that it will only execute one subexpression lazily, which nicely explains the concept in lazy evaluation put in simple terms, but doesn't improve performance substantially. The other example shows also well how one can apply operator() to add only some elements using that expression object. But to evaluate arbitrary complex expressions, we need some mechanism that can store the structure of that too. We can't get around templates to do that. And the name for that is expression templates. The idea is that one templated expression object can store the structure of some arbitrary sub-expression recursively, like a tree, where the operations are the nodes, and the operands are the child-nodes. For a very good explanation i just found today (some days after i wrote the below code) see here.
template<typename Lhs, typename Rhs>
struct AddOp {
Lhs const& lhs;
Rhs const& rhs;
AddOp(Lhs const& lhs, Rhs const& rhs):lhs(lhs), rhs(rhs) {
// empty body
}
Lhs const& get_lhs() const { return lhs; }
Rhs const& get_rhs() const { return rhs; }
};
That will store any addition operation, even nested one, as can be seen by the following definition of an operator+ for a simple point type:
struct Point { int x, y; };
// add expression template with point at the right
template<typename Lhs, typename Rhs> AddOp<AddOp<Lhs, Rhs>, Point>
operator+(AddOp<Lhs, Rhs> const& lhs, Point const& p) {
return AddOp<AddOp<Lhs, Rhs>, Point>(lhs, p);
}
// add expression template with point at the left
template<typename Lhs, typename Rhs> AddOp< Point, AddOp<Lhs, Rhs> >
operator+(Point const& p, AddOp<Lhs, Rhs> const& rhs) {
return AddOp< Point, AddOp<Lhs, Rhs> >(p, rhs);
}
// add two points, yield a expression template
AddOp< Point, Point >
operator+(Point const& lhs, Point const& rhs) {
return AddOp<Point, Point>(lhs, rhs);
}
Now, if you have
Point p1 = { 1, 2 }, p2 = { 3, 4 }, p3 = { 5, 6 };
p1 + (p2 + p3); // returns AddOp< Point, AddOp<Point, Point> >
You now just need to overload operator= and add a suitable constructor for the Point type and accept AddOp. Change its definition to:
struct Point {
int x, y;
Point(int x = 0, int y = 0):x(x), y(y) { }
template<typename Lhs, typename Rhs>
Point(AddOp<Lhs, Rhs> const& op) {
x = op.get_x();
y = op.get_y();
}
template<typename Lhs, typename Rhs>
Point& operator=(AddOp<Lhs, Rhs> const& op) {
x = op.get_x();
y = op.get_y();
return *this;
}
int get_x() const { return x; }
int get_y() const { return y; }
};
And add the appropriate get_x and get_y into AddOp as member functions:
int get_x() const {
return lhs.get_x() + rhs.get_x();
}
int get_y() const {
return lhs.get_y() + rhs.get_y();
}
Note how we haven't created any temporaries of type Point. It could have been a big matrix with many fields. But at the time the result is needed, we calculate it lazily.
I have nothing to add to Konrad's post, but you can look at Eigen for an example of lazy evaluation done right, in a real world app. It is pretty awe inspiring.
I'm thinking about implementing a template class, that uses std::function. The class should, more or less, look like this:
template <typename Value>
class Lazy
{
public:
Lazy(std::function<Value()> function) : _function(function), _evaluated(false) {}
Value &operator*() { Evaluate(); return _value; }
Value *operator->() { Evaluate(); return &_value; }
private:
void Evaluate()
{
if (!_evaluated)
{
_value = _function();
_evaluated = true;
}
}
std::function<Value()> _function;
Value _value;
bool _evaluated;
};
For example usage:
class Noisy
{
public:
Noisy(int i = 0) : _i(i)
{
std::cout << "Noisy(" << _i << ")" << std::endl;
}
Noisy(const Noisy &that) : _i(that._i)
{
std::cout << "Noisy(const Noisy &)" << std::endl;
}
~Noisy()
{
std::cout << "~Noisy(" << _i << ")" << std::endl;
}
void MakeNoise()
{
std::cout << "MakeNoise(" << _i << ")" << std::endl;
}
private:
int _i;
};
int main()
{
Lazy<Noisy> n = [] () { return Noisy(10); };
std::cout << "about to make noise" << std::endl;
n->MakeNoise();
(*n).MakeNoise();
auto &nn = *n;
nn.MakeNoise();
}
Above code should produce the following message on the console:
Noisy(0)
about to make noise
Noisy(10)
~Noisy(10)
MakeNoise(10)
MakeNoise(10)
MakeNoise(10)
~Noisy(10)
Note that the constructor printing Noisy(10) will not be called until the variable is accessed.
This class is far from perfect, though. The first thing would be the default constructor of Value will have to be called on member initialization (printing Noisy(0) in this case). We can use pointer for _value instead, but I'm not sure whether it would affect the performance.
Johannes' answer works.But when it comes to more parentheses ,it doesn't work as wish. Here is an example.
Point p1 = { 1, 2 }, p2 = { 3, 4 }, p3 = { 5, 6 }, p4 = { 7, 8 };
(p1 + p2) + (p3+p4)// it works ,but not lazy enough
Because the three overloaded + operator didn't cover the case
AddOp<Llhs,Lrhs>+AddOp<Rlhs,Rrhs>
So the compiler has to convert either (p1+p2) or(p3+p4) to Point ,that's not lazy enough.And when compiler decides which to convert ,it complains. Because none is better than the other .
Here comes my extension: add yet another overloaded operator +
template <typename LLhs, typename LRhs, typename RLhs, typename RRhs>
AddOp<AddOp<LLhs, LRhs>, AddOp<RLhs, RRhs>> operator+(const AddOp<LLhs, LRhs> & leftOperandconst, const AddOp<RLhs, RRhs> & rightOperand)
{
return AddOp<AddOp<LLhs, LRhs>, AddOp<RLhs, RRhs>>(leftOperandconst, rightOperand);
}
Now ,the compiler can handle the case above correctly ,and no implicit conversion ,volia!
As it's going to be done in C++0x, by lambda expressions.
Anything is possible.
It depends on exactly what you mean:
class X
{
public: static X& getObjectA()
{
static X instanceA;
return instanceA;
}
};
Here we have the affect of a global variable that is lazily evaluated at the point of first use.
As newly requested in the question.
And stealing Konrad Rudolph design and extending it.
The Lazy object:
template<typename O,typename T1,typename T2>
struct Lazy
{
Lazy(T1 const& l,T2 const& r)
:lhs(l),rhs(r) {}
typedef typename O::Result Result;
operator Result() const
{
O op;
return op(lhs,rhs);
}
private:
T1 const& lhs;
T2 const& rhs;
};
How to use it:
namespace M
{
class Matrix
{
};
struct MatrixAdd
{
typedef Matrix Result;
Result operator()(Matrix const& lhs,Matrix const& rhs) const
{
Result r;
return r;
}
};
struct MatrixSub
{
typedef Matrix Result;
Result operator()(Matrix const& lhs,Matrix const& rhs) const
{
Result r;
return r;
}
};
template<typename T1,typename T2>
Lazy<MatrixAdd,T1,T2> operator+(T1 const& lhs,T2 const& rhs)
{
return Lazy<MatrixAdd,T1,T2>(lhs,rhs);
}
template<typename T1,typename T2>
Lazy<MatrixSub,T1,T2> operator-(T1 const& lhs,T2 const& rhs)
{
return Lazy<MatrixSub,T1,T2>(lhs,rhs);
}
}
In C++11 lazy evaluation similar to hiapay's answer can be achieved using std::shared_future. You still have to encapsulate calculations in lambdas but memoization is taken care of:
std::shared_future<int> a = std::async(std::launch::deferred, [](){ return 1+1; });
Here's a full example:
#include <iostream>
#include <future>
#define LAZY(EXPR, ...) std::async(std::launch::deferred, [__VA_ARGS__](){ std::cout << "evaluating "#EXPR << std::endl; return EXPR; })
int main() {
std::shared_future<int> f1 = LAZY(8);
std::shared_future<int> f2 = LAZY(2);
std::shared_future<int> f3 = LAZY(f1.get() * f2.get(), f1, f2);
std::cout << "f3 = " << f3.get() << std::endl;
std::cout << "f2 = " << f2.get() << std::endl;
std::cout << "f1 = " << f1.get() << std::endl;
return 0;
}
C++0x is nice and all.... but for those of us living in the present you have Boost lambda library and Boost Phoenix. Both with the intent of bringing large amounts of functional programming to C++.
Lets take Haskell as our inspiration - it being lazy to the core.
Also, let's keep in mind how Linq in C# uses Enumerators in a monadic (urgh - here is the word - sorry) way.
Last not least, lets keep in mind, what coroutines are supposed to provide to programmers. Namely the decoupling of computational steps (e.g. producer consumer) from each other.
And lets try to think about how coroutines relate to lazy evaluation.
All of the above appears to be somehow related.
Next, lets try to extract our personal definition of what "lazy" comes down to.
One interpretation is: We want to state our computation in a composable way, before executing it. Some of those parts we use to compose our complete solution might very well draw upon huge (sometimes infinite) data sources, with our full computation also either producing a finite or infinite result.
Lets get concrete and into some code. We need an example for that! Here, I choose the fizzbuzz "problem" as an example, just for the reason that there is some nice, lazy solution to it.
In Haskell, it looks like this:
module FizzBuzz
( fb
)
where
fb n =
fmap merge fizzBuzzAndNumbers
where
fizz = cycle ["","","fizz"]
buzz = cycle ["","","","","buzz"]
fizzBuzz = zipWith (++) fizz buzz
fizzBuzzAndNumbers = zip [1..n] fizzBuzz
merge (x,s) = if length s == 0 then show x else s
The Haskell function cycle creates an infinite list (lazy, of course!) from a finite list by simply repeating the values in the finite list forever. In an eager programming style, writing something like that would ring alarm bells (memory overflow, endless loops!). But not so in a lazy language. The trick is, that lazy lists are not computed right away. Maybe never. Normally only as much as subsequent code requires it.
The third line in the where block above creates another lazy!! list, by means of combining the infinite lists fizz and buzz by means of the single two elements recipe "concatenate a string element from either input list into a single string". Again, if this were to be immediately evaluated, we would have to wait for our computer to run out of resources.
In the 4th line, we create tuples of the members of a finite lazy list [1..n] with our infinite lazy list fizzbuzz. The result is still lazy.
Even in the main body of our fb function, there is no need to get eager. The whole function returns a list with the solution, which itself is -again- lazy. You could as well think of the result of fb 50 as a computation which you can (partially) evaluate later. Or combine with other stuff, leading to an even larger (lazy) evaluation.
So, in order to get started with our C++ version of "fizzbuzz", we need to think of ways how to combine partial steps of our computation into larger bits of computations, each drawing data from previous steps as required.
You can see the full story in a gist of mine.
Here the basic ideas behind the code:
Borrowing from C# and Linq, we "invent" a stateful, generic type Enumerator, which holds
- The current value of the partial computation
- The state of a partial computation (so we can produce subsequent values)
- The worker function, which produces the next state, the next value and a bool which states if there is more data or if the enumeration has come to an end.
In order to be able to compose Enumerator<T,S> instance by means of the power of the . (dot), this class also contains functions, borrowed from Haskell type classes such as Functor and Applicative.
The worker function for enumerator is always of the form: S -> std::tuple<bool,S,T where S is the generic type variable representing the state and T is the generic type variable representing a value - the result of a computation step.
All this is already visible in the first lines of the Enumerator class definition.
template <class T, class S>
class Enumerator
{
public:
typedef typename S State_t;
typedef typename T Value_t;
typedef std::function<
std::tuple<bool, State_t, Value_t>
(const State_t&
)
> Worker_t;
Enumerator(Worker_t worker, State_t s0)
: m_worker(worker)
, m_state(s0)
, m_value{}
{
}
// ...
};
So, all we need to create a specific enumerator instance, we need to create a worker function, have the initial state and create an instance of Enumerator with those two arguments.
Here an example - function range(first,last) creates a finite range of values. This corresponds to a lazy list in the Haskell world.
template <class T>
Enumerator<T, T> range(const T& first, const T& last)
{
auto finiteRange =
[first, last](const T& state)
{
T v = state;
T s1 = (state < last) ? (state + 1) : state;
bool active = state != s1;
return std::make_tuple(active, s1, v);
};
return Enumerator<T,T>(finiteRange, first);
}
And we can make use of this function, for example like this: auto r1 = range(size_t{1},10); - We have created ourselves a lazy list with 10 elements!
Now, all is missing for our "wow" experience, is to see how we can compose enumerators.
Coming back to Haskells cycle function, which is kind of cool. How would it look in our C++ world? Here it is:
template <class T, class S>
auto
cycle
( Enumerator<T, S> values
) -> Enumerator<T, S>
{
auto eternally =
[values](const S& state) -> std::tuple<bool, S, T>
{
auto[active, s1, v] = values.step(state);
if (active)
{
return std::make_tuple(active, s1, v);
}
else
{
return std::make_tuple(true, values.state(), v);
}
};
return Enumerator<T, S>(eternally, values.state());
}
It takes an enumerator as input and returns an enumerator. Local (lambda) function eternally simply resets the input enumeration to its start value whenever it runs out of values and voilà - we have an infinite, ever repeating version of the list we gave as an argument:: auto foo = cycle(range(size_t{1},3)); And we can already shamelessly compose our lazy "computations".
zip is a good example, showing that we can also create a new enumerator from two input enumerators. The resulting enumerator yields as many values as the smaller of either of the input enumerators (tuples with 2 element, one for each input enumerator). I have implemented zip inside class Enumerator itself. Here is how it looks like:
// member function of class Enumerator<S,T>
template <class T1, class S1>
auto
zip
( Enumerator<T1, S1> other
) -> Enumerator<std::tuple<T, T1>, std::tuple<S, S1> >
{
auto worker0 = this->m_worker;
auto worker1 = other.worker();
auto combine =
[worker0,worker1](std::tuple<S, S1> state) ->
std::tuple<bool, std::tuple<S, S1>, std::tuple<T, T1> >
{
auto[s0, s1] = state;
auto[active0, newS0, v0] = worker0(s0);
auto[active1, newS1, v1] = worker1(s1);
return std::make_tuple
( active0 && active1
, std::make_tuple(newS0, newS1)
, std::make_tuple(v0, v1)
);
};
return Enumerator<std::tuple<T, T1>, std::tuple<S, S1> >
( combine
, std::make_tuple(m_state, other.state())
);
}
Please note, how the "combining" also ends up in combining the state of both sources and the values of both sources.
As this post is already TL;DR; for many, here the...
Summary
Yes, lazy evaluation can be implemented in C++. Here, I did it by borrowing the function names from haskell and the paradigm from C# enumerators and Linq. There might be similarities to pythons itertools, btw. I think they followed a similar approach.
My implementation (see the gist link above) is just a prototype - not production code, btw. So no warranties whatsoever from my side. It serves well as demo code to get the general idea across, though.
And what would this answer be without the final C++ version of fizzbuz, eh? Here it is:
std::string fizzbuzz(size_t n)
{
typedef std::vector<std::string> SVec;
// merge (x,s) = if length s == 0 then show x else s
auto merge =
[](const std::tuple<size_t, std::string> & value)
-> std::string
{
auto[x, s] = value;
if (s.length() > 0) return s;
else return std::to_string(x);
};
SVec fizzes{ "","","fizz" };
SVec buzzes{ "","","","","buzz" };
return
range(size_t{ 1 }, n)
.zip
( cycle(iterRange(fizzes.cbegin(), fizzes.cend()))
.zipWith
( std::function(concatStrings)
, cycle(iterRange(buzzes.cbegin(), buzzes.cend()))
)
)
.map<std::string>(merge)
.statefulFold<std::ostringstream&>
(
[](std::ostringstream& oss, const std::string& s)
{
if (0 == oss.tellp())
{
oss << s;
}
else
{
oss << "," << s;
}
}
, std::ostringstream()
)
.str();
}
And... to drive the point home even further - here a variation of fizzbuzz which returns an "infinite list" to the caller:
typedef std::vector<std::string> SVec;
static const SVec fizzes{ "","","fizz" };
static const SVec buzzes{ "","","","","buzz" };
auto fizzbuzzInfinite() -> decltype(auto)
{
// merge (x,s) = if length s == 0 then show x else s
auto merge =
[](const std::tuple<size_t, std::string> & value)
-> std::string
{
auto[x, s] = value;
if (s.length() > 0) return s;
else return std::to_string(x);
};
auto result =
range(size_t{ 1 })
.zip
(cycle(iterRange(fizzes.cbegin(), fizzes.cend()))
.zipWith
(std::function(concatStrings)
, cycle(iterRange(buzzes.cbegin(), buzzes.cend()))
)
)
.map<std::string>(merge)
;
return result;
}
It is worth showing, since you can learn from it how to dodge the question what the exact return type of that function is (as it depends on the implementation of the function alone, namely how the code combines the enumerators).
Also it demonstrates that we had to move the vectors fizzes and buzzes outside the scope of the function so they are still around when eventually on the outside, the lazy mechanism produces values. If we had not done that, the iterRange(..) code would have stored iterators to the vectors which are long gone.
Using a very simple definition of lazy evaluation, which is the value is not evaluated until needed, I would say that one could implement this through the use of a pointer and macros (for syntax sugar).
#include <stdatomic.h>
#define lazy(var_type) lazy_ ## var_type
#define def_lazy_type( var_type ) \
typedef _Atomic var_type _atomic_ ## var_type; \
typedef _atomic_ ## var_type * lazy(var_type); //pointer to atomic type
#define def_lazy_variable(var_type, var_name ) \
_atomic_ ## var_type _ ## var_name; \
lazy_ ## var_type var_name = & _ ## var_name;
#define assign_lazy( var_name, val ) atomic_store( & _ ## var_name, val )
#define eval_lazy(var_name) atomic_load( &(*var_name) )
#include <stdio.h>
def_lazy_type(int)
void print_power2 ( lazy(int) i )
{
printf( "%d\n", eval_lazy(i) * eval_lazy(i) );
}
typedef struct {
int a;
} simple;
def_lazy_type(simple)
void print_simple ( lazy(simple) s )
{
simple temp = eval_lazy(s);
printf("%d\n", temp.a );
}
#define def_lazy_array1( var_type, nElements, var_name ) \
_atomic_ ## var_type _ ## var_name [ nElements ]; \
lazy(var_type) var_name = _ ## var_name;
int main ( )
{
//declarations
def_lazy_variable( int, X )
def_lazy_variable( simple, Y)
def_lazy_array1(int,10,Z)
simple new_simple;
//first the lazy int
assign_lazy(X,111);
print_power2(X);
//second the lazy struct
new_simple.a = 555;
assign_lazy(Y,new_simple);
print_simple ( Y );
//third the array of lazy ints
for(int i=0; i < 10; i++)
{
assign_lazy( Z[i], i );
}
for(int i=0; i < 10; i++)
{
int r = eval_lazy( &Z[i] ); //must pass with &
printf("%d\n", r );
}
return 0;
}
You'll notice in the function print_power2 there is a macro called eval_lazy which does nothing more than dereference a pointer to get the value just prior to when it's actually needed. The lazy type is accessed atomically, so it's completely thread-safe.