Lazy evaluation for subset of class methods - c++

I'm looking to make a general, lazy evaluation-esque procedure to streamline my code.
Right now, I have the ability to speed up the execution of mathematical functions - provided that I pre-process it by calling another method first. More concretely, given a function of the type:
const Eigen::MatrixXd<double, -1, -1> function_name(const Eigen::MatrixXd<double, -1, -1>& input)
I can pass this into another function, g, which will produce a new version of function_name g_p, which can be executed faster.
I would like to abstract all this busy-work away from the end-user. Ideally, I'd like to make a class such that when any function f matching function_name's method signature is called on any input (say, x), the following happens instead:
The class checks if f has been called before.
If it hasn't, it calls g(f), followed by g_p(x).
If it has, it just calls g_p(x)
This is tricky for two reasons. The first, is I don't know how to get a reference to the current method, or if that's even possible, and pass it to g. There might be a way around this, but passing one function to the other would be simplest/cleanest for me.
The second bigger issue is how to force the calls to g. I have read about the execute around pattern, which almost works for this purpose - except that, unless I'm understanding it wrong, it would be impossible to reference f in the surrounding function calls.
Is there any way to cleanly implement my dream class? I ideally want to eventually generalize beyond the type of function_name (perhaps with templates), but can take this one step at a time. I am also open to other solution to get the same functionality.

I don't think a "perfect" solution is possible in C++, for the following reasons.
If the calling site says:
result = object->f(x);
as compiled this will call into the unoptimized version. At this point you're pretty much hamstrung, since there's no way in C++ to change where a function call goes, that's determined at compile-time for static linkage, and at runtime via vtable lookup for virtual (dynamic) linkage. Whatever the case, it's not something you can directly alter. Other languages do allow this, e.g. Lua, and rather ironically C++'s great-grandfather BCPL also permits it. However C++ doesn't.
TL;DR to get a workable solution to this, you need to modify either the called function, or every calling site that uses one of these.
Long answer: you'll need to do one of two things. You can either offload the problem to the called class and make all functions look something like this:
const <return_type> myclass:f(x)
{
static auto unoptimized = [](x) -> <return_type>
{
// Do the optimizable heavy lifting here;
return whatever;
};
static auto optimized = g(unoptimized);
return optimized(x);
}
However I very strongly suspect this is exactly what you don't want to do, because assuming the end-user you're talking about is the author of the class, this fails your requirement to offload this from the end-user.
However, you can also solve it by using a template, but that requires modification to every place you call one of these. In essence you encapsulate the above logic in a template function, replacing unoptimized with the bare class member, and leaving most everything else alone. Then you just call the template function at the calling site, and it should work.
This does have the advantage of a relatively small change at the calling site:
result = object->f(x);
becomes either:
result = optimize(object->f, x);
or:
result = optimize(object->f)(x);
depending on how you set the optimize template up. It also has the advantage of no changes at all to the class.
So I guess it comes down to where you wan't to make the changes.
Yet another choice. Would it be an option to take the class as authored by the end user, and pass the cpp and h files through a custom pre-processor? That could go through the class and automatically make the changes outlined above, which then yields the advantage of no change needed at the calling site.

Related

Good practice in C++ function/method design

I have a confusion about C++ function/method design as below:
1.
class ArithmeticCalculation
{
private:
float num1_;
float num2_;
float sum_;
void addTwoNumbers();
};
2.
class ArithmeticCalculation
{
private:
float addTwoNumbers(float num1, float num2);
};
In 1., one can basically declare a class variable and the void addTwoNumbers() will just implement it and assign to the class variable (sum_). I found using 1. is cleaner but using 2. looks like it more intuitive for function use.
Which one is actually best option considering the function/method is not restricted to only this basic addition functionality -- I mean in general how to decide to use with return or simply void?
The major difference between the two functions is that the second one is stateless*, while the first one has a state. Other things being equal, stateless approach is preferred, because it gives the users of your class more flexibility at utilizing your class in their systems. For example, stateless functions are re-entrant, while functions that rely on state may require the code that uses them to take additional measures that prevent incorrect use.
Re-entrancy alone is a big reason to prefer stateless functions whenever possible. However, there are situations when keeping state becomes more economical - for example, when you are using Builder Design Pattern.
Another important advantage of keeping your functions stateless whenever it is possible is that the call sequence becomes more readable. A call of a method that relies on the state consists of these parts:
Set up the object before the call
Make the call
Harvest the result of the call (optional)
Human readers of your code will have much easier time reading the call that uses a function invocation with parameter passing than the three-part setup-call-get result sequence.
There are situations when you have to have state, for example, when you want to defer the action. In this case the parameters are supplied by one part of the code, while the computation is initiated by some other part of the code. In terms of your example, one function would call set_num1 and set_num2, while another function would call addTwoNumbers at some later time. In situations like this you could save the parameters on the object itself, or create a separate object with deferred parameters.
* This is only an assumption based on the signature of your member function. Your second function gets all the data that it needs as parameters, and returns the value to the caller; Obviously, implementations may choose to add some state, e.g. by saving the last result, but that is uncommon for addTwoNumbers functions, so I assume that your code does not do it.
The first function doesn't really make a lot of sense. What numbers? Where does the result go? The name doesn't describe the expected side-effects, nor the origin of the numbers in question.
The second function makes it abundantly clear what's going on, where the result is, and how that function might be used.
Your functions should strive to communicate their intent based on the function signature. If that's not sufficient you'll need to add comments or documentation, but no amount of commenting or documentation can pave over a misleading or confusing signature.
Think about what your function's responsibility is as well as whatever expectations it has when naming things. For example:
void whatever(const int);
What does that function do? Could you even guess without looking at code or documentation?
Compare with the same function given a much more meaningful name:
void detonateReactor(const int countdownTimeInSeconds);
It seems pretty clear what that does now, as well as what side-effects it will have.
You probably had in mind something like this for the first option:
struct Adder {
float sum;
float a;
float b;
void addNumbers(){ sum = a+b; }
};
that would be used like this:
Adder adder;
adder.a = 1.0;
adder.b = 2.0;
adder.addNumbers();
std::cout << adder.sum << "\n";
There is no single good argument to do this when you actually wanted this:
float addTwoNumbers(float a,float b) { return a+b; }
std::cout << addTwoNumbers(1.0,2.0) << "\n";
Not everything has to be inside a class. Actually not everything should be inside a class (C++ isnt Java). If you need a function that adds two numbers then write a function that adds two numbers and dont overthink it.

Where should the user-defined parameters of a framework be ?

I am kind of a newbie and I am creating a framework to evolve objects in C++ with an evolutionary algorithm.
An evolutionary algorithm evolves objects and tests them to get the best solution (for example, evolve the weights neural network and test it on sample data, so that in the end you get a network which has a good accuracy, without having trained it).
My problem is that there are lots of parameters for the algorithm (type of selection/crossover/mutation, probabilities for each of them...) and since it is a framework, the user should be able to easily access and modify them.
CURRENT SOLUTION
For now, I created a header file parameters.h of this form:
// DON'T CHANGE THESE PARAMETERS
//mutation type
#define FLIP 1
#define ADD_CONNECTION 2
#define RM_CONNECTION 3
// USER DEFINED
static const int TYPE_OF_MUTATION = FLIP;
The user modifies the static variables TYPE_OF_MUTATION and then my mutation function tests what the value of TYPE_OF_MUTATION is and calls the right mutation function.
This works well, but it has a few drawbacks:
when I change a parameter in this header and then call "make", no change is taken into account, I have to call "make clean" then "make". From what I saw, it is not a problem in the makefile but it is how building works. Even if it did re-build when I change a parameter, it would mean re-compile the whole project as these parameters are used everywhere; it is definitely not efficient.
if you want to run the genetic algorithm several times with different parameters, you have to run it a first time then save the results, change the parameters then run it a second time etc.
OTHER POSSIBILITIES
I thought about taking these parameters as arguments of the top-level function. The problem is that the function would then take 20 arguments or so, it doesn't seem really readable...
What I mean about the top-level function is that for now, the evolutionary algorithm is run simply by doing this:
PopulationManager myPop;
myPop.evolveIt();
If I defined the parameters as arguments, we would have something like:
PopulationManager myPop;
myPop.evolveIt(20,10,5,FLIP,9,8,2,3,TOURNAMENT,0,23,4);
You can see how hellish it may be to always define parameters in the right order !
CONCLUSION
The frameworks I know make you build your algorithm yourself from pre-defined functions, but the user shouldn't have to go through all the code to change parameters one by one.
It may be useful to indicate that this framework will be used internally, for a definite set of projects.
Any input about the best way to define these parameters is welcome !
If the options do not change I usually use a struct for this:
enum class MutationType {
Flip,
AddConnection,
RemoveConnection
};
struct Options {
// Documentation for mutation_type.
MutationType mutation_type = MutationType::Flip;
// Documentation for integer option.
int integer_option = 10;
};
And then provide a constructor that takes these options.
Options options;
options.mutation_type = MutationType::AddConnection;
PopulationManager population(options);
C++11 makes this really easy, because it allows specifying defaults for the options, so a user only needs to set the options that need to be different from the default.
Also note that I used an enum for the options, this ensures that the user can only use correct values.
This is a classic example of polymorphism. In your proposed implementation you're doing a switch on constant to decide which polymorphic mutation algorithm you will choose to decide how to mutate the parameter. In C++, the corresponding mechanisms are templates (static polymorphism) or virtual functions (dynamic polymorphism) to select the appropriate mutating algorithm to apply to the parameter.
The templates way has the advantage that everything is resolvable at compile time and the resulting mutating algorithm could be inlined entirely, depending on the implementation. What you give up is the ability to dynamically select parameter mutation algorithms at runtime.
The virtual function way has the advantage that you can defer the choice of mutation algorithm until runtime, allowing this to vary based on input from the user or whatnot. The disadvantage is that the mutation algorithm can no longer be inlined and you pay the cost of a virtual function call (an extra level of indirection) when you mutate the parameter.
If you want to see a real example of how "algorithmic mutation" can work, look at evolve.cpp in my Iterated Dynamics repository on github. This is C code converted to C++ so it is neither using templates nor using virtual functions. Instead it uses function pointers and a switch-on-constant to select the appropriate code. However, the idea is the same.
My recommendation would be to see if you can use static polymorphism (templates) first. From your initial description you were fixing the mutation at compile-time anyway, so you're not giving anything up.
If that was just a prototyping phase and you intended to support switching of mutation algorithms at runtime, then look at virtual functions. As the other answer recommended, please shun C-style coding like #define constants and instead use proper enums.
To solve the "long parameter list smell", the idea of packing all the parameters into a structure is a good one. You can achieve more readability on top of that by using the builder pattern to build up the structure of parameters in a more readable way than just assigning a bunch of values into a struct. In this blog post, I applied the builder pattern to the resource description structures in Direct3D. That allowed me to more directly express these "bags of data" with reasonable defaults and directly reveal my intent to override or replace default values with special values when necessary.

Is it idiomatically ok to put algorithm into class?

I have a complex algorithm. This uses many variables, calculates helper arrays at initialization and also calculates arrays along the way. Since the algorithm is complex, I break it down into several functions.
Now, I actually do not see how this might be a class from an idiomatic way; I mean, I am just used to have algorithms as functions. The usage would simply be:
Calculation calc(/* several parameters */);
calc.calculate();
// get the heterogenous results via getters
On the other hand, putting this into a class has the following advantages:
I do not have to pass all the variables to the other functions/methods
arrays initialized at the beginning of the algorithm are accessible throughout the class in each function
my code is shorter and (imo) clearer
A hybrid way would be to put the algorithm class into a source file and access it via a function that uses it. The user of the algorithm would not see the class.
Does anyone have valuable thoughts that might help me out?
Thank you very much in advance!
I have a complex algorithm. This uses many variables, calculates helper arrays at initialization and also calculates arrays along the way.[...]
Now, I actually do not see how this might be a class from an idiomatic way
It is not, but many people do the same thing you do (so did I a few times).
Instead of creating a class for your algorithm, consider transforming your inputs and outputs into classes/structures.
That is, instead of:
Calculation calc(a, b, c, d, e, f, g);
calc.calculate();
// use getters on calc from here on
you could write:
CalcInputs inputs(a, b, c, d, e, f, g);
CalcResult output = calculate(inputs); // calculate is now free function
// use getters on output from here on
This doesn't create any problems and performs the same (actually better) grouping of data.
I'd say it is very idiomatic to represent an algorithm (or perhaps better, a computation) as a class. One of the definitions of object class from OOP is "data and functions to operate on that data." A compex algorithm with its inputs, outputs and intermediary data matches this definition perfectly.
I've done this myself several times, and it simplifies (human) code flow analysis significantly, making the whole thing easier to reason about, to debug and to test.
If the abstraction for the client code is an algorithm, you
probably want to keep a pure functional interface, and not
introduce additional types there. It's quite common, on the
other hand, for such a function to be implemented in a source
file which defines a common data structure or class for its
internal use, so you might have:
double calculation( /* input parameters */ )
{
SupportClass calc( /* input parameters */ );
calc.part1();
calc.part2();
// etc...
return calc.results();
}
Depending on how your code is organized, SupportClass will be
in an unnamed namespace in the source file (probably the most
common case), or in a "private" header, included only by the
sources involved in the algorith.
It really depends of what kind of algorithm you want to encapsulate. Generally I agree with John Carmack : "Sometimes, the elegant implementation is just a function. Not a method. Not a class. Not a framework. Just a function."
It really boils down to: do the algorithm need access to the private area of the class that is not supposed to be public? If the answer is yes (unless you are willing to refactor your class interface, depending on the specific cases) you should go with a member function, if not, then a free function is good enough.
Take for example the standard library. Most of the algorithms are provided as free functions because they only access the public interface of the class (with iterators for standard containers, for example).
Do you need to call the exact same functions in the exact same order each time? Then you shouldn't be requiring calling code to do this. Splitting your algorithm into multiple functions is fine, but I'd still have one call the next and then the next and so on, with a struct of results/parameters being passed along the way. A class doesn't feel right for a one-off invocation of some procedure.
The only way I'd do this with a class is if the class encapsulates all the input data itself, and you then call myClass.nameOfMyAlgorithm() on it, among other potential operations. Then you have data+manipulators. But just manipulators? Yeah, I'm not so sure.
In modern C++ the distinction has been eroded quite a bit. Even from the operator overloading of the pre-ANSI language, you could create a class whose instances are syntactically like functions:
struct Multiplier
{
int factor_;
Multiplier(int f) : factor_(f) { }
int operator()(int v) const
{
return v * _factor;
}
};
Multipler doubler(2);
std::cout << doubler(3) << std::endl; // prints 6
Such a class/struct is called a functor, and can capture "contextual" values in its constructor. This allows you to effectively pass the parameters to a function in two stages: some in the constructor call, some later each time you call it for real. This is called partial function application.
To relate this to your example, your calculate member function could be turned into operator(), and then the Calculation instance would be a function! (or near enough.)
To unify these ideas, you can try thinking of a plain function as a functor of which there is only one instance (and hence no need for a constructor - although this is no guarantee that the function only depends on its formal parameters: it might depend on global variables...)
Rather than asking "Should I put this algorithm in a function or a class?" instead ask yourself "Would it be useful to be able to pass the parameters to this algorithm in two or more stages?" In your example, all the parameters go into the constructor, and none in the later call to calculate, so it makes little sense to ask users of your class make two calls.
In C++11 the distinction breaks down further (and things get a lot more convenient), in recognition of the fluidity of these ideas:
auto doubler = [] (int val) { return val * 2; };
std::cout << doubler(3) << std::endl; // prints 6
Here, doubler is a lambda, which is essentially a nifty way to declare an instance of a compiler-generated class that implements the () operator.
Reproducing the original example more exactly, we would want a function-like thing called multiplier that accepts a factor, and returns another function-like thing that accepts a value v and returns v * factor.
auto multiplier = [] (int factor)
{
return [=] (int v) { return v * factor; };
};
auto doubler = multiplier(2);
std::cout << doubler(3) << std::endl; // prints 6
Note the pattern: ultimately we're multiplying two numbers, but we specify the numbers in two steps. The functor we get back from calling multiplier acts like a "package" containing the first number.
Although lambdas are relatively new, they are likely to become a very common part of C++ style (as they have in every other language they've been added to).
But sadly at this point we've reached the "cutting edge" as the above example works in GCC but not in MSVC 12 (I haven't tried it in MSVC 13). It does pass the intellisense checking of MSVC 12 though (they use two completely different compilers)! And you can fix it by wrapping the inner lambda with std::function<int(int)>( ... ).
Even so, you can use these ideas in old-school C++ when writing functors by hand.
Looking further ahead, resumable functions may make it into some future version of the language (Microsoft is pushing hard for them as they are practically identical to async/await in C#) and that is yet another blurring of the distinction between functions and classes (a resumable function acts like a constructor for a state machine class).

Metaprogramming C/C++ using the preprocessor

So I have this huge tree that is basically a big switch/case with string keys and different function calls on one common object depending on the key and one piece of metadata.
Every entry basically looks like this
} else if ( strcmp(key, "key_string") == 0) {
((class_name*)object)->do_something();
} else if ( ...
where do_something can have different invocations, so I can't just use function pointers. Also, some keys require object to be cast to a subclass.
Now, if I were to code this in a higher level language, I would use a dictionary of lambdas to simplify this.
It occurred to me that I could use macros to simplify this to something like
case_call("key_string", class_name, do_something());
case_call( /* ... */ )
where case_call would be a macro that would expand this code to the first code snippet.
However, I am very much on the fence whether that would be considered good style. I mean, it would reduce typing work and improve the DRYness of the code, but then it really seems to abuse the macro system somewhat.
Would you go down that road, or rather type out the whole thing? And what would be your reasoning for doing so?
Edit
Some clarification:
This code is used as a glue layer between a simplified scripting API which accesses several different aspects of a C++ API as simple key-value properties. The properties are implemented in different ways in C++ though: Some have getter/setter methods, some are set in a special struct. Scripting actions reference C++ objects casted to a common base class. However, some actions are only available on certain subclasses and have to be cast down.
Further down the road, I may change the actual C++ API, but for the moment, it has to be regarded as unchangeable. Also, this has to work on an embedded compiler, so boost or C++11 are (sadly) not available.
I would suggest you slightly reverse the roles. You are saying that the object is already some class that knows how to handle a certain situation, so add a virtual void handle(const char * key) in your base class and let the object check in the implementation if it applies to it and do whatever is necessary.
This would not only eliminate the long if-else-if chain, but would also be more type safe and would give you more flexibility in handling those events.
That seems to me an appropriate use of macros. They are, after all, made for eliding syntactic repetition. However, when you have syntactic repetition, it’s not always the fault of the language—there are probably better design choices out there that would let you avoid this decision altogether.
The general wisdom is to use a table mapping keys to actions:
std::map<std::string, void(Class::*)()> table;
Then look up and invoke the action in one go:
object->*table[key]();
Or use find to check for failure:
const auto i = table.find(key);
if (i != table.end())
object->*(i->second)();
else
throw std::runtime_error(...);
But if as you say there is no common signature for the functions (i.e., you can’t use member function pointers) then what you actually should do depends on the particulars of your project, which I don’t know. It might be that a macro is the only way to elide the repetition you’re seeing, or it might be that there’s a better way of going about it.
Ask yourself: why do my functions take different arguments? Why am I using casts? If you’re dispatching on the type of an object, chances are you need to introduce a common interface.

Removing a parameter list from f(list) with preprocessor

It seems to me that I saw something weird being done in a boost library and it ended up being exactly what I'm trying to do now. Can't find it though...
I want to create a macro that takes a signature and turns it into a function pointer:
void f(int,int) {}
...
void (*x)(int,int) = WHAT( (f(int,int)) );
x(2,4); // calls f()
I especially need this to work with member function pointers so that WHAT takes two params:
WHAT(ClassType, (f(int,int)); // results in static_cast<void (ClassType::*)(int,int)>(&ClassType::f)
It's not absolutely necessary in order to solve my problem, but it would make things a touch nicer.
This question has nothing, per-se, to do with function pointers. What needs to be done is to use the preprocessor to take "f(int,int)" and turn it into two different parts:
'f'
'(int,int)'
Why:
I've solved the problem brought up here: Generating Qt Q_OBJECT classes pragmatically
I've started a series of articles explaining how to do it:
http://crazyeddiecpp.blogspot.com/2011/01/quest-for-sane-signals-in-qt-step-1.html
http://crazyeddiecpp.blogspot.com/2011/01/quest-for-sane-signals-in-qt-step-2.html
The signature must be evaluated from, and match exactly, the "signal" that the user is attempting to connect with. Qt users are used to expressing this as SIGNAL(fun(param,param)), so something like connect_static(SIGINFO(object,fun(param,param)), [](int,int){}) wouldn't feel too strange.
In order to construct the signature I need to be able to pull it out of the arguments supplied. There's enough information to get the member function address (using C++0x's decltype) and fetch the signature in order to generate the appropriate wrapper but I can't see how to get it out. The closest I can come up with is SIGINFO(object, fun, (param,param)), which is probably good enough but I figured I'd ask here before considering it impossible to get the exact syntax I'd prefer.
What are you trying to do is impossible using standard preprocessor, unfortunately. There are a couple of reasons:
It is impossible to split parameters passed to a macro using custom character. They have to be comma delimited. Otherwise that could solve your problem instantly.
You cannot use preprocessor to define something that is not an identifier. Otherwise you could use double expansion where ( and ) is defined as , and split arguments on that as if it was passed as f, int, int,, then process it as variadic arguments.
Function pointer definition in C++ does not allow you to deduce the name given to defined type, unfortunately.
Going even further, even if you manage to create a function pointer, the code won't work for methods because in order to invoke a method, you need to have two pointers - pointer to the method and to the class instance. This means you have to have some wrapper around this stuff.
That is why QT is using its own tools like moc to generate glue code.
The closes thing you might have seen in Boost is probably Signals, Bind and Lambda libraries. It is ironic that those libraries are much more powerful than what you are trying to achieve, but at the same time they won’t allow you to achieve it the way you want it. For example, even if you could do what you want with the syntax you want, you won’t be able to “connect” a slot to a “signal” if signal has a different signature. At the same time, libraries from Boost I mentioned above totally allow that. For example, if your “slot” expects more parameters than “signal” provides, you can bind other objects to be passed when “slot” is invoked. Those libraries can also suppress extra parameters if “slot” does not expect them.
I’d say the best way from C++ prospective as for today is to use Boost Signal approach to implement event handling in GUI libraries. QT doesn’t use it for a number of reasons. First, it started in like 90-s when C++ was not that fancy. Plus, they have to parse your code in order to work with “slots” and “signals” in graphic designer.
It seems for me than instead of using macros or even worse – non-standard tools on top of C++ to generate code, and using the following:
void (*x)(int,int) = WHAT( (f(int,int)) );
It would be much better to do something like this:
void f (int x, int y, int z);
boost::function<void (int, int)> x = boost::bind (&f, _1, _2, 3);
x (1, 2);
Above will work for both functions and methods.