Is it idiomatically ok to put algorithm into class? - c++

I have a complex algorithm. This uses many variables, calculates helper arrays at initialization and also calculates arrays along the way. Since the algorithm is complex, I break it down into several functions.
Now, I actually do not see how this might be a class from an idiomatic way; I mean, I am just used to have algorithms as functions. The usage would simply be:
Calculation calc(/* several parameters */);
calc.calculate();
// get the heterogenous results via getters
On the other hand, putting this into a class has the following advantages:
I do not have to pass all the variables to the other functions/methods
arrays initialized at the beginning of the algorithm are accessible throughout the class in each function
my code is shorter and (imo) clearer
A hybrid way would be to put the algorithm class into a source file and access it via a function that uses it. The user of the algorithm would not see the class.
Does anyone have valuable thoughts that might help me out?
Thank you very much in advance!

I have a complex algorithm. This uses many variables, calculates helper arrays at initialization and also calculates arrays along the way.[...]
Now, I actually do not see how this might be a class from an idiomatic way
It is not, but many people do the same thing you do (so did I a few times).
Instead of creating a class for your algorithm, consider transforming your inputs and outputs into classes/structures.
That is, instead of:
Calculation calc(a, b, c, d, e, f, g);
calc.calculate();
// use getters on calc from here on
you could write:
CalcInputs inputs(a, b, c, d, e, f, g);
CalcResult output = calculate(inputs); // calculate is now free function
// use getters on output from here on
This doesn't create any problems and performs the same (actually better) grouping of data.

I'd say it is very idiomatic to represent an algorithm (or perhaps better, a computation) as a class. One of the definitions of object class from OOP is "data and functions to operate on that data." A compex algorithm with its inputs, outputs and intermediary data matches this definition perfectly.
I've done this myself several times, and it simplifies (human) code flow analysis significantly, making the whole thing easier to reason about, to debug and to test.

If the abstraction for the client code is an algorithm, you
probably want to keep a pure functional interface, and not
introduce additional types there. It's quite common, on the
other hand, for such a function to be implemented in a source
file which defines a common data structure or class for its
internal use, so you might have:
double calculation( /* input parameters */ )
{
SupportClass calc( /* input parameters */ );
calc.part1();
calc.part2();
// etc...
return calc.results();
}
Depending on how your code is organized, SupportClass will be
in an unnamed namespace in the source file (probably the most
common case), or in a "private" header, included only by the
sources involved in the algorith.

It really depends of what kind of algorithm you want to encapsulate. Generally I agree with John Carmack : "Sometimes, the elegant implementation is just a function. Not a method. Not a class. Not a framework. Just a function."

It really boils down to: do the algorithm need access to the private area of the class that is not supposed to be public? If the answer is yes (unless you are willing to refactor your class interface, depending on the specific cases) you should go with a member function, if not, then a free function is good enough.
Take for example the standard library. Most of the algorithms are provided as free functions because they only access the public interface of the class (with iterators for standard containers, for example).

Do you need to call the exact same functions in the exact same order each time? Then you shouldn't be requiring calling code to do this. Splitting your algorithm into multiple functions is fine, but I'd still have one call the next and then the next and so on, with a struct of results/parameters being passed along the way. A class doesn't feel right for a one-off invocation of some procedure.
The only way I'd do this with a class is if the class encapsulates all the input data itself, and you then call myClass.nameOfMyAlgorithm() on it, among other potential operations. Then you have data+manipulators. But just manipulators? Yeah, I'm not so sure.

In modern C++ the distinction has been eroded quite a bit. Even from the operator overloading of the pre-ANSI language, you could create a class whose instances are syntactically like functions:
struct Multiplier
{
int factor_;
Multiplier(int f) : factor_(f) { }
int operator()(int v) const
{
return v * _factor;
}
};
Multipler doubler(2);
std::cout << doubler(3) << std::endl; // prints 6
Such a class/struct is called a functor, and can capture "contextual" values in its constructor. This allows you to effectively pass the parameters to a function in two stages: some in the constructor call, some later each time you call it for real. This is called partial function application.
To relate this to your example, your calculate member function could be turned into operator(), and then the Calculation instance would be a function! (or near enough.)
To unify these ideas, you can try thinking of a plain function as a functor of which there is only one instance (and hence no need for a constructor - although this is no guarantee that the function only depends on its formal parameters: it might depend on global variables...)
Rather than asking "Should I put this algorithm in a function or a class?" instead ask yourself "Would it be useful to be able to pass the parameters to this algorithm in two or more stages?" In your example, all the parameters go into the constructor, and none in the later call to calculate, so it makes little sense to ask users of your class make two calls.
In C++11 the distinction breaks down further (and things get a lot more convenient), in recognition of the fluidity of these ideas:
auto doubler = [] (int val) { return val * 2; };
std::cout << doubler(3) << std::endl; // prints 6
Here, doubler is a lambda, which is essentially a nifty way to declare an instance of a compiler-generated class that implements the () operator.
Reproducing the original example more exactly, we would want a function-like thing called multiplier that accepts a factor, and returns another function-like thing that accepts a value v and returns v * factor.
auto multiplier = [] (int factor)
{
return [=] (int v) { return v * factor; };
};
auto doubler = multiplier(2);
std::cout << doubler(3) << std::endl; // prints 6
Note the pattern: ultimately we're multiplying two numbers, but we specify the numbers in two steps. The functor we get back from calling multiplier acts like a "package" containing the first number.
Although lambdas are relatively new, they are likely to become a very common part of C++ style (as they have in every other language they've been added to).
But sadly at this point we've reached the "cutting edge" as the above example works in GCC but not in MSVC 12 (I haven't tried it in MSVC 13). It does pass the intellisense checking of MSVC 12 though (they use two completely different compilers)! And you can fix it by wrapping the inner lambda with std::function<int(int)>( ... ).
Even so, you can use these ideas in old-school C++ when writing functors by hand.
Looking further ahead, resumable functions may make it into some future version of the language (Microsoft is pushing hard for them as they are practically identical to async/await in C#) and that is yet another blurring of the distinction between functions and classes (a resumable function acts like a constructor for a state machine class).

Related

Lazy evaluation for subset of class methods

I'm looking to make a general, lazy evaluation-esque procedure to streamline my code.
Right now, I have the ability to speed up the execution of mathematical functions - provided that I pre-process it by calling another method first. More concretely, given a function of the type:
const Eigen::MatrixXd<double, -1, -1> function_name(const Eigen::MatrixXd<double, -1, -1>& input)
I can pass this into another function, g, which will produce a new version of function_name g_p, which can be executed faster.
I would like to abstract all this busy-work away from the end-user. Ideally, I'd like to make a class such that when any function f matching function_name's method signature is called on any input (say, x), the following happens instead:
The class checks if f has been called before.
If it hasn't, it calls g(f), followed by g_p(x).
If it has, it just calls g_p(x)
This is tricky for two reasons. The first, is I don't know how to get a reference to the current method, or if that's even possible, and pass it to g. There might be a way around this, but passing one function to the other would be simplest/cleanest for me.
The second bigger issue is how to force the calls to g. I have read about the execute around pattern, which almost works for this purpose - except that, unless I'm understanding it wrong, it would be impossible to reference f in the surrounding function calls.
Is there any way to cleanly implement my dream class? I ideally want to eventually generalize beyond the type of function_name (perhaps with templates), but can take this one step at a time. I am also open to other solution to get the same functionality.
I don't think a "perfect" solution is possible in C++, for the following reasons.
If the calling site says:
result = object->f(x);
as compiled this will call into the unoptimized version. At this point you're pretty much hamstrung, since there's no way in C++ to change where a function call goes, that's determined at compile-time for static linkage, and at runtime via vtable lookup for virtual (dynamic) linkage. Whatever the case, it's not something you can directly alter. Other languages do allow this, e.g. Lua, and rather ironically C++'s great-grandfather BCPL also permits it. However C++ doesn't.
TL;DR to get a workable solution to this, you need to modify either the called function, or every calling site that uses one of these.
Long answer: you'll need to do one of two things. You can either offload the problem to the called class and make all functions look something like this:
const <return_type> myclass:f(x)
{
static auto unoptimized = [](x) -> <return_type>
{
// Do the optimizable heavy lifting here;
return whatever;
};
static auto optimized = g(unoptimized);
return optimized(x);
}
However I very strongly suspect this is exactly what you don't want to do, because assuming the end-user you're talking about is the author of the class, this fails your requirement to offload this from the end-user.
However, you can also solve it by using a template, but that requires modification to every place you call one of these. In essence you encapsulate the above logic in a template function, replacing unoptimized with the bare class member, and leaving most everything else alone. Then you just call the template function at the calling site, and it should work.
This does have the advantage of a relatively small change at the calling site:
result = object->f(x);
becomes either:
result = optimize(object->f, x);
or:
result = optimize(object->f)(x);
depending on how you set the optimize template up. It also has the advantage of no changes at all to the class.
So I guess it comes down to where you wan't to make the changes.
Yet another choice. Would it be an option to take the class as authored by the end user, and pass the cpp and h files through a custom pre-processor? That could go through the class and automatically make the changes outlined above, which then yields the advantage of no change needed at the calling site.

Good practice in C++ function/method design

I have a confusion about C++ function/method design as below:
1.
class ArithmeticCalculation
{
private:
float num1_;
float num2_;
float sum_;
void addTwoNumbers();
};
2.
class ArithmeticCalculation
{
private:
float addTwoNumbers(float num1, float num2);
};
In 1., one can basically declare a class variable and the void addTwoNumbers() will just implement it and assign to the class variable (sum_). I found using 1. is cleaner but using 2. looks like it more intuitive for function use.
Which one is actually best option considering the function/method is not restricted to only this basic addition functionality -- I mean in general how to decide to use with return or simply void?
The major difference between the two functions is that the second one is stateless*, while the first one has a state. Other things being equal, stateless approach is preferred, because it gives the users of your class more flexibility at utilizing your class in their systems. For example, stateless functions are re-entrant, while functions that rely on state may require the code that uses them to take additional measures that prevent incorrect use.
Re-entrancy alone is a big reason to prefer stateless functions whenever possible. However, there are situations when keeping state becomes more economical - for example, when you are using Builder Design Pattern.
Another important advantage of keeping your functions stateless whenever it is possible is that the call sequence becomes more readable. A call of a method that relies on the state consists of these parts:
Set up the object before the call
Make the call
Harvest the result of the call (optional)
Human readers of your code will have much easier time reading the call that uses a function invocation with parameter passing than the three-part setup-call-get result sequence.
There are situations when you have to have state, for example, when you want to defer the action. In this case the parameters are supplied by one part of the code, while the computation is initiated by some other part of the code. In terms of your example, one function would call set_num1 and set_num2, while another function would call addTwoNumbers at some later time. In situations like this you could save the parameters on the object itself, or create a separate object with deferred parameters.
* This is only an assumption based on the signature of your member function. Your second function gets all the data that it needs as parameters, and returns the value to the caller; Obviously, implementations may choose to add some state, e.g. by saving the last result, but that is uncommon for addTwoNumbers functions, so I assume that your code does not do it.
The first function doesn't really make a lot of sense. What numbers? Where does the result go? The name doesn't describe the expected side-effects, nor the origin of the numbers in question.
The second function makes it abundantly clear what's going on, where the result is, and how that function might be used.
Your functions should strive to communicate their intent based on the function signature. If that's not sufficient you'll need to add comments or documentation, but no amount of commenting or documentation can pave over a misleading or confusing signature.
Think about what your function's responsibility is as well as whatever expectations it has when naming things. For example:
void whatever(const int);
What does that function do? Could you even guess without looking at code or documentation?
Compare with the same function given a much more meaningful name:
void detonateReactor(const int countdownTimeInSeconds);
It seems pretty clear what that does now, as well as what side-effects it will have.
You probably had in mind something like this for the first option:
struct Adder {
float sum;
float a;
float b;
void addNumbers(){ sum = a+b; }
};
that would be used like this:
Adder adder;
adder.a = 1.0;
adder.b = 2.0;
adder.addNumbers();
std::cout << adder.sum << "\n";
There is no single good argument to do this when you actually wanted this:
float addTwoNumbers(float a,float b) { return a+b; }
std::cout << addTwoNumbers(1.0,2.0) << "\n";
Not everything has to be inside a class. Actually not everything should be inside a class (C++ isnt Java). If you need a function that adds two numbers then write a function that adds two numbers and dont overthink it.

Pattern to share data between objects in C++

I have started a migration of a high energy physics algorithm written in FORTRAN to an object oriented approach in C++. The FORTRAN code uses a lot of global variables all across a lot of functions.
I have simplified the global variables into a set of input variables, and a set of invariants (variables calculated once at the beginning of the algorithm and then used by all the functions).
Also, I have divided the full algorithm into three logical steps, represented by three different classes. So, in a very simple way, I have something like this:
double calculateFactor(double x, double y, double z)
{
InvariantsTypeA invA();
InvariantsTypeB invB();
// they need x, y and z
invA.CalculateValues();
invB.CalculateValues();
Step1 s1();
Step2 s2();
Step3 s3();
// they need x, y, z, invA and invB
return s1.Eval() + s2.Eval() + s3.Eval();
}
My problem is:
for doing the calculations all the InvariantsTypeX and StepX objects need the input parameters (and these are not just three).
the three objects s1, s2 and s3 need the data of the invA and invB objects.
all the classes use several other classes through composition to do their job, and all those classes also need the input and the invariants (by example, s1 has a member object theta of class ThetaMatrix that needs x, z and invB to get constructed).
I cannot rewrite the algorithm to reduce the global values, because it follows several high energy physics formulas, and those formulas are just like that.
Is there a good pattern to share the input parameters and the invariants to all the objects used to calculate the result?
Should I use singletons? (but the calculateFactor function is evaluated around a million of times)
Or should I pass all the required data as arguments to the objects when they are created?(but if I do that then the data will be passed everywhere in every member object of every class, creating a mess)
Thanks.
Well, in C++ the most suitable solution, given your constraints and conditions, is represented by pointers. Many developers told you to use boost::shared_ptr. Well it is not necessary, although it provides a better performance especially when considering portability and robustness to system faults.
It is not necessary for you to bind to boost. It is true that they are not compiled and that now standardization processes will lead to c++ with boost directly integrated as a standard library, but if you do not want to use an external library you obviously can.
So let's go and try to solve your problem using just C++ and what it provides actually.
You'll probably have a main method and there, you told before, initialize all invariants elements... so you basically have constants and they can be every possible type. no need to make them constant if you want, however, in main you instantiate your invariant elements and point them for all those components requiring their usage. First in a separate file called "common_components.hpp" consider the following (I assume that you need some types for your invariant variables):
typedef struct {
Type1 invariant_var1;
Type2 invariant_var2;
...
TypeN invariant_varN;
} InvariantType; // Contains the variables I need, it is a type, instantiating it will generate a set of global variables.
typedef InvariantType* InvariantPtr; // Will point to a set of invariants
In your "main.cpp" file you'll have:
#include "common_components.hpp"
// Functions declaration
int main(int, char**);
MyType1 CalculateValues1(InvariantPtr); /* Your functions have as imput param the pointer to globals */
MyType2 CalculateValues2(InvariantPtr); /* Your functions have as imput param the pointer to globals */
...
MyType3 CalculateValuesN(InvariantPtr); /* Your functions have as imput param the pointer to globals */
// Main implementation
int main(int argc, char** argv) {
InvariantType invariants = {
value1,
value2,
...
valueN
}; // Instantiating all invariants I need.
InvariantPtr global = &invariants;
// Now I have my variable global being a pointer to global.
// Here I have to call the functions
CalculateValue1(global);
CalculateValue2(global);
...
CalculateValueN(global);
}
If you have functions returning or using the global variable use the pointer to the struct modifying you methods' interface. By doing so all changes will be flooded to all using thoss variables.
Why not passing the invariants as a function parameter or to the constructor of the class having the calculateFactor method ?
Also try to gather parameters together if you have too many params for a single function (for instance, instead of (x, y, z) pass a 3D point, you have then only 1 parameter instead of 3).
three logical steps, represented by three different classes
This may not have been the best approach.
A single class can have a large number of "global" variables, shared by all methods of the class.
What I've done when converting old codes (C or Fortran) to new OO structures is to try to create a single class which represents a more complete "thing".
In some case, well-structured FORTRAN would use "Named COMMON Blocks" to cluster things into meaningful groups. This is a hint as to what the "thing" really was.
Also, FORTRAN will have lots of parallel arrays which aren't really separate things, they're separate attributes of a common thing.
DOUBLE X(200)
DOUBLE Y(200)
Is really a small class with two attributes that you would put into a collection.
Finally, you can easily create large classes with nothing but data, separate from the the class that contains the functions that do the work. This is kind of creepy, but it allows you to finesse the common issue by translating a COMMON block into a class and simply passing an instance of that class to every function that uses the COMMON.
There is a very simple template class to share data between objects in C++ and it is called shared_ptr. It is in the new STL and in boost.
If two objects both have a shared_ptr to the same object they get shared access to whatever data it holds.
In your particular case you probably don't want this but want a simple class that holds the data.
class FactorCalculator
{
InvariantsType invA;
InvariantsType invB;
public:
FactorCalculator() // calculate the invariants once per calculator
{
invA.CalculateValues();
invB.CalculateValues();
}
// call multiple times with different values of x, y, z
double calculateFactor( double x, double y, double z ) /*const*/
{
// calculate using pre-calculated values in invA and invB
}
};
Instead of passing each parameter individually, create another class to store them all and pass an instance of that class:
// Before
void f1(int a, int b, int c) {
cout << a << b << c << endl;
}
// After
void f2(const HighEnergyParams& x) {
cout << x.a << x.b << x.c << endl;
}
First point: globals aren't nearly as bad (in themselves) as many (most?) programmers claim. In fact, in themselves, they aren't really bad at all. They're primarily a symptom of other problems, primarily 1) logically separate pieces of code that have been unnecessarily intermixed, and 2) code that has unnecessary data dependencies.
In your case, it sounds like already eliminated (or at least minimized) the real problems (being invariants, not really variables eliminates one major source of problems all by itself). You've already stated that you can't eliminate the data dependencies, and you've apparently un-mingled the code to the point that you have at least two distinct sets of invariants. Without seeing the code, that may be coarser granularity than really needed, and maybe upon closer inspection, some of those dependencies can be eliminated completely.
If you can reduce or eliminate the dependencies, that's a worthwhile pursuit -- but eliminating the globals, in itself, is rarely worthwhile or useful. In fact, I'd say within the last decade or so, I've seen fewer problems caused by globals, than by people who didn't really understand their problems attempting to eliminate what were (or should have been) perfectly fine as globals.
Given that they are intended to be invariant, what you probably should do is enforce that explicitly. For example, have a factory class (or function) that creates an invariant class. The invariant class makes the factory its friend, but that's the only way members of the invariant class can change. The factory class, in turn, has (for example) a static bool, and executes an assert if you attempt to run it more than once. This gives (a reasonable level of) assurance that the invariants really are invariant (yes, a reinterpret_cast will let you modify the data anyway, but not by accident).
The one real question I'd have is whether there's a real point in separating your invariants into two "chunks" if all the calculations really depend on both. If there's a clear, logical separation between the two, that's great (even if they do get used together). If you have what's logically a single block of data, however, trying to break it into pieces may be counterproductive.
Bottom line: globals are (at worst) a symptom, not a disease. Insisting that you're going to get the patient's temperature down to 98.6 degrees may be counterproductive -- especially if the patient is an animal whose normal body temperature is actually 102 degrees.
uhm. Cpp is not necessarily object oriented. It is the GTA of programming! You are free to be a Object obscessed freak, a relax C programmer, a functional programmer, what ever; a mix martial artist.
My point, if Global variables worked in your fortran compile, just copy and paste to Cpp. No need to avoid global variables. It follows the principle of, dont touch legacy code.
Lets understand why global variables may cause problem. As you know, variables is the programs`s state and state is the soul of the program. Bad or invalid state causes runtime and logic errors. The problem with global variables/ global state, is that any part of our code has access to it; thus in case of invalid state, their are many bad guys or culprits to consider, meaning functions and operators. However this is only applicable if you really used so many functions on your global variable. I mean you are the only one working on your lonely program. Global variables are only a real problem if you are doing a team project. In that case many people have access to it, writing different functions that may or may not be accessing that variable.

And now for a complete change of direction from C++ function pointers

I am building a part of a simulator. We are building off of a legacy simulator, but going in different direction, incorporating live bits along side of the simulated bits. The piece I am working on has to, effectively route commands from the central controller to the various bits.
In the legacy code, there is a const array populated with an enumerated type. A command comes in, it is looked up in the table, then shipped off to a switch statement keyed by the enumerated type.
The type enumeration has a choice VALID_BUT_NOT_SIMULATED, which is effectively a no-op from the point of the sim. I need to turn those no-ops into commands to actual other things [new simulated bits| live bits]. The new stuff and the live stuff have different interfaces than the old stuff [which makes me laugh about the shill job that it took to make it all happen, but that is a topic for a different discussion].
I like the array because it is a very apt description of the live thing this chunk is simulating [latching circuits by row and column]. I thought that I would try to replace the enumerated types in the array with pointers to functions and call them directly. This would be in lieu of the lookup+switch.
Can't be done. However, you could do something sort of like it with a functor. I'd put example code but as I was writing it I realized such a construct would necessarily be quite complicated. You might look at boost::bind for some ideas.
One way to do it, though ugly, is to use a generic table of pointers and cast your function pointers to such generic type (losing information about the arguments' types):
void (*myFunctions[]) () = {
(void (*)())myFirstFunction,
(void (*)())mySecondFunction
};
But then you'll have to know, for each of the pointers, what arguments to pass to the corresponding functions. You can extend your table of pointers and make a table of more sophisticated objects which hold some enumeration variable informing about the arguments of a function to which a particular pointer points.
Unfortunately, each time you'll want to use a function from the array, you will need to cast the pointer back to the given type (and you'll have to care not to cast it incorrectly), like so:
((void (*)(int))tab[0])(1);
In order to call myFirstFunction with x = 1.
As I think about it now (after you changed the question), I come to the conclusion that if you have to call the functions differently, there really is no point complicating the whole thing (lookup table), unless there are just a few signatures and many functions available. You need a very consistent calling policy and really few possible signatures to achieve a good-looking solution with a lookup table. Needless to mention what will happen if you need to store pointers to member functions or even worse - virtuals.
Based on your updated question, I'm still not sure how you're going to invoke the functions via the pointer if the functions need different parameter lists.
However, if one parameter list is a subset of the other, could you write thunks to adapt one interface to look like the other? (i.e. discarding irrelevant parameters or synthesising fake parameters).
In your original question, you were describing a scenario that is very common when working with Javascript libraries. In Javascript, libraries often provide a way for "interested parties" to be notified of events that are published by the library, and all that the interested parties need to do is register their own Function object callbacks. In one version of the library, the documentation might say that the callbacks will be passed n arguments (a, b, c, ... in that order), but a future version might want to provide n + m arguments. This change does not have to break existing code because the library can just append the extra m arguments to the argument list, and this works because Javascript uses a caller-cleans-up calling convention (essentially).
In C++, you could do something similar (provide additional arguments to callbacks) as long as you can guarantee that the calling convention that is used by the callbacks and to call the callbacks is a caller-cleans-up calling convention such as the C calling convention for the x86 architecture:
#include <cstdlib>
#include <iostream>
#include <vector>
extern "C" void old_api_callback_in_old_archive(int x) {
std::cout << "`old_api_callback_in_old_archive` was called with x = " << x << std::endl;
}
extern "C" void new_api_callback(int x, int otherInfo) {
std::cout << "`new_api_callback` was called with x = " << x
<< ", otherInfo = " << otherInfo << std::endl;
}
extern "C" {
typedef void (*callback_type)(int, int);
}
int main()
{
std::vector<callback_type> callbacks;
callbacks.push_back(&new_api_callback);
callbacks.push_back(reinterpret_cast<callback_type>(&old_api_callback_in_old_archive));
std::vector<callback_type>::iterator it;
for (it = callbacks.begin(); it != callbacks.end(); ++it) {
(*it)(7, -8);
}
return EXIT_SUCCESS;
}

Closures in C++

I've found myself in a strange place, mentally. In a C++ project, I long for closures.
Background. There's a Document-type class with a public Render method which spawns a deep call tree. There's some transient state that only makes sense during rendering. Right now it resides in the class like regular member variables. However, this is not satisfactory on some levels - this data only makes sense during a Render call, why store it all the time? Passing it around in arguments would be ugly - there are around 15 variables there. Passing around a structure would add a lot of "RenderState->..." in the lower-level methods.
So what do I want? I want the world, like we all do. Specifically, a set of variables that are:
available to some methods in a class (not all of them)
accessible by name alone (no pState->... stuff - so that refactoring is easy)
not copied around on every method call
only live during a method call and up its call tree (assuming trees grow up)
live on a stack
I know I can have some of those properties with C++ - but not all of them. Tell me I'm not turning weird.
Heck, in Pascal, of all places, nested functions give you all that...
So what is a good workaround to emulate closures in C++, getting as many of the above benefits as possible?
Standard C++ since C++11 provides native lambda expressions and several compilers (VC10+ GCC and clang at least) implements it.
With GCC and Clang you can activate it with "--std=c++11" (or use a higher version of C++ if available). VC10 and later versions have it activated without need for flags.
By the way, you can also use boost::lambda (that is not perfect but works with C++03) also provide lambda in C++.
You don't have nested functions, but you have local classes:
void Document::Render(Param)
{
class RenderState
{
public:
RenderState(Document&)
{
//...
}
void Go(Param);
private:
// "Nested" functions
// ....
// Data that nested functions operate on
// ...
};
RenderState s(*this);
s.Go(Param);
}
See this GotW article for more information
Personally, I'd go with the RenderState approach.
Alternatively, if there's a well-defined set of Render-only functions that all require access to the same data, I'd seriously investigate pulling those into their own DocumentRenderer class that contains both the appropriate methods and the appropriate member variables. (This is similar to Fowler's "method object" refactoring.)
C++ doesn't have nested functions, but local classes can serve as an imperfect solution. (Imperfect because local classes' methods cannot access variables of the enclosing class and because they can't be used to instantiate templates.) A local class is simply a class that's declared, along with its methods, within the body of a function. Herb Sutter discusses local classes in more detail here.
Local classes are used to implement Boost's ScopeExit library. ScopeExit's reviewers noted that ScopeExit "suggests a method for creating a general closure mechanism as a library," so if you aren't happy with a RenderState or DocumentRenderer approach, ScopeExit's implementation may give you some ideas for closures in C++.
Currently there are no closures in C++ that would generate "orinary" first-class functions (whether member or non-member). Moreover, there's no standard way to implement such closures.
Closure semantics is available for functors in template metaprogramming at compile time, but that's a completely different kind of beast. In order to obtain a true run-time closure functionality for first-class functions you haver no other choice but to use a non-standard low-level implementation like this one, for example.
A functor is basically a closure.
Why the downvotes? Take Érics comment, change void Go(Param); to void operator () (Param); and there you have it.
There is no way to keep the stack in a native application after the function has exited. But this would be neccesary to make closures like the ones in Javascript. And there is no way to reference a function's stack without doing anything evil. A class that acts like a function (=a functor) would have to get all the relevant information passed somehow, but this is as close as you get in C++. It has state, it has code, and you can pass it around.
Please explain, where am I wrong?
As long as the local variables you want to bind are in scope, you can try something like the following to bind them to your inner class. Though, if you have read the above posted GotW article, it is a fragile solution.
#include <iostream>
using namespace std;
int main() {
int x = 1;
cout << x << endl; // 1
class Inner {
public:
Inner(int& x) : bound_x(x) {}
void do_sth() { ++bound_x; }
private:
int& bound_x;
};
Inner i(x);
i.do_sth();
cout << x << endl; // 2
x = 5;
i.do_sth();
cout << x << endl; // 6
return 0;
}