What's everybody's opinion on using lambdas to do nested functions in C++? For example, instead of this:
static void prepare_eggs()
{
...
}
static void prepare_ham()
{
...
}
static void prepare_cheese()
{
...
}
static fry_ingredients()
{
...
}
void make_omlette()
{
prepare_eggs();
prepare_ham();
prepare_cheese();
fry_ingredients();
}
You do this:
void make_omlette()
{
auto prepare_eggs = [&]()
{
...
};
auto prepare_ham = [&]()
{
...
};
auto prepare_cheese = [&]()
{
...
};
auto fry_ingredients = [&]()
{
...
};
prepare_eggs();
prepare_ham();
prepare_cheese();
fry_ingredients();
}
Having come from the generation that learned how to code by using Pascal, nested functions make perfect sense to me. However, this usage seemed to confuse some of the less experienced developers in my group at work during a code review where I made use of lambdas in this way.
I don't see anything wrong with nested functions per se. I use lambdas for nested functions, but only if it meets some conditions:
It is called in more than once place. (otherwise just write the code directly if it's not too long)
It is really an internal function, so that that calling it in any other context would not make sense.
It's short enough (maybe 10 lines max).
So in your example I would not use lambdas for reason number one.
Conceptually nested functions can be useful for the same reason why private methods in classes are useful. They enforce encapsulation and they make it easier to see the structure of the program. If a function is an implementation detail to some other function then why not make it explicitly so?
The biggest problem I see is with readability; it's more difficult to read code that has a lot of nesting and indenting. Also, people are not very comfortable with lambdas yet so resistance is expected.
For any given piece of code, make it as visible as necessary and as hidden as possible:
If the piece of code is used in only one place, write it there.
If it is used in multiple places inside the same function, emulate nested functions through lambdas.
If it is used by multiple functions, put it in a proper function.
You can already guess that you're doing something unorthodox by the comments you received. This is one of the reasons C++ has bad reputation, people never stop abusing it. Lambdas are mainly used as inline function objects for standard library algorithms and places that require some kind of callback mechanism. I think this covers 99% of use-cases, and it should stay that way!
As Bjarne said in one of his lectures: "Not everything should be a template, and not everything should be an object."
And not everything should be a lambda :) there is nothing wrong with a free standing function.
It's a very limited use case. For starters, the functionality present in the local function must be needed at several spots inside the enclosing function such that the resulting local refactoring will be a win in readability. Otherwise I will write the functionality inline, perhaps putting it in a block if that helps.
But at the same time, the functionality must be local or specific enough that I don't have an incentive to refactor the functionality outside of the (not so) enclosing function, where I could perhaps reuse it entirely in another function at some point. It must also be short: otherwise I'm just going to move it out, perhaps putting it in an anonymous namespace (or namespace detail in a header) or some such. It doesn't take much for me to trade locality off in favour of compactness (long functions are a pain to review).
Note that the above is strictly language agnostic. I don't think C++ spins a particular spin on that. If there is one particular C++ advice I have to give on the topic however, it's that I would proscribe using a default by-reference capture ([&]). There'd be no way to tell if that particular lambda expression describe a closure or a local function without carefully reviewing the whole body. Which wouldn't be that bad (it's not that closures are 'scary') if not for the fact that that by-reference captures ([&], [&foo]) allow mutations even if the lambda is not marked mutable, and by-value captures ([=], [foo]) can make an undesirable copy, or even attempt an impossible copy for move-only types. All in all I'd rather not capture anything at all if it's possible (that's what parameters are for!), and use individual captures when needed. It's especially problematic
To sum up:
// foo is expensive to copy, but ubiquitous enough
// that capturing it rather than passing it as a parameter
// is acceptable
auto const& foo_view = foo;
auto do_quux = [&foo_view](arg_type0 arg0, arg_type1 arg1) -> quux_type
{
auto b = baz(foo_view, arg0, arg1);
b.frobnicate;
return foo_view.quux(b);
};
// use do_quux several times later
In my opinion it's useless, but you can accomplish that using only C++03
void foo() {
struct {
void operator()() {}
} bar;
bar();
}
But once again, IMHO it's useless.
Related
So, I see the usefulness of lambda functions when they are used to replace functors, but when would you want to use them in an object oriented programming (with classes) setting in general and why?
Okay, so a little more (and less) helpful response than my comment.
Closures (that is, functions declared inside other functions, which capture the outer function's variables) are an interesting back-channel way to implement classes. Watch:
auto MakeCounter(int initialCount)
{
int currentCount = initialCount;
struct {
std::function<void()> increment = [&]() { currentCount++; };
std::function<int()> getCount = [&]() { return currentCount; };
} theCounter;
return theCounter;
}
Now theCounter is a structure with two members, one which increments the count and the other which retrieves the current count. Notice that the struct doesn't itself need to store the current count; that is instead implicitly held by the two lambdas, which share currentCount between them.
There's a few problems with this.
It doesn't compile. Not even mostly. There are a ton of things wrong with that code snippet. Actually, it crashes GCC 4.9. Whee!
Even if it did compile, it wouldn't work properly, because C++ closures aren't very powerful -- they can't keep captured variables alive after the end of their scopes. Only a language with actual GC could do this.
C++ already has classes, so why bother?
Nevertheless, you see this sort of pattern in other languages which either support (proper) closures and GC but don't have native facilities for classes (e.g. some variants of LISP), or do support classes but so crappily that it's arguably better to do things this way (e.g. Matlab).
So in C++ they're just for replacing functor boilerplate. They don't offer any additional power. In some other languages, they're rather more versatile, and more widely important.
I have a complex algorithm. This uses many variables, calculates helper arrays at initialization and also calculates arrays along the way. Since the algorithm is complex, I break it down into several functions.
Now, I actually do not see how this might be a class from an idiomatic way; I mean, I am just used to have algorithms as functions. The usage would simply be:
Calculation calc(/* several parameters */);
calc.calculate();
// get the heterogenous results via getters
On the other hand, putting this into a class has the following advantages:
I do not have to pass all the variables to the other functions/methods
arrays initialized at the beginning of the algorithm are accessible throughout the class in each function
my code is shorter and (imo) clearer
A hybrid way would be to put the algorithm class into a source file and access it via a function that uses it. The user of the algorithm would not see the class.
Does anyone have valuable thoughts that might help me out?
Thank you very much in advance!
I have a complex algorithm. This uses many variables, calculates helper arrays at initialization and also calculates arrays along the way.[...]
Now, I actually do not see how this might be a class from an idiomatic way
It is not, but many people do the same thing you do (so did I a few times).
Instead of creating a class for your algorithm, consider transforming your inputs and outputs into classes/structures.
That is, instead of:
Calculation calc(a, b, c, d, e, f, g);
calc.calculate();
// use getters on calc from here on
you could write:
CalcInputs inputs(a, b, c, d, e, f, g);
CalcResult output = calculate(inputs); // calculate is now free function
// use getters on output from here on
This doesn't create any problems and performs the same (actually better) grouping of data.
I'd say it is very idiomatic to represent an algorithm (or perhaps better, a computation) as a class. One of the definitions of object class from OOP is "data and functions to operate on that data." A compex algorithm with its inputs, outputs and intermediary data matches this definition perfectly.
I've done this myself several times, and it simplifies (human) code flow analysis significantly, making the whole thing easier to reason about, to debug and to test.
If the abstraction for the client code is an algorithm, you
probably want to keep a pure functional interface, and not
introduce additional types there. It's quite common, on the
other hand, for such a function to be implemented in a source
file which defines a common data structure or class for its
internal use, so you might have:
double calculation( /* input parameters */ )
{
SupportClass calc( /* input parameters */ );
calc.part1();
calc.part2();
// etc...
return calc.results();
}
Depending on how your code is organized, SupportClass will be
in an unnamed namespace in the source file (probably the most
common case), or in a "private" header, included only by the
sources involved in the algorith.
It really depends of what kind of algorithm you want to encapsulate. Generally I agree with John Carmack : "Sometimes, the elegant implementation is just a function. Not a method. Not a class. Not a framework. Just a function."
It really boils down to: do the algorithm need access to the private area of the class that is not supposed to be public? If the answer is yes (unless you are willing to refactor your class interface, depending on the specific cases) you should go with a member function, if not, then a free function is good enough.
Take for example the standard library. Most of the algorithms are provided as free functions because they only access the public interface of the class (with iterators for standard containers, for example).
Do you need to call the exact same functions in the exact same order each time? Then you shouldn't be requiring calling code to do this. Splitting your algorithm into multiple functions is fine, but I'd still have one call the next and then the next and so on, with a struct of results/parameters being passed along the way. A class doesn't feel right for a one-off invocation of some procedure.
The only way I'd do this with a class is if the class encapsulates all the input data itself, and you then call myClass.nameOfMyAlgorithm() on it, among other potential operations. Then you have data+manipulators. But just manipulators? Yeah, I'm not so sure.
In modern C++ the distinction has been eroded quite a bit. Even from the operator overloading of the pre-ANSI language, you could create a class whose instances are syntactically like functions:
struct Multiplier
{
int factor_;
Multiplier(int f) : factor_(f) { }
int operator()(int v) const
{
return v * _factor;
}
};
Multipler doubler(2);
std::cout << doubler(3) << std::endl; // prints 6
Such a class/struct is called a functor, and can capture "contextual" values in its constructor. This allows you to effectively pass the parameters to a function in two stages: some in the constructor call, some later each time you call it for real. This is called partial function application.
To relate this to your example, your calculate member function could be turned into operator(), and then the Calculation instance would be a function! (or near enough.)
To unify these ideas, you can try thinking of a plain function as a functor of which there is only one instance (and hence no need for a constructor - although this is no guarantee that the function only depends on its formal parameters: it might depend on global variables...)
Rather than asking "Should I put this algorithm in a function or a class?" instead ask yourself "Would it be useful to be able to pass the parameters to this algorithm in two or more stages?" In your example, all the parameters go into the constructor, and none in the later call to calculate, so it makes little sense to ask users of your class make two calls.
In C++11 the distinction breaks down further (and things get a lot more convenient), in recognition of the fluidity of these ideas:
auto doubler = [] (int val) { return val * 2; };
std::cout << doubler(3) << std::endl; // prints 6
Here, doubler is a lambda, which is essentially a nifty way to declare an instance of a compiler-generated class that implements the () operator.
Reproducing the original example more exactly, we would want a function-like thing called multiplier that accepts a factor, and returns another function-like thing that accepts a value v and returns v * factor.
auto multiplier = [] (int factor)
{
return [=] (int v) { return v * factor; };
};
auto doubler = multiplier(2);
std::cout << doubler(3) << std::endl; // prints 6
Note the pattern: ultimately we're multiplying two numbers, but we specify the numbers in two steps. The functor we get back from calling multiplier acts like a "package" containing the first number.
Although lambdas are relatively new, they are likely to become a very common part of C++ style (as they have in every other language they've been added to).
But sadly at this point we've reached the "cutting edge" as the above example works in GCC but not in MSVC 12 (I haven't tried it in MSVC 13). It does pass the intellisense checking of MSVC 12 though (they use two completely different compilers)! And you can fix it by wrapping the inner lambda with std::function<int(int)>( ... ).
Even so, you can use these ideas in old-school C++ when writing functors by hand.
Looking further ahead, resumable functions may make it into some future version of the language (Microsoft is pushing hard for them as they are practically identical to async/await in C#) and that is yet another blurring of the distinction between functions and classes (a resumable function acts like a constructor for a state machine class).
My question refers to:
Using a lambda expression versus a private method
Now that lambda functors are part of C++, they could be used to unclutter a class' interface. How does lambda use vs private method use compare in C++? Do there exist better alternatives to unclutter class interfaces?
Although lambdas can definitely replace some private member functions, thinking of them as means to uncluttering the interface of a class is taking too narrow a view of both lambdas and private member functions.
Private member functions (and functions in general) are the basic units of code reuse. They let you write a piece of logic once, and then hide it behind the name of the function.
Although lambdas can replace a private member function in certain context, they could replace an object of which that function is a member along with it, which is a whole lot more. Due to lambda's ability to capture the context around them, you get a way of creating code blocks that take not only your object with it, but also the state of your local variables. Before lambdas you needed to create a special class for that; lambdas let you create such classes on the fly, for much better readability.
Do there exist better alternatives to unclutter class interfaces?
A method I use quite often is to provide a series of functions with file-scope in an unnamed namespace that act as "helpers" to my member functions.
For example:
// Foobar.cpp
#include "Foobar.h"
namespace {
void someHelper()
{
std::cout << "I'm a helper!" << std::endl;
}
}
void Foobar::someMemberFunc()
{
someHelper();
[...]
}
You can, of course, also declare lambdas and/or classes in this unnamed namespace as well. Anything declared in this unnamed namespace will only be accessible from within the same translation unit (eg, cpp file).
Lambda is a way to move logic into the flow of your code. The only purpose is improving readability, and they can help or they can hurt
If a private/lambda is short, it is often easier for a reader to comprehend it on the spot, rather than having to remember to look at a different function later.
If a private/lambda is long, and appears incongruous with the code calling it, this can distract the reader from a greater pattern in the code. Imagine if you were reading one of these StackOverflow answers and suddenly had to stop to read StackEnglish's description of null and void to make sense of the answer. You'd really rather just see a hyperlink of "optional information" than a copy/paste of the contents.
It is far easier to construct a lambda which holds onto upvalue arguments than it is to construct a private functor to do the same. This can increase readability due to lambdas having less boilerplate in accomplishing that goal.
lambdas are harder to share between sections of code. If you find yourself calling the same function from many places, private functions may be a better match.
How does lambda use vs private method use compare in C++
One may capture the this pointer, therefore having access to a class's internals without cluttering its interface. Previously, under certain circumstances one would require either a member or a friend function (use in conjunction with bind and mem_fn) in stl algorithms.
Added, one should strive to keep interfaces minimal, as changing them is more costly. For this reason idioms like pimpl are popular, and for the same reason a lambda may be preferred over a member function. As mentioned by another person, one should consider that functions are also tools of reuse, but in that case I would prefer a pimpl with private functions, which negates reuse of lambdas for this purpose.
It has a matter of personal taste in it if you ask me, the difference between the two is not huge.
Lambdas are really good for stl algorithms, while private methods, are not as comftorable for that, on the other hand, i find private methods much better if you want to reuse your code.
This question shows the best usages of lambdas imo : lambdas
Personally, i prefer private methods, espessialy in the case you presented. Lambdas should not be used instead of private methods. It clutters the function's body and diverts the eye from the main logic.
Although it unclutter the class, it clutters the function, which i find worse.
in a C++ program I need some helper constant objects that would be instantiated once, preferably when the program starts. Those objects would mostly be used within the same translation unit, so the simplest way to do this would be to make them static:
static const Helper h(params);
But then there is this static initialization order problem, so if Helper refers to some other statics (via params), this might lead to UB.
Another point is that I might eventually need to share this object between several units. If I just leave it static and put in a .h file, that would lead to multiple objects. I could avoid that by bothering with extern etc, but this can finally provoke the same initialization order issues (and not to say it looks very C-ish).
I thought about singletons, but that would be overkill due to the boilerplate code and inconvenient syntax (e.g. MySingleton::GetInstance().MyVar) - those objects are helpers, so they are supposed to simplify things, not to complicate them...
The same C++ FAQ mentions this option:
Fred& x()
{
static Fred* ans = new Fred();
return *ans;
}
Is this really used and considered a good thing? Should I do it this way, or would you suggest other alternatives? Thanks.
EDIT: I should have clarified why I actually need that helpers: they are very like normal constants, and could have been pre-calculated, but it is more convenient to do that at runtime. I would prefer to instantiate them before main, as it automatically resolves multi-threading issues (which local statics are not protected against in C++03). Also, as I said, they would often be limited to a translation unit, so it does not make sense to export them and initialize in main(). You can think of them as just constants but only known at runtime.
There are several possibilities for global state (whether mutable or not).
If you fear that you'll have an initialization issue, then you should use the local static approach to create your instance.
Note that the clunky singleton design you present is not mandatory design:
class Singleton
{
public:
static void DoSomething(int i)
{
Singleton& s = Instance();
// do something with i
}
private:
Singleton() {}
~Singleton() {}
static Singleton& Instance()
{
static Singleton S; // no dynamic allocation, it's unnecessary
return S;
}
};
// Invocation
Singleton::DoSomething(i);
Another design is somewhat similar, though I much prefer it because it makes transition to a non-global design much easier.
class Monoid
{
public:
Monoid()
{
static State S;
state = &s;
}
void doSomething(int i)
{
state->count += i;
}
private:
struct State
{
int count;
};
State* state;
};
// Use
Monoid m;
m.doSomething(1);
The net advantage here is that the "global-ness" of the state is hidden, it's an implementation details that clients need not worry about. Very useful for caches.
Let us, will you, question the design:
do you actually need to enforce the singularity ?
do you actually need the object be built before main starts ?
Singularity is generally over-emphasized. C++0x will help here, but even then, technically enforcing singularity rather than relying on programmers to behave themselves can be very annoying... for example when writing tests: do you really want to unload/reload your program between each unit test just to change the configuration between each one ? Ugh. Much more simple to instantiate it once and have faith in your fellow programmers... or the functional tests ;)
The second question is more technical, than functional. If you do need the configuration before the entry point of your program, then you can simply read it when it starts.
It may sound naive, but there is actually one issue with computing during the library load: how do you handle errors ? If you throw, the library is not loaded. If you do not throw and go on, you are in an invalid state. Not so funny, is it ? Things are much simpler once the real work has begun, because you can use the regular control-flow logic.
And if you think about testing whether the state is valid or not... why not simply building everything at the point where you'd test ?
Finally, the very issue with global is the hidden dependencies that are introduced. It's much better when dependencies are implicit to reason about the flow of execution, or the impacts of a refactoring.
EDIT:
Regarding initialization order issues: objects within a single translation unit are guaranteed to be initialized in the order they are defined.
Therefore, the following code is valid according to the standard:
static int foo() { return std::numeric_limits<int>::max() / 2; }
static int bar(int c) { return c*2; }
static int const x = foo();
static int const y = bar(x);
The initialization order is only an issue when referencing constants / variables defined in another translation unit. As such, static objects can naturally be expressed without issues as long as they only refer to static objects within the same translation unit.
Regarding the space issue: the as-if rule can do wonders here. Informally the as-if rule means that you specify a behavior and leave it up to the compiler/linker/runtime to provide it, without a care in the world for how it is provided. This is what actually enables optimizations.
Therefore, if the compiler chain can infer that the address of a constant is never taken, it may elide the constant altogether. If it can infer that several constants will always be equal, and once again that their address are never inspected, it may merge them together.
Yes, you can use Construct On First Use Idiom if it simplifies your problem. It's always better than global objects whose initialization depend on other global objects.
The other alternative is Singleton Pattern. Both can solve similar problem. But you've to decide which suits the situation better and fulfill your requirement.
To the best of my knowledge, there is nothing "better" than these two appproaches.
Singletons and global objects are often considered evil. The simplest and most flexible way is to instantiate the object in your main function and pass this object to other functions:
void doSomething(const Helper& h);
int main() {
const Parameters params(...);
const Helper h(params);
doSomething(h);
}
Another way is to make the helper functions non-members. Maybe they don't need any state at all, and if they do, you can pass a stateful object when you call them.
I think nothing speaks against the local static idiom mentioned in the FAQ. It is simple and should be thread-safe, and if the object isn't mutable, it should also be easily mockable and introduce no action at a distance.
Does Helper need to exist before main runs? If not, make a (set of?) global pointer variables initialized to 0. Then use main to populate them with the constant state in a definitive order. If you like you can even make helper functions that do the dereference for you.
I've found myself in a strange place, mentally. In a C++ project, I long for closures.
Background. There's a Document-type class with a public Render method which spawns a deep call tree. There's some transient state that only makes sense during rendering. Right now it resides in the class like regular member variables. However, this is not satisfactory on some levels - this data only makes sense during a Render call, why store it all the time? Passing it around in arguments would be ugly - there are around 15 variables there. Passing around a structure would add a lot of "RenderState->..." in the lower-level methods.
So what do I want? I want the world, like we all do. Specifically, a set of variables that are:
available to some methods in a class (not all of them)
accessible by name alone (no pState->... stuff - so that refactoring is easy)
not copied around on every method call
only live during a method call and up its call tree (assuming trees grow up)
live on a stack
I know I can have some of those properties with C++ - but not all of them. Tell me I'm not turning weird.
Heck, in Pascal, of all places, nested functions give you all that...
So what is a good workaround to emulate closures in C++, getting as many of the above benefits as possible?
Standard C++ since C++11 provides native lambda expressions and several compilers (VC10+ GCC and clang at least) implements it.
With GCC and Clang you can activate it with "--std=c++11" (or use a higher version of C++ if available). VC10 and later versions have it activated without need for flags.
By the way, you can also use boost::lambda (that is not perfect but works with C++03) also provide lambda in C++.
You don't have nested functions, but you have local classes:
void Document::Render(Param)
{
class RenderState
{
public:
RenderState(Document&)
{
//...
}
void Go(Param);
private:
// "Nested" functions
// ....
// Data that nested functions operate on
// ...
};
RenderState s(*this);
s.Go(Param);
}
See this GotW article for more information
Personally, I'd go with the RenderState approach.
Alternatively, if there's a well-defined set of Render-only functions that all require access to the same data, I'd seriously investigate pulling those into their own DocumentRenderer class that contains both the appropriate methods and the appropriate member variables. (This is similar to Fowler's "method object" refactoring.)
C++ doesn't have nested functions, but local classes can serve as an imperfect solution. (Imperfect because local classes' methods cannot access variables of the enclosing class and because they can't be used to instantiate templates.) A local class is simply a class that's declared, along with its methods, within the body of a function. Herb Sutter discusses local classes in more detail here.
Local classes are used to implement Boost's ScopeExit library. ScopeExit's reviewers noted that ScopeExit "suggests a method for creating a general closure mechanism as a library," so if you aren't happy with a RenderState or DocumentRenderer approach, ScopeExit's implementation may give you some ideas for closures in C++.
Currently there are no closures in C++ that would generate "orinary" first-class functions (whether member or non-member). Moreover, there's no standard way to implement such closures.
Closure semantics is available for functors in template metaprogramming at compile time, but that's a completely different kind of beast. In order to obtain a true run-time closure functionality for first-class functions you haver no other choice but to use a non-standard low-level implementation like this one, for example.
A functor is basically a closure.
Why the downvotes? Take Érics comment, change void Go(Param); to void operator () (Param); and there you have it.
There is no way to keep the stack in a native application after the function has exited. But this would be neccesary to make closures like the ones in Javascript. And there is no way to reference a function's stack without doing anything evil. A class that acts like a function (=a functor) would have to get all the relevant information passed somehow, but this is as close as you get in C++. It has state, it has code, and you can pass it around.
Please explain, where am I wrong?
As long as the local variables you want to bind are in scope, you can try something like the following to bind them to your inner class. Though, if you have read the above posted GotW article, it is a fragile solution.
#include <iostream>
using namespace std;
int main() {
int x = 1;
cout << x << endl; // 1
class Inner {
public:
Inner(int& x) : bound_x(x) {}
void do_sth() { ++bound_x; }
private:
int& bound_x;
};
Inner i(x);
i.do_sth();
cout << x << endl; // 2
x = 5;
i.do_sth();
cout << x << endl; // 6
return 0;
}