Is there a preferred way to return multiple values from a C++ function? For example, imagine a function that divides two integers and returns both the quotient and the remainder. One way I commonly see is to use reference parameters:
void divide(int dividend, int divisor, int& quotient, int& remainder);
A variation is to return one value and pass the other through a reference parameter:
int divide(int dividend, int divisor, int& remainder);
Another way would be to declare a struct to contain all of the results and return that:
struct divide_result {
int quotient;
int remainder;
};
divide_result divide(int dividend, int divisor);
Is one of these ways generally preferred, or are there other suggestions?
Edit: In the real-world code, there may be more than two results. They may also be of different types.
In C++11 you can:
#include <tuple>
std::tuple<int, int> divide(int dividend, int divisor) {
return std::make_tuple(dividend / divisor, dividend % divisor);
}
#include <iostream>
int main() {
using namespace std;
int quotient, remainder;
tie(quotient, remainder) = divide(14, 3);
cout << quotient << ',' << remainder << endl;
}
In C++17:
#include <tuple>
std::tuple<int, int> divide(int dividend, int divisor) {
return {dividend / divisor, dividend % divisor};
}
#include <iostream>
int main() {
using namespace std;
auto [quotient, remainder] = divide(14, 3);
cout << quotient << ',' << remainder << endl;
}
or with structs:
auto divide(int dividend, int divisor) {
struct result {int quotient; int remainder;};
return result {dividend / divisor, dividend % divisor};
}
#include <iostream>
int main() {
using namespace std;
auto result = divide(14, 3);
cout << result.quotient << ',' << result.remainder << endl;
// or
auto [quotient, remainder] = divide(14, 3);
cout << quotient << ',' << remainder << endl;
}
For returning two values I use a std::pair (usually typedef'd). You should look at boost::tuple (in C++11 and newer, there's std::tuple) for more than two return results.
With introduction of structured binding in C++ 17, returning std::tuple should probably become accepted standard.
Personally, I generally dislike return parameters for a number of reasons:
it is not always obvious in the invocation which parameters are ins and which are outs
you generally have to create a local variable to catch the result, while return values can be used inline (which may or may not be a good idea, but at least you have the option)
it seems cleaner to me to have an "in door" and an "out door" to a function -- all the inputs go in here, all the outputs come out there
I like to keep my argument lists as short as possible
I also have some reservations about the pair/tuple technique. Mainly, there is often no natural order to the return values. How is the reader of the code to know whether result.first is the quotient or the remainder? And the implementer could change the order, which would break existing code. This is especially insidious if the values are the same type so that no compiler error or warning would be generated. Actually, these arguments apply to return parameters as well.
Here's another code example, this one a bit less trivial:
pair<double,double> calculateResultingVelocity(double windSpeed, double windAzimuth,
double planeAirspeed, double planeCourse);
pair<double,double> result = calculateResultingVelocity(25, 320, 280, 90);
cout << result.first << endl;
cout << result.second << endl;
Does this print groundspeed and course, or course and groundspeed? It's not obvious.
Compare to this:
struct Velocity {
double speed;
double azimuth;
};
Velocity calculateResultingVelocity(double windSpeed, double windAzimuth,
double planeAirspeed, double planeCourse);
Velocity result = calculateResultingVelocity(25, 320, 280, 90);
cout << result.speed << endl;
cout << result.azimuth << endl;
I think this is clearer.
So I think my first choice, in general, is the struct technique. The pair/tuple idea is likely a great solution in certain cases. I'd like to avoid the return parameters when possible.
std::pair<int, int> divide(int dividend, int divisor)
{
// :
return std::make_pair(quotient, remainder);
}
std::pair<int, int> answer = divide(5,2);
// answer.first == quotient
// answer.second == remainder
std::pair is essentially your struct solution, but already defined for you, and ready to adapt to any two data types.
There are a bunch of ways to return multiple parameters. I'm going to be exhastive.
Use reference parameters:
void foo( int& result, int& other_result );
Use pointer parameters:
void foo( int* result, int* other_result );
which has the advantage that you have to do a & at the call-site, possibly alerting people it is an out-parameter.
Write an out<?> template and use it:
template<class T>
struct out {
std::function<void(T)> target;
out(T* t):target([t](T&& in){ if (t) *t = std::move(in); }) {}
out(std::optional<T>* t):target([t](T&& in){ if (t) t->emplace(std::move(in)); }) {}
out(std::aligned_storage_t<sizeof(T), alignof(T)>* t):
target([t](T&& in){ ::new( (void*)t ) T(std::move(in)); } ) {}
template<class...Args> // TODO: SFINAE enable_if test
void emplace(Args&&...args) {
target( T(std::forward<Args>(args)...) );
}
template<class X> // TODO: SFINAE enable_if test
void operator=(X&&x){ emplace(std::forward<X>(x)); }
template<class...Args> // TODO: SFINAE enable_if test
void operator()(Args...&&args){ emplace(std::forward<Args>(args)...); }
};
then we can do:
void foo( out<int> result, out<int> other_result )
and all is good. foo is no longer able to read any value passed in as a bonus.
Other ways of defining a spot you can put data can be used to construct out. A callback to emplace things somewhere, for example.
We can return a structure:
struct foo_r { int result; int other_result; };
foo_r foo();
whick works ok in every version of C++, and in c++17 this also permits:
auto&&[result, other_result]=foo();
at zero cost. Parameters can even not even be moved thanks to guaranteed elision.
We could return a std::tuple:
std::tuple<int, int> foo();
which has the downside that parameters are not named. This permits the c++17:
auto&&[result, other_result]=foo();
as well. Prior to c++17 we can instead do:
int result, other_result;
std::tie(result, other_result) = foo();
which is just a bit more awkward. Guaranteed elision doesn't work here, however.
Going into stranger territory (and this is after out<>!),
We can use continuation passing style:
void foo( std::function<void(int result, int other_result)> );
and now callers do:
foo( [&](int result, int other_result) {
/* code */
} );
a benefit of this style is you can return an arbitrary number of values (with uniform type) without having to manage memory:
void get_all_values( std::function<void(int)> value )
the value callback could be called 500 times when you get_all_values( [&](int value){} ).
For pure insanity, you could even use a continuation on the continuation.
void foo( std::function<void(int, std::function<void(int)>)> result );
whose use looks like:
foo( [&](int result, auto&& other){ other([&](int other){
/* code */
}) });
which would permit many-one relationships between result and other.
Again with uniforn values, we can do this:
void foo( std::function< void(span<int>) > results )
here, we call the callback with a span of results. We can even do this repeatedly.
Using this, you can have a function that efficiently passes megabytes of data without doing any allocation off the stack.
void foo( std::function< void(span<int>) > results ) {
int local_buffer[1024];
std::size_t used = 0;
auto send_data=[&]{
if (!used) return;
results({ local_buffer, used });
used = 0;
};
auto add_datum=[&](int x){
local_buffer[used] = x;
++used;
if (used == 1024) send_data();
};
auto add_data=[&](gsl::span<int const> xs) {
for (auto x:xs) add_datum(x);
};
for (int i = 0; i < 7+(1<<20); ++i) {
add_datum(i);
}
send_data(); // any leftover
}
Now, std::function is a bit heavy for this, as we would be doing this in zero-overhead no-allocation environments. So we'd want a function_view that never allocates.
Another solution is:
std::function<void(std::function<void(int result, int other_result)>)> foo(int input);
where instead of taking the callback and invoking it, foo instead returns a function which takes the callback.
foo(7)([&](int result, int other_result){ /* code */ });
this breaks the output parameters from the input parameters by having separate brackets.
Use a Generator:
With variant and c++20 coroutines, you could make foo a generator of a variant of the return types (or just the return type). The syntax is not yet fixed, so I won't give examples.
Use signals/slot style:
In the world of signals and slots, a function that exposes a set of signals:
template<class...Args>
struct broadcaster;
broadcaster<int, int> foo();
allows you to create a foo that does work async and broadcasts the result when it is finished.
Use pipelines:
Down this line we have a variety of pipeline techniques, where a function doesn't do something but rather arranges for data to be connected in some way, and the doing is relatively independant.
foo( int_source )( int_dest1, int_dest2 );
then this code doesn't do anything until int_source has integers to provide it. When it does, int_dest1 and int_dest2 start recieving the results.
It's entirely dependent upon the actual function and the meaning of the multiple values, and their sizes:
If they're related as in your fraction example, then I'd go with a struct or class instance.
If they're not really related and can't be grouped into a class/struct then perhaps you should refactor your method into two.
Depending upon the in-memory size of the values you're returning, you may want to return a pointer to a class instance or struct, or use reference parameters.
With C++17 you can also return one ore more unmovable/uncopyable values (in certain cases). The possibility to return unmovable types come via the new guaranteed return value optimization, and it composes nicely with aggregates, and what can be called templated constructors.
template<typename T1,typename T2,typename T3>
struct many {
T1 a;
T2 b;
T3 c;
};
// guide:
template<class T1, class T2, class T3>
many(T1, T2, T3) -> many<T1, T2, T3>;
auto f(){ return many{string(),5.7, unmovable()}; };
int main(){
// in place construct x,y,z with a string, 5.7 and unmovable.
auto [x,y,z] = f();
}
The pretty thing about this is that it is guaranteed to not cause any copying or moving. You can make the example many struct variadic too. More details:
Returning variadic aggregates (struct) and syntax for C++17 variadic template 'construction deduction guide'
The OO solution for this is to create a ratio class. It wouldn't take any extra code (would save some), would be significantly cleaner/clearer, and would give you some extra refactorings letting you clean up code outside this class as well.
Actually I think someone recommended returning a structure, which is close enough but hides the intent that this needs to be a fully thought-out class with constructor and a few methods, in fact, the "method" that you originally mentioned (as returning the pair) should most likely be a member of this class returning an instance of itself.
I know your example was just an "Example", but the fact is that unless your function is doing way more than any function should be doing, if you want it to return multiple values you are almost certainly missing an object.
Don't be afraid to create these tiny classes to do little pieces of work--that's the magic of OO--you end up breaking it down until every method is very small and simple and every class small and understandable.
Another thing that should have been an indicator that something was wrong: in OO you have essentially no data--OO isn't about passing around data, a class needs to manage and manipulate it's own data internally, any data passing (including accessors) is a sign that you may need to rethink something..
There is precedent for returning structures in the C (and hence C++) standard with the div, ldiv (and, in C99, lldiv) functions from <stdlib.h> (or <cstdlib>).
The 'mix of return value and return parameters' is usually the least clean.
Having a function return a status and return data via return parameters is sensible in C; it is less obviously sensible in C++ where you could use exceptions to relay failure information instead.
If there are more than two return values, then a structure-like mechanism is probably best.
C++17, using std::make_tuple, structured binding and as much auto as possible:
#include <tuple>
#include <string>
#include <cstring>
auto func() {
// ...
return std::make_tuple(1, 2.2, std::string("str"), "cstr");
}
int main() {
auto [i, f, s, cs] = func();
return i + f + s.length() + strlen(cs);
}
With -O1 this optimizes out completely: https://godbolt.org/z/133rT9Pcq
-O3 needed only to optimize out std::string: https://godbolt.org/z/Mqbez73Kf
And here: https://godbolt.org/z/WWKvE3osv you can see GCC storing all the returned values packed together in a single chunk of memory (rdi+N), POD-style, proving there is no performance penalty.
Use a struct or a class for the return value. Using std::pair may work for now, but
it's inflexible if you decide later you want more info returned;
it's not very clear from the function's declaration in the header what is being returned and in what order.
Returning a structure with self-documenting member variable names will likely be less bug-prone for anyone using your function. Putting my coworker hat on for a moment, your divide_result structure is easy for me, a potential user of your function, to immediately understand after 2 seconds. Messing around with ouput parameters or mysterious pairs and tuples would take more time to read through and may be used incorrectly. And most likely even after using the function a few times I still won't remember the correct order of the arguments.
If your function returns a value via reference, the compiler cannot store it in a register when calling other functions because, theoretically, the first function can save the address of the variable passed to it in a globally accessible variable, and any subsecuently called functions may change it, so the compiler will have (1) save the value from registers back to memory before calling other functions and (2) re-read it when it is needed from the memory again after any of such calls.
If you return by reference, optimization of your program will suffer
Here, i am writing a program that is returning multiple values(more than two values) in c++. This program is executable in c++14 (G++4.9.2). program is like a calculator.
# include <tuple>
# include <iostream>
using namespace std;
tuple < int,int,int,int,int > cal(int n1, int n2)
{
return make_tuple(n1/n2,n1%n2,n1+n2,n1-n2,n1*n2);
}
int main()
{
int qut,rer,add,sub,mul,a,b;
cin>>a>>b;
tie(qut,rer,add,sub,mul)=cal(a,b);
cout << "quotient= "<<qut<<endl;
cout << "remainder= "<<rer<<endl;
cout << "addition= "<<add<<endl;
cout << "subtraction= "<<sub<<endl;
cout << "multiplication= "<<mul<<endl;
return 0;
}
So, you can clearly understand that in this way you can return multiple values from a function. using std::pair only 2 values can be returned while std::tuple can return more than two values.
I tend to use out-vals in functions like this, because I stick to the paradigm of a function returning success/error codes and I like to keep things uniform.
Alternatives include arrays, generators, and inversion of control, but none is appropriate here.
Some (e.g. Microsoft in historical Win32) tend to use reference parameters for simplicity, because it's clear who allocates and how it will look on the stack, reduces the proliferation of structures, and allows a separate return value for success.
"Pure" programmers prefer the struct, assuming it is the function value (as is the case here), rather than something that's touched incidentally by the function. If you had a more complicated procedure, or something with state, you'd probably use references (assuming you have a reason for not using a class).
I'd say there is no preferred method, it all depends on what you're going to do with the response. If the results are going to be used together in further processing then structures make sense, if not I'd tend to pass then as individual references unless the function was going to be used in a composite statement:
x = divide( x, y, z ) + divide( a, b, c );
I often choose to pass 'out structures' by reference in the parameter list rather than having the pass by copy overhead of returning a new structure (but this is sweating the small stuff).
void divide(int dividend, int divisor, Answer &ans)
Are out parameters confusing? A parameter sent as reference suggests the value is going to change (as opposed to a const reference). Sensible naming also removes confusion.
Why do you insist on a function with multiple return values? With OOP you can use a class offering a regular function with a single return value, and any number of additional "return values" like below. The advantage is that the caller has a choice of looking at the extra data members, but is not required to do this. This is the preferred method for complicated data base or networking calls, where lots of additional return info may be needed in case errors occur.
To answer your original question, this example has a method to return the quotient, which is what most callers may need, and additionally, after the method call, you can get the remainder as a data member.
class div{
public:
int remainder;
int quotient(int dividend, int divisor){
remainder = ...;
return ...;
}
};
Boost tuple would be my preferred choice for a generalized system of returning more than one value from a function.
Possible example:
include "boost/tuple/tuple.hpp"
tuple <int,int> divide( int dividend,int divisor )
{
return make_tuple(dividend / divisor,dividend % divisor )
}
rather than returning multiple values,just return one of them and make a reference of others in the required function for eg:
int divide(int a,int b,int quo,int &rem)
Here is the link to the "core guidelines" (by Bjarne Stroustrup and Herb Sutter) on this topic.
https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rf-out-multi
Partial Quote:
F.21: To return multiple “out” values, prefer returning a struct or tuple
Reason A return value is self-documenting as an “output-only” value. Note that C++ does have multiple return values, by convention of using a tuple (including pair), possibly with the extra convenience of tie or structured bindings (C++17) at the call site. Prefer using a named struct where there are semantics to the returned value. Otherwise, a nameless tuple is useful in generic code.
We can declare the function such that, it returns a structure type user defined variable or a pointer to it . And by the property of a structure, we know that a structure in C can hold multiple values of asymmetrical types (i.e. one int variable, four char variables, two float variables and so on…)
I would just do it by reference if it's only a few return values but for more complex types you can also just do it like this :
static struct SomeReturnType {int a,b,c; string str;} SomeFunction()
{
return {1,2,3,string("hello world")}; // make sure you return values in the right order!
}
use "static" to limit the scope of the return type to this compilation unit if it's only meant to be a temporary return type.
SomeReturnType st = SomeFunction();
cout << "a " << st.a << endl;
cout << "b " << st.b << endl;
cout << "c " << st.c << endl;
cout << "str " << st.str << endl;
This is definitely not the prettiest way to do it but it will work.
Quick answer:
#include <iostream>
using namespace std;
// different values of [operate] can return different number.
int yourFunction(int a, int b, int operate)
{
a = 1;
b = 2;
if (operate== 1)
{
return a;
}
else
{
return b;
}
}
int main()
{
int a, b;
a = yourFunction(a, b, 1); // get return 1
b = yourFunction(a, b, 2); // get return 2
return 0;
}
I want to implement a simple tree in C++11 tuple with a Python fashion. In Python, we can use type(obj) to check run-time object type, and pass object with different type to one function, I have write pseudo code for calc(), how to do it in c++?
I try to print typeid(child1).name() and typeid(tree).name(), they are 'St5tupleIIciiEE' and 'St5tupleIIcS_IIciiEES0_EE'.
My environment is g++ 4.8.1. Thanks!
// pseudo code
int calc(tuple tree) {
symbol = type(get<0>(tree));
l_child = type(get<1>(tree));
r_child = type(get<2>(tree));
l = (type(l_child) == tuple) ? calc(l_child) : l_child;
r = (type(r_child) == tuple) ? calc(r_child) : r_child;
return l symbol r;
}
int main()
{
auto l_child = make_tuple('*', 1, 2);
auto r_child = make_tuple('-', 5, 1);
auto tree = make_tuple('+', l_child, r_child);
cout << calc(tree) << endl;
}
Python and C++ are very different languages. C++ is statically typed, Python is not. Transplanting Python techniques to C++ may or may not work. In this case it won't work.
In Python, there is only one tuple class, able to represent any tuple; in C++ there is an infinite number of tuple types, each one able to hold specific kinds of data. They are not interchangeable, as your experiment with typeid aptly demonstrates.
In C++, you cannot hold an arbitrary tree in a tuple. Write a tree class (or better, a class template).
Edit: technically, if you combine tuples with pointers and unions, you can get away with tuples. This is however not recommended. Your tree is going to be your central abstraction, exposing such low level details as pointers and unions is counterproductive and should be avoided. The C++ way is to write a class, stick to it.
It's unreal, since result of typeid().name is implementation-defined.
const char* name() const noexcept;
Returns: An implementation-defined ntbs.
However, here, you cannot use ternary operator, since calc(l_child) will be evaluated at compile-time, so if l_child is not tuple, compilation will be failed.
You can use some type-traits (or overloading), since tuple members are known at compile-time.
int calc(int value)
{
return value;
}
template<typename Left, typename Right>
int calc(const std::tuple<char, Left, Right>& tuple)
{
char symbol = std::get<0>(tuple);
Left l_child = std::get<1>(tuple);
Right r_child = std::get<2>(tuple);
int l = calc(l_child);
int r = calc(r_child);
return l /*symbol*/, r;
}
Live example
I wonder where should we use lambda expression over functor in C++. To me, these two techniques are basically the same, even functor is more elegant and cleaner than lambda. For example, if I want to reuse my predicate, I have to copy the lambda part over and over. So when does lambda really come in to place?
A lambda expression creates an nameless functor, it's syntactic sugar.
So you mainly use it if it makes your code look better. That generally would occur if either (a) you aren't going to reuse the functor, or (b) you are going to reuse it, but from code so totally unrelated to the current code that in order to share it you'd basically end up creating my_favourite_two_line_functors.h, and have disparate files depend on it.
Pretty much the same conditions under which you would type any line(s) of code, and not abstract that code block into a function.
That said, with range-for statements in C++0x, there are some places where you would have used a functor before where it might well make your code look better now to write the code as a loop body, not a functor or a lambda.
1) It's trivial and trying to share it is more work than benefit.
2) Defining a functor simply adds complexity (due to having to make a bunch of member variables and crap).
If neither of those things is true then maybe you should think about defining a functor.
Edit: it seems to be that you need an example of when it would be nice to use a lambda over a functor. Here you go:
typedef std::vector< std::pair<int,std::string> > whatsit_t;
int find_it(std::string value, whatsit_t const& stuff)
{
auto fit = std::find_if(stuff.begin(), stuff.end(), [value](whatsit_t::value_type const& vt) -> bool { return vt.second == value; });
if (fit == stuff.end()) throw std::wtf_error();
return fit->first;
}
Without lambdas you'd have to use something that similarly constructs a functor on the spot or write an externally linkable functor object for something that's annoyingly trivial.
BTW, I think maybe wtf_error is an extension.
Lambdas are basically just syntactic sugar that implement functors (NB: closures are not simple.) In C++0x, you can use the auto keyword to store lambdas locally, and std::function will enable you to store lambdas, or pass them around in a type-safe manner.
Check out the Wikipedia article on C++0x.
Small functions that are not repeated.
The main complain about functors is that they are not in the same place that they were used. So you had to find and read the functor out of context to the place it was being used in (even if it is only being used in one place).
The other problem was that functor required some wiring to get parameters into the functor object. Not complex but all basic boilerplate code. And boiler plate is susceptible to cut and paste problems.
Lambda try and fix both these. But I would use functors if the function is repeated in multiple places or is larger than (can't think up an appropriate term as it will be context sensitive) small.
lambda and functor have context. Functor is a class and therefore can be more complex then a lambda. A function has no context.
#include <iostream>
#include <list>
#include <vector>
using namespace std;
//Functions have no context, mod is always 3
bool myFunc(int n) { return n % 3 == 0; }
//Functors have context, e.g. _v
//Functors can be more complex, e.g. additional addNum(...) method
class FunctorV
{
public:
FunctorV(int num ) : _v{num} {}
void addNum(int num) { _v.push_back(num); }
bool operator() (int num)
{
for(int i : _v) {
if( num % i == 0)
return true;
}
return false;
}
private:
vector<int> _v;
};
void print(string prefix,list<int>& l)
{
cout << prefix << "l={ ";
for(int i : l)
cout << i << " ";
cout << "}" << endl;
}
int main()
{
list<int> l={1,2,3,4,5,6,7,8,9};
print("initial for each test: ",l);
cout << endl;
//function, so no context.
l.remove_if(myFunc);
print("function mod 3: ",l);
cout << endl;
//nameless lambda, context is x
l={1,2,3,4,5,6,7,8,9};
int x = 3;
l.remove_if([x](int n){ return n % x == 0; });
print("lambda mod x=3: ",l);
x = 4;
l.remove_if([x](int n){ return n % x == 0; });
print("lambda mod x=4: ",l);
cout << endl;
//functor has context and can be more complex
l={1,2,3,4,5,6,7,8,9};
FunctorV myFunctor(3);
myFunctor.addNum(4);
l.remove_if(myFunctor);
print("functor mod v={3,4}: ",l);
return 0;
}
Output:
initial for each test: l={ 1 2 3 4 5 6 7 8 9 }
function mod 3: l={ 1 2 4 5 7 8 }
lambda mod x=3: l={ 1 2 4 5 7 8 }
lambda mod x=4: l={ 1 2 5 7 }
functor mod v={3,4}: l={ 1 2 5 7 }
First, i would like to clear some clutter here.
There are two different things
Lambda function
Lambda expression/functor.
Usually, Lambda expression i.e. [] () {} -> return-type does not always synthesize to closure(i.e. kind of functor). Although this is compiler dependent. But you can force compiler by enforcing + sign before [] as +[] () {} -> return-type. This will create function pointer.
Now, coming to your question. You can use lambda repeatedly as follows:
int main()
{
auto print = [i=0] () mutable {return i++;};
cout<<print()<<endl;
cout<<print()<<endl;
cout<<print()<<endl;
// Call as many time as you want
return 0;
}
You should use Lambda wherever it strikes in your mind considering code expressiveness & easy maintainability like you can use it in custom deleters for smart pointers & with most of the STL algorithms.
If you combine Lambda with other features like constexpr, variadic template parameter pack or generic lambda. You can achieve many things.
You can find more about it here
As you pointed out, it works best when you need a one-off and the coding overhead of writing it out as a function isn't worth it.
Conceptually, the decision of which to use is driven by the same criterion as using a named variable versus a in-place expression or constant...
size_t length = strlen(x) + sizeof(y) + z++ + strlen('\0');
...
allocate(length);
std::cout << length;
...here, creating a length variable encourages the program to consider it's correctness and meaning in isolation of it's later use. The name hopefully conveys enough that it can be understood intuitively and independently of it's initial value. It then allows the value to be used several times without repeating the expression (while handling z being different). While here...
allocate(strlen(x) + sizeof(y) + z++ + strlen('\0'));
...the total code is reduced and the value is localised at the point it's needed. The only thing to "carry forwards" from a reading of this line is the side effects of allocation and increment (z), but there's no extra local variable with scope or later use to consider. The programmer has to mentally juggle less state while continuing their analysis of the code.
The same distinction applies to functions versus inline statements. For the purposes of answering your question, functors versus lambdas can be seen as just a particular case of this function versus inlining decision.
I tend to prefer Functors over Lambdas these days. Although they require more code, Functors yield cleaner algorithms. The below comparison between find_id and find_id2 showcase that result. While both yield sufficiently clean code, find_id2 is slightly easier to read as the MatchName(name) definition is extracted from (and secondary to) the primary algorithm.
I would argue, however, that the Functor code should be placed inside implementation files right above the function definition where it is used to provide direct access to the function definition. Otherwise a Lambda would be better for code-locality/organization.
#include <iostream>
#include <vector>
#include <string>
using namespace std;
struct Person {
int id;
string name;
};
typedef vector<Person> People;
int find_id(string const& name, People const& people) {
auto MatchName = [name](Person const& p) -> bool
{
return p.name == name;
};
auto found = find_if(people.begin(), people.end(), MatchName);
if (found == people.end()) return -1;
return found->id;
}
struct MatchName {
string const& name;
MatchName(string const& name) : name(name) {}
bool operator() (Person const& person)
{
return person.name == name;
}
};
int find_id2(string const& name, People const& people) {
auto found = find_if(people.begin(), people.end(), MatchName(name));
if (found == people.end()) return -1;
return found->id;
}
int main() {
People people { {0, "Jim"}, {1, "Pam"}, {2, "Dwight"} };
cout << "Pam's ID is " << find_id("Pam", people) << endl;
cout << "Dwight's ID is " << find_id2("Dwight", people) << endl;
}
The Functor is self-documenting by default; but Lambda's need to be stored in variables (to be self-documenting) inside more-complex algorithm definitions. Hence, it is preferable to not use Lambda's inline as many people do (for code readability) in order to gain the self-documenting benefit as shown above in the MatchName Lambda.
When a Lambda is stored in a variable at the call-site (or used inline), primary algorithms are slightly more difficult to read. Since Lambdas are secondary in nature to algorithms where they are used, it is preferable to clean up the primary algorithms by using self-documenting subroutines (e.g. Functors). This might not matter as much in this example, but if one wanted to use more complex algorithms it can significantly reduce the burden interpreting code.
Functors can be as simple (as in the example above) or complex as they need to be. Sometimes complexity is desirable and cases for dynamic polymorphism (e.g. for strategy/decorator design patterns; or their template-equivalent policy types). This is a use-case Lambda's can not satisfy.
Functors require explicit declaration of capture variables without polluting primary algorithms. When more-and-more capture variables are required by Lambda's the tendency is to use a blanket-capture like [=]. But this reduces readability greatly as one must mentally jump between the Lambda definition and all surrounding local variables, possibly member variables, and more.