Lambda Expression vs Functor in C++ - c++

I wonder where should we use lambda expression over functor in C++. To me, these two techniques are basically the same, even functor is more elegant and cleaner than lambda. For example, if I want to reuse my predicate, I have to copy the lambda part over and over. So when does lambda really come in to place?

A lambda expression creates an nameless functor, it's syntactic sugar.
So you mainly use it if it makes your code look better. That generally would occur if either (a) you aren't going to reuse the functor, or (b) you are going to reuse it, but from code so totally unrelated to the current code that in order to share it you'd basically end up creating my_favourite_two_line_functors.h, and have disparate files depend on it.
Pretty much the same conditions under which you would type any line(s) of code, and not abstract that code block into a function.
That said, with range-for statements in C++0x, there are some places where you would have used a functor before where it might well make your code look better now to write the code as a loop body, not a functor or a lambda.

1) It's trivial and trying to share it is more work than benefit.
2) Defining a functor simply adds complexity (due to having to make a bunch of member variables and crap).
If neither of those things is true then maybe you should think about defining a functor.
Edit: it seems to be that you need an example of when it would be nice to use a lambda over a functor. Here you go:
typedef std::vector< std::pair<int,std::string> > whatsit_t;
int find_it(std::string value, whatsit_t const& stuff)
{
auto fit = std::find_if(stuff.begin(), stuff.end(), [value](whatsit_t::value_type const& vt) -> bool { return vt.second == value; });
if (fit == stuff.end()) throw std::wtf_error();
return fit->first;
}
Without lambdas you'd have to use something that similarly constructs a functor on the spot or write an externally linkable functor object for something that's annoyingly trivial.
BTW, I think maybe wtf_error is an extension.

Lambdas are basically just syntactic sugar that implement functors (NB: closures are not simple.) In C++0x, you can use the auto keyword to store lambdas locally, and std::function will enable you to store lambdas, or pass them around in a type-safe manner.
Check out the Wikipedia article on C++0x.

Small functions that are not repeated.
The main complain about functors is that they are not in the same place that they were used. So you had to find and read the functor out of context to the place it was being used in (even if it is only being used in one place).
The other problem was that functor required some wiring to get parameters into the functor object. Not complex but all basic boilerplate code. And boiler plate is susceptible to cut and paste problems.
Lambda try and fix both these. But I would use functors if the function is repeated in multiple places or is larger than (can't think up an appropriate term as it will be context sensitive) small.

lambda and functor have context. Functor is a class and therefore can be more complex then a lambda. A function has no context.
#include <iostream>
#include <list>
#include <vector>
using namespace std;
//Functions have no context, mod is always 3
bool myFunc(int n) { return n % 3 == 0; }
//Functors have context, e.g. _v
//Functors can be more complex, e.g. additional addNum(...) method
class FunctorV
{
public:
FunctorV(int num ) : _v{num} {}
void addNum(int num) { _v.push_back(num); }
bool operator() (int num)
{
for(int i : _v) {
if( num % i == 0)
return true;
}
return false;
}
private:
vector<int> _v;
};
void print(string prefix,list<int>& l)
{
cout << prefix << "l={ ";
for(int i : l)
cout << i << " ";
cout << "}" << endl;
}
int main()
{
list<int> l={1,2,3,4,5,6,7,8,9};
print("initial for each test: ",l);
cout << endl;
//function, so no context.
l.remove_if(myFunc);
print("function mod 3: ",l);
cout << endl;
//nameless lambda, context is x
l={1,2,3,4,5,6,7,8,9};
int x = 3;
l.remove_if([x](int n){ return n % x == 0; });
print("lambda mod x=3: ",l);
x = 4;
l.remove_if([x](int n){ return n % x == 0; });
print("lambda mod x=4: ",l);
cout << endl;
//functor has context and can be more complex
l={1,2,3,4,5,6,7,8,9};
FunctorV myFunctor(3);
myFunctor.addNum(4);
l.remove_if(myFunctor);
print("functor mod v={3,4}: ",l);
return 0;
}
Output:
initial for each test: l={ 1 2 3 4 5 6 7 8 9 }
function mod 3: l={ 1 2 4 5 7 8 }
lambda mod x=3: l={ 1 2 4 5 7 8 }
lambda mod x=4: l={ 1 2 5 7 }
functor mod v={3,4}: l={ 1 2 5 7 }

First, i would like to clear some clutter here.
There are two different things
Lambda function
Lambda expression/functor.
Usually, Lambda expression i.e. [] () {} -> return-type does not always synthesize to closure(i.e. kind of functor). Although this is compiler dependent. But you can force compiler by enforcing + sign before [] as +[] () {} -> return-type. This will create function pointer.
Now, coming to your question. You can use lambda repeatedly as follows:
int main()
{
auto print = [i=0] () mutable {return i++;};
cout<<print()<<endl;
cout<<print()<<endl;
cout<<print()<<endl;
// Call as many time as you want
return 0;
}
You should use Lambda wherever it strikes in your mind considering code expressiveness & easy maintainability like you can use it in custom deleters for smart pointers & with most of the STL algorithms.
If you combine Lambda with other features like constexpr, variadic template parameter pack or generic lambda. You can achieve many things.
You can find more about it here

As you pointed out, it works best when you need a one-off and the coding overhead of writing it out as a function isn't worth it.

Conceptually, the decision of which to use is driven by the same criterion as using a named variable versus a in-place expression or constant...
size_t length = strlen(x) + sizeof(y) + z++ + strlen('\0');
...
allocate(length);
std::cout << length;
...here, creating a length variable encourages the program to consider it's correctness and meaning in isolation of it's later use. The name hopefully conveys enough that it can be understood intuitively and independently of it's initial value. It then allows the value to be used several times without repeating the expression (while handling z being different). While here...
allocate(strlen(x) + sizeof(y) + z++ + strlen('\0'));
...the total code is reduced and the value is localised at the point it's needed. The only thing to "carry forwards" from a reading of this line is the side effects of allocation and increment (z), but there's no extra local variable with scope or later use to consider. The programmer has to mentally juggle less state while continuing their analysis of the code.
The same distinction applies to functions versus inline statements. For the purposes of answering your question, functors versus lambdas can be seen as just a particular case of this function versus inlining decision.

I tend to prefer Functors over Lambdas these days. Although they require more code, Functors yield cleaner algorithms. The below comparison between find_id and find_id2 showcase that result. While both yield sufficiently clean code, find_id2 is slightly easier to read as the MatchName(name) definition is extracted from (and secondary to) the primary algorithm.
I would argue, however, that the Functor code should be placed inside implementation files right above the function definition where it is used to provide direct access to the function definition. Otherwise a Lambda would be better for code-locality/organization.
#include <iostream>
#include <vector>
#include <string>
using namespace std;
struct Person {
int id;
string name;
};
typedef vector<Person> People;
int find_id(string const& name, People const& people) {
auto MatchName = [name](Person const& p) -> bool
{
return p.name == name;
};
auto found = find_if(people.begin(), people.end(), MatchName);
if (found == people.end()) return -1;
return found->id;
}
struct MatchName {
string const& name;
MatchName(string const& name) : name(name) {}
bool operator() (Person const& person)
{
return person.name == name;
}
};
int find_id2(string const& name, People const& people) {
auto found = find_if(people.begin(), people.end(), MatchName(name));
if (found == people.end()) return -1;
return found->id;
}
int main() {
People people { {0, "Jim"}, {1, "Pam"}, {2, "Dwight"} };
cout << "Pam's ID is " << find_id("Pam", people) << endl;
cout << "Dwight's ID is " << find_id2("Dwight", people) << endl;
}
The Functor is self-documenting by default; but Lambda's need to be stored in variables (to be self-documenting) inside more-complex algorithm definitions. Hence, it is preferable to not use Lambda's inline as many people do (for code readability) in order to gain the self-documenting benefit as shown above in the MatchName Lambda.
When a Lambda is stored in a variable at the call-site (or used inline), primary algorithms are slightly more difficult to read. Since Lambdas are secondary in nature to algorithms where they are used, it is preferable to clean up the primary algorithms by using self-documenting subroutines (e.g. Functors). This might not matter as much in this example, but if one wanted to use more complex algorithms it can significantly reduce the burden interpreting code.
Functors can be as simple (as in the example above) or complex as they need to be. Sometimes complexity is desirable and cases for dynamic polymorphism (e.g. for strategy/decorator design patterns; or their template-equivalent policy types). This is a use-case Lambda's can not satisfy.
Functors require explicit declaration of capture variables without polluting primary algorithms. When more-and-more capture variables are required by Lambda's the tendency is to use a blanket-capture like [=]. But this reduces readability greatly as one must mentally jump between the Lambda definition and all surrounding local variables, possibly member variables, and more.

Related

Is it possible / advisable to return a range?

I'm using the ranges library to help filer data in my classes, like this:
class MyClass
{
public:
MyClass(std::vector<int> v) : vec(v) {}
std::vector<int> getEvens() const
{
auto evens = vec | ranges::views::filter([](int i) { return ! (i % 2); });
return std::vector<int>(evens.begin(), evens.end());
}
private:
std::vector<int> vec;
};
In this case, a new vector is constructed in the getEvents() function. To save on this overhead, I'm wondering if it is possible / advisable to return the range directly from the function?
class MyClass
{
public:
using RangeReturnType = ???;
MyClass(std::vector<int> v) : vec(v) {}
RangeReturnType getEvens() const
{
auto evens = vec | ranges::views::filter([](int i) { return ! (i % 2); });
// ...
return evens;
}
private:
std::vector<int> vec;
};
If it is possible, are there any lifetime considerations that I need to take into account?
I am also interested to know if it is possible / advisable to pass a range in as an argument, or to store it as a member variable. Or is the ranges library more intended for use within the scope of a single function?
This was asked in op's comment section, but I think I will respond it in the answer section:
The Ranges library seems promising, but I'm a little apprehensive about this returning auto.
Remember that even with the addition of auto, C++ is a strongly typed language. In your case, since you are returning evens, then the return type will be the same type of evens. (technically it will be the value type of evens, but evens was a value type anyways)
In fact, you probably really don't want to type out the return type manually: std::ranges::filter_view<std::ranges::ref_view<const std::vector<int>>, MyClass::getEvens() const::<decltype([](int i) {return ! (i % 2);})>> (141 characters)
As mentioned by #Caleth in the comment, in fact, this wouldn't work either as evens was a lambda defined inside the function, and the type of two different lambdas will be different even if they were basically the same, so there's literally no way of getting the full return type here.
While there might be debates on whether to use auto or not in different cases, but I believe most people would just use auto here. Plus your evens was declared with auto too, typing the type out would just make it less readable here.
So what are my options if I want to access a subset (for instance even numbers)? Are there any other approaches I should be considering, with or without the Ranges library?
Depends on how you would access the returned data and the type of the data, you might consider returning std::vector<T*>.
views are really supposed to be viewed from start to end. While you could use views::drop and views::take to limit to a single element, it doesn't provide a subscript operator (yet).
There will also be computational differences. vector need to be computed beforehand, where views are computed while iterating. So when you do:
for(auto i : myObject.getEven())
{
std::cout << i;
}
Under the hood, it is basically doing:
for(auto i : myObject.vec)
{
if(!(i % 2)) std::cout << i;
}
Depends on the amount of data, and the complexity of computations, views might be a lot faster, or about the same as the vector method. Plus you can easily apply multiple filters on the same range without iterating through the data multiple times.
In the end, you can always store the view in a vector:
std::vector<int> vec2(evens.begin(), evens.end());
So my suggestions is, if you have the ranges library, then you should use it.
If not, then vector<T>, vector<T*>, vector<index> depending on the size and copiability of T.
There's no restrictions on the usage of components of the STL in the standard. Of course, there are best practices (eg, string_view instead of string const &).
In this case, I can foresee no problems with handling the view return type directly. That said, the best practices are yet to be decided on since the standard is so new and no compiler has a complete implementation yet.
You're fine to go with the following, in my opinion:
class MyClass
{
public:
MyClass(std::vector<int> v) : vec(std::move(v)) {}
auto getEvens() const
{
return vec | ranges::views::filter([](int i) { return ! (i % 2); });
}
private:
std::vector<int> vec;
};
As you can see here, a range is just something on which you can call begin and end. Nothing more than that.
For instance, you can use the result of begin(range), which is an iterator, to traverse the range, using the ++ operator to advance it.
In general, looking back at the concept I linked above, you can use a range whenever the conext code only requires to be able to call begin and end on it.
Whether this is advisable or enough depends on what you need to do with it. Clearly, if your intention is to pass evens to a function which expects a std::vector (for instance it's a function you cannot change, and it calls .push_back on the entity we are talking about), you clearly have to make a std::vector out of filter's output, which I'd do via
auto evens = vec | ranges::views::filter(whatever) | ranges::to_vector;
but if all the function which you pass evens to does is to loop on it, then
return vec | ranges::views::filter(whatever);
is just fine.
As regards life time considerations, a view is to a range of values what a pointer is to the pointed-to entity: if the latter is destroied, the former will be dangling, and making improper use of it will be undefined behavior. This is an erroneous program:
#include <iostream>
#include <range/v3/view/filter.hpp>
#include <string>
using namespace ranges;
using namespace ranges::views;
auto f() {
// a local vector here
std::vector<std::string> vec{"zero","one","two","three","four","five"};
// return a view on the local vecotor
return vec | filter([](auto){ return true; });
} // vec is gone ---> the view returned is dangling
int main()
{
// the following throws std::bad_alloc for me
for (auto i : f()) {
std::cout << i << std::endl;
}
}
You can use ranges::any_view as a type erasure mechanism for any range or combination of ranges.
ranges::any_view<int> getEvens() const
{
return vec | ranges::views::filter([](int i) { return ! (i % 2); });
}
I cannot see any equivalent of this in the STL ranges library; please edit the answer if you can.
EDIT: The problem with ranges::any_view is that it is very slow and inefficient. See https://github.com/ericniebler/range-v3/issues/714.
It is desirable to declare a function returning a range in a header and define it in a cpp file
for compilation firewalls (compilation speed)
stop the language server from going crazy
for better factoring of the code
However, there are complications that make it not advisable:
How to get type of a view?
If defining it in a header is fine, use auto
If performance is not a issue, I would recommend ranges::any_view
Otherwise I'd say it is not advisable.

Lambda closure vs simple argument?

For lambda expressions, I don't quite get the usefulness of closures in C++11.
auto f = [] (int n, int m) { return n + m };
std::cout << f(2,2);
versus.
int n = 2;
auto f = [n] (int m) { return n + m };
std::cout << f(2);
This is a very basic and primitive example. I'm guessing that closures play an important part in other kinds of statements, but my C++ book doesn't clarify this (so far).
Why not include the closure as a parameter?
OK, a simple example, remove all the x's from a string
char x = 'x';
std::string s = "Text to remove all 'x's from";
s.erase(std::remove_if(s.begin(), s.end(), [x](char c) {return x == c;}), s.end());
Borrowed and modifed from http://en.cppreference.com/w/cpp/algorithm/remove
In this example, remove_if() only takes a single parameter, but I need two values for the comparison.
Closures are not always called immediately. They are objects which can be stored and called later when the data necessary to successfully execute the lambda function may no longer be in scope or easily accessible from the call site.
It's possible to to store any necessary data along with the closure but it's so much simpler for the closure to grab anything it needs when it's created and use it when it's eventually called. It provides a form of encapsulation.
This also decreases code coupling because if you were to store the data along with the code then the caller could only work with the specific objects you decided to store. Since a closure carries its own data along with it, it can work with any data it needs.
Here's an greatly oversimplified real-life example. I built a database server which needed to support fields with multiple values. The problem was that when results were displayed, it was important to highlight which values actually caused a record to match the search criteria. So, the query parser would spit out a predicate in the form of a closure which would indicate whether or not it was a matching value.
It looked something like this:
std::function< bool(int value) > parser::match_int(int search_val) {
return [=](int value) { value == search_val; };
}
That closure got stored in a collection. When it was time to render the record, I could easily determine which values needed to be highlighted. Keep in mind that the parser and any associated data is now gone:
void render_values(std::function< bool(int value) > pred, std::vector<int> values) {
for (int value : values) {
if (pred(value))
render_highlight(value);
else
render_normal(value);
}
}

Why use functors over functions?

Compare
double average = CalculateAverage(values.begin(), values.end());
with
double average = std::for_each(values.begin(), values.end(), CalculateAverage());
What are the benefits of using a functor over a function? Isn't the first a lot easier to read (even before the implementation is added)?
Assume the functor is defined like this:
class CalculateAverage
{
private:
std::size_t num;
double sum;
public:
CalculateAverage() : num (0) , sum (0)
{
}
void operator () (double elem)
{
num++;
sum += elem;
}
operator double() const
{
return sum / num;
}
};
At least four good reasons:
Separation of concerns
In your particular example, the functor-based approach has the advantage of separating the iteration logic from the average-calculation logic. So you can use your functor in other situations (think about all the other algorithms in the STL), and you can use other functors with for_each.
Parameterisation
You can parameterise a functor more easily. So for instance, you could have a CalculateAverageOfPowers functor that takes the average of the squares, or cubes, etc. of your data, which would be written thus:
class CalculateAverageOfPowers
{
public:
CalculateAverageOfPowers(float p) : acc(0), n(0), p(p) {}
void operator() (float x) { acc += pow(x, p); n++; }
float getAverage() const { return acc / n; }
private:
float acc;
int n;
float p;
};
You could of course do the same thing with a traditional function, but then makes it difficult to use with function pointers, because it has a different prototype to CalculateAverage.
Statefulness
And as functors can be stateful, you could do something like this:
CalculateAverage avg;
avg = std::for_each(dataA.begin(), dataA.end(), avg);
avg = std::for_each(dataB.begin(), dataB.end(), avg);
avg = std::for_each(dataC.begin(), dataC.end(), avg);
to average across a number of different data-sets.
Note that almost all STL algorithms/containers that accept functors require them to be "pure" predicates, i.e. have no observable change in state over time. for_each is a special case in this regard (see e.g. Effective Standard C++ Library - for_each vs. transform).
Performance
Functors can often be inlined by the compiler (the STL is a bunch of templates, after all). Whilst the same is theoretically true of functions, compilers typically won't inline through a function pointer. The canonical example is to compare std::sort vs qsort; the STL version is often 5-10x faster, assuming the comparison predicate itself is simple.
Summary
Of course, it's possible to emulate the first three with traditional functions and pointers, but it becomes a great deal simpler with functors.
Advantages of Functors:
Unlike Functions Functor can have state.
Functor fits into OOP paradigm as compared to functions.
Functor often may be inlined unlike Function pointers
Functor doesn't require vtable and runtime dispatching, and hence more efficient in most cases.
std::for_each is easily the most capricious and least useful of the standard algorithms. It's just a nice wrapper for a loop. However, even it has advantages.
Consider what your first version of CalculateAverage must look like. It will have a loop over the iterators, and then do stuff with each element. What happens if you write that loop incorrectly? Oops; there's a compiler or runtime error. The second version can never have such errors. Yes, it's not a lot of code, but why do we have to write loops so often? Why not just once?
Now, consider real algorithms; the ones that actually do work. Do you want to write std::sort? Or std::find? Or std::nth_element? Do you even know how to implement it in the most efficient way possible? How many times do you want to implement these complex algorithms?
As for ease of reading, that's in the eyes of the beholder. As I said, std::for_each is hardly the first choice for algorithms (especially with C++0x's range-based for syntax). But if you're talking about real algorithms, they're very readable; std::sort sorts a list. Some of the more obscure ones like std::nth_element won't be as familiar, but you can always look it up in your handy C++ reference.
And even std::for_each is perfectly readable once you use Lambda's in C++0x.
•Unlike Functions Functor can have state.
This is very interesting because std::binary_function, std::less and std::equal_to has a template for an operator() that is const. But what if you wanted to print a debug message with the current call count for that object, how would you do it?
Here is template for std::equal_to:
struct equal_to : public binary_function<_Tp, _Tp, bool>
{
bool
operator()(const _Tp& __x, const _Tp& __y) const
{ return __x == __y; }
};
I can think of 3 ways to allow the operator() to be const, and yet change a member variable. But what is the best way? Take this example:
#include <iostream>
#include <string>
#include <algorithm>
#include <functional>
#include <cassert> // assert() MACRO
// functor for comparing two integer's, the quotient when integer division by 10.
// So 50..59 are same, and 60..69 are same.
// Used by std::sort()
struct lessThanByTen: public std::less<int>
{
private:
// data members
int count; // nr of times operator() was called
public:
// default CTOR sets count to 0
lessThanByTen() :
count(0)
{
}
// #override the bool operator() in std::less<int> which simply compares two integers
bool operator() ( const int& arg1, const int& arg2) const
{
// this won't compile, because a const method cannot change a member variable (count)
// ++count;
// Solution 1. this trick allows the const method to change a member variable
++(*(int*)&count);
// Solution 2. this trick also fools the compilers, but is a lot uglier to decipher
++(*(const_cast<int*>(&count)));
// Solution 3. a third way to do same thing:
{
// first, stack copy gets bumped count member variable
int incCount = count+1;
const int *iptr = &count;
// this is now the same as ++count
*(const_cast<int*>(iptr)) = incCount;
}
std::cout << "DEBUG: operator() called " << count << " times.\n";
return (arg1/10) < (arg2/10);
}
};
void test1();
void printArray( const std::string msg, const int nums[], const size_t ASIZE);
int main()
{
test1();
return 0;
}
void test1()
{
// unsorted numbers
int inums[] = {33, 20, 10, 21, 30, 31, 32, 22, };
printArray( "BEFORE SORT", inums, 8 );
// sort by quotient of integer division by 10
std::sort( inums, inums+8, lessThanByTen() );
printArray( "AFTER SORT", inums, 8 );
}
//! #param msg can be "this is a const string" or a std::string because of implicit string(const char *) conversion.
//! print "msg: 1,2,3,...N", where 1..8 are numbers in nums[] array
void printArray( const std::string msg, const int nums[], const size_t ASIZE)
{
std::cout << msg << ": ";
for (size_t inx = 0; inx < ASIZE; ++inx)
{
if (inx > 0)
std::cout << ",";
std::cout << nums[inx];
}
std::cout << "\n";
}
Because all 3 solutions are compiled in, it increments count by 3. Here's the output:
gcc -g -c Main9.cpp
gcc -g Main9.o -o Main9 -lstdc++
./Main9
BEFORE SORT: 33,20,10,21,30,31,32,22
DEBUG: operator() called 3 times.
DEBUG: operator() called 6 times.
DEBUG: operator() called 9 times.
DEBUG: operator() called 12 times.
DEBUG: operator() called 15 times.
DEBUG: operator() called 12 times.
DEBUG: operator() called 15 times.
DEBUG: operator() called 15 times.
DEBUG: operator() called 18 times.
DEBUG: operator() called 18 times.
DEBUG: operator() called 21 times.
DEBUG: operator() called 21 times.
DEBUG: operator() called 24 times.
DEBUG: operator() called 27 times.
DEBUG: operator() called 30 times.
DEBUG: operator() called 33 times.
DEBUG: operator() called 36 times.
AFTER SORT: 10,20,21,22,33,30,31,32
In the first approach the iteration code has to be duplicated in all functions that wants to do something with the collection. The second approach hide the details of iteration.
OOP is keyword here.
http://www.newty.de/fpt/functor.html:
4.1 What are Functors ?
Functors are functions with a state. In C++ you can realize them as a class with one or more private members to store the state and with an overloaded operator () to execute the function. Functors can encapsulate C and C++ function pointers employing the concepts templates and polymorphism. You can build up a list of pointers to member functions of arbitrary classes and call them all through the same interface without bothering about their class or the need of a pointer to an instance. All the functions just have got to have the same return-type and calling parameters. Sometimes functors are also known as closures. You can also use functors to implement callbacks.
You are comparing functions on different level of abstraction.
You can implement CalculateAverage(begin, end) either as:
template<typename Iter>
double CalculateAverage(Iter begin, Iter end)
{
return std::accumulate(begin, end, 0.0, std::plus<double>) / std::distance(begin, end)
}
or you can do it with a for loop
template<typename Iter>
double CalculateAverage(Iter begin, Iter end)
{
double sum = 0;
int count = 0;
for(; begin != end; ++begin) {
sum += *begin;
++count;
}
return sum / count;
}
The former requires you to know more things, but once you know them, is simpler and leaves fewer possibilities for error.
It also only uses two generic components (std::accumulate and std::plus), which is often the case in more complex case too. You can often have a simple, universal functor (or function; plain old function can act as functor) and simply combine it with whatever algorithm you need.

What is the motivation behind C++11 lambda expressions?

I am trying to find out if there is an actual computational benefit to using lambda expressions in C++, namely "this code compiles/runs faster/slower because we use lambda expressions" or is it just a neat development perk open for abuse by poor coders trying to look cool?
I understand this question may seem subjective, but I would much appreciate the opinion of the community on this matter.
The benefit is what's the most important thing in writing computer programs: easier to understand code. I'm not aware of any performance considerations.
C++ allows, to a certain extend, to do Functional Programming. Consider this:
std::for_each( begin, end, doer );
The problem with this is that the function (object) doer
specifies what's done in the loop
yet somewhat hides what's actually done (you have to look up the function object's operator()'s implementation)
must be defined in a different scope than the std::for_each call
contains a certain amount of boilerplate code
is often throw-away code that's not used for anything but this one loop construct
Lambdas considerably improve on all these (and maybe some more I forgot).
I don't think it's nearly as much about the computational performance as increasing the expressive power of the language.
There's no performance benefit per se, but the need for lambda came as a consequence of the wide adoption of the STL and its design ideas.
Specifically, the STL algorithms make frequent use of functors. Without lambda, these functors need to be previously declared to be used. Lambdas make it possible to have 'anonymous', in-place functors.
This is important because there are many situations in which you need to use a functor only once, and you don't want to give a name to it for two reasons: you don't want to pollute the namespace, and in those specific cases the name you give is either vague or extremely long.
I, for instance, use STL a lot, but without C++0x I use much more for() loops than the for_each() algorithm and its cousins. That's because if I were to use for_each() instead, I'd need to get the code from inside the loop and declare a functor for it. Also all the local variables before the loop wouldn't be accessible, so I'd need to write additional code to pass them as parameters to the functor constructor, or another thing equivalent. As a consequence, I tend not to use for_each() unless there's strong motivation, otherwise the code would be longer and more difficult to read.
That's bad, because it's well known that using for_each() and similar algorithms gives much more room to the compiler & the library for optimizations, including automatic parallelism. So, indirectly, lambda will favour more efficient code.
IMO, the most important thing about lambda's is it keeps related code close together. If you have this code:
std::for_each(begin, end, unknown_function);
You need to navigate over to unknown_function to understand what the code does. But with a lambda, the logic can be kept together.
Lambdas are syntactic sugar for functor classes, so no, there is no computational benefit. As far as the motivation, probably any of the other dozen or so popular languages which have lambdas in them?
One could argue it aids in the readability of code (having your functor declared inline where it is used).
Although I think other parts of C++0x are more important, lambdas are more than just "syntactic sugar" for C++98 style function objects, because they can capture contexts, and they do so by name and then they can take those contexts elsewhere and execute. This is something new, not something that "compiles faster/slower".
#include <iostream>
#include <vector>
#include <functional>
void something_else(std::function<void()> f)
{
f(); // A closure! I wonder if we can write in CPS now...
}
int main()
{
std::vector<int> v(10,0);
std::function<void ()> f = [&](){ std::cout << v.size() << std::endl; };
something_else(f);
}
Well, compare this:
int main () {
std::vector<int> x = {2, 3, 5, 7, 11, 13, 17, 19};
int center = 10;
std::sort(x.begin(), x.end(), [=](int x, int y) {
return abs(x - center) < abs(y - center);
});
std::for_each(x.begin(), x.end(), [](int v) {
printf("%d\n", v);
});
return 0;
}
with this:
// why enforce this to be defined nonlocally?
void printer(int v) {
printf("%d\n", v);
}
int main () {
std::vector<int> x = {2, 3, 5, 7, 11, 13, 17, 19};
// why enforce we to define a whole struct just need to maintain a state?
struct {
int center;
bool operator()(int x, int y) const {
return abs(x - center) < abs(y - center);
}
} comp = {10};
std::sort(x.begin(), x.end(), comp);
std::for_each(x.begin(), x.end(), printer);
return 0;
}
"a neat development perk open for abuse by poor coders trying to look cool?"...whatever you call it, it makes code a lot more readable and maintainable. It does not increase the performance.
Most often, a programmer iterates over a range of elements (searching for an element, accumulating elements, sorting elements etc). Using functional style, you immediatly see what the programmer intends to do, as different from using for loops, where everything "looks" the same.
Compare algorithms + lambda:
iterator longest_tree = std::max_element(forest.begin(), forest.end(), [height]{arg0.height>arg1.height});
iterator first_leaf_tree = std::find_if(forest.begin(), forest.end(), []{is_leaf(arg0)});
std::transform(forest.begin(), forest.end(), firewood.begin(), []{arg0.trans(...));
std::for_each(forest.begin(), forest.end(), {arg0.make_plywood()});
with oldschool for-loops;
Forest::iterator longest_tree = it.begin();
for (Forest::const_iterator it = forest.begin(); it != forest.end(); ++it{
if (*it.height() > *longest_tree.height()) {
longest_tree = it;
}
}
Forest::iterator leaf_tree = it.begin();
for (Forest::const_iterator it = forest.begin(); it != forest.end(); ++it{
if (it->type() == LEAF_TREE) {
leaf_tree = it;
break;
}
}
for (Forest::const_iterator it = forest.begin(), jt = firewood.begin();
it != forest.end();
it++, jt++) {
*jt = boost::transformtowood(*it);
}
for (Forest::const_iterator it = forest.begin(); it != forest.end(); ++it{
std::makeplywood(*it);
}
(I know this pieace of code contains syntactic errors.)

C++ STL - iterate through everything in a sequence

I have a sequence, e.g
std::vector< Foo > someVariable;
and I want a loop which iterates through everything in it.
I could do this:
for (int i=0;i<someVariable.size();i++) {
blah(someVariable[i].x,someVariable[i].y);
woop(someVariable[i].z);
}
or I could do this:
for (std::vector< Foo >::iterator i=someVariable.begin(); i!=someVariable.end(); i++) {
blah(i->x,i->y);
woop(i->z);
}
Both these seem to involve quite a bit of repetition / excessive typing. In an ideal language I'd like to be able to do something like this:
for (i in someVariable) {
blah(i->x,i->y);
woop(i->z);
}
It seems like iterating through everything in a sequence would be an incredibly common operation. Is there a way to do it in which the code isn't twice as long as it should have to be?
You could use for_each from the standard library. You could pass a functor or a function to it. The solution I like is BOOST_FOREACH, which is just like foreach in other languages. C+0x is gonna have one btw.
For example:
#include <iostream>
#include <vector>
#include <algorithm>
#include <boost/foreach.hpp>
#define foreach BOOST_FOREACH
void print(int v)
{
std::cout << v << std::endl;
}
int main()
{
std::vector<int> array;
for(int i = 0; i < 100; ++i)
{
array.push_back(i);
}
std::for_each(array.begin(), array.end(), print); // using STL
foreach(int v, array) // using Boost
{
std::cout << v << std::endl;
}
}
Not counting BOOST_FOREACH which AraK already suggested, you have the following two options in C++ today:
void function(Foo& arg){
blah(arg.x, arg.y);
woop(arg.z);
}
std::for_each(someVariable.begin(), someVariable.end(), function);
struct functor {
void operator()(Foo& arg){
blah(arg.x, arg.y);
woop(arg.z);
}
};
std::for_each(someVariable.begin(), someVariable.end(), functor());
Both require you to specify the "body" of the loop elsewhere, either as a function or as a functor (a class which overloads operator()). That might be a good thing (if you need to do the same thing in multiple loops, you only have to define the function once), but it can be a bit tedious too. The function version may be a bit less efficient, because the compiler is generally unable to inline the function call. (A function pointer is passed as the third argument, and the compiler has to do some more detailed analysis to determine which function it points to)
The functor version is basically zero overhead. Because an object of type functor is passed to for_each, the compiler knows exactly which function to call: functor::operator(), and so it can be trivially inlined and will be just as efficient as your original loop.
C++0x will introduce lambda expressions which make a third form possible.
std::for_each(someVariable.begin(), someVariable.end(), [](Foo& arg){
blah(arg.x, arg.y);
woop(arg.z);
});
Finally, it will also introduce a range-based for loop:
for(Foo& arg : my_someVariable)
{
blah(arg.x, arg.y);
woop(arg.z);
}
So if you've got access to a compiler which supports subsets of C++0x, you might be able to use one or both of the last forms. Otherwise, the idiomatic solution (without using Boost) is to use for_eachlike in one of the two first examples.
By the way, MSVS 2008 has a "for each" C++ keyword. Look at How to: Iterate Over STL Collection with for each.
int main() {
int retval = 0;
vector<int> col(3);
col[0] = 10;
col[1] = 20;
col[2] = 30;
for each( const int& c in col )
retval += c;
cout << "retval: " << retval << endl;
}
Prefer algorithm calls to hand-written loops
There are three reasons:
1) Efficiency: Algorithms are often more efficient than the loops programmers produce
2) Correctness: Writing loops is more subject to errors than is calling algorithms.
3) Maintainability: Algorithm calls often yield code that is clearer and more
straightforward than the corresponding explicit loops.
Prefer almost every other algorithm to for_each()
There are two reasons:
for_each is extremely general, telling you nothing about what's really being done, just that you're doing something to all the items in a sequence.
A more specialized algorithm will often be simpler and more direct
Consider, an example from an earlier reply:
void print(int v)
{
std::cout << v << std::endl;
}
// ...
std::for_each(array.begin(), array.end(), print); // using STL
Using std::copy instead, that whole thing turns into:
std::copy(array.begin(), array.end(), std::ostream_iterator(std::cout, "\n"));
"struct functor {
void operator()(Foo& arg){
blah(arg.x, arg.y);
woop(arg.z);
}
};
std::for_each(someVariable.begin(), someVariable.end(), functor());"
I think approaches like these are often needlessly baroque for a simple problem.
do i=1,N
call blah( X(i),Y(i) )
call woop( Z(i) )
end do
is perfectly clear, even if it's 40 years old (and not C++, obviously).
If the container is always a vector (STL name), I see nothing wrong with an index and nothing wrong with calling that index an integer.
In practice, often one needs to iterate over multiple containers of the same size simultaneously and peel off a datum from each, and do something with the lot of them. In that situation, especially, why not use the index?
As far as SSS's points #2 and #3 above, I'd say it could be so for complex cases, but often iterating 1...N is often as simple and clear as anything else.
If you had to explain the algorithm on the whiteboard, could you do it faster with, or without, using 'i'? I think if your meatspace explanation is clearer with the index, use it in codespace.
Save the heavy C++ firepower for the hard targets.