Emulate copy-assignment operator for lambdas in C++

Emulate copy-assignment operator for lambdas in C++ - c++

This question has two parts
Firstly, can someone explain the rationale behind C++ disabling the copy-assignment operator for lambdas? If you're going to allow the copy constructor, why not the copy-assignment operator?
Secondly, how do you best overcome this limitation without forcing people to write C++03 style functors, or using std::function (the functions I'm dealing with are tiny, and I'd like the compiler to inline them wherever possible)?
Background:
I'm trying to implement a flat_map like operation in a stream processing library I'm writing, similar to flatMap in Scala or other functional languages. As a result, I need to create an iterator that iterates over a list of iterators. Each time the flat_map iterator is de-referenced a lambda associated with the inner iterator is executed. The outer iterator needs to switch the inner iterator each time the inner iterator reaches the end. Since the inner iterator contains a lambda, and therefore does not have a copy-assignment operator, it's not possible to switch it. Technically I could solve the problem using dynamic allocation, so that I always call the copy-constructor, but that doesn't seem like the right approach. Here is a snippet of code that might help highlight the problem:
template <typename Iter>
class flat_map_iterator {
public:
flat_map_iterator& operator++() {
++it_inner_;
if (it_inner_ == (*it_outer_).end()) {
++it_outer_;
// ERROR: cannot be assigned because its copy assignment operator is implicitly deleted
it_inner_ = (*it_outer_).begin();
}
return *this;
}
private:
Iter it_outer_;
typename Iter::value_type::iterator it_inner_;
};
Edit:
Thanks for the really quick responses. Here is a use case example:
int res = ftl::range(1, 4).map([](int a){
return ftl::range(a, 4).map([a](int b){
return std::make_tuple(a, b);
});
})
.flat_map([](std::tuple<int, int> x){ return std::get<0>(x) * std::get<1>(x); })
.sum();
assert(res, 25);
The ftl::range(begin, end) function returns a lazy iterator over the range [begin, end).

It's not that C++ disables the copy-assignment operator for lambda per-se, but that by default members in a lambda object are saved as const, and then the assignment operator can basically do nothing to assign to them, and so it is not generated. If you want lambdas to not hold members as const, you use the [...](...) mutable {...} syntax.
The other thing is that I'm not entirely sure what you get out of assigning lambdas. I mean, if you're going to re-use the lambda type (and functionality) and simply bind it to different variables, you're already working against the nice lambda capture syntax, and might as well have it be a normal function object. Assigning one type of lambda to another one is impossible. This means that you can not provide different lambda implementations when you hold the lambda itself by value.
If this is still what you're going for, I think dynamic allocation (e.g. using unique_ptr) is fair game.
And if you really want to avoid it, you could manually destruct and re-construct your lambda, as the following example illustrates:
#include <iostream>
template <class T>
struct LambdaContainer {
LambdaContainer(const T& lambda)
: lambda{lambda} {}
void resetLambda(const T& lambda) {
this->lambda.~T();
new (&this->lambda) T{lambda};
}
T lambda;
};
int main()
{
int i = 1;
auto l = [=]() {
std::cout << i;
};
using LT = decltype(l);
LambdaContainer<LT> lc{l};
lc.resetLambda(l);
}

Related

Difference between vector::push_back(Foo()) and vector::emplace_back(Foo())? [duplicate]

I'm a bit confused regarding the difference between push_back and emplace_back.
void emplace_back(Type&& _Val);
void push_back(const Type& _Val);
void push_back(Type&& _Val);
As there is a push_back overload taking a rvalue reference I don't quite see what the purpose of emplace_back becomes?

In addition to what visitor said :
The function void emplace_back(Type&& _Val) provided by MSCV10 is non conforming and redundant, because as you noted it is strictly equivalent to push_back(Type&& _Val).
But the real C++0x form of emplace_back is really useful: void emplace_back(Args&&...);
Instead of taking a value_type it takes a variadic list of arguments, so that means that you can now perfectly forward the arguments and construct directly an object into a container without a temporary at all.
That's useful because no matter how much cleverness RVO and move semantic bring to the table there is still complicated cases where a push_back is likely to make unnecessary copies (or move). For example, with the traditional insert() function of a std::map, you have to create a temporary, which will then be copied into a std::pair<Key, Value>, which will then be copied into the map :
std::map<int, Complicated> m;
int anInt = 4;
double aDouble = 5.0;
std::string aString = "C++";
// cross your finger so that the optimizer is really good
m.insert(std::make_pair(4, Complicated(anInt, aDouble, aString)));
// should be easier for the optimizer
m.emplace(4, anInt, aDouble, aString);
So why didn't they implement the right version of emplace_back in MSVC? Actually, it bugged me too a while ago, so I asked the same question on the Visual C++ blog. Here is the answer from Stephan T Lavavej, the official maintainer of the Visual C++ standard library implementation at Microsoft.
Q: Are beta 2 emplace functions just some kind of placeholder right now?
A: As you may know, variadic templates
aren't implemented in VC10. We
simulate them with preprocessor
machinery for things like
make_shared<T>(), tuple, and the new
things in <functional>. This
preprocessor machinery is relatively
difficult to use and maintain. Also,
it significantly affects compilation
speed, as we have to repeatedly
include subheaders. Due to a
combination of our time constraints
and compilation speed concerns, we
haven't simulated variadic templates
in our emplace functions.
When variadic templates are
implemented in the compiler, you can
expect that we'll take advantage of
them in the libraries, including in
our emplace functions. We take
conformance very seriously, but
unfortunately, we can't do everything
all at once.
It's an understandable decision. Everyone who tried just once to emulate variadic template with preprocessor horrible tricks knows how disgusting this stuff gets.

emplace_back shouldn't take an argument of type vector::value_type, but instead variadic arguments that are forwarded to the constructor of the appended item.
template <class... Args> void emplace_back(Args&&... args);
It is possible to pass a value_type which will be forwarded to the copy constructor.
Because it forwards the arguments, this means that if you don't have rvalue, this still means that the container will store a "copied" copy, not a moved copy.
std::vector<std::string> vec;
vec.emplace_back(std::string("Hello")); // moves
std::string s;
vec.emplace_back(s); //copies
But the above should be identical to what push_back does. It is probably rather meant for use cases like:
std::vector<std::pair<std::string, std::string> > vec;
vec.emplace_back(std::string("Hello"), std::string("world"));
// should end up invoking this constructor:
//template<class U, class V> pair(U&& x, V&& y);
//without making any copies of the strings

Optimization for emplace_back can be demonstrated in next example.
For emplace_back constructor A (int x_arg) will be called. And for
push_back A (int x_arg) is called first and move A (A &&rhs) is called afterwards.
Of course, the constructor has to be marked as explicit, but for current example is good to remove explicitness.
#include <iostream>
#include <vector>
class A
{
public:
A (int x_arg) : x (x_arg) { std::cout << "A (x_arg)\n"; }
A () { x = 0; std::cout << "A ()\n"; }
A (const A &rhs) noexcept { x = rhs.x; std::cout << "A (A &)\n"; }
A (A &&rhs) noexcept { x = rhs.x; std::cout << "A (A &&)\n"; }
private:
int x;
};
int main ()
{
{
std::vector<A> a;
std::cout << "call emplace_back:\n";
a.emplace_back (0);
}
{
std::vector<A> a;
std::cout << "call push_back:\n";
a.push_back (1);
}
return 0;
}
output:
call emplace_back:
A (x_arg)
call push_back:
A (x_arg)
A (A &&)

One more example for lists:
// constructs the elements in place.
emplace_back("element");
// creates a new object and then copies (or moves) that object.
push_back(ExplicitDataType{"element"});

Specific use case for emplace_back: If you need to create a temporary object which will then be pushed into a container, use emplace_back instead of push_back. It will create the object in-place within the container.
Notes:
push_back in the above case will create a temporary object and move it
into the container. However, in-place construction used for emplace_back would be more
performant than constructing and then moving the object (which generally involves some copying).
In general, you can use emplace_back instead of push_back in all the cases without much issue. (See exceptions)

A nice code for the push_back and emplace_back is shown here.
http://en.cppreference.com/w/cpp/container/vector/emplace_back
You can see the move operation on push_back and not on emplace_back.

emplace_back conforming implementation will forward arguments to the vector<Object>::value_typeconstructor when added to the vector. I recall Visual Studio didn't support variadic templates, but with variadic templates will be supported in Visual Studio 2013 RC, so I guess a conforming signature will be added.
With emplace_back, if you forward the arguments directly to vector<Object>::value_type constructor, you don't need a type to be movable or copyable for emplace_back function, strictly speaking. In the vector<NonCopyableNonMovableObject> case, this is not useful, since vector<Object>::value_type needs a copyable or movable type to grow.
But note that this could be useful for std::map<Key, NonCopyableNonMovableObject>, since once you allocate an entry in the map, it doesn't need to be moved or copied ever anymore, unlike with vector, meaning that you can use std::map effectively with a mapped type that is neither copyable nor movable.

Write overloads for const reference and rvalue reference

Recently I find myself often in the situation of having a single function that takes some object as a parameter. The function will have to copy that object.
However the parameter for that function may also quite frequently be a temporary and thus I want to also provide an overload of that function that takes an rvalue reference instead a const reference.
Both overloads tend to only differ in that they have different types of references as argument types. Other than that they are functionally equivalent.
For instance consider this toy example:
void foo(const MyObject &obj) {
globalVec.push_back(obj); // Makes copy
}
void foo(MyObject &&obj) {
globalVec.push_back(std::move(obj)); // Moves
}
Now I was wondering whether there is a way to avoid this code-duplication by e.g. implementing one function in terms of the other.
For instance I was thinking of implementing the copy-version in terms of the move-one like this:
void foo(const MyObject &obj) {
MyObj copy = obj;
foo(std::move(copy));
}
void foo(MyObject &&obj) {
globalVec.push_back(std::move(obj)); // Moves
}
However this still does not seem ideal since now there is a copy AND a move operation happening when calling the const ref overload instead of a single copy operation that was required before.
Furthermore, if the object does not provide a move-constructor, then this would effectively copy the object twice (afaik) which defeats the whole purpose of providing these overloads in the first place (avoiding copies where possible).
I'm sure one could hack something together using macros and the preprocessor but I would very much like to avoid involving the preprocessor in this (for readability purposes).
Therefore my question reads: Is there a possibility to achieve what I want (effectively only implementing the functionality once and then implement the second overload in terms of the first one)?
If possible I would like to avoid using templates instead.

My opinion is that understanding (truly) how std::move and std::forward work, together with what their similarities and their differences are is the key point to solve your doubts, so I suggest that you read my answer to What's the difference between std::move and std::forward, where I give a very good explanation of the two.
In
void foo(MyObject &&obj) {
globalVec.push_back(obj); // Moves (no, it doesn't!)
}
there's no move. obj is the name of a variable, and the overload of push_back which will be called is not the one which will steal reasources out of its argument.
You would have to write
void foo(MyObject&& obj) {
globalVec.push_back(std::move(obj)); // Moves
}
if you want to make the move possible, because std::move(obj) says look, I know this obj here is a local variable, but I guarantee you that I don't need it later, so you can treat it as a temporary: steal its guts if you need.
As regards the code duplication you see in
void foo(const MyObject &obj) {
globalVec.push_back(obj); // Makes copy
}
void foo(MyObject&& /*rvalue reference -> std::move it */ obj) {
globalVec.push_back(std::move(obj)); // Moves (corrected)
}
what allows you to avoid it is std::forward, which you would use like this:
template<typename T>
void foo(T&& /* universal/forwarding reference -> std::forward it */ obj) {
globalVec.push_back(std::forward<T>(obj)); // moves conditionally
}
As regards the error messages of templates, be aware that there are ways to make things easier. for instance, you could use static_asserts at the beginning of the function to enfornce that T is a specific type. That would certainly make the errors more understandable. For instance:
#include <type_traits>
#include <vector>
std::vector<int> globalVec{1,2,3};
template<typename T>
void foo(T&& obj) {
static_assert(std::is_same_v<int, std::decay_t<T>>,
"\n\n*****\nNot an int, aaarg\n*****\n\n");
globalVec.push_back(std::forward<T>(obj));
}
int main() {
int x;
foo(x);
foo(3);
foo('c'); // errors at compile time with nice message
}
Then there's SFINAE, which is harder and I guess beyond the scope of this question and answer.
My suggestion
Don't be scared of templates and SFINAE! They do pay off :)
There's a beautiful library that leverages template metaprogramming and SFINAE heavily and successfully, but this is really off-topic :D

A simple solution is:
void foo(MyObject obj) {
globalVec.push_back(std::move(obj));
}
If caller passes an lvalue, then there is a copy (into the parameter) and a move (into the vector). If caller passes an rvalue, then there are two moves (one into parameter and another into vector). This can potentially be slightly less optimal compared to the two overloads because of the extra move (slightly compensated by the lack of indirection) but in cases where moves are cheap, this is often a decent compromise.
Another solution for templates is std::forward explored in depth in Enlico's answer.
If you cannot have a template and the potential cost of a move is too expensive, then you just have to be satisfied with some extra boilerplate of having two overloads.

Usage of comparator in C++

Following is the c++ code to Merge k Sorted Lists. But i was confused reading the first 4 lines of code. I know what it does just confused how it does it. Could anybody explain these lines to me?
Why use struct?
What are the "()" for after "operator"?
Why use ">" rather than "<" since all the lists including the result list are in ascending order?
struct compare {
bool operator() (ListNode* &left, ListNode* &right) {
return left->val > right->val;
}
};
class Solution {
public:
ListNode *mergeKLists(vector<ListNode *> &lists) {
priority_queue<ListNode *, vector<ListNode *>, compare> heap;
for (int i = 0; i < lists.size(); i++) {
if (lists[i]) heap.push(lists[i]);
}
ListNode *dummy = new ListNode(0);
ListNode *cur = dummy;
while (!heap.empty()) {
ListNode *min = heap.top();
heap.pop();
cur->next = min;
cur = min;
if (min->next) {
heap.push(min->next);
}
}
return dummy->next;
}
};

Your struct compare is what is known as a functor or a function object.
struct compare
{
bool
operator() (const ListNode& left, const ListNode& right) const
{
return left.val > right.val;
}
};
void
example_usage(const ListNode& left, const ListNode& right, const compare cmp)
{
if (cmp(left, right))
std::cout << "left is greater" << std::endl;
else
std::cout << "right is greater" << std::endl;
}
(I have changed the signature since using references to pointers and making these non-const disturbed me too much.)
It is a convenient alternative to using function pointers in many situations. Most important, when used in templates (as in your example) the compiler is usually able to inline calls to operator (). Using function pointers this is not so easy.
It is not clear whether this is relevant in your example but generally, a functor has the advantage that it can be declared anywhere (also inside function bodies) while functions may only be declared at global scope or as class members. This allows better encapsulation using functors. Since C++11, we have lambdas as yet another alternative:
auto cmp = [](const ListNode& left, const ListNode& right)->bool{
return left.val > right.val;
};
It can be used just like the functor. (Under the hood, the compiler will most likely create a functor if you give it a lambda expression.)

Why use struct?
There is no need to use a struct here, although it doesn't hurt.
What are the "()" for after "operator"?
Look for operator overloading tutorial.
Why use ">" rather than "<" since all the lists including the result list are in ascending order?
It's convention to use < for comparisions such as these, so I don't know why, except to be awkward.
Furthermore the parameters should be (const ListNode* left, const ListNode* right) and if inside a struct, the member function should be made const as well. There is no need to take them by reference here. I would also consider whether you need a data structure of pointers and dynamic allocation here.

Why use struct?
It can be a struct or a class. They are essentially the same thing other than 1. members of structs are default public and members of classes are default private, and 2. general opinions on how people should use structs for POD (plain old data) and classes for everything else.
What are the "()" for after "operator"?
This one's fun. ... operator()(...) defines an overload for the function call operator. Whenever you call a function, you will definitely use this operator, encapsulating the arguments within the brackets and having the return value be the result of the expression.
On a side note, a class or struct with an overloaded function call operator is normally referred to as a functor.
Why use ">" rather than "<" since all the lists including the result list are in ascending order?
This is also another interesting question. The idea is that the STL (Standard Template Library) is meant to be used by almost everyone that uses C++, and is meant to provide convenience in performing certain operations. As such, all the tools in the STL are built to work for any types, including user-defined types.
However, user-defined types do not have operators such as < and > overloaded by default, so users will have to overload them.
The result of a > b is essentially the same as b < a. Now imagine if some of these tools use <, while others use >. Everyone would have to overload both operators even though they are just complements of each other and can be used to substitute each other.
Therefore, a standard was set such that the STL will only use the > operator and only require you to overload that one operator.
On a side note, instead of writing your own comparison functor, you can use the std::less templated functor for simple operations like a less-than comparison, and the std::function for more complex operations.
Thank you for reading.

Lambda to std::function conversion performance

I'd like to use lambda functions to asynchronously call a method on a reference counted object:
void RunAsync(const std::function<void()>& f) { /* ... */ }
SmartPtr<T> objPtr = ...
RunAsync([objPtr] { objPtr->Method(); });
Creating the lambda expression obviously creates a copy but I now have the problem that converting the lambda expression to a std::function object also creates a bunch of copies of my smart pointer and each copy increases the reference count.
The following code should demonstrate this behavior:
#include <functional>
struct C {
C() {}
C(const C& c) { ++s_copies; }
void CallMe() const {}
static int s_copies;
};
int C::s_copies = 0;
void Apply(const std::function<void()>& fct) { fct(); }
int main() {
C c;
std::function<void()> f0 = [c] { c.CallMe(); };
Apply(f0);
// s_copies = 4
}
While the amount of references goes back to normal afterwards, I'd like to prevent too many referencing operations for performance reasons. I'm not sure where all these copy operations come from.
Is there any way to achieve this with less copies of my smart pointer object?
Update: Compiler is Visual Studio 2010.

std::function probably won't be as fast as a custom functor until compilers implement some serious special treatment of the simple cases.
But the reference-counting problem is symptomatic of copying when move is appropriate. As others have noted in the comments, MSVC doesn't properly implement move. The usage you've described requires only moving, not copying, so the reference count should never be touched.
If you can, try compiling with GCC and see if the issue goes away.

Converting to a std::function should only make a move of the lambda. If this isn't what's done, then there's arguably a bug in the implementation or specification of std::function. In addition, in your above code, I can only see two copies of the original c, one to create the lambda and another to create the std::function from it. I don't see where the extra copy is coming from.

Is pass-by-value a reasonable default in C++11?

In traditional C++, passing by value into functions and methods is slow for large objects, and is generally frowned upon. Instead, C++ programmers tend to pass references around, which is faster, but which introduces all sorts of complicated questions around ownership and especially around memory management (in the event that the object is heap-allocated)
Now, in C++11, we have Rvalue references and move constructors, which mean that it's possible to implement a large object (like an std::vector) that's cheap to pass by value into and out of a function.
So, does this mean that the default should be to pass by value for instances of types such as std::vector and std::string? What about for custom objects? What's the new best practice?

It's a reasonable default if you need to make a copy inside the body. This is what Dave Abrahams is advocating:
Guideline: Don’t copy your function arguments. Instead, pass them by value and let the compiler do the copying.
In code this means don't do this:
void foo(T const& t)
{
auto copy = t;
// ...
}
but do this:
void foo(T t)
{
// ...
}
which has the advantage that the caller can use foo like so:
T lval;
foo(lval); // copy from lvalue
foo(T {}); // (potential) move from prvalue
foo(std::move(lval)); // (potential) move from xvalue
and only minimal work is done. You'd need two overloads to do the same with references, void foo(T const&); and void foo(T&&);.
With that in mind, I now wrote my valued constructors as such:
class T {
U u;
V v;
public:
T(U u, V v)
: u(std::move(u))
, v(std::move(v))
{}
};
Otherwise, passing by reference to const still is reasonable.

In almost all cases, your semantics should be either:
bar(foo f); // want to obtain a copy of f
bar(const foo& f); // want to read f
bar(foo& f); // want to modify f
All other signatures should be used only sparingly, and with good justification. The compiler will now pretty much always work these out in the most efficient way. You can just get on with writing your code!

Pass parameters by value if inside the function body you need a copy of the object or only need to move the object. Pass by const& if you only need non-mutating access to the object.
Object copy example:
void copy_antipattern(T const& t) { // (Don't do this.)
auto copy = t;
t.some_mutating_function();
}
void copy_pattern(T t) { // (Do this instead.)
t.some_mutating_function();
}
Object move example:
std::vector<T> v;
void move_antipattern(T const& t) {
v.push_back(t);
}
void move_pattern(T t) {
v.push_back(std::move(t));
}
Non-mutating access example:
void read_pattern(T const& t) {
t.some_const_function();
}
For rationale, see these blog posts by Dave Abrahams and Xiang Fan.

The signature of a function should reflect it's intended use. Readability is important, also for the optimizer.
This is the best precondition for an optimizer to create fastest code - in theory at least and if not in reality then in a few years reality.
Performance considerations are very often overrated in the context of parameter passing. Perfect forwarding is an example. Functions like emplace_back are mostly very short and inlined anyway.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Emulate copy-assignment operator for lambdas in C++ - c++

Related

Difference between vector::push_back(Foo()) and vector::emplace_back(Foo())? [duplicate]

Write overloads for const reference and rvalue reference

Usage of comparator in C++

Lambda to std::function conversion performance

Is pass-by-value a reasonable default in C++11?

Categories

Resources