Lambda to std::function conversion performance - c++

I'd like to use lambda functions to asynchronously call a method on a reference counted object:
void RunAsync(const std::function<void()>& f) { /* ... */ }
SmartPtr<T> objPtr = ...
RunAsync([objPtr] { objPtr->Method(); });
Creating the lambda expression obviously creates a copy but I now have the problem that converting the lambda expression to a std::function object also creates a bunch of copies of my smart pointer and each copy increases the reference count.
The following code should demonstrate this behavior:
#include <functional>
struct C {
C() {}
C(const C& c) { ++s_copies; }
void CallMe() const {}
static int s_copies;
};
int C::s_copies = 0;
void Apply(const std::function<void()>& fct) { fct(); }
int main() {
C c;
std::function<void()> f0 = [c] { c.CallMe(); };
Apply(f0);
// s_copies = 4
}
While the amount of references goes back to normal afterwards, I'd like to prevent too many referencing operations for performance reasons. I'm not sure where all these copy operations come from.
Is there any way to achieve this with less copies of my smart pointer object?
Update: Compiler is Visual Studio 2010.

std::function probably won't be as fast as a custom functor until compilers implement some serious special treatment of the simple cases.
But the reference-counting problem is symptomatic of copying when move is appropriate. As others have noted in the comments, MSVC doesn't properly implement move. The usage you've described requires only moving, not copying, so the reference count should never be touched.
If you can, try compiling with GCC and see if the issue goes away.

Converting to a std::function should only make a move of the lambda. If this isn't what's done, then there's arguably a bug in the implementation or specification of std::function. In addition, in your above code, I can only see two copies of the original c, one to create the lambda and another to create the std::function from it. I don't see where the extra copy is coming from.

Related

Write overloads for const reference and rvalue reference

Recently I find myself often in the situation of having a single function that takes some object as a parameter. The function will have to copy that object.
However the parameter for that function may also quite frequently be a temporary and thus I want to also provide an overload of that function that takes an rvalue reference instead a const reference.
Both overloads tend to only differ in that they have different types of references as argument types. Other than that they are functionally equivalent.
For instance consider this toy example:
void foo(const MyObject &obj) {
globalVec.push_back(obj); // Makes copy
}
void foo(MyObject &&obj) {
globalVec.push_back(std::move(obj)); // Moves
}
Now I was wondering whether there is a way to avoid this code-duplication by e.g. implementing one function in terms of the other.
For instance I was thinking of implementing the copy-version in terms of the move-one like this:
void foo(const MyObject &obj) {
MyObj copy = obj;
foo(std::move(copy));
}
void foo(MyObject &&obj) {
globalVec.push_back(std::move(obj)); // Moves
}
However this still does not seem ideal since now there is a copy AND a move operation happening when calling the const ref overload instead of a single copy operation that was required before.
Furthermore, if the object does not provide a move-constructor, then this would effectively copy the object twice (afaik) which defeats the whole purpose of providing these overloads in the first place (avoiding copies where possible).
I'm sure one could hack something together using macros and the preprocessor but I would very much like to avoid involving the preprocessor in this (for readability purposes).
Therefore my question reads: Is there a possibility to achieve what I want (effectively only implementing the functionality once and then implement the second overload in terms of the first one)?
If possible I would like to avoid using templates instead.
My opinion is that understanding (truly) how std::move and std::forward work, together with what their similarities and their differences are is the key point to solve your doubts, so I suggest that you read my answer to What's the difference between std::move and std::forward, where I give a very good explanation of the two.
In
void foo(MyObject &&obj) {
globalVec.push_back(obj); // Moves (no, it doesn't!)
}
there's no move. obj is the name of a variable, and the overload of push_back which will be called is not the one which will steal reasources out of its argument.
You would have to write
void foo(MyObject&& obj) {
globalVec.push_back(std::move(obj)); // Moves
}
if you want to make the move possible, because std::move(obj) says look, I know this obj here is a local variable, but I guarantee you that I don't need it later, so you can treat it as a temporary: steal its guts if you need.
As regards the code duplication you see in
void foo(const MyObject &obj) {
globalVec.push_back(obj); // Makes copy
}
void foo(MyObject&& /*rvalue reference -> std::move it */ obj) {
globalVec.push_back(std::move(obj)); // Moves (corrected)
}
what allows you to avoid it is std::forward, which you would use like this:
template<typename T>
void foo(T&& /* universal/forwarding reference -> std::forward it */ obj) {
globalVec.push_back(std::forward<T>(obj)); // moves conditionally
}
As regards the error messages of templates, be aware that there are ways to make things easier. for instance, you could use static_asserts at the beginning of the function to enfornce that T is a specific type. That would certainly make the errors more understandable. For instance:
#include <type_traits>
#include <vector>
std::vector<int> globalVec{1,2,3};
template<typename T>
void foo(T&& obj) {
static_assert(std::is_same_v<int, std::decay_t<T>>,
"\n\n*****\nNot an int, aaarg\n*****\n\n");
globalVec.push_back(std::forward<T>(obj));
}
int main() {
int x;
foo(x);
foo(3);
foo('c'); // errors at compile time with nice message
}
Then there's SFINAE, which is harder and I guess beyond the scope of this question and answer.
My suggestion
Don't be scared of templates and SFINAE! They do pay off :)
There's a beautiful library that leverages template metaprogramming and SFINAE heavily and successfully, but this is really off-topic :D
A simple solution is:
void foo(MyObject obj) {
globalVec.push_back(std::move(obj));
}
If caller passes an lvalue, then there is a copy (into the parameter) and a move (into the vector). If caller passes an rvalue, then there are two moves (one into parameter and another into vector). This can potentially be slightly less optimal compared to the two overloads because of the extra move (slightly compensated by the lack of indirection) but in cases where moves are cheap, this is often a decent compromise.
Another solution for templates is std::forward explored in depth in Enlico's answer.
If you cannot have a template and the potential cost of a move is too expensive, then you just have to be satisfied with some extra boilerplate of having two overloads.

C++11 best practice to use rvalue

I am new to C++11. In fact until recently, I programmed only using dynamic allocation, in a way similar to Java, e.g.
void some_function(A *a){
a->changeInternalState();
}
A *a = new A();
some_function(a);
delete a;
// example 2
some_function( new A() ); // suppose there is **no** memory leak.
Now I want to reproduce similar code with C++11, but without pointers.
I need to be able to pass newly created class class A directly to function useA(). There seems to be a problem if I want to do so with non-const normal reference and It works if I do it with rvalue reference.
Here is the code:
#include <stdio.h>
class A{
public:
void print(){
++p; // e.g. change internal state
printf("%d\n", p);
}
int p;
};
// normal reference
void useA(A & x){
x.print();
}
// rvalue reference
void useA(A && x){
useA(x);
}
int main(int argc, char** argv)
{
useA( A{45} ); // <--- newly created class
A b{20};
useA(b);
return 0;
}
It compiles and executes correctly, but I am not sure, if this is the correct acceptable way to do the work?
Are there some best practices for this kind of operations?
Normally you would not design the code so that a temporary object gets modified. Then you would write your print function as:
void useA(A const & x){
x.print();
}
and declare A::print as const. This binds to both rvalues and lvalues. You can use mutable for class member variables which might change value but without the object logically changing state.
Another plan is to keep just A &, but write:
{ A temp{45}; useA(temp); }
If you really do want to modify a temporary object, you can write the pair of lvalue and rvalue overloads as you have done in your question. I believe this is acceptable practice for that case.
The best thing about C++11 move semantics is that most of the time, you get them "for free" without having to explicitly add any &&s or std::move()s in your code. Usually, you only need to use these things explicitly if you're writing code that does manual memory management, such as the implementation of a smart pointer or a container class, where you would have had to write a custom destructor and copy constructor anyway.
In your example, A is just an int. For ints, a move is no different from a copy, because there's no opportunity for optimization even if the int happens to be a disposable temporary. Just provide a single useA() function that takes an ordinary reference. It'll have the same behavior.

Eliminating copying of function parameter

I am writing a custom memory allocator. If possible, I want to make object creation function like this to abstract creation procedure completely.
template<typename T>
class CustomCreator
{
virtual T& createObject(T value) __attribute__((always_inline))
{
T* ptr = (T*)customAlloc();
new (ptr) T(value);
return *ptr;
}
}
But this causes copying. Is there a way to force to eliminate copying in this case?
Here's my current testing code.
#include <iostream>
struct AA
{
inline static void test(AA aa) __attribute__((always_inline))
{
AA* ptr = new AA(aa);
delete ptr;
}
AA(AA const& aa)
{
printf("COPY!\n");
}
AA(int v)
{
printf("CTOR!\n");
}
};
int main(int argc, const char * argv[])
{
AA::test(AA(77));
return 0;
}
I tried to pass the value as T&, T const&, T&&, T const&&, but it still copies.
I expected optimizer will eliminate function frame, so the function parameter can be deduced into an R-value, but I still see COPY! message.
I also tried C++11 forwarding template, but I couldn't use it because it cannot be a virtual function.
My compiler is Clang included in Xcode. (clang-425.0.28) And optimization level is set to -Os -flto.
Update (for later reference)
I wrote extra tests, and checked generated LLVM IR with clang -Os -flto -S -emit-llvm -std=c++11 -stdlib=libc++ main.cpp; options. What I could observe are: (1) function frames always can be get eliminated. (2) if large object (4096 byte) passed by value, it didn't get eliminated. (3) Passing r-value reference using std::move works well. (4) if I don't make move-constructor, compiler mostly fall back to copy-constructor.
First:
You expected the optimizer to eliminate the function frame, but it can't, because (1) that function isn't a valid case for RVO to cheat and skip the copy constructor entirely, and (2) Your function has side effects. Namely, writing copy to the screen. So the optimizer can't optimize out the copy, because the code says it must write copy out twice. What I would do is remove the printf call, and check the assembly. Most likely, if it's simple, it is being optimized out.
Second:
If the members copy constructors can have side effects (such as allocating memory), then most likely you're right that it isn't being optimized out. However, you can tell it to move members rather than copying them, which is probably what you expected. For this to work you'll need to make sure your class has a move constructor alongside your copy constructor.
inline static void test(AA aa) __attribute__((always_inline))
{
AA* ptr = new AA(std::move(aa));
delete ptr;
}
Third:
In reality, even that isn't really what you want. What you probably want is something more like perfect forwarding, which will just directly pass the parameters to the constructor without making copies of ANYTHING. This means even with the optimizer disabled entirely, your code is still avoiding copies. (Note that this may not apply to your situation, templates cannot be virtual like this, unless it's specialized)
template<class ...types>
inline static void test(types&&... values) __attribute__((always_inline))
{
AA* ptr = new AA(std::forward<types>(values)...);
delete ptr;
}
Isn't your copy-constructor explicitly called in line AA* ptr = new AA(aa);? If you want to avoid copying initializing value, you should do it like this:
#include <iostream>
#include <utility>
int main(int argc, const char * argv[])
{
struct
AA
{ inline static void test(AA aa) __attribute__((always_inline))
{
AA* ptr = new AA(std::move(aa));
delete ptr;
}
inline static void test(AA&& aa) __attribute__((always_inline))
{
AA* ptr = new AA(std::move(aa));
delete ptr;
}
AA(AA const& aa)
{
printf("COPY!\n");
}
AA(AA const&& aa) // Move constructors are not default here - you have to declare one
{
printf("MOVE!\n");
}
AA(int v)
{
printf("CTOR!\n");
xx = v;
}
int xx = 55;
};
AA::test(AA(77));
return 0;
}
Problems:
Conditions allowing default move constructor are quite tight, you are better off writing one yourself.
If you pass value along (even a temporary one) you still need std::move to to pass it even further without creating a copy.
First, let's solve your A&& move problem. You can get a move constructor by making a move constructor (or declaring it default) (or letting the compiler generate one), and making sure you call std::move on aa when you pass it through: Live example here.
The same will apply to any virtual instance function you are working with. I've also included the forwarding variadic version in the example as well, just in case you'd like a reference on how to do that as well (notice all the arguments are std::forwarded to their next calls).
In either case, this seems like a question spawned from a serious case of XY, that is: you're trying to solve a problem you've created with your current design. I don't know why you need to create objects through instance methods on a pre-existing object (factory-type creation?), but it sounds messy. In either case, the above should help you out enough to get your going. Just make sure if you're moving parameters, to always wrap the && item with a std::move (or std::forward if you're working with universal references and template parameters).

Is it possible to take a parameter by const reference, while banning conversions so that temporaries aren't passed instead?

Sometimes we like to take a large parameter by reference, and also to make the reference const if possible to advertize that it is an input parameter. But by making the reference const, the compiler then allows itself to convert data if it's of the wrong type. This means it's not as efficient, but more worrying is the fact that I think I am referring to the original data; perhaps I will take it's address, not realizing that I am, in effect, taking the address of a temporary.
The call to bar in this code fails. This is desirable, because the reference is not of the correct type. The call to bar_const is also of the wrong type, but it silently compiles. This is undesirable for me.
#include<vector>
using namespace std;
int vi;
void foo(int &) { }
void bar(long &) { }
void bar_const(const long &) { }
int main() {
foo(vi);
// bar(vi); // compiler error, as expected/desired
bar_const(vi);
}
What's the safest way to pass a lightweight, read-only reference? I'm tempted to create a new reference-like template.
(Obviously, int and long are very small types. But I have been caught out with larger structures which can be converted to each other. I don't want this to silently happen when I'm taking a const reference. Sometimes, marking the constructors as explicit helps, but that is not ideal)
Update: I imagine a system like the following: Imagine having two functions X byVal(); and X& byRef(); and the following block of code:
X x;
const_lvalue_ref<X> a = x; // I want this to compile
const_lvalue_ref<X> b = byVal(); // I want this to fail at compile time
const_lvalue_ref<X> c = byRef(); // I want this to compile
That example is based on local variables, but I want it to also work with parameters. I want to get some sort of error message if I'm accidentally passing a ref-to-temporary or a ref-to-a-copy when I think I'll passing something lightweight such as a ref-to-lvalue. This is just a 'coding standard' thing - if I actually want to allow passing a ref to a temporary, then I'll use a straightforward const X&. (I'm finding this piece on Boost's FOREACH to be quite useful.)
Well, if your "large parameter" is a class, the first thing to do is ensure that you mark any single parameter constructors explicit (apart from the copy constructor):
class BigType
{
public:
explicit BigType(int);
};
This applies to constructors which have default parameters which could potentially be called with a single argument, also.
Then it won't be automatically converted to since there are no implicit constructors for the compiler to use to do the conversion. You probably don't have any global conversion operators which make that type, but if you do, then
If that doesn't work for you, you could use some template magic, like:
template <typename T>
void func(const T &); // causes an undefined reference at link time.
template <>
void func(const BigType &v)
{
// use v.
}
If you can use C++11 (or parts thereof), this is easy:
void f(BigObject const& bo){
// ...
}
void f(BigObject&&) = delete; // or just undefined
Live example on Ideone.
This will work, because binding to an rvalue ref is preferred over binding to a reference-to-const for a temporary object.
You can also exploit the fact that only a single user-defined conversion is allowed in an implicit conversion sequence:
struct BigObjWrapper{
BigObjWrapper(BigObject const& o)
: object(o) {}
BigObject const& object;
};
void f(BigObjWrapper wrap){
BigObject const& bo = wrap.object;
// ...
}
Live example on Ideone.
This is pretty simple to solve: stop taking values by reference. If you want to ensure that a parameter is addressable, then make it an address:
void bar_const(const long *) { }
That way, the user must pass a pointer. And you can't get a pointer to a temporary (unless the user is being terribly malicious).
That being said, I think your thinking on this matter is... wrongheaded. It comes down to this point.
perhaps I will take it's address, not realizing that I am, in effect, taking the address of a temporary.
Taking the address of a const& that happens to be a temporary is actually fine. The problem is that you cannot store it long-term. Nor can you transfer ownership of it. After all, you got a const reference.
And that's part of the problem. If you take a const&, your interface is saying, "I'm allowed to use this object, but I do not own it, nor can I give ownership to someone else." Since you do not own the object, you cannot store it long-term. This is what const& means.
Taking a const* instead can be problematic. Why? Because you don't know where that pointer came from. Who owns this pointer? const& has a number of syntactic safeguards to prevent you from doing bad things (so long as you don't take its address). const* has nothing; you can copy that pointer to your heart's content. Your interface says nothing about whether you are allowed to own the object or transfer ownership to others.
This ambiguity is why C++11 has smart pointers like unique_ptr and shared_ptr. These pointers can describe real memory ownership relations.
If your function takes a unique_ptr by value, then you now own that object. If it takes a shared_ptr, then you now share ownership of that object. There are syntactic guarantees in place that ensure ownership (again, unless you take unpleasant steps).
In the event of your not using C++11, you should use Boost smart pointers to achieve similar effects.
You can't, and even if you could, it probably wouldn't help much.
Consider:
void another(long const& l)
{
bar_const(l);
}
Even if you could somehow prevent the binding to a temporary as input to
bar_const, functions like another could be called with the reference
bound to a temporary, and you'd end up in the same situation.
If you can't accept a temporary, you'll need to use a reference to a
non-const, or a pointer:
void bar_const(long const* l);
requires an lvalue to initialize it. Of course, a function like
void another(long const& l)
{
bar_const(&l);
}
will still cause problems. But if you globally adopt the convention to
use a pointer if object lifetime must extend beyond the end of the call,
then hopefully the author of another will think about why he's taking
the address, and avoid it.
I think your example with int and long is a bit of a red herring as in canonical C++ you will never pass builtin types by const reference anyway: You pass them by value or by non-const reference.
So let's assume instead that you have a large user defined class. In this case, if it's creating temporaries for you then that means you created implicit conversions for that class. All you have to do is mark all converting constructors (those that can be called with a single parameter) as explicit and the compiler will prevent those temporaries from being created automatically. For example:
class Foo
{
explicit Foo(int bar) { }
};
(Answering my own question thanks to this great answer on another question I asked. Thanks #hvd.)
In short, marking a function parameter as volatile means that it cannot be bound to an rvalue. (Can anybody nail down a standard quote for that? Temporaries can be bound to const&, but not to const volatile & apparently. This is what I get on g++-4.6.1. (Extra: see this extended comment stream for some gory details that are way over my head :-) ))
void foo( const volatile Input & input, Output & output) {
}
foo(input, output); // compiles. good
foo(get_input_as_value(), output); // compile failure, as desired.
But, you don't actually want the parameters to be volatile. So I've written a small wrapper to const_cast the volatile away. So the signature of foo becomes this instead:
void foo( const_lvalue<Input> input, Output & output) {
}
where the wrapper is:
template<typename T>
struct const_lvalue {
const T * t;
const_lvalue(const volatile T & t_) : t(const_cast<const T*>(&t_)) {}
const T* operator-> () const { return t; }
};
This can be created from an lvalue only
Any downsides? It might mean that I accidentally misuse an object that is truly volatile, but then again I've never used volatile before in my life. So this is the right solution for me, I think.
I hope to get in the habit of doing this with all suitable parameters by default.
Demo on ideone

Avoiding need for #define with expression templates

With the following code, "hello2" is not displayed as the temporary string created on Line 3 dies before Line 4 is executed. Using a #define as on Line 1 avoids this issue, but is there a way to avoid this issue without using #define? (C++11 code is okay)
#include <iostream>
#include <string>
class C
{
public:
C(const std::string& p_s) : s(p_s) {}
const std::string& s;
};
int main()
{
#define x1 C(std::string("hello1")) // Line 1
std::cout << x1.s << std::endl; // Line 2
const C& x2 = C(std::string("hello2")); // Line 3
std::cout << x2.s << std::endl; // Line 4
}
Clarification:
Note that I believe Boost uBLAS stores references, this is why I don't want to store a copy. If you suggest that I store by value, please explain why Boost uBLAS is wrong and storing by value will not affect performance.
Expression templates that do store by reference typically do so for performance, but with the caveat they only be used as temporaries
Taken from the documentation of Boost.Proto (which can be used to create expression templates):
Note An astute reader will notice that the object y defined above will be left holding a dangling reference to a temporary int. In the sorts of high-performance applications Proto addresses, it is typical to build and evaluate an expression tree before any temporary objects go out of scope, so this dangling reference situation often doesn't arise, but it is certainly something to be aware of. Proto provides utilities for deep-copying expression trees so they can be passed around as value types without concern for dangling references.
In your initial example this means that you should do:
std::cout << C(std::string("hello2")).s << std::endl;
That way the C temporary never outlives the std::string temporary. Alternatively you could make s a non reference member as others pointed out.
Since you mention C++11, in the future I expect expression trees to store by value, using move semantics to avoid expensive copying and wrappers like std::reference_wrapper to still give the option of storing by reference. This would play nicely with auto.
A possible C++11 version of your code:
class C
{
public:
explicit
C(std::string const& s_): s { s_ } {}
explicit
C(std::string&& s_): s { std::move(s_) } {}
std::string const&
get() const& // notice lvalue *this
{ return s; }
std::string
get() && // notice rvalue *this
{ return std::move(s); }
private:
std::string s; // not const to enable moving
};
This would mean that code like C("hello").get() would only allocate memory once, but still play nice with
std::string clvalue("hello");
auto c = C(clvalue);
std::cout << c.get() << '\n'; // no problem here
but is there a way to avoid this issue without using #define?
Yes.
Define your class as: (don't store the reference)
class C
{
public:
C(const std::string & p_s) : s(p_s) {}
const std::string s; //store the copy!
};
Store the copy!
Demo : http://www.ideone.com/GpSa2
The problem with your code is that std::string("hello2") creates a temporary, and it remains alive as long as you're in the constructor of C, and after that the temporary is destroyed but your object x2.s stills points to it (the dead object).
After your edit:
Storing by reference is dangerous and error prone sometimes. You should do it only when you are 100% sure that the variable reference will never go out of scope until its death.
C++ string is very optimized. Until you change a string value, all will refer to the same string only. To test it, you can overload operator new (size_t) and put a debug statement. For multiple copies of same string, you will see that the memory allocation will happen only once.
You class definition should not be storing by reference, but by value as,
class C {
const std::string s; // s is NOT a reference now
};
If this question is meant for general sense (not specific to string) then the best way is to use dynamic allocation.
class C {
MyClass *p;
C() : p (new MyClass()) {} // just an example, can be allocated from outside also
~C() { delete p; }
};
Without looking at BLAS, expression templates typically make heavy use of temporary objects of types you aren't supposed to even know exists. If Boost is storing references like this within theirs, then they would suffer the same problem you see here. But as long as those temporary objects remain temporary, and the user doesnt store them for later, everything is fine because the temporaries they reference remain alive for as long as the temporary objects do. The trick is you perform a deep copy when the intermediate object is turned into the final object that the user stores. You've skipped this last step here.
In short, it's a dangerous move, which is perfectly safe as long as the user of your library doesn't do anything foolish. I wouldn't recommend making use of it unless you have a clear need, and you're well aware of the consequences. And even then, there might be a better alternative, I've never worked with expression templates in any serious capacity.
As an aside, since you tagged this C++0x, auto x = a + b; seems like it would be one of those "foolish" things users of your code can do to make your optimization dangerous.