Why can't we trivially copy std::function

Why can't we trivially copy std::function - c++

The reason for me to ask this is I need to store std::function in a vector, and the in-house vector we have in company basically is doing realloc if it needs more memory. (Basically just memcpy, no copy/move operator involves)
This means all the element we can put in our container need to be trivially-copyable.
Here is some code to demonstrate the problematic copy I had:
void* func1Buffer = malloc(sizeof(std::function<void(int)>));
std::function<void(int)>* func1p = new (func1Buffer) std::function<void(int)>();
std::function<void(int)>* func2p = nullptr;
*func1p = [](int) {};
char func2Buffer[sizeof(*func1p)];
memcpy(&func2Buffer, func1p, sizeof(*func1p));
func2p = (std::function<void(int)>*)(func2Buffer);
// func2p is still valid here
(*func2p)(10);
free(func1Buffer);
// func2p is now invalid, even without std::function<void(int)> desctructor get triggered
(*func2p)(10);
I understand we should support copy/move of the element in order to store std::function safely.
But I am still very curious about what is the direct cause of invalid std::function copy above.
----------------------------------------------------UpdateLine----------------------------------------------------
Updated the code sample.
I have found the direct reason for this failure, by debugging our in-house vector more.
The trivially copied std::function has some dependency on original object memory, delete the original memory will trash the badly copied std::function even without the destruction of the original object.
Thanks for everyone's answer to this post. It's all valuable input. :)

The problem is how std::function has to be implemented: it has to manage the lifetime of whatever object it's holding onto. So when you write:
{
std::function<Sig> f = X{};
}
we must invoke the destructor of X when f goes out of scope. Moreover, std::function will [potentially] allocate memory to hold that X so the destructor of f must also [potentially] free that memory.
Now consider what happens when we try to do:
char buffer[100000]; // something big
{
std::function<void()> f = X{};
memcpy(buffer, &f, sizeof(f));
}
(*reinterpret_cast<std::function<void()>*>(buffer))();
At the point we're calling the function "stored" at buffer, the X object has already been destroyed and the memory holding it has been [potentially] freed. Regardless of whether X were TriviallyCopyable, we don't have an X anymore. We have the artist formerly known as an X.
Because it's incumbent upon std::function to manage its own objects, it cannot be TriviallyCopyable even if we added the requirement that all callables it managed were TriviallyCopyable.
To work in your realloc_vector, you need either need something like function_ref (or std::function<>*) (that is, a type that simply doesn't own any resources), or you need to implement your own version of function that (a) keeps its own storage as a member to avoid allocating memory and (b) is only constructible with TriviallyCopyable callables so that it itself becomes trivially copyable. Whichever solution is better depends on the what your program is actually doing.

But I am still very curious about what is the direct cause of invalid
std::function copy above.
std::function cannot be TriviallyCopyable (or conditionally TriviallyCopyable) because as a generic callable object wrapper it cannot assume that the stored callable is TriviallyCopyable.
Consider implementing your own version of std::function that only supports TriviallyCopyable callable objects (using a fixed buffer for storage), or use a vector of function pointers if applicable in your situation.

To be trivially copyable is something that is inherently related to a given type, not to an object.
Consider the following example:
#include<type_traits>
#include<functional>
int main() {
auto l = [](){};
static_assert(not std::is_trivially_copyable<decltype(l)>::value, "!");
std::function<void(void)> f;
bool copyable = std::is_trivially_copyable<decltype(f)>::value;
f = l;
// do something based on the
// fact that f is trivially copyable
}
How could you enforce the property once you have assigned to the function the lambda, that is not trivially copyable?
What you are looking for would be a runtime machinery that gets a decision based on the actual object assigned to the function.
This is not how std::is_trivially_copyable works.
Therefore the compiler has to make a decision at compile-time regarding the given specialization for the std::function. For it's a generic container for callable objects and you can assign it trivially copyable objects as well as objects that aren't trivially copyable, the rest goes without saying.

A std::function might allocate memory for captured variables. As with any other class which allocates memory, it's not trivially copyable.

Related

MSVC exhibits unexpected behavior while copying lambda using memcpy() [duplicate]

The reason for me to ask this is I need to store std::function in a vector, and the in-house vector we have in company basically is doing realloc if it needs more memory. (Basically just memcpy, no copy/move operator involves)
This means all the element we can put in our container need to be trivially-copyable.
Here is some code to demonstrate the problematic copy I had:
void* func1Buffer = malloc(sizeof(std::function<void(int)>));
std::function<void(int)>* func1p = new (func1Buffer) std::function<void(int)>();
std::function<void(int)>* func2p = nullptr;
*func1p = [](int) {};
char func2Buffer[sizeof(*func1p)];
memcpy(&func2Buffer, func1p, sizeof(*func1p));
func2p = (std::function<void(int)>*)(func2Buffer);
// func2p is still valid here
(*func2p)(10);
free(func1Buffer);
// func2p is now invalid, even without std::function<void(int)> desctructor get triggered
(*func2p)(10);
I understand we should support copy/move of the element in order to store std::function safely.
But I am still very curious about what is the direct cause of invalid std::function copy above.
----------------------------------------------------UpdateLine----------------------------------------------------
Updated the code sample.
I have found the direct reason for this failure, by debugging our in-house vector more.
The trivially copied std::function has some dependency on original object memory, delete the original memory will trash the badly copied std::function even without the destruction of the original object.
Thanks for everyone's answer to this post. It's all valuable input. :)

The problem is how std::function has to be implemented: it has to manage the lifetime of whatever object it's holding onto. So when you write:
{
std::function<Sig> f = X{};
}
we must invoke the destructor of X when f goes out of scope. Moreover, std::function will [potentially] allocate memory to hold that X so the destructor of f must also [potentially] free that memory.
Now consider what happens when we try to do:
char buffer[100000]; // something big
{
std::function<void()> f = X{};
memcpy(buffer, &f, sizeof(f));
}
(*reinterpret_cast<std::function<void()>*>(buffer))();
At the point we're calling the function "stored" at buffer, the X object has already been destroyed and the memory holding it has been [potentially] freed. Regardless of whether X were TriviallyCopyable, we don't have an X anymore. We have the artist formerly known as an X.
Because it's incumbent upon std::function to manage its own objects, it cannot be TriviallyCopyable even if we added the requirement that all callables it managed were TriviallyCopyable.
To work in your realloc_vector, you need either need something like function_ref (or std::function<>*) (that is, a type that simply doesn't own any resources), or you need to implement your own version of function that (a) keeps its own storage as a member to avoid allocating memory and (b) is only constructible with TriviallyCopyable callables so that it itself becomes trivially copyable. Whichever solution is better depends on the what your program is actually doing.

But I am still very curious about what is the direct cause of invalid
std::function copy above.
std::function cannot be TriviallyCopyable (or conditionally TriviallyCopyable) because as a generic callable object wrapper it cannot assume that the stored callable is TriviallyCopyable.
Consider implementing your own version of std::function that only supports TriviallyCopyable callable objects (using a fixed buffer for storage), or use a vector of function pointers if applicable in your situation.

To be trivially copyable is something that is inherently related to a given type, not to an object.
Consider the following example:
#include<type_traits>
#include<functional>
int main() {
auto l = [](){};
static_assert(not std::is_trivially_copyable<decltype(l)>::value, "!");
std::function<void(void)> f;
bool copyable = std::is_trivially_copyable<decltype(f)>::value;
f = l;
// do something based on the
// fact that f is trivially copyable
}
How could you enforce the property once you have assigned to the function the lambda, that is not trivially copyable?
What you are looking for would be a runtime machinery that gets a decision based on the actual object assigned to the function.
This is not how std::is_trivially_copyable works.
Therefore the compiler has to make a decision at compile-time regarding the given specialization for the std::function. For it's a generic container for callable objects and you can assign it trivially copyable objects as well as objects that aren't trivially copyable, the rest goes without saying.

A std::function might allocate memory for captured variables. As with any other class which allocates memory, it's not trivially copyable.

How can unique_ptr have no overhead if it needs to store the deleter?

First take a look at what C++ Primer said about unique_ptr and shared_ptr:
$16.1.6. Efficiency and Flexibility
We can be certain that shared_ptr does not hold the deleter as a direct member, because the type of the deleter isn’t known until run time.
Because the type of the deleter is part of the type of a unique_ptr, the type of the deleter member is known at compile time. The deleter can be stored directly in each unique_ptr object.
So it seems like that the shared_ptr does not have a direct member of deleter, but unique_ptr does. However, the top-voted answer of another question says:
If you provide the deleter as template argument (as in unique_ptr) it is part of the type and you don't need to store anything additional in the objects of this type. If deleter is passed as constructor's argument (as in shared_ptr) you need to store it in the object. This is the cost of additional flexibility, since you can use different deleters for the objects of the same type.
The two quoted paragraph are totally conflicting, which makes me confused. What's more, many people says unique_ptr is zero overhead because it doesn't need to store the deleter as member. However, as we know, unique_ptr has a constructor of unique_ptr<obj,del> p(new obj,fcn), which means that we can pass a deleter to it, so unique_ptr seems to have stored deleter as a member. What a mess!

std::unique_ptr<T> is quite likely to be zero-overhead (with any sane standard-library implementation). std::unique_ptr<T, D>, for an arbitrary D, is not in general zero-overhead.
The reason is simple: Empty-Base Optimisation can be used to eliminate storage of the deleter in case it's an empty (and thus stateless) type (such as std::default_delete instantiations).

The key phrase which seems to confuse you is "The deleter can be stored directly". But there's no point in storing a deleter of type std::default_delete. If you need one, you can just create one as std::default_delete{}.
In general, stateless deleters do not need to be stored, as you can create them on demand.

Angew's answer explained pretty thoroughly what's going on.
For those curious how things could look under the covers
template<typename T, typename D, bool Empty = std::is_empty_v<D>>
class unique_ptr
{
T* ptr;
D d;
// ...
};
template<typename T, typename D>
class unique_ptr<T, D, true> : D
{
T* ptr;
// ...
};
Which specializes for empty deleters and take advantage of empty base optimization.

Brief intro:
unique_ptr can introduce some small overhead, but not because of the deleter, but because when you move from it value must be set to null where if you were using raw pointers you could leave the old pointer in bug prone but legitimate state where it still points to where it pointed before. Obviously smart optimizer can optimize, but it is not guaranteed.
Back to the deleter:
Other answers are correct, but elaborate. So here is the simplified version witout mention of EBO or other complicated terms.
If deleter is empty(has no state) you do not need to keep it inside the unique_ptr. If you need it you can just construct it when you need it. All you need to know is the deleter type(and that is one of the template arguments for unique_ptr).
For exaple consider following code, than also demonstrates simple creation on demand of a stateless object.
#include <iostream>
#include <string>
#include <string_view>
template<typename Person>
struct Greeter{
void greet(){
static_assert(std::is_empty_v<Person>, "Person must be stateless");
Person p; // Stateless Person instance constructed on demand
std::cout << "Hello " << p() << std::endl;
}
// ... and not kept as a member.
};
struct Bjarne{
std::string_view operator()(){
return "Bjarne";
}
};
int main() {
Greeter<Bjarne> hello_bjarne;
hello_bjarne.greet();
}

c++ type trait to say "trivially movable" - examples of

I would define "trivially movable" by
Calling the move constructor (or the move assignment operator) is
equivalent to memcpy the bytes to the new destination and not calling
the destructor on the moved-from object.
For instance, if you know that this property holds, you can use realloc to resize a std::vector or a memory pool.
Types failing this would typically have pointers to their contents that needs to be updated by the move constructor/assignment operator.
There is no such type traits in the standard that I can find.
I am wondering whether this already has a (better) name, whether it's been discussed and whether there are some libraries making use of such a trait.
Edit 1:
From the first few comments, std::is_trivially_move_constructible and std::is_trivially_move_assignable are not equivalent to what I am looking for.
I believe they would give true for types containing pointers to themselves, since reading your own member seems to fall under "trivial" operation.
Edit 2:
When properly implemented, types which point to themselves won't be trivially_move_constructible or move_assignable because the move ctor / move assignment operator are not trivial anymore.
Though, we ought to be able to say that unique_ptr can be safely copied to a new location provided we don't call its destructor.

I think what you need is std::is_trivially_relocatable from proposal P1144. Unfortunately the proposal didn't make it into C++20, so we shouldn't expect it before 2023. Which is sad, because this type trait would enable great optimizations for std::vector and similar types.

Well, this got me thinking... It is very important to overload type traits of structs that hold a pointer to themselves.
The following code demonstrates how fast a bug can creep in code, when type_traits are not defined properly.
#include <memory>
#include <type_traits>
struct A
{
int a;
int b;
int* p{&a};
};
int main()
{
auto p = std::make_unique<A>();
A a = std::move(*p.get()); // gets moved here, a.p is dangling.
return std::is_move_assignable<A>::value; // <-- yet, this returns true.
}

What is the lifecycle of a C++ object?

I'm a seasoned C developer who is just now getting into C++, and I must admit, I'm very confused about how many ways there are to create, retain, and destroy C++ objects. In C, life is simple: assignment with = copies on the stack, and malloc/free manage data on the heap. C++ is far from that, or so it seems to me.
In light of that, here are my questions:
What are all the ways to create a C++ object? Direct/copy constructor, assignment, etc. How do they work?
What are all the different initialization syntaxes associated with all these types of object creation? What's the difference between T f = x, T f(x);, T f{x};, etc.?
Most importantly, when is it correct to copy/assign/whatever = is in C++, and when do you want to use pointers? In C, I got very used to throwing pointers around a lot, because pointer assignment is cheap but struct copying is less so. How do C++'s copy semantics affect this?
Finally, what are all these things like shared_ptr, weak_ptr, etc.?
I'm sorry if this is a somewhat broad question, but I'm very confused about when to use what (not even mentioning my confusion about memory management in collections and the new operator), and I feel like everything I knew about C memory management breaks down in C++. Is that true, or is my mental model just wrong?
To sum things up: how are C++ objects created, initialized, and destroyed, and when should I use each method?

First of all, your memory management skills are useful in C++, just they are a level below the C++ way of doing things, but they are there...
About your questions, they are a bit broad, so I'll try to keep it short:
1) What are all the ways to create a C++ object?
Same as C: they can be global variables, local automatic, local static or dynamic. You may be confused by the constructor, but simply think that every time you create an object, a constructor is called. Always. Which constructor is simply a matter of what parameters are used when creating the object.
Assignment does not create a new object, it simply copies from one oject to another, (think of memcpy but smarter).
2) What are all the different initialization syntaxes associated with all these types of object creation? What's the difference between T f = x, T f(x);, T f{x};, etc.?
T f(x) is the classic way, it simply creates an object of type T using the constructor that takes x as argument.
T f{x} is the new C++11 unified syntax, as it can be used to initialize aggregate types (arrays and such), but other than that it is equivalent to the former.
T f = x it depends on whether x is of type T. If it is, then it equivalent to the former, but if it is of different type, then it is equivalent to T f = T(x). Not that it really matters, because the compiler is allowed to optimize away the extra copy (copy elision).
T(x). You forgot this one. A temporary object of type T is created (using the same constructor as above), it is used whereever it happens in the code, and at the end of the current full expression, it is destroyed.
T f. This creates a value of type T using the default constructor, if available. That is simply a constructor that takes no parameters.
T f{}. Default contructed, but with the new unified syntax. Note that T f() is not an object of type T, but instead a function returning T!.
T(). A temporary object using the default constructor.
3) Most importantly, when is it correct to copy/assign/whatever = is in C++, and when do you want to use pointers?
You can use the same as in C. Think of the copy/assignment as if it where a memcpy. You can also pass references around, but you also may wait a while until you feel comfortable with those. What you should do, is: do not use pointers as auxiliary local variables, use references instead.
4) Finally, what are all these things like shared_ptr, weak_ptr, etc.?
They are tools in your C++ tool belt. You will have to learn through experience and some mistakes...
shared_ptr use when the ownership of the object is shared.
unique_ptr use when the ownership of the object is unique and unambiguous.
weak_ptr used to break loops in trees of shared_ptr. They are not detected automatically.
vector. Don't forget this one! Use it to create dynamic arrays of anything.
PS: You forgot to ask about destructors. IMO, destructors are what gives C++ its personality, so be sure to use a lot of them!

This is a fairly broad question, but I'll give you a starting point.
What's known in C as a "stack variable" is also called an object with "automatic storage". The lifetime of an object with automatic storage is fairly easy to understand: it's created when control reaches the point it's defined, and then destroyed when it goes out of scope:
int main() {
int foo = 5; // creation of automatic storage
do_stuff();
foo = 1;
// end of function; foo is destroyed.
}
Now, a thing to note is that = 5 is considered part of the initialization syntax, while = 1 is considered an assignment operation. I don't want you to get confused by = being used for two different things in the language's grammar.
Anyway, C++ takes automatic storage a bit further and allows arbitrary code to be run during the creation and destruction of that object: the constructors and destructors. This gives rise to the wonderful idiom called RAII, which you should use whenever possible. With RAII, resource management becomes automatic.
what are all these things like shared_ptr, weak_ptr, etc.?
Good examples of RAII. They allow you to treat a dynamic resource (malloc/free calls) as an automatic storage object!
Most importantly, when is it correct to copy/assign/whatever = is in C++, and when do you want to use pointers? In C, I got very used to throwing pointers around a lot, because pointer assignment is cheap but struct copying is less so. How do C++'s copy semantics affect this?
const references everywhere, especially for function parameters. const refs avoid copies and prevent modification of the object. If you can't use const ref, chances are a normal reference is suitable. If for some reason you want to reset the reference or set it to null, use a pointer.
What are all the ways to create a C++ object? Direct/copy constructor, assignment, etc. How do they work?
In short, all constructors create objects. Assignment doesn't. Read a book for this.

There are many ways of implicit object creating in C++ apart from explicit ones. Almost all of them use copy-constructor of the object's class. Remember: Implicit copying may require the copy constructor and/or assignment operator of a T type to be declared in public scope depending on where copying occurs. So in course:
a) explicit creation of a brand new object in stack:
T object(arg);
b) explicit copying of an existing object:
T original(arg);
...
T copy(original);
If T class has no copy constructor defined default implementation is created by compiler. It attempts to create an exact copy of the passed object. This is not always what programmer want, so custom implementation may be useful sometimes.
c) explicit creation of a brand new object in heap:
T *ptr = new T(arg);
d) implicit creation of a brand new object which constructor takes only one parameter and has no explicit modifier, for instance:
class T
{
public:
T(int x) : i(x) {}
private:
int i;
}
...
T object = 5; // actually implicit invocation of constructor occurs here
e) implicit copying of an object passed to a function by value:
void func(T input)
{
// here `input` is a copy of an object actually passed
}
...
int main()
{
T object(arg);
func(object); // copy constructor of T class is invoked before the `func` is called
}
f) implicit copying of an exception object handling by value:
void function()
{
...
throw T(arg); // suppose that exception is always raised in the `function`
...
}
...
int main()
{
...
try {
function();
} catch (T exception) { // copy constructor of T class is invoked here
// handling `exception`
}
...
}
g) Creation of a new object using assignment operator. I haven't used word 'copy' because in this case an assignment operator implementation of a particular type matters. If this operator is not implemented default implementation is created by compiler, btw it has the same behavior as default copy constructor.
class T
{
T(int x) : i(x) {}
T operator=() const
{
return T(*this); // in this implementation we explicitly call default copy constructor
}
}
...
int main()
{
...
T first(5);
T second = first; // assingment operator is invoked
...
}
Well, that's what I am able to remember without looking into Stroustrup's book. May be something is missed.
While I was writing this, some answer was accepted so I stop at this point. May the details I listed will be useful.

How to convert an object instance to shared_ptr instance

Suppose I had two shared_ptr types such as
boost::shared_ptr<ObjA> sptrA;
boost::shared_ptr<ObjB> sptrB;
Now suppose that sptrA->SomeMethod() returned a simple ObjB type (not a shared ptr). Is it possible for me to store that type somehow in sptrB ? So that I could do something like this so that the returned type instance is automatically converted to boost_shared ptr
sptrB = sptrA->SomeMethod();
I asked this question just of curiosity and whether it is possible or not ?

The most standard way of creating boost:shared_ptr objects is to use the make_shared function provided by Boost:
#include <boost/shared_ptr.hpp>
#include <boost/make_shared.hpp>
struct A {};
A generator() {
return A();
}
int main()
{
using namespace boost;
shared_ptr<A> p = make_shared<A>(generator());
return 0;
}
Since the generator() function returns an A object by value, the syntax above implies that new is invoked with the copy contructor of A, and the resulting pointer is wrapped in a shared-pointer object. In other words, make_shared doesn't quite perform a conversion to shared pointer; instead, it creates a copy of the object on the heap and provides memory management for that. This may or may not be what you need.
Note that this is equivalent to what std::make_shared does for std::shared_ptr in C++11.
One way to provide the convenient syntax you mentioned in your question is to define a conversion operator to shared_ptr<A> for A:
struct A {
operator boost::shared_ptr<A>() {
return boost::make_shared<A>(*this);
}
};
Then you can use it as follows:
shared_ptr<A> p = generate();
This will automatically "convert" the object returned by the function. Again, conversion here really means heap allocation, copying and wrapping in a shared pointer. Therefore, I am not really sure if I'd recommend defining such a convenience conversion operator. It makes the syntax very convenient, but it, as all implicit conversion operators, may also mean that you implicitly cause these "conversions" to happen in places you didn't expect.

Since C++ 11 you can use std::make_shared<T>() function (link)
Example:
int a = 10;
std::shared_ptr<int> shared_a = std::make_shared<int>(a);

This depends on precisely what ObjA::SomeMethod returns - a copy, a reference or a pointer. In the first two cases it would not be feasible to wrap it into a shared_ptr (because shared_ptr needs a pointer).
The third case is possible, but you must proceed with caution. Make sure that once you wrap a pointer to an object into a shared_ptr, no one else attempts to manage the lifetime of that object.
For example, if you return a raw pointer, wrap it into a shared pointer and then, at some point later in the program, someone deletes that same pointer, you will have a problem.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Why can't we trivially copy std::function - c++

A std::function might allocate memory for captured variables. As with any other class which allocates memory, it's not trivially copyable.

Related

MSVC exhibits unexpected behavior while copying lambda using memcpy() [duplicate]

How can unique_ptr have no overhead if it needs to store the deleter?

c++ type trait to say "trivially movable" - examples of

What is the lifecycle of a C++ object?

How to convert an object instance to shared_ptr instance

Categories

Resources