MSVC exhibits unexpected behavior while copying lambda using memcpy() [duplicate]

MSVC exhibits unexpected behavior while copying lambda using memcpy() [duplicate] - c++

The reason for me to ask this is I need to store std::function in a vector, and the in-house vector we have in company basically is doing realloc if it needs more memory. (Basically just memcpy, no copy/move operator involves)
This means all the element we can put in our container need to be trivially-copyable.
Here is some code to demonstrate the problematic copy I had:
void* func1Buffer = malloc(sizeof(std::function<void(int)>));
std::function<void(int)>* func1p = new (func1Buffer) std::function<void(int)>();
std::function<void(int)>* func2p = nullptr;
*func1p = [](int) {};
char func2Buffer[sizeof(*func1p)];
memcpy(&func2Buffer, func1p, sizeof(*func1p));
func2p = (std::function<void(int)>*)(func2Buffer);
// func2p is still valid here
(*func2p)(10);
free(func1Buffer);
// func2p is now invalid, even without std::function<void(int)> desctructor get triggered
(*func2p)(10);
I understand we should support copy/move of the element in order to store std::function safely.
But I am still very curious about what is the direct cause of invalid std::function copy above.
----------------------------------------------------UpdateLine----------------------------------------------------
Updated the code sample.
I have found the direct reason for this failure, by debugging our in-house vector more.
The trivially copied std::function has some dependency on original object memory, delete the original memory will trash the badly copied std::function even without the destruction of the original object.
Thanks for everyone's answer to this post. It's all valuable input. :)

The problem is how std::function has to be implemented: it has to manage the lifetime of whatever object it's holding onto. So when you write:
{
std::function<Sig> f = X{};
}
we must invoke the destructor of X when f goes out of scope. Moreover, std::function will [potentially] allocate memory to hold that X so the destructor of f must also [potentially] free that memory.
Now consider what happens when we try to do:
char buffer[100000]; // something big
{
std::function<void()> f = X{};
memcpy(buffer, &f, sizeof(f));
}
(*reinterpret_cast<std::function<void()>*>(buffer))();
At the point we're calling the function "stored" at buffer, the X object has already been destroyed and the memory holding it has been [potentially] freed. Regardless of whether X were TriviallyCopyable, we don't have an X anymore. We have the artist formerly known as an X.
Because it's incumbent upon std::function to manage its own objects, it cannot be TriviallyCopyable even if we added the requirement that all callables it managed were TriviallyCopyable.
To work in your realloc_vector, you need either need something like function_ref (or std::function<>*) (that is, a type that simply doesn't own any resources), or you need to implement your own version of function that (a) keeps its own storage as a member to avoid allocating memory and (b) is only constructible with TriviallyCopyable callables so that it itself becomes trivially copyable. Whichever solution is better depends on the what your program is actually doing.

But I am still very curious about what is the direct cause of invalid
std::function copy above.
std::function cannot be TriviallyCopyable (or conditionally TriviallyCopyable) because as a generic callable object wrapper it cannot assume that the stored callable is TriviallyCopyable.
Consider implementing your own version of std::function that only supports TriviallyCopyable callable objects (using a fixed buffer for storage), or use a vector of function pointers if applicable in your situation.

To be trivially copyable is something that is inherently related to a given type, not to an object.
Consider the following example:
#include<type_traits>
#include<functional>
int main() {
auto l = [](){};
static_assert(not std::is_trivially_copyable<decltype(l)>::value, "!");
std::function<void(void)> f;
bool copyable = std::is_trivially_copyable<decltype(f)>::value;
f = l;
// do something based on the
// fact that f is trivially copyable
}
How could you enforce the property once you have assigned to the function the lambda, that is not trivially copyable?
What you are looking for would be a runtime machinery that gets a decision based on the actual object assigned to the function.
This is not how std::is_trivially_copyable works.
Therefore the compiler has to make a decision at compile-time regarding the given specialization for the std::function. For it's a generic container for callable objects and you can assign it trivially copyable objects as well as objects that aren't trivially copyable, the rest goes without saying.

A std::function might allocate memory for captured variables. As with any other class which allocates memory, it's not trivially copyable.

Related

Is memcpy with this pointer safe?

I'm currently writing an own string implementation in C++. (Just for exercise).
However, I currently have this copy-constructor:
// "obj" has the same type of *this, it's just another string object
string_base<T>(const string_base<T> &obj)
: len(obj.length()), cap(obj.capacity()) {
raw_data = new T[cap];
for (unsigned i = 0; i < cap; i++)
raw_data[i] = obj.data()[i];
raw_data[len] = 0x00;
}
and I wanted to increase performance a little bit. So I came on the idea using memcpy() to just copy obj into *this.
Just like that:
// "obj" has the same type of *this, it's just another string object
string_base<T>(const string_base<T> &obj) {
memcpy(this, &obj, sizeof(string_base<T>));
}
Is it safe to overwrite the data of *this like that? Or may this produce any problems?
Thanks in advance!

No, it's not safe. From cppreference.com:
If the objects are not TriviallyCopyable, the behavior of memcpy is not specified and may be undefined.
Your class is not TriviallyCopyable, since its copy constructor is user-provided.
Moreover, your copy constructor would make only shallow copies (which might be fine if you wanted, e.g., copy-on-write mechanism applied with your strings).

It will produce problems. All references and pointers will be just copied, even the pointer to raw_data, which will be the same as the source object.
As a requisite in order to use memcpy, your class should:
Be Trivially Copyable
Have no references or pointers unless: static pointers or not owning the pointed data. Unless you know what you are doing such as implementing a copy-on-write mechanism or when this behaviour is otherwise intended and managed.

As others have said, in order for memcpy to work correctly, the the objects being copied must be trivially copyable. For an arbitrary type like T in a template, you can't be sure of that. Sure, you can check for it, but it's much easier to let somebody else do the checking. Instead of writing that loop and tweaking it, use std::copy_n. It will use memcpy when that's appropriate, and element-by-element copying when it isn't. So change
raw_data = new T[cap];
for (unsigned i = 0; i < cap; i++)
raw_data[i] = obj.data()[i];
to
raw_data = new T[cap];
std::copy_n(obj.data(), cap, raw_data);
This also has the slight advantage of not evaluating obj.data() on every pass through the loop, which is an optimization that your compiler might or might not apply.

It seems pretty obvious from the limited fragment that raw_data is a member, and a pointer to a new[]'ed array of T. If you memcpy the object, you memcpy this pointer. You don't copy the array.
Look at your destructor. It probably calls delete[], unconditionally. It has no idea how many copies exist. That means it's calling delete[] too often. This is fixable: you'd need something similar to shared_ptr. That's not at all trivial; you have to worry about thread safety of that share count. And obviously you can't just memcpy the object, as that would not update the share count.

c++ type trait to say "trivially movable" - examples of

I would define "trivially movable" by
Calling the move constructor (or the move assignment operator) is
equivalent to memcpy the bytes to the new destination and not calling
the destructor on the moved-from object.
For instance, if you know that this property holds, you can use realloc to resize a std::vector or a memory pool.
Types failing this would typically have pointers to their contents that needs to be updated by the move constructor/assignment operator.
There is no such type traits in the standard that I can find.
I am wondering whether this already has a (better) name, whether it's been discussed and whether there are some libraries making use of such a trait.
Edit 1:
From the first few comments, std::is_trivially_move_constructible and std::is_trivially_move_assignable are not equivalent to what I am looking for.
I believe they would give true for types containing pointers to themselves, since reading your own member seems to fall under "trivial" operation.
Edit 2:
When properly implemented, types which point to themselves won't be trivially_move_constructible or move_assignable because the move ctor / move assignment operator are not trivial anymore.
Though, we ought to be able to say that unique_ptr can be safely copied to a new location provided we don't call its destructor.

I think what you need is std::is_trivially_relocatable from proposal P1144. Unfortunately the proposal didn't make it into C++20, so we shouldn't expect it before 2023. Which is sad, because this type trait would enable great optimizations for std::vector and similar types.

Well, this got me thinking... It is very important to overload type traits of structs that hold a pointer to themselves.
The following code demonstrates how fast a bug can creep in code, when type_traits are not defined properly.
#include <memory>
#include <type_traits>
struct A
{
int a;
int b;
int* p{&a};
};
int main()
{
auto p = std::make_unique<A>();
A a = std::move(*p.get()); // gets moved here, a.p is dangling.
return std::is_move_assignable<A>::value; // <-- yet, this returns true.
}

Why can't we trivially copy std::function

The reason for me to ask this is I need to store std::function in a vector, and the in-house vector we have in company basically is doing realloc if it needs more memory. (Basically just memcpy, no copy/move operator involves)
This means all the element we can put in our container need to be trivially-copyable.
Here is some code to demonstrate the problematic copy I had:
void* func1Buffer = malloc(sizeof(std::function<void(int)>));
std::function<void(int)>* func1p = new (func1Buffer) std::function<void(int)>();
std::function<void(int)>* func2p = nullptr;
*func1p = [](int) {};
char func2Buffer[sizeof(*func1p)];
memcpy(&func2Buffer, func1p, sizeof(*func1p));
func2p = (std::function<void(int)>*)(func2Buffer);
// func2p is still valid here
(*func2p)(10);
free(func1Buffer);
// func2p is now invalid, even without std::function<void(int)> desctructor get triggered
(*func2p)(10);
I understand we should support copy/move of the element in order to store std::function safely.
But I am still very curious about what is the direct cause of invalid std::function copy above.
----------------------------------------------------UpdateLine----------------------------------------------------
Updated the code sample.
I have found the direct reason for this failure, by debugging our in-house vector more.
The trivially copied std::function has some dependency on original object memory, delete the original memory will trash the badly copied std::function even without the destruction of the original object.
Thanks for everyone's answer to this post. It's all valuable input. :)

The problem is how std::function has to be implemented: it has to manage the lifetime of whatever object it's holding onto. So when you write:
{
std::function<Sig> f = X{};
}
we must invoke the destructor of X when f goes out of scope. Moreover, std::function will [potentially] allocate memory to hold that X so the destructor of f must also [potentially] free that memory.
Now consider what happens when we try to do:
char buffer[100000]; // something big
{
std::function<void()> f = X{};
memcpy(buffer, &f, sizeof(f));
}
(*reinterpret_cast<std::function<void()>*>(buffer))();
At the point we're calling the function "stored" at buffer, the X object has already been destroyed and the memory holding it has been [potentially] freed. Regardless of whether X were TriviallyCopyable, we don't have an X anymore. We have the artist formerly known as an X.
Because it's incumbent upon std::function to manage its own objects, it cannot be TriviallyCopyable even if we added the requirement that all callables it managed were TriviallyCopyable.
To work in your realloc_vector, you need either need something like function_ref (or std::function<>*) (that is, a type that simply doesn't own any resources), or you need to implement your own version of function that (a) keeps its own storage as a member to avoid allocating memory and (b) is only constructible with TriviallyCopyable callables so that it itself becomes trivially copyable. Whichever solution is better depends on the what your program is actually doing.

But I am still very curious about what is the direct cause of invalid
std::function copy above.
std::function cannot be TriviallyCopyable (or conditionally TriviallyCopyable) because as a generic callable object wrapper it cannot assume that the stored callable is TriviallyCopyable.
Consider implementing your own version of std::function that only supports TriviallyCopyable callable objects (using a fixed buffer for storage), or use a vector of function pointers if applicable in your situation.

To be trivially copyable is something that is inherently related to a given type, not to an object.
Consider the following example:
#include<type_traits>
#include<functional>
int main() {
auto l = [](){};
static_assert(not std::is_trivially_copyable<decltype(l)>::value, "!");
std::function<void(void)> f;
bool copyable = std::is_trivially_copyable<decltype(f)>::value;
f = l;
// do something based on the
// fact that f is trivially copyable
}
How could you enforce the property once you have assigned to the function the lambda, that is not trivially copyable?
What you are looking for would be a runtime machinery that gets a decision based on the actual object assigned to the function.
This is not how std::is_trivially_copyable works.
Therefore the compiler has to make a decision at compile-time regarding the given specialization for the std::function. For it's a generic container for callable objects and you can assign it trivially copyable objects as well as objects that aren't trivially copyable, the rest goes without saying.

A std::function might allocate memory for captured variables. As with any other class which allocates memory, it's not trivially copyable.

What is the lifecycle of a C++ object?

I'm a seasoned C developer who is just now getting into C++, and I must admit, I'm very confused about how many ways there are to create, retain, and destroy C++ objects. In C, life is simple: assignment with = copies on the stack, and malloc/free manage data on the heap. C++ is far from that, or so it seems to me.
In light of that, here are my questions:
What are all the ways to create a C++ object? Direct/copy constructor, assignment, etc. How do they work?
What are all the different initialization syntaxes associated with all these types of object creation? What's the difference between T f = x, T f(x);, T f{x};, etc.?
Most importantly, when is it correct to copy/assign/whatever = is in C++, and when do you want to use pointers? In C, I got very used to throwing pointers around a lot, because pointer assignment is cheap but struct copying is less so. How do C++'s copy semantics affect this?
Finally, what are all these things like shared_ptr, weak_ptr, etc.?
I'm sorry if this is a somewhat broad question, but I'm very confused about when to use what (not even mentioning my confusion about memory management in collections and the new operator), and I feel like everything I knew about C memory management breaks down in C++. Is that true, or is my mental model just wrong?
To sum things up: how are C++ objects created, initialized, and destroyed, and when should I use each method?

First of all, your memory management skills are useful in C++, just they are a level below the C++ way of doing things, but they are there...
About your questions, they are a bit broad, so I'll try to keep it short:
1) What are all the ways to create a C++ object?
Same as C: they can be global variables, local automatic, local static or dynamic. You may be confused by the constructor, but simply think that every time you create an object, a constructor is called. Always. Which constructor is simply a matter of what parameters are used when creating the object.
Assignment does not create a new object, it simply copies from one oject to another, (think of memcpy but smarter).
2) What are all the different initialization syntaxes associated with all these types of object creation? What's the difference between T f = x, T f(x);, T f{x};, etc.?
T f(x) is the classic way, it simply creates an object of type T using the constructor that takes x as argument.
T f{x} is the new C++11 unified syntax, as it can be used to initialize aggregate types (arrays and such), but other than that it is equivalent to the former.
T f = x it depends on whether x is of type T. If it is, then it equivalent to the former, but if it is of different type, then it is equivalent to T f = T(x). Not that it really matters, because the compiler is allowed to optimize away the extra copy (copy elision).
T(x). You forgot this one. A temporary object of type T is created (using the same constructor as above), it is used whereever it happens in the code, and at the end of the current full expression, it is destroyed.
T f. This creates a value of type T using the default constructor, if available. That is simply a constructor that takes no parameters.
T f{}. Default contructed, but with the new unified syntax. Note that T f() is not an object of type T, but instead a function returning T!.
T(). A temporary object using the default constructor.
3) Most importantly, when is it correct to copy/assign/whatever = is in C++, and when do you want to use pointers?
You can use the same as in C. Think of the copy/assignment as if it where a memcpy. You can also pass references around, but you also may wait a while until you feel comfortable with those. What you should do, is: do not use pointers as auxiliary local variables, use references instead.
4) Finally, what are all these things like shared_ptr, weak_ptr, etc.?
They are tools in your C++ tool belt. You will have to learn through experience and some mistakes...
shared_ptr use when the ownership of the object is shared.
unique_ptr use when the ownership of the object is unique and unambiguous.
weak_ptr used to break loops in trees of shared_ptr. They are not detected automatically.
vector. Don't forget this one! Use it to create dynamic arrays of anything.
PS: You forgot to ask about destructors. IMO, destructors are what gives C++ its personality, so be sure to use a lot of them!

This is a fairly broad question, but I'll give you a starting point.
What's known in C as a "stack variable" is also called an object with "automatic storage". The lifetime of an object with automatic storage is fairly easy to understand: it's created when control reaches the point it's defined, and then destroyed when it goes out of scope:
int main() {
int foo = 5; // creation of automatic storage
do_stuff();
foo = 1;
// end of function; foo is destroyed.
}
Now, a thing to note is that = 5 is considered part of the initialization syntax, while = 1 is considered an assignment operation. I don't want you to get confused by = being used for two different things in the language's grammar.
Anyway, C++ takes automatic storage a bit further and allows arbitrary code to be run during the creation and destruction of that object: the constructors and destructors. This gives rise to the wonderful idiom called RAII, which you should use whenever possible. With RAII, resource management becomes automatic.
what are all these things like shared_ptr, weak_ptr, etc.?
Good examples of RAII. They allow you to treat a dynamic resource (malloc/free calls) as an automatic storage object!
Most importantly, when is it correct to copy/assign/whatever = is in C++, and when do you want to use pointers? In C, I got very used to throwing pointers around a lot, because pointer assignment is cheap but struct copying is less so. How do C++'s copy semantics affect this?
const references everywhere, especially for function parameters. const refs avoid copies and prevent modification of the object. If you can't use const ref, chances are a normal reference is suitable. If for some reason you want to reset the reference or set it to null, use a pointer.
What are all the ways to create a C++ object? Direct/copy constructor, assignment, etc. How do they work?
In short, all constructors create objects. Assignment doesn't. Read a book for this.

There are many ways of implicit object creating in C++ apart from explicit ones. Almost all of them use copy-constructor of the object's class. Remember: Implicit copying may require the copy constructor and/or assignment operator of a T type to be declared in public scope depending on where copying occurs. So in course:
a) explicit creation of a brand new object in stack:
T object(arg);
b) explicit copying of an existing object:
T original(arg);
...
T copy(original);
If T class has no copy constructor defined default implementation is created by compiler. It attempts to create an exact copy of the passed object. This is not always what programmer want, so custom implementation may be useful sometimes.
c) explicit creation of a brand new object in heap:
T *ptr = new T(arg);
d) implicit creation of a brand new object which constructor takes only one parameter and has no explicit modifier, for instance:
class T
{
public:
T(int x) : i(x) {}
private:
int i;
}
...
T object = 5; // actually implicit invocation of constructor occurs here
e) implicit copying of an object passed to a function by value:
void func(T input)
{
// here `input` is a copy of an object actually passed
}
...
int main()
{
T object(arg);
func(object); // copy constructor of T class is invoked before the `func` is called
}
f) implicit copying of an exception object handling by value:
void function()
{
...
throw T(arg); // suppose that exception is always raised in the `function`
...
}
...
int main()
{
...
try {
function();
} catch (T exception) { // copy constructor of T class is invoked here
// handling `exception`
}
...
}
g) Creation of a new object using assignment operator. I haven't used word 'copy' because in this case an assignment operator implementation of a particular type matters. If this operator is not implemented default implementation is created by compiler, btw it has the same behavior as default copy constructor.
class T
{
T(int x) : i(x) {}
T operator=() const
{
return T(*this); // in this implementation we explicitly call default copy constructor
}
}
...
int main()
{
...
T first(5);
T second = first; // assingment operator is invoked
...
}
Well, that's what I am able to remember without looking into Stroustrup's book. May be something is missed.
While I was writing this, some answer was accepted so I stop at this point. May the details I listed will be useful.

How to convert an object instance to shared_ptr instance

Suppose I had two shared_ptr types such as
boost::shared_ptr<ObjA> sptrA;
boost::shared_ptr<ObjB> sptrB;
Now suppose that sptrA->SomeMethod() returned a simple ObjB type (not a shared ptr). Is it possible for me to store that type somehow in sptrB ? So that I could do something like this so that the returned type instance is automatically converted to boost_shared ptr
sptrB = sptrA->SomeMethod();
I asked this question just of curiosity and whether it is possible or not ?

The most standard way of creating boost:shared_ptr objects is to use the make_shared function provided by Boost:
#include <boost/shared_ptr.hpp>
#include <boost/make_shared.hpp>
struct A {};
A generator() {
return A();
}
int main()
{
using namespace boost;
shared_ptr<A> p = make_shared<A>(generator());
return 0;
}
Since the generator() function returns an A object by value, the syntax above implies that new is invoked with the copy contructor of A, and the resulting pointer is wrapped in a shared-pointer object. In other words, make_shared doesn't quite perform a conversion to shared pointer; instead, it creates a copy of the object on the heap and provides memory management for that. This may or may not be what you need.
Note that this is equivalent to what std::make_shared does for std::shared_ptr in C++11.
One way to provide the convenient syntax you mentioned in your question is to define a conversion operator to shared_ptr<A> for A:
struct A {
operator boost::shared_ptr<A>() {
return boost::make_shared<A>(*this);
}
};
Then you can use it as follows:
shared_ptr<A> p = generate();
This will automatically "convert" the object returned by the function. Again, conversion here really means heap allocation, copying and wrapping in a shared pointer. Therefore, I am not really sure if I'd recommend defining such a convenience conversion operator. It makes the syntax very convenient, but it, as all implicit conversion operators, may also mean that you implicitly cause these "conversions" to happen in places you didn't expect.

Since C++ 11 you can use std::make_shared<T>() function (link)
Example:
int a = 10;
std::shared_ptr<int> shared_a = std::make_shared<int>(a);

This depends on precisely what ObjA::SomeMethod returns - a copy, a reference or a pointer. In the first two cases it would not be feasible to wrap it into a shared_ptr (because shared_ptr needs a pointer).
The third case is possible, but you must proceed with caution. Make sure that once you wrap a pointer to an object into a shared_ptr, no one else attempts to manage the lifetime of that object.
For example, if you return a raw pointer, wrap it into a shared pointer and then, at some point later in the program, someone deletes that same pointer, you will have a problem.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

MSVC exhibits unexpected behavior while copying lambda using memcpy() [duplicate] - c++

A std::function might allocate memory for captured variables. As with any other class which allocates memory, it's not trivially copyable.

Related

Is memcpy with this pointer safe?

c++ type trait to say "trivially movable" - examples of

Why can't we trivially copy std::function

What is the lifecycle of a C++ object?

How to convert an object instance to shared_ptr instance

Categories

Resources