std::optional - construct empty with {} or std::nullopt? - c++

I thought that initializing a std::optional with std::nullopt would be the same as default construction.
They are described as equivalent at cppreference, as form (1)
However, both Clang and GCC seem to treat these toy example functions differently.
#include <optional>
struct Data { char large_data[0x10000]; };
std::optional<Data> nullopt_init()
{
return std::nullopt;
}
std::optional<Data> default_init()
{
return {};
}
Compiler Explorer seems to imply that using std::nullopt will simply set the one byte "contains" flag,
nullopt_init():
mov BYTE PTR [rdi+65536], 0
mov rax, rdi
ret
While default construction will value-initialize every byte of the class. This is functionally equivalent, but almost always costlier.
default_init():
sub rsp, 8
mov edx, 65537
xor esi, esi
call memset
add rsp, 8
ret
Is this intentional behavior? When should one form be preferred over the other?
Update: GCC (since v11.1) and Clang (since v12.0.1) now treat both forms efficiently.

In this case, {} invokes value-initialization. If optional's default constructor is not user-provided (where "not user-provided" means roughly "is implicitly declared or explicitly defaulted within the class definition"), that incurs zero-initialization of the entire object.
Whether it does so depends on the implementation details of that particular std::optional implementation. It looks like libstdc++'s optional's default constructor is not user-provided, but libc++'s is.

For gcc, the unnecessary zeroing with default initialization
std::optional<Data> default_init() {
std::optional<Data> o;
return o;
}
is bug 86173 and needs to be fixed in the compiler itself. Using the same libstdc++, clang does not perform any memset here.
In your code, you are actually value-initializing the object (through list-initialization). It appears that library implementations of std::optional have 2 main options: either they default the default constructor (write =default;, one base class takes care of initializing the flag saying that there is no value), like libstdc++, or they define the default constructor, like libc++.
Now in most cases, defaulting the constructor is the right thing to do, it is trivial or constexpr or noexcept when possible, avoids initializing unnecessary things in default initialization, etc. This happens to be an odd case, where the user-defined constructor has an advantage, thanks to a quirk in the language in [decl.init], and none of the usual advantages of defaulting apply (we can specify explicitly constexpr and noexcept). Value-initialization of an object of class type starts by zero-initializing the whole object, before running the constructor if it is non-trivial, unless the default constructor is user-provided (or some other technical cases). This seems like an unfortunate specification, but fixing it (to look at subobjects to decide what to zero-initialize?) at this point in time may be risky.
Starting from gcc-11, libstdc++ switched to the used-defined constructor version, which generates the same code as std::nullopt. In the mean time, pragmatically, using the constructor from std::nullopt where it does not complicate code seems to be a good idea.

The standard doesn't say anything about the implementation of those two constructors. According to [optional.ctor]:
constexpr optional() noexcept;
constexpr optional(nullopt_t) noexcept;
Ensures:*this does not contain a value.
Remarks: No contained value is initialized. For every object type T these constructors shall be constexpr constructors (9.1.5).
It just specifies the signature of those two constructors and their "Ensures" (aka effects): after any of those constructions the optional doesn't contain any value. No other guarantees are given.
Whether the first constructor is user-defined is implementation-defined (i.e depends on the compiler).
If the first constructor is user-defined, it can of course be implemented as setting the contains flag. But a non-user-defined constructor is also compliant with the standard (as implemented by gcc), because this also zero-initialize the flag to false. Although it does result in costy zero-initialization, it doesn't violate the "Ensures" specified by the standard.
As it comes to real-life usage, well, it is nice that you have dug into the implementations so as to write optimal code.
Just as a side-note, probably the standard should specify the complexity of those two constructors (i.e O(1) or O(sizeof(T)))

Motivational example
When I write:
std::optional<X*> opt{};
(*opt)->f();//expects error here, not UB or heap corruption
I would expect the optional is initialized and doesn't contain uninitialized memory. Also I wouldn't expect a heap corruption to be a consequence since Im expecting everything is initialized fine. This compares up with the pointer semantic of std::optional:
X* ptr{};//ptr is now zero
ptr->f();//deterministic error here, not UB or heap corruption
If I write std::optional<X*>(std::nullopt) I would have hoped the same but at least here it looks more of an ambiguous situation.
The reason is Uninitialized Memory
It is very likely that this behavior is intentional.
(Im not part of any comittee so in the end I cannot say sure)
This is the primary reason: an empty brace init (zero-init) shouldn't lead to uninitialized memory (although the language doesn't enforce this as a rule) - how else will you guarentee there's no un-initialized memory in your program ?
For this task we often turn to use static analysis tools: prominently cpp core check that is based on enforcing the cpp core guidelines; in particular there's a few guidelines concerning exactly this issue. Had this not been possible our static analysis would fail for this otherwise seemingly simple case; or worse be misleading. In contrast, heap based containers do not have the same issue naturally.
Unchecked access
Remember that accessing std::optional is unchecked - this leads to the case where you could by mistake access that unitialized memory.
Just to showcase this, if that weren't the case then this could be heap corruption:
std::optional<X*> opt{};//lets assume brace-init doesn't zero-initialize the underlying object for a moment (in practice it does)
(*opt)->f();//<- possible heap corruption
With current implementation however, this becomes deterministic (seg fault/access violation on main platforms).
Then you might ask, why doesn't the std::nullopt 'specialized' constructor not initialize the memory ?
Im not really sure why it doesn't. While I guess it wouldn't be an issue if it did. In this case, as opposed to the brace-init one, it doesn't come with the same kind of expectations. Subtly, you now have a choice.
For those interested MSVC does the same.

Related

Can we detect "trivial relocatability" in C++17?

In future standards of C++, we will have the concept of "trivial relocatability", which means we can simply copy bytes from one object to an uninitialized chunk of memory, and simply ignore/zero out the bytes of the original object.
this way, we imitate the C-style way of copying/moving objects around.
In future standards, we will probably have something like std::is_trivially_relocatable<type> as a type trait. currently, the closest thing we have is std::is_pod<type> which will be deprecated in C++20.
My question is, do we have a way in the current standard (C++17) to figure out if the object is trivially relocatable?
For example, std::unique_ptr<type> can be moved around by copying its bytes to a new memory address and zeroing out the original bytes, but std::is_pod_v<std::unique_ptr<int>> is false.
Also, currently the standard mandate that every uninitialized chunk of memory must pass through a constructor in order to be considered a valid C++ object. even if we can somehow figure out if the object is trivially relocatable, if we just move the bytes - it's still UB according to the standard.
So another question is - even if we can detect trivial relocatability, how can we implement trivial relocation without causing UB? simply calling memcpy + memset(src,0,...) and casting the memory address to the right type is UB.
`
Thanks!
The whole point of trivial-relocatability would seem to be to enable byte-wise moving of objects even in the presence of a non-trivial move constructor or move assignment operator. Even in the current proposal P1144R3, this ultimately requires that a user manually mark types for which this is possible. For a compiler to figure out whether a given type is trivially-relocatable in general is most-likely equivalent to solving the halting problem (it would have to understand and reason about what an arbitrary, potentially user-defined move constructor or move assignment operator does)…
It is, of course, possible that you define your own is_trivially_relocatable trait that defaults to std::is_trivially_copyable_v and have the user specialize for types that should specifically be considered trivially-relocatable. Even this is problematic, however, because there's gonna be no way to automatically propagate this property to types that are composed of trivially-relocatable types…
Even for trivially-copyable types, you can't just copy the bytes of the object representation to some random memory location and cast the address to a pointer to the type of the original object. Since an object was never created, that pointer will not point to an object. And attempting to access the object that pointer doesn't point to will result in undefined behavior. Trivial-copyabibility means you can copy the bytes of the object representation from one existing object to another existing object and rely on that making the value of the one object equal to the value of the other [basic.types]/3.
To do this for trivially-relocating some object would mean that you have to first construct an object of the given type at your target location, then copy the bytes of the original object into that, and then modify the original object in a way equivalent to what would have happened if you had moved from that object. Which is essentially a complicated way of just moving the object…
There's a reason a proposal to add the concept of trivial-relocatability to the language exists: because you currently just can't do it from within the langugage itself…
Note that, despite all this, just because the compiler frontend cannot avoid generating constructor calls doesn't mean the optimizer cannot eliminate unnecessary loads and stores. Let's have a look at what code the compiler generates for your example of moving a std::vector or std::unique_ptr:
auto test1(void* dest, std::vector<int>& src)
{
return new (dest) std::vector<int>(std::move(src));
}
auto test2(void* dest, std::unique_ptr<int>& src)
{
return new (dest) std::unique_ptr<int>(std::move(src));
}
As you can see, just doing an actual move often already boils down to just copying and overwriting some bytes, even for non-trivial types…
Author of P1144 here; somehow I'm just seeing this SO question now!
std::is_trivially_relocatable<T> is proposed for some-future-version-of-C++, but I don't predict it'll get in anytime soon (definitely not C++23, I bet not C++26, quite possibly not ever). The paper (P1144R6, June 2022) ought to answer a lot of your questions, especially the ones where people are correctly answering that if you could already implement this in present-day C++, we wouldn't need a proposal. See also my 2019 C++Now talk.
Michael Kenzel's answer says that P1144 "ultimately requires that a user manually mark types for which [trivial relocation] is possible"; I want to point out that that's kind of the opposite of the point. The state of the art for trivial relocatability is manual marking ("warranting") of each and every such type; for example, in Folly, you'd say
struct Widget {
std::string s;
std::vector<int> v;
};
FOLLY_ASSUME_FBVECTOR_COMPATIBLE(Widget);
And this is a problem, because the average industry programmer shouldn't be bothered with trying to figure out if std::string is trivially relocatable on their library of choice. (The annotation above is wrong on 1.5 of the big 3 vendors!) Even Folly's own maintainers can't get these manual annotations right 100% of the time.
So the idea of P1144 is that the compiler can just take care of it for you. Your job changes from dangerously warranting things-you-don't-necessarily-know, to merely (and optionally) verifying things-you-want-to-be-true via static_assert (Godbolt):
struct Widget {
std::string s;
std::vector<int> v;
};
static_assert(std::is_trivially_relocatable_v<Widget>);
struct Gadget {
std::string s;
std::list<int> v;
};
static_assert(!std::is_trivially_relocatable_v<Gadget>);
In your (OP's) specific use-case, it sounds like you need to find out whether a given lambda type is trivially relocatable (Godbolt):
void f(std::list<int> v) {
auto widget = [&]() { return v; };
auto gadget = [=]() { return v; };
static_assert(std::is_trivially_relocatable_v<decltype(widget)>);
static_assert(!std::is_trivially_relocatable_v<decltype(gadget)>);
}
This is something you can't really do at all with Folly/BSL/EASTL, because their warranting mechanisms work only on named types at the global scope. You can't exactly FOLLY_ASSUME_FBVECTOR_COMPATIBLE(decltype(widget)).
Inside a std::function-like type, you're correct that it would be useful to know whether the captured type is trivially relocatable or not. But since you can't know that, the next best thing (and what you should do in practice) is to check std::is_trivially_copyable. That's the currently blessed type trait that literally means "This type is safe to memcpy, safe to skip the destructor of" — basically all the things you're going to be doing with it. Even if you knew that the type was exactly std::unique_ptr<int>, or whatever, it would still be undefined behavior to memcpy it in present-day C++, because the current standard says that you're not allowed to memcpy types that aren't trivially copyable.
(Btw, technically, P1144 doesn't change that fact. P1144 merely says that the implementation is allowed to elide the effects of relocation, which is a huge wink-and-nod to implementors that they should just use memcpy. But even P1144R6 doesn't make it legal for ordinary non-implementor programmers to memcpy non-trivially-copyable types: it leaves the door open for some compiler to implement, and some library implementation to use, a __builtin_trivial_relocate function that is in some magical sense distinguishable from a plain old memcpy.)
Finally, your last paragraph refers to memcpy + memset(src,0,...). That's wrong. Trivial relocation is tantamount to just memcpy. If you care about the state of the source object afterward — if you care that it's all-zero-bytes, for example — then that must mean you're going to look at it again, which means you aren't actually treating it as destroyed, which means you aren't actually doing the semantics of a relocate here. "Copy and null out the source" is more often the semantics of a move. The point of relocation is to avoid that extra work.

Initialisation of vector of atomics

Consider:
void foo() {
std::vector<std::atomic<int>> foo(10);
...
}
Are the contents of foo now valid? Or do I need to explicitly loop through and initialise them? I have checked on Godbolt and it seems fine, however the standard seems to be very confused on this point.
The std::vector constructor says it inserts default-inserted instances of std::atomic<int>, which are value initialised via placement new.
I think this effect of value initialisation applies:
2) if T is a class type with a default constructor that is neither user-provided nor deleted (that is, it may be a class with an implicitly-defined or defaulted default constructor), the object is zero-initialized and then it is default-initialized if it has a non-trivial default constructor;
So it seems to me that the atomics are zero-initialised. So the question is, does zero-initialisation of a std::atomic<int> result in a valid object?
I'm going to guess that the answer is "yes in practice but it's not really defined"?
Note: This answer agrees that it is zero-initialised, but doesn't really say if that means that the object is valid.
You are correct to be worried. According to standard the atomics has the default constructor called, however they have not been initialized as such. This is because the default constructor doesn't initialize the atomic:
The default-initialized std::atomic<T> does not contain a T object,
and its only valid uses are destruction and initialization by
std::atomic_init
This is somewhat in violation of the normal language rules, and some implementations initialize anyway (as you have noted).
That being said, I would recommend taking the extra step to make 100% sure they are initialized correctly according to standard - after all you are dealing with concurrency where bugs can be extremely hard to track down.
There are many ways to dodge the issue, including using wrapper:
struct int_atomic {
std::atomic<int> atomic_{0};//use 'initializing' constructor
};
Even if the default constructor were called (it isn't, because it's trivial) it doesn't really do anything.
Zero-initialisation obviously cannot be guaranteed to produce a valid atomic; this'll only work if by chance a valid atomic is created by zero-initialising all its members.
And, since atomics aren't copyable, you can't provide a initialisation value in the vector constructor.
You should now loop over the container and std::atomic_init each element. If you need to lock around this, that's fine because you're already synchronising the vector's creation for the same reason.

memset and a dynamic array of std::complex<double>

since std::complex is a non-trivial type, compiling the following with GCC 8.1.1
complex<double>* z = new complex<double>[6];
memset(z,0,6*sizeof*z);
delete [] (z);`
produces a warning
clearing an object of non-trivial type
My question is, is there actually any potential harm in doing so?
The behavior of std::memset is only defined if the pointer it is modifying is a pointer to a TriviallyCopyable type. std::complex is guaranteed to be a LiteralType, but, as far as I can tell, it isn't guaranteed to be TriviallyCopyable, meaning that std::memset(z, 0, ...) is not portable.
That said, std::complex has an array-compatibility guarantee, which states that the storage of a std::complex<T> is exactly two consecutive Ts and can be reinterpreted as such. This seems to suggest that std::memset is actually fine, since it would be accessing through this array-oriented access. It may also imply that std::complex<double> is TriviallyCopyable, but I am unable to determine that.
If you wish to do this, I would suggest being on the safe side and static_asserting that std::complex<double> is TriviallyCopyable:
static_assert(std::is_trivially_copyable<std::complex<double>>::value);
If that assertion holds, then you are guaranteed that the memset is safe.
In either case, it would be safe to use std::fill:
std::fill(z, z + 6, std::complex<double>{});
It optimizes down to a call to memset, albeit with a few more instructions before it. I would recommend using std::fill unless your benchmarking and profiling showed that those few extra instructions are causing problems.
Never, never, ever memset non-POD types. They have constructors for a reason. Just writing a bunch of bytes on top of them is highly unlikely to give the desired result (and if it does, the types themselves are badly designed as they should clearly then just be POD in the first place - or you are simply being unlucky that Undefined Behaviour seems to work in this case - have fun debugging it when it doesn't after you change optimization level, compiler or platform (or moon phase)).
Just don't do this.
The answer to this question is that for a standard-compliant std::complex there is no need for memset after new.
new complex<double>[6] will initialize the complex to (0, 0) because it calls a default (non-trivial) constructor that initializes to zero.
(I think this is a mistake unfortunately.)
https://en.cppreference.com/w/cpp/numeric/complex/complex
If the code posted was just and example with missing code between new and memset, then std::fill will do the right thing.
(In part because the specific standard library implementation knows internally how std::complex is implemented.)

Class object creation in C++

I have a basic C++ question which I really should know the answer to.
Say we have some class A with constructor A(int a). What is the difference between:
A test_obj(4);
and
A test_obj = A(4);
?
I generally use the latter syntax, but after looking up something unrelated in my trusty C++ primer I realized that they generally use the former. The difference between these two is often discussed in the context of built-in types (e.g. int a(6) vs int a = 6), and my understanding is that in this case they are equivalent.
However, in the case of user-defined classes, are the two approaches to defining an object equivalent? Or is the latter option first default constructing test_obj, and then using the copy constructor of A to assign the return value of A(4) to test_obj? If it's this second possibility, I imagine there could be some performance differences between the two approaches for large classes.
I'm sure this question is answered somewhere on the internet, even here, but I couldn't search for it effectively without finding questions asking the difference between the first option and using new, which is unrelated.
A test_obj = A(4); conceptually does indeed construct a temporary A object, then copy/move-construct test_obj from the temporary, and then destruct the temporary.
However this process is a candidate for copy elision which means the compiler is allowed to treat it as A test_obj(4); after verifying that the copy/move-constructor exists and is accessible.
From C++17 it will be mandatory for compilers to do this; prior to that it was optional but typically compilers did do it.
Performance-wise these are equivalent, even if you have a non-standard copy constructor, as mandated by copy elision. This is guaranteed since C++17 but permitted and widely present even in compilers conforming to earlier standards.
Try for yourself, with all optimizations turned off and the standard forced into C++11 (or C++03, change the command line in the top right):
https://godbolt.org/g/GAq7fi

Can I use memcpy in C++ to copy classes that have no pointers or virtual functions

Say I have a class, something like the following;
class MyClass
{
public:
MyClass();
int a,b,c;
double x,y,z;
};
#define PageSize 1000000
MyClass Array1[PageSize],Array2[PageSize];
If my class has not pointers or virtual methods, is it safe to use the following?
memcpy(Array1,Array2,PageSize*sizeof(MyClass));
The reason I ask, is that I'm dealing with very large collections of paged data, as decribed here, where performance is critical, and memcpy offers significant performance advantages over iterative assignment. I suspect it should be ok, as the 'this' pointer is an implicit parameter rather than anything stored, but are there any other hidden nasties I should be aware of?
Edit:
As per sharptooths comments, the data does not include any handles or similar reference information.
As per Paul R's comment, I've profiled the code, and avoiding the copy constructor is about 4.5 times faster in this case. Part of the reason here is that my templated array class is somewhat more complex than the simplistic example given, and calls a placement 'new' when allocating memory for types that don't allow shallow copying. This effectively means that the default constructor is called as well as the copy constructor.
Second edit
It is perhaps worth pointing out that I fully accept that use of memcpy in this way is bad practice and should be avoided in general cases. The specific case in which it is being used is as part of a high performance templated array class, which includes a parameter 'AllowShallowCopying', which will invoke memcpy rather than a copy constructor. This has big performance implications for operations such as removing an element near the start of an array, and paging data in and out of secondary storage. The better theoretical solution would be to convert the class to a simple structure, but given this involves a lot of refactoring of a large code base, avoiding it is not something I'm keen to do.
According to the Standard, if no copy constructor is provided by the programmer for a class, the compiler will synthesize a constructor which exhibits default memberwise initialization. (12.8.8) However, in 12.8.1, the Standard also says,
A class object can be copied in two
ways, by initialization (12.1, 8.5),
including for function argument
passing (5.2.2) and for function value
return (6.6.3), and by assignment
(5.17). Conceptually, these two
operations are implemented by a copy
constructor (12.1) and copy assignment
operator (13.5.3).
The operative word here is "conceptually," which, according to Lippman gives compiler designers an 'out' to actually doing memberwise initialization in "trivial" (12.8.6) implicitly defined copy constructors.
In practice, then, compilers have to synthesize copy constructors for these classes that exhibit behavior as if they were doing memberwise initialization. But if the class exhibits "Bitwise Copy Semantics" (Lippman, p. 43) then the compiler does not have to synthesize a copy constructor (which would result in a function call, possibly inlined) and do bitwise copy instead. This claim is apparently backed up in the ARM, but I haven't looked this up yet.
Using a compiler to validate that something is Standard-compliant is always a bad idea, but compiling your code and viewing the resulting assembly seems to verify that the compiler is not doing memberwise initialization in a synthesized copy constructor, but doing a memcpy instead:
#include <cstdlib>
class MyClass
{
public:
MyClass(){};
int a,b,c;
double x,y,z;
};
int main()
{
MyClass c;
MyClass d = c;
return 0;
}
The assembly generated for MyClass d = c; is:
000000013F441048 lea rdi,[d]
000000013F44104D lea rsi,[c]
000000013F441052 mov ecx,28h
000000013F441057 rep movs byte ptr [rdi],byte ptr [rsi]
...where 28h is the sizeof(MyClass).
This was compiled under MSVC9 in Debug mode.
EDIT:
The long and the short of this post is that:
1) So long as doing a bitwise copy will exhibit the same side effects as memberwise copy would, the Standard allows trivial implicit copy constructors to do a memcpy instead of memberwise copies.
2) Some compilers actually do memcpys instead of synthesizing a trivial copy constructor which does memberwise copies.
Let me give you an empirical answer: in our realtime app, we do this all the time, and it works just fine. This is the case in MSVC for Wintel and PowerPC and GCC for Linux and Mac, even for classes that have constructors.
I can't quote chapter and verse of the C++ standard for this, just experimental evidence.
Your class has a constructor, and so is not POD in the sense that a C struct is. It is therefore not safe to copy it with memcpy(). If you want POD data, remove the constructor. If you want non-POD data, where controlled construction is essential, don't use memcpy() - you can't have both.
You could. But first ask yourself:
Why not just use the copy-constructor that is provided by your compiler to do a member-wise copy?
Are you having specific performance problems for which you need to optimise?
The current implementation contains all POD-types: what happens when somebody changes it?
[...] but are there any other hidden nasties
I should be aware of?
Yes: your code makes certain assumptions that are neither suggested nor documented (unless you specifically document them). This is a nightmare for maintenance.
Also, your implementation is basically hacking (if it's necessary it's not a bad thing) and it may depend (not sure on this) on how your current compiler implements things.
This means that if you upgrade compiler / toolchain one year (or five) from now (or just change optimization settings in your current compiler) nobody will remember this hack (unless you make a big effort to keep it visible) and you may end up with undefined behavior on your hands, and developers cursing "whoever did this" a few years down the road.
It's not that the decision is unsound, it's that it is (or will be) unexpected to maintainers.
To minimize this (unexpectedness?) I would move the class into a structure within a namespace based on the current name of the class, with no internal functions in the structure at all. Then you are making it clear you're looking at a memory block and treating it as a memory block.
Instead of:
class MyClass
{
public:
MyClass();
int a,b,c;
double x,y,z;
};
#define PageSize 1000000
MyClass Array1[PageSize],Array2[PageSize];
memcpy(Array1,Array2,PageSize*sizeof(MyClass));
You should have:
namespace MyClass // obviously not a class,
// name should be changed to something meaningfull
{
struct Data
{
int a,b,c;
double x,y,z;
};
static const size_t PageSize = 1000000; // use static const instead of #define
void Copy(Data* a1, Data* a2, const size_t count)
{
memcpy( a1, a2, count * sizeof(Data) );
}
// any other operations that you'd have declared within
// MyClass should be put here
}
MyClass::Data Array1[MyClass::PageSize],Array2[MyClass::PageSize];
MyClass::Copy( Array1, Array2, MyClass::PageSize );
This way you:
make it clear that MyClass::Data is a POD structure, not a class (binary they will be the same or very close - the same if I remember correctly) but this way it is also visible to programmers reading the code.
centralize the usage of memcpy (if you have to change to a std::copy or something else) in two years you do it in a single point.
keep the usage of memcpy near the implementation of the POD structure.
You could use memcpy for copying array of POD types. And it will be a good idea to add static assert for boost::is_pod is true. You class is not a POD type now.
Arithmetic types, enumeration types, pointer types, and pointer to member types are POD.
A cv-qualified version of a POD type is itself a POD type.
An array of POD is itself POD.
A struct or union, all of whose non-static data members are POD, is itself POD if it has:
No user-declared constructors.
No private or protected non-static data members.
No base classes.
No virtual functions.
No non-static data members of reference type.
No user-defined copy assignment operator.
No user-defined destructor.
I will notice that you admit there is an issue here. And you are aware of the potential drawbacks.
My question is one of maintenance. Do you feel confident that nobody will ever include a field in this class that would botch up your great optimization ? I don't, I'm an engineer not a prophet.
So instead of trying to improve the copy operation.... why not try to avoid it altogether ?
Would it be possible to change the data structure used for storage to stop moving elements around... or at least not that much.
For example, do you know of blist (the Python module). B+Tree can allow index access with performances quite similar to vectors (a bit slower, admittedly) for example, while minimizing the number of elements to shuffle around when inserting / removing.
Instead of going in the quick and dirty, perhaps should you focus on finding a better collection ?
Calling memcpy on non-POD classes is undefined behaviour. I suggest to follow Kirill's tip for assertion. Using memcpy can be faster, but if the copy operation is not performance critical in your code then just use bitwise copy.
When talking about the case you're referring to I suggest you declare struct's instead of class'es. It makes it a lot easier to read (and less debatable :) ) and the default access specifier is public.
Of course you can use memcpy in this case, but beware that adding other kinds of elements in the struct (like C++ classes) is not recommended (due to obvious reasons - you don't know how memcpy will influence them).
As pointed out by John Dibling, you should not use memcpy manually. Instead, use std::copy. If your class is memcpy-able, std::copy will automatically do a memcpy. It may be even faster than a manual memcpy.
If you use std::copy, your code is readable and it always uses the fastest way to copy. And if you change the layout of your class at a later point so that it is not memcpy-able any more, the code that uses std::copy will not break, while your manual calls to memcpy will.
Now, how do you know whether your class is memcpy-able? In the same way, std::copy detects that. It uses: std::is_trivially_copyable. You can use a static_assert to make sure that this property is upheld.
Note that std::is_trivially_copyable can only check the type information. It does not understand the semantics. The following class is trivially copyable type, but a bitwise copy would be a bug:
#include <type_traits>
struct A {
int* p = new int[32];
};
static_assert(std::is_trivially_copyable<A>::value, "");
After a bitwise copy, the ptr of the copy will still point to the original memory. Also see the Rule of Three.
It will work, because a (POD-) class is the same as a struct (not completely, default access ...), in C++. And you may copy a POD struct with memcpy.
The definition of POD was no virtual functions, no constructor, deconstructor no virtual inheritance ... etc.