What's the c++'s default assign operation behavior? - c++

eg, it puzzles me:
struct A {
// some fileds...
char buf[SIZE];
};
A a;
a = a;
Through A's field buf, it looks like probably that the default assign operation will call something like memcpy to assign an object X to Y, so what if assign an object to itself and there are no explicit assign operation defined, like a = a; above.
memcpy manual page:
DESCRIPTION
The memcpy() function copies n bytes from memory area src to memory area dest. The memory areas must not overlap. Use memmove(3) if the memory areas do overlap.
If use memcpy, there may some undefined behavior occur.
So, what's the default assign operation behavior in C++ object?

The assignment operator is not defined in terms of memcpy (§12.8/28).
The implicitly-defined copy/move assignment operator for a non-union
class X performs memberwise copy/move assignment of its subobjects.
The direct base classes of X are assigned first, in the order of their
declaration in the base-specifier-list, and then the immediate
non-static data members of X are assigned, in the order in which they
were declared in the class definition. Let x be either the parameter
of the function or, for the move operator, an xvalue referring to the
parameter. Each subobject is assigned in the manner appropriate to its
type:
[...]
— if the subobject is an array, each element is assigned, in the
manner appropriate to the element type;
[...]
As you see, each char element will be assigned individually. That is always safe.
However, under the as-if rule, a compiler may replace this with a memmove because it has identical behaviour for a char array. It could also replace it with a memcpy if it can guarantee that memcpy will result in this same behaviour, even if theoretically such a thing is undefined. Compilers can rely on theoretically undefined behaviour; one of the reasons undefined behaviour exists is so that compilers can define it to whatever is more appropriate for their operation.
Actually, in this case a compiler could take the as-if rule even further and not do anything with the array at all, since that also results in the same behaviour.

Default assign (and copy) behaviour does not memcpy the whole class, which would break things. Each member is copied using their copy constructor or assignment operator (depending on operation). This is applied recursively for members and their members. When a basic data type is reached, it simply performs a straight copy of data, similar to memcpy. So an array of basic data types may be copied similar to memcpy, but the whole class is not. If you add std::string to your class its = operator would be called, alongside copy of array. If you used array of std::string, each string in your array will have their operator called. They won't memcpy.

Some limited experimentation tells me that g++ completely removes any attempt to copy a = a; [assuming it is obvious - I'm sure with sufficient messing about with pointers, it will eventually be possible to copy the same object over itself, and get undefined behaviour].

If use memcpy, there may some undefined behavior occur.
It's an implementation detail how the given class will be copied. Both memcpy() function and copy constructor will be converted into some machine code. However your objects in memory should not overlap because default assignment does not guarantee you'll have a proper result in case they overlap.
So, what's the default assign operation behavior in C++ object?
As in other responses, the behaviour is such that it will call assignments on all class/struct members recursively. However technically, as in your case, it may just copy whole block of memory, especially if your structure is POD (plain old data).

Related

Is it legal to construct data members of a struct separately?

class A;
class B;
//we have void *p pointing to enough free memory initially
std::pair<A,B> *pp=static_cast<std::pair<A,B> *>(p);
new (&pp->first) A(/*...*/);
new (&pp->second) B(/*...*/);
After the code above get executed, is *pp guaranteed to be in a valid state? I know the answer is true for every compiler I have tested, but the question is whether this is legal according to the standard and hence. In addition, is there any other way to obtain such a pair if A or B is not movable in C++98/03? (thanks to #StoryTeller-UnslanderMonica , there is a piecewise constructor for std::pair since C++11)
“Accessing” the members of the non-existent pair object is undefined behavior per [basic.life]/5; pair is never a POD-class (having user-declared constructors), so a pointer to its storage out of lifetime may not be used for its members. It’s not clear whether forming a pointer to the member is already undefined, or if the new is.
Neither is there a way to construct a pair of non-copyable (not movable, of course) types in C++98—that’s why the piecewise constructor was added along with move semantics.
A more simple question: is using a literal string well defined?
Not even that is, as its lifetime is not defined. You can't use a string literal in conforming code.
So the committee that never took the time to make string literals well defined obviously did not bother with specifying which objects of class type can be made to exist by placement new of its subobjects - polymorphic objects obviously cannot be created that way!
That standard did not even bother describing the semantic of union.
About lifetime the standard is all over the place, and that isn't just editorial: it reflects a deep disagreement between serious people about what begins a lifetime, what an object is, what an lvalue is, etc.
Notably people have all sorts of false or contradicting intuitions:
an infinite number of objects cannot be created by one call to malloc
an lvalue refers to an object
overlapping objects are against the object model
a unnamed object can only be created by new or by the compiler (temporaries)
...

Why not a consolidated Copy constructor and Assignment operator available in C++?

I do understand the scenarios where the respective functions (Copy Constructor and Assignment operator) would be called. And both these functions are literally doing the same functionality - properly allocating memory for dynamic data members and copy the data from the passed argument object, so that both the object looks identical in data. Why not in that case, C++ provides a consolidated (one function) which would be called in both these scenarios instead of complicating things by providing two variants?
They are not the same, and it would be a pain in the neck if someone forced them to be.
Copy construction is a way of creating an object. Amongst other things, base member initialisers can be used. In multi-threaded code, you don't need to worry so much about mutual exclusion units in a constructor since you cannot create the same object simultaneously!
An assignment operator does a rather different thing. It operates on an object that already exists and should return a reference to self. An implementation can do subtly different things here cf. copy construction. For example, a string class might not release resources if the new assigned string is smaller.
In simple cases they may well do the same thing and the return value of an assignment discarded. But in such cases you can rely on the ones that the compiler generates automatically.
They are so not the same. They might be the same in special cases, but generally, no.
when you have something like this:
std::vector myVec = myOtherVec;
It looks like assignment, but actually the copy constructor is being called.
The copy constructor starts an object from nothing.
This falls back to the basic question of what's the difference between malloc (the old C way to reserve memory) and new. The difference is: new calls the constructor of your object, which is very important in C++, otherwise we'd be talking about garbage memory that can't be initialzed unless it explicitly is.
For example, in the internal implementation of std::vector, there is a size variable that tracks the number of elements actively acknowledged by the user with push_back() or resize (we're not talking about reserve).
Now imagine how it's implemented:
template <typename T>
class vector
{
int size;
T* theArray;
void reserveMyMemory(); //ignoring allocators for simplicity
}
What's the difference between the copy constructor and assignment operator?
Assignment operator: Just copies size and the array content.
Copy-cosntructor: Must reserve memory and initialize variables, then copy.
Now imageine that memory reserving requires checking the size and whether theArray is nullptr. What's going to happen if internally it uses assignment operator? A catastrophe. Because the values are not initialized. So, you need a constructor to start.
In this case, the copy constructor is more general, because it should initialize variables, then copy the elements it has to copy. Of course, this whole example is just a demonstration. Don't take it literally for std::vector, the STL doesn't work that way.

Managing trivial types

I have found the intricacies of trivial types in C++ non-trivial to understand and hope someone can enlighten me on the following.
Given type T, storage for T allocated using ::operator new(std::size_t) or ::operator new[](std::size_t) or std::aligned_storage, and a void * p pointing to a location in that storage suitably aligned for T so that it may be constructed at p:
If std::is_trivially_default_constructible<T>::value holds, is the code invoking undefined behavior when code skips initialization of T at p (i.e. by using T * tPtr = new (p) T();) before otherwise accessing *p as T? Can one just use T * tPtr = static_cast<T *>(p); instead without fear of undefined behavior in this case?
If std::is_trivially_destructible<T>::value holds, does skipping destruction of T at *p (i.e by calling tPtr->~T();) cause undefined behavior?
For any type U for which std::is_trivially_assignable<T, U>::value holds, is std::memcpy(&t, &u, sizeof(U)); equivalent to t = std::forward<U>(u); (for any t of type T and u of type U) or will it cause undefined behavior?
No, you can't. There is no object of type T in that storage, and accessing the storage as if there was is undefined. See also T.C.'s answer here.
Just to clarify on the wording in [basic.life]/1, which says that objects with vacuous initialization are alive from the storage allocation onward: that wording obviously refers to an object's initialization. There is no object whose initialization is vacuous when allocating raw storage with operator new or malloc, hence we cannot consider "it" alive, because "it" does not exist. In fact, only objects created by a definition with vacuous initialization can be accessed after storage has been allocated but before the vacuous initialization occurs (i.e. their definition is encountered).
Omitting destructor calls never per se leads to undefined behavior. However, it's pointless to attempt any optimizations in this area in e.g. templates, since a trivial destructor is just optimized away.
Right now, the requirement is being trivially copyable, and the types have to match. However, this may be too strict. Dos Reis's N3751 at least proposes distinct types to work as well, and I could imagine this rule being extended to trivial copy assignment across one type in the future.
However, what you've specifically shown does not make a lot of sense (not least because you're asking for assignment to a scalar xvalue, which is ill-formed), since trivial assignment can hold between types whose assignment is not actually "trivial", that is, has the same semantics as memcpy. E.g. is_trivially_assignable<int&, double> does not imply that one can be "assigned" to the other by copying the object representation.
Technically reinterpreting storage is not enough to introduce a new object as. Look at the note for Trivial default constructor states:
A trivial default constructor is a constructor that performs no action. All data types compatible with the C language (POD types) are trivially default-constructible. Unlike in C, however, objects with trivial default constructors cannot be created by simply reinterpreting suitably aligned storage, such as memory allocated with std::malloc: placement-new is required to formally introduce a new object and avoid potential undefined behavior.
But the note says it's a formal limitation, so probably it is safe in many cases. Not guaranteed though.
No. is_assignable does not even guarantee the assignment will be legal under certain conditions:
This trait does not check anything outside the immediate context of the assignment expression: if the use of T or U would trigger template specializations, generation of implicitly-defined special member functions etc, and those have errors, the actual assignment may not compile even if std::is_assignable::value compiles and evaluates to true.
What you describe looks more like is_trivially_copyable, which says:
Objects of trivially-copyable types are the only C++ objects that may be safely copied with std::memcpy or serialized to/from binary files with std::ofstream::write()/std::ifstream::read().
I don't really know. I would trust KerrekSB's comments.

Are the following inlined functions guaranteed to have the same implementation?

Are the following functions guaranteed to have the same implementation (i.e. object code)?
Does this change if Foo below is a primitive type instead (e.g. int)?
Does this change with the size of Foo?
Returning by value:
inline Foo getMyFooValue() { return myFoo; }
Foo foo = getMyFooValue();
Returning by reference:
inline const Foo &getMyFooReference() { return myFoo; }
Foo foo = getMyFooReference();
Modifying in place:
inline void getMyFooInPlace(Foo &theirFoo) { theirFoo = myFoo; }
Foo foo;
getMyFooInPlace(foo);
Are the following functions guaranteed to have the same implementation (i.e. object code)?
No, the language only specifies behaviour, not code generation, so it's up to the compiler whether two pieces of code with equivalent behaviour produce the same object code.
Does this change if Foo below is a primitive type instead (e.g. int)?
If it is (or, more generally, if it's trivially copyable), then all three have the same behaviour, so can be expected to produce similar code.
If it's a non-trivial class type, then it depends on what the class's special functions do. Each calls these functions in slightly different ways:
The first might copy-initialise a temporary object (calling the copy constructor), copy-initialise foo with that, then destroy the temporary (calling the destructor); but more likely it will elide the temporary, becoming equivalent to the second.
The second will copy-initialise foo (calling the copy constructor)
The third will default initialise foo (calling the default constructor), then copy-assign to it (calling the assignment operator).
So whether or not they are equivalent depends on whether default-initialisation and copy-assignment has equivalent behaviour to copy-initialisation, and (perhaps) whether creating and destroying a temporary has side effects. If they are equivalent, then you'll probably get similar code.
Does this change with the size of Foo?
No the size is irrelevant. What matters is whether it's trivial (so that both copy initialisation and copy assignment simply copy bytes) or non-trivial (so that they call user-defined functions, which might or might not be equivalent to each other).
The standard draft N3337 contains the following rules in 1.9.5: "A conforming [C++] implementation [...] shall produce the same observable behaviour as one of the possible executions of the corresponding instance of the abstract machine with the same program and the same input." And in 1.9.9 it defines the observable behaviour basically as I/O and volatile's values. Which means that as long as the I/O and volatiles of your program stay the same the implementation can do what it wants. If you have no I/O or volatiles the program doesn't need to do anything (which makes benchmarks hard to get right with high optimizations).
Note that the standard specifically is totally silent about what code a compiler should emit. Hell, it could probably interpret the sources.
This answers your question: No.

When is a type in c++11 allowed to be memcpyed?

My question is the following:
If I want to copy a class type, memcpy can do it very fast. This is allowed in some situations.
We have some type traits:
is_standard_layout.
is_trivially_copyable.
What I would like to know is the exact requirements when a type will be "bitwise copyable".
My conclusion is that a type is bitwise copyable if both of is_trivally_copyable and is_standard_layout traits are true:
It is exactly what I need to bitwise copy?
Is it overconstrained?
Is it underconstrained?
P.S.: of course, the result of memcpy must be correct. I know I could memcpy in any situation but incorrectly.
You can copy an object of type T using memcpy when is_trivially_copyable<T>::value is true. There is no particular need for the type to be a standard layout type. The definition of 'trivially copyable' is essentially that it's safe to do this.
An example of a class that is safe to copy with memcpy but which is not standard layout:
struct T {
int i;
private:
int j;
};
Because this class uses different access control for different non-static data members it is not standard layout, but it is still trivially copyable.
If is_trivally_copyable<T>::value (or in C++14 is_trivially_copyable<T>(), or in C++17 is_trivially_copyable_v<T>) is not zero, the type is copyable using memcpy.
Per the C++ standard, a type being trivially copyable means:
the underlying bytes making up the object can be copied into an array
of char or unsigned char. If the content of the array of char or unsigned char is copied back into the
object, the object shall subsequently hold its original value.
However, it is important to realise that pointers are trivially copyable types, too. Whenever there are pointers inside the data structures you will be copying, you have to brainually make sure that copying them around is proper.
Examples where hazard may be caused by just relying on the object being trivially copyable:
A tree-structure implementation where your data is placed in a contiguous region of memory, but with nodes storing absolute addresses to child nodes
Creating multiple instances of some data for sake of multithreading performance (in order to reduce cache crashes), with owning pointers inside, pointing anywhere
You have a flat object without pointers, but with an embedded third party structure inside. The third party structure at some point in the future includes a pointer that should not exist twice or more.
So whenever memcopying, keep in mind to check whether pointers could be copied in that specific case, and if that would be okay.
Realise that is_trivially_copyable is only the "Syntax Check", not the "Semantic Test", in compiler parlance.
From http://en.cppreference.com/w/cpp/types/is_trivially_copyable:
Objects of trivially-copyable types are the only C++ objects that may be safely copied with std::memcpy or serialized to/from binary files with std::ofstream::write()/std::ifstream::read(). In general, a trivially copyable type is any type for which the underlying bytes can be copied to an array of char or unsigned char and into a new object of the same type, and the resulting object would have the same value as the original.
Objects with trivial copy constructors, trivial copy assignment operators and
trivial destructors can be copied with memcpy or memmove
The requirements for a special member function of a class T to be trivial are
Copy constructors (cc) and copy assignment operators (ca)
Not being user-provided (meaning, it is implicitly-defined or defaulted), and if it is defaulted, its signature is the same as implicitly-defined
T has no virtual member functions
T has no virtual base classes
The cc/ca selected for every direct base of T is trivial
The cc/ca selected for every non-static class type (or array of class type) memeber of T is trivial
T has no non-static data members of volatile-qualified type (since C++14)
Destructors
Not being user-provided (meaning, it is implicitly-defined or defaulted)
Not being virtual (that is, the base class destructor is not virtual)
All direct base classes have trivial destructors
All non-static data members of class type (or array of class type) have trivial destructors
Just declaring the function as = default doesn’t make it trivial (it will only be trivial if
the class also supports all the other criteria for the corresponding function to be trivial)
but explicitly writing the function in user code does prevent it from being trivial. Also all data types compatible with the C language (POD types) are trivially copyable.
Source : C++ Concurrency in action and cppreference.com
What I understood is
An Object should have default constructor /destructor.
Default Copy and Move Operations.
No static and Virtual Functions has multiple
access specifiers for non-static data members prevents important
layout optimizations has a non-static member or a base that is not
standard layout.
you can test if given type is pod (Plain Old Data) by using standard function is_pod::value
Reference: The C++ programming Language 4th edition