I am trying to understand the meaning of the types properties and supported operations, https://en.cppreference.com/w/cpp/types .
I have a library that implements in terms of low level functions things similar to std::copy_n, of course I would have to implement uninitialized_copy as well, but for many of the types, that are in some sense trivial, I don't want to repeat code and delegate one to the other.
What property makes uninitialized_copy(_n) semantically substitutable by copy(_n)?
My guess is that it is std::is_trivially_default_constructible<T>::value, but I am not sure.
The justification is that if something is trivially default constructible I can skip the initialization before the assignment.
(Initially I though it could be std::is_trivially_assignable<TSource, T>::value)
Of course I could play safe and ask for std::is_trivial<T>::value, but I want to be more targeted.
I know there are could be Machiavellian or misleading definitions of these traits, but suppose I want to trust them.
Example code:
template<class... As, class... Bs>
auto uninitialized_copy_n(
my_iterator<As...> first, Size count,
my_iterator<Bs...> d_first
)-> my_iterator<Bs...> {
using T1 = typename my_iterator<As...>::value_type;
using T2 = typename my_iterator<Bs...>::value_type;
if constexpr(std::is_trivially_constructible<T2>::value){
return copy_n(first, count, d_first);
}
... another implementation or abort
}
To replace uninitialized_copy with copy, the two actions must be equivalent:
uninitialized_copy: construct a new object of type Destination from an object of type Source
copy: assume an object of type Destination exists and assign to it from an object of type Source
One must put sufficient requirements on the type such that the destination object exists and that construction and assignment give the same value.
is_trivially_default_constructible is neither necessary nor sufficient to assume an object exists at a memory location. Instead, the destination type must be an implicit lifetime type. That is not a recursive condition, so we must require the type be an implicit lifetime type, and recursively all members or subobjects are implicit lifetime types.
For example, scalars or classes with trivial default constructor and trivial destructor will work.
Secondly, we need construction and assignment to give the same result. This is impossible if these operations aren't trivial. But in C++, the value of an object is only guaranteed to be unchanged by such operations for trivially copyable types, which tie the value to the representation. Therefore, we probably want the destination to be trivially copyable.
This gives us:
std::is_trivially_default_constructible_v<Destination>
std::is_trivially_destructible_v<Destination>
std::is_trivially_constructible_v<Destination, const Source&>
std::is_trivially_assignable_v<Destination&, const Source&>
std::is_trivially_copyable_v<Destination>
This simplifies to:
std::is_trivial_v<Destination>
std::is_trivially_constructible_v<Destination, const Source&>
std::is_trivially_assignable_v<Destination&, const Source&>
Note that libstdc++ requires both types to be trivial and the Destination assignable and constructible from the Source. The comment references a slightly different condition on the ranges algorithm instead.
The libstdc++ requirement has observable differences. See this example. value should be 1 (from construction) but is being populated with 2 (from assignment). I cannot find any license given to implementations to differ in this way.
The libstdc++ logic was recently found to be buggy and likely still is.
Related
I made a class for a function's argument to delegate its validation and also for function overloading purposes.
Throwing from constructor guarantees that the object will either be constructed in a valid state or will not be constructed at all. Hence, there is no need to introduce any checking member functions like explicit operator bool() const.
// just for exposition
const auto &certString = open_cert();
add_certificate(cert_pem{certString.cbegin(), certString.cend()}); // this will either throw
// or add a valid certificate.
// cert_pem is a temporary
However, there are issues which I don't see a appealing solution for:
Argument-validation class might itself be made non-persistent - to be used only for validation as a temporary object. But what about classes that are allowed to be persistent? That is living after function invocation:
// just for exposition
const auto &certString = open_cert();
cert_pem cert{certString.cbegin(), certString.cend()}; // allowed to throw
cert_pem moved = std::move(cert); // cert invalidated
cert_pem cert_invalid = std::move(cert); // is not allowed to throw
add_certificate(cert_invalid); // we lost the whole purpoce
I can see several ways to treat this without introducing state-checking (thus declaring a class stateful) functions:
Declare object "unusable" after move. - A really simple recipe for disaster
Declare move constructor and assignment operator deleted. Allow only copy - Resources might be very expensive to copy. Or even not possible if using a PIMPL idiom.
Use heap allocation when need an object to be persistent - this looks like most obvious. But has an unnecessary penalty on performance. Especially when some class has as members several such objects - there will be several memory allocations upon construction.
Here is a code example for 2):
/**
* Class that contains PEM certificate byte array.
* To be used as an argument. Ensures that input certificate is valid, otherwise throws on construction.
*/
class cert_pem final
{
public:
template <typename IterT>
cert_pem(IterT begin, IterT end)
: value_(begin, end)
{
validate(value_);
}
const std::vector<uint8_t>& Value() const noexcept(false)
{
return value_;
}
cert_pem (const cert_pem &) = default;
cert_pem & operator=(const cert_pem &) = default;
cert_pem (cert_pem &&) = delete;
cert_pem & operator=(cert_pem &&) = delete;
private:
/**
* \throws std::invalid_argument
*/
static void Validate(const std::vector<uint8_t>& value) noexcept(false);
static void ValidateNotEmpty(const std::vector<uint8_t>& value) noexcept(false);
private:
std::vector<uint8_t> value_;
};
Is there another way to handle this problem without these shortcomings? Or will I have to choose one of the above?
I think that with argument-validating classes a good way would be to not allow it to be persistent - only temporary object is allowed. But I am not sure if it is possible in C++.
You are trying to maintain two invariants at once, and their semantics are in conflict. The first invariant is the validity of the certificate. The second is for memory management.
For the first invariant, you decided that there can be no invalid constructed object, but for the second, you decided that the object can be either valid or unspecified†. This is only possible because the deallocation has a check somewhere.
There is no way around this: you either add a check for the first or you decouple the invariants. One way of decoupling them is to follow the design of std::lock_guard
cert c = open_cert(); // c is guaranteed to not have memory leaks and is movable
{
cert_guard cg{c}; // cg is guaranteed to be valid, but cg is non-movable
}
But wait, you might ask, how do you transfer the validity to another cert_guard?
Well, you can't.
That is the semantics you chose for the first invariant: it is valid exactly during the lifetime of the object. That is the entire point.
† Unspecified and invalid as far as the certificate is concerned.
The question aims to design a type such that:
an object of the type always satisfies a given invariant
an object of the type is "usable" as a non-temporary
The question then makes a leap from (2) to ask that the type be movable. But it need not be: copy and move operations could be defined as deleted. The question fails to motivate why the move operations are necessary. If that is a need, it comes from an unstated requirement. A non-movable class can be emplaced in a map, returned from a function, and used in many other ways. It admittedly can be more painful to use, but it can be used.
So that's one option that's not listed: define copy and move operations as deleted.
Otherwise, let's assume we do want:
an object of the type always satisfies a given invariant
the type is movable
This is not in conflict. Every copyable class is movable, and copying is a valid strategy here. Remember that move operations allow a "potentially smarter" copy, by allowing the source to be mutated. There are still two C++ objects, and it is still a logical copy, but with an assumption that the source won't be needed anymore in its current state (so you can steal from it!). There is no difference in the C++ interface, only in the totally unchecked documented behavior of the type after a move operation.
Defining move operations as deleted gives you a copyable class. This is your second option listed. Assigning from an xvalue (cert_pem moved = std::move(cert)) will still compile, but will not invalidate the source. It will still be considered movable by the language. The trade-off is as you note, copies can be expensive. Note that PIMPL authors can give their types copy operations, that's a choice they make about what the interface of the type should be, and the idiom doesn't prevent it.
The third choice is a version of the second. By putting values behind a shared_ptr, one can make an expensive-to-copy type cheap to copy. But we still rely on copy as the strategy for move.
The first choice amounts to weakening the invariant in (1). A moved-from object satisfying a different set of invariants than a normal object is very typical in C++. It is annoying, but in many cases it is the best we can do. When only one object can exist satisfying the invariant (think: non-null unique_ptr) the moved-from object must violate it.
The accepted answer amounts to my first option combined with delayed construction: define copy and move operations as deleted. Creating the guard can throw if the object was moved-from. The guard is just the type maintaining the invariant, and it is non-movable. We can delay its construction because such types are difficult to manage. We do that by keeping an object that knows enough about how to construct it. This strategy exists in other forms (emplace functions and piecewise_construct constructors to construct objects in their eventual place, factory functions to construct the object at will, etc.).
However, the description in the accepted answer leaves a bit to be desired, in my opinion. The desire is to maintain the invariant while being movable (this is assumed). Being movable doesn't require that the moved-from object satisfy an invariant or be unspecified. That's a choice the author of the type makes, and what choices are available is exactly the question, by my reading of it. Although the example given only implicated memory, and the first answer mentioned memory, my reading of the question was more general: maintaining invariants in movable classes.
Knowing that all copyable classes are movable, that move is a "smart" copy, and that there are two objects in and after a move operation will help in understanding why there's such limited options here. One has to leave that source object in some state.
My advice is to embrace the radioactive moved-from object. That's the approach in the standard library, and defaulted move operations will obey that more often than not. For such types, there must be some "empty" state for moved-from objects, so all types are effectively optional and a default constructor can also be defined to get an object in that empty state.
I am actually not sure to understand the difference between a trivial and non-trivial object. For instance, it is said here that a trivial object:
occupies a contiguous memory area
does not contains user-provided constructor/operator/destructor.
but are objects automatically aligning data in memory ? what if it meets both points but there is methods ? is there something related to POD ?
"trivial" sounds to me like something that can be used kind of the same way as a simple type. But i guess it is more complicated than that.
The official definition of a trivial type can be found here.
In simpler terms, a trivial type is either a fundamental type (int, float, etc.) or a type composed of only other trivial types, and without any of the special member functions listed here. Other member functions don't play a role.
The point of triviality is that the type can be treated exactly like a fundamental type, in that objects of the type can be copied and moved with memcpy and constructed destructed without doing anything. Hence, triviality requires a type be essentially made only of fundamental types. This is what makes the copy, move, construction, and destruction operations relevant to the definition of trivial types. Other member functions don't play a role in triviality just as you can write void fn( int*, OtherArgs... ) without affecting whether or not an int is trivial, because you can think of member functions of T as essentially being free functions with the signature ReturnType member_function( T*, OtherArgs... ) that the compiler let's you call with the syntax a.member_function( other_args... ).
As for alignment, it simply isn't all that relevant because it's all taken care of for you. The compiler knows the alignment of the types it's working with, thanks to the strong static type system.
As you can see here, all POD types are trivial.
Assume I wrote a type that is something like a static_string (just a pair of size_t and N chars where N is template int parameter). So my type can be safely memcopied and there is no need to run a destructor. But it has user provided copy constructor so it is not detected as trivially copyable by C++ language.
I would like to tell the users of my type they can memcopy my type and that there is no need to run a destructor.
I always assumed that I can just specialize type_traits but I recently learned it is UB to do so.
If there is no way to do this with type traits:
is there a named concept in C++20 that my type satisfies so at least in comment I can use that instead of words?
P.S. I know it is a bad idea to write types like this, but some use cases exist: optimization, shared memory(where you do not want strings to heap alloc).
So my type can be safely memcopied and there is no need to run a destructor. But it has user provided copy constructor so it is not detected as trivially copyable by C++ language.
This is a contradiction. As far as the C++ language is concerned, if your type is not TriviallyCopyable, then it is not "safely memcopied". If memcopying is equivalent to a copy, then that must mean that copying the object is equivalent to a memcpy. If two operations are the same, then they must be the same.
There is no way to resolve this contradiction without sacrificing one of these. Either make all copying equivalent (memcpy and copy constructor/assignment) or memcpy must not be allowed.
The question is this: how important is trivial copyability compared to the optimization of only copying the characters that actually have a value in a copy constructor/assignment operator? You cannot have both. Personally, I would say that static strings should never be large enough that the cost of just copying the whole thing should matter all that much. And in the rare cases where that's actually important, provide a specialized function to do the copying.
The answer is No. There is no "atomic" named concept in C++20 that satisfies your type. Of course you could define a user defined concept which satisfies your type. But this is not what you want. Since the question is about documentation I advise you to use words (not code) as a comment.
In C, we have the functions memcpy and memmove to efficiently copy data around. The former yields undefined behavior if the source and destination regions overlap, but the latter is guaranteed to deal with that "as expected," presumably by noticing the direction of overlap and (if necessary) choosing a different algorithm.
The above functions are available in C++ (as std::memcpy and std::memmove), of course, but they don't really work with non-trivial classes. Instead, we get std::copy and std::copy_backward. Each of these works if the source and destination ranges don't overlap; moreover, each is guaranteed to work for one "direction" of overlap.
What can we use if we want to copy from one region to another and we don't know at compile-time if the ranges may overlap or in what direction that overlap may occur? It doesn't seem that we have an option. For general iterators it may be difficult to determine if ranges overlap, so I understand why no solution is provided in that case, but what about when we're dealing with pointers? Ideally, there'd be a function like:
template<class T>
T * copy_either_direction(const T * inputBegin, const T * inputEnd, T * outputBegin) {
if ("outputBegin ∈ [inputBegin, inputEnd)") {
outputBegin += (inputEnd - inputBegin);
std::copy_backward(inputBegin, inputEnd, outputBegin);
return outputBegin;
} else {
return std::copy(inputBegin, inputEnd, outputBegin);
}
}
(A similar function with T * replaced with std::vector<T>::iterator would also be nice. Even better would be if this were guaranteed to work if inputBegin == outputBegin, but that's a separate gripe of mine.)
Unfortunately, I don't see a sensible way to write the condition in the if statement, as comparing pointers into separate blocks of memory often yields undefined behavior. On the other hand, the implementation clearly has its own way to do this, as std::memmove inherently requires one. Thus, any implementation could provide such a function, thereby filling a need that the programmer simply can't. Since std::memmove was considered useful, why not copy_either_direction? Is there a solution I'm missing?
memmove works because it traffics in pointers to contiguous bytes, so the ranges of the two blocks to be copied are well defined. copy and move take iterators that don't necessarily point at contiguous ranges. For example, a list iterator can jump around in memory; there's no range that the code can look at, and no meaningful notion of overlap.
I recently learned that std::less is specialized for pointers in a way that provides a total order, presumably to allow one to store pointers in std::sets and related classes. Assuming that this must agree with the standard ordering whenever the latter is defined, I think the following will work:
#include <functional>
template<class T>
T * copy_either_direction(const T * inputBegin, const T * inputEnd, T * outputBegin) {
if (std::less<const T *>()(inputBegin, outputBegin)) {
outputBegin += (inputEnd - inputBegin);
std::copy_backward(inputBegin, inputEnd, outputBegin);
return outputBegin;
} else {
return std::copy(inputBegin, inputEnd, outputBegin);
}
}
What can we use if we want to copy from one region to another and we don't know at compile-time if the ranges may overlap or in what direction that overlap may occur?
This is not a logically consistent concept.
After a copy operation, you will have two objects. And each object is defined by a separate and distinct region of memory. You cannot have objects which overlap in this way (you can have subobjects, but an object type cannot be its own subobject). And therefore, it is impossible to copy an object on top of part of itself.
To move an object on top of itself is also not logically consistent. Why? Because moving is a fiction in C++; after the move, you still have two perfectly functional objects. A move operation is merely a destructive copy, one that steals resources owned by the other object. It is still there, and it is still a viable object.
And since an object cannot overlap with another object, it is again impossible.
Trivially copyable types get around this because they are just blocks of bits, with no destructors or specialized copy operations. So their lifetime is not as rigid as that of others. A non-trivially copyable type cannot do this because:
The experience with memmove suggests that there could be a solution in this case (and perhaps also for iterators into contiguous containers).
This is neither possible nor generally desirable for types which are not trivially copyable in C++.
The rules for trivial copyability are that the type has no non-trivial copy/move constructors/assignment operators, as well as no non-trivial destructor. A trivial copy/move constructor/assignment is nothing more than a memcpy, and a trivial destructor does nothing. And therefore, these rules effectively ensures that the type is nothing more than a "block of bits". And one "block of bits" is no different from another, so copying it via memmove is a legal construct.
If a type has a real destructor, then the type is maintaining some sort of invariant that requires actual effort to maintain. It may free up a pointer or release a file handle or whatever. Given that, it makes no sense to copy the bits, because now you have two objects that reference the same pointer/file handle. That's a bad thing to do, because the class will generally want to control how that gets handled.
There is no way to solve this problem without the class itself being involved in the copy operation. Different class have different behavior with respect to managing their internals. Indeed, that is the entire purpose of objects having copy constructors and assignment operators. So that a class can decide for itself how to maintain the sanity of its own state.
And it doesn't even have to be a pointer or file handle. Maybe each class instance has a unique identifier; such a value is generated at construction time, and it never gets copied (new copies get new values). For you to violate such a restriction with memmove leaves your program in an indeterminate state, because you will have code that expects such identifiers to be unique.
That's why memmoveing for non-trivially copyable types yields undefined behavior.
Alex Stepanov defined Regular Types as types satisfying certain properties around copying and equality. Now that C++11 has added move semantics to the realm of generic programming, Stepanov's definition is no longer complete. I'm looking for a good reference on regular types that includes their interaction with move semantics.
Summary:
For C++11 I would include:
move-ctor (noexcept)
move-assign (noexcept)
total ordering (operator<() for natural total order and std::less<> if a natural
total order does not exist).
hash<>
And would remove:
swap() (non-throwing) - replaced by move operations.
Commentary
Alex revisits the concept of a regular type in Elements of Programming. In fact, much of the book is devoted to regular types.
There is a set of procedures whose inclusion in the computational
basis of a type lets us place objects in data structures and use
algorithms to copy objects from one data structure to another. We call
types having such a basis regular, since their use guarantees
regularity of behavior and, therefore, interoperability. -- Section 1.5 of EoP
In EoP, Alex introduces the notion of an underlying_type which gives us a non-throwing swap algorithm that can be used to move. An underlying_type template isn't implementable in C++ in any particularly useful manner, but you can use non-throwing (noexcept) move-ctor and move-assign as reasonable approximations (an underlying type allows moving to/from a temporary without an additional destruction for the temporary). In C++03, providing a non-throwing swap() was the recommended way to approximate a move operation, if you provide move-ctor and move-assign then the default std::swap() will suffice (though you could still implement a more efficient one).
[ I'm on record as recommending that you use a single assignment operator, passing by value, to cover both move-assign and copy-assign. Unfortunately the current language rules for when a type gets a default move-ctor causes this to break with composite types. Until that is fixed in the language you will need to write two assignment operators. However, you can still use pass by value for other sink arguments to avoid combinatorics in handling move/copy for all arguments. ]
Alex also adds the requirement of total ordering (though there may not be a natural total order and the ordering may be purely representational). operator<() should be reserved for the natural total ordering. My suggestion is to specialize std::less<>() if a natural total ordering is not available, there is some precedent for that in the standard).
In EoP, Alex relaxes the requirements on equality to allow for representational-equality as being sufficient. A useful refinement.
A regular type should also be equationally complete (that is, operator==() should be implementable as a non-friend, non-member, function). A type that is equationally complete is also serializable (though without a canonical serialization format, implementing the stream operators are of little use except for debugging). A type that is equationally complete can also be hashed. In C++11 (or with TR1) you should provide a specialization of std::hash.
Another property of regular types is area() for which there is not yet any standard syntax - and likely little reason to actually implement except for testing. It is a useful concept for specifying complexity - and I frequently implement it (or an approximation) for testing complexity. For example, we define the complexity of copy as bounded by the time to copy the area of the object.
The concept of a regular type is not language-specific. One of the first things I do when presented with a new language is work out how regular types manifest in that language.
Constraints of generic programming are best stated in terms of expressions. A more modern rendition of the same constraint on copiability would be that both statements should be valid:
T b = a;
and
T b = ra;
where a is an lvalue with type T or const T and ra is an rvalue with type T or const T. (With similar post-conditions.)
This formulation is in the spirit of the paper, I believe. Do note that C++03 already makes use of notions like lvalues and rvalues, such that the constraint we've expressed requires that something like T source(); T b = source(); be valid -- certainly something that seems sensible.
Under those constraints, then not much changes with C++11. Of particular note is that such a (pathological) type is irregular:
struct irregular {
irregular() = default;
irregular(irregular const&) = default;
irregular& operator=(irregular const&) = default;
irregular(irregular&&) = delete;
irregular& operator=(irregular&&) = delete;
};
because something like irregular a; irregular b = a; is valid while irregular source(); irregular b = source(); isn't. It's a type that is somewhat copyable (resp. copy assignable), but not quite enough. [ This has been considered somewhat of a defect and is slated to be changed for C++1y, where such a type will in fact be copyable. ]
Going further, for the post-condition that a copy must be equivalent in some sense to the original (or, for rvalues, to the original before the copy) to hold, a move special member can only ever be an 'optimization' of the respective copy special member. Another way to put it is that copy semantics are a refinement of move semantics. This means that the assertion must hold in the following:
T a;
T b = a;
T c = std::move(a);
assert( b == c );
I.e. whether we arrived there via a copy 'request' (that is, an expression involving an lvalue source) or via a move request (an expression involving an rvalue source), we must have the same result regardless of what 'actually' happened (whether a copy special member or move special member was involved, if at all).
Of interest is the fact that the traits such as std::is_copy_constructible used to be called std::has_copy_constructor, but were renamed to put the emphasis on expressions rather than intrinsic properties: something like std::is_copy_constructible<int>::value && std::is_move_assignable<int>::value is true regardless of the fact that int has no constructors or assignment operators.
I advise you to really do generic programming by expressing constraints on the expression level because e.g. the presence or absence of a move constructor is neither sufficient nor necessary for a type to be copy constructible.
add move assignment and a move copy constructor, along with all the other operators of built in types and I'd say you have it, according to Stepanov's paper.