STL copy efficency

STL copy efficency - c++

My understanding is that std::copy copies the elements one at a time. This seems to be necessary in order to trigger the constructor on each element. But when no such constructor exists (e.g PODs), I would think a memcpy would be much more efficient.
So, does the STL require/allow for specializations of, for instance, vector<int> copying that would just do a memcpy?
The following questions I would appreciate answered for both GCC / MSVC, since those are the compilers I use.
If it is allowed but not required, do the above compilers actually do it?
If they do, for which containers would this trigger? Obviously it makes no sense for list, but what about string or deque?
Again, if they do, which contained types would trigger this? Only built-in types, or also my own POD types (e.g. struct Point {int x, y;} )?
If they don't, would it be faster to use my own wrapper around new / delete / pointers that uses memcpy for things like integer/char/my own struct arrays?

First off, std::copy doesn't copy-construct anything. (That would be the job of the algorithm std::uninitialized_copy.) Instead, it assigns to each element of the old range the corresponding new value.
Secondly, yes indeed, a compiler may optimize the assignment into a memcopy as long as the result is the same "as if" it had performed element-wise assignment. GCC does this for example by having compiler support to recognize such trivially-copyable types, and C++11 actually adds a new type trait called std::is_trivially_copyable which is true precisely for types which can be memcopied.

Related

Documenting that my C++ type is memcopyable and noop destructible

Assume I wrote a type that is something like a static_string (just a pair of size_t and N chars where N is template int parameter). So my type can be safely memcopied and there is no need to run a destructor. But it has user provided copy constructor so it is not detected as trivially copyable by C++ language.
I would like to tell the users of my type they can memcopy my type and that there is no need to run a destructor.
I always assumed that I can just specialize type_traits but I recently learned it is UB to do so.
If there is no way to do this with type traits:
is there a named concept in C++20 that my type satisfies so at least in comment I can use that instead of words?
P.S. I know it is a bad idea to write types like this, but some use cases exist: optimization, shared memory(where you do not want strings to heap alloc).

So my type can be safely memcopied and there is no need to run a destructor. But it has user provided copy constructor so it is not detected as trivially copyable by C++ language.
This is a contradiction. As far as the C++ language is concerned, if your type is not TriviallyCopyable, then it is not "safely memcopied". If memcopying is equivalent to a copy, then that must mean that copying the object is equivalent to a memcpy. If two operations are the same, then they must be the same.
There is no way to resolve this contradiction without sacrificing one of these. Either make all copying equivalent (memcpy and copy constructor/assignment) or memcpy must not be allowed.
The question is this: how important is trivial copyability compared to the optimization of only copying the characters that actually have a value in a copy constructor/assignment operator? You cannot have both. Personally, I would say that static strings should never be large enough that the cost of just copying the whole thing should matter all that much. And in the rare cases where that's actually important, provide a specialized function to do the copying.

The answer is No. There is no "atomic" named concept in C++20 that satisfies your type. Of course you could define a user defined concept which satisfies your type. But this is not what you want. Since the question is about documentation I advise you to use words (not code) as a comment.

memset and a dynamic array of std::complex<double>

since std::complex is a non-trivial type, compiling the following with GCC 8.1.1
complex<double>* z = new complex<double>[6];
memset(z,0,6*sizeof*z);
delete [] (z);`
produces a warning
clearing an object of non-trivial type
My question is, is there actually any potential harm in doing so?

The behavior of std::memset is only defined if the pointer it is modifying is a pointer to a TriviallyCopyable type. std::complex is guaranteed to be a LiteralType, but, as far as I can tell, it isn't guaranteed to be TriviallyCopyable, meaning that std::memset(z, 0, ...) is not portable.
That said, std::complex has an array-compatibility guarantee, which states that the storage of a std::complex<T> is exactly two consecutive Ts and can be reinterpreted as such. This seems to suggest that std::memset is actually fine, since it would be accessing through this array-oriented access. It may also imply that std::complex<double> is TriviallyCopyable, but I am unable to determine that.
If you wish to do this, I would suggest being on the safe side and static_asserting that std::complex<double> is TriviallyCopyable:
static_assert(std::is_trivially_copyable<std::complex<double>>::value);
If that assertion holds, then you are guaranteed that the memset is safe.
In either case, it would be safe to use std::fill:
std::fill(z, z + 6, std::complex<double>{});
It optimizes down to a call to memset, albeit with a few more instructions before it. I would recommend using std::fill unless your benchmarking and profiling showed that those few extra instructions are causing problems.

Never, never, ever memset non-POD types. They have constructors for a reason. Just writing a bunch of bytes on top of them is highly unlikely to give the desired result (and if it does, the types themselves are badly designed as they should clearly then just be POD in the first place - or you are simply being unlucky that Undefined Behaviour seems to work in this case - have fun debugging it when it doesn't after you change optimization level, compiler or platform (or moon phase)).
Just don't do this.

The answer to this question is that for a standard-compliant std::complex there is no need for memset after new.
new complex<double>[6] will initialize the complex to (0, 0) because it calls a default (non-trivial) constructor that initializes to zero.
(I think this is a mistake unfortunately.)
https://en.cppreference.com/w/cpp/numeric/complex/complex
If the code posted was just and example with missing code between new and memset, then std::fill will do the right thing.
(In part because the specific standard library implementation knows internally how std::complex is implemented.)

In what sense is valarray free from aliasing?

An oft-made claim is that std::valarray was intended to eliminate some forms of aliasing in order to enable better optimization (e.g. see valarray vs. vector: Why was valarray introduced?)
Can anyone elaborate on this claim? It seems to me that aliasing is always possible as long as you can obtain a pointer to an element---which you can, because operator[] returns a reference.

The "no aliasing" thing refers to the global functions like cos that accept valarray as a parameter. cos (or whatever function) gets applied to the entire array, and the compiler and standard library implementation can assume that the array does not alias and can perform the operation on each element independently.
It also refers to things like valarray's operator+, which does memberwise addition, etc.

Why would I ever use push_back instead of emplace_back?

C++11 vectors have the new function emplace_back. Unlike push_back, which relies on compiler optimizations to avoid copies, emplace_back uses perfect forwarding to send the arguments directly to the constructor to create an object in-place. It seems to me that emplace_back does everything push_back can do, but some of the time it will do it better (but never worse).
What reason do I have to use push_back?

I have thought about this question quite a bit over the past four years. I have come to the conclusion that most explanations about push_back vs. emplace_back miss the full picture.
Last year, I gave a presentation at C++Now on Type Deduction in C++14. I start talking about push_back vs. emplace_back at 13:49, but there is useful information that provides some supporting evidence prior to that.
The real primary difference has to do with implicit vs. explicit constructors. Consider the case where we have a single argument that we want to pass to push_back or emplace_back.
std::vector<T> v;
v.push_back(x);
v.emplace_back(x);
After your optimizing compiler gets its hands on this, there is no difference between these two statements in terms of generated code. The traditional wisdom is that push_back will construct a temporary object, which will then get moved into v whereas emplace_back will forward the argument along and construct it directly in place with no copies or moves. This may be true based on the code as written in standard libraries, but it makes the mistaken assumption that the optimizing compiler's job is to generate the code you wrote. The optimizing compiler's job is actually to generate the code you would have written if you were an expert on platform-specific optimizations and did not care about maintainability, just performance.
The actual difference between these two statements is that the more powerful emplace_back will call any type of constructor out there, whereas the more cautious push_back will call only constructors that are implicit. Implicit constructors are supposed to be safe. If you can implicitly construct a U from a T, you are saying that U can hold all of the information in T with no loss. It is safe in pretty much any situation to pass a T and no one will mind if you make it a U instead. A good example of an implicit constructor is the conversion from std::uint32_t to std::uint64_t. A bad example of an implicit conversion is double to std::uint8_t.
We want to be cautious in our programming. We do not want to use powerful features because the more powerful the feature, the easier it is to accidentally do something incorrect or unexpected. If you intend to call explicit constructors, then you need the power of emplace_back. If you want to call only implicit constructors, stick with the safety of push_back.
An example
std::vector<std::unique_ptr<T>> v;
T a;
v.emplace_back(std::addressof(a)); // compiles
v.push_back(std::addressof(a)); // fails to compile
std::unique_ptr<T> has an explicit constructor from T *. Because emplace_back can call explicit constructors, passing a non-owning pointer compiles just fine. However, when v goes out of scope, the destructor will attempt to call delete on that pointer, which was not allocated by new because it is just a stack object. This leads to undefined behavior.
This is not just invented code. This was a real production bug I encountered. The code was std::vector<T *>, but it owned the contents. As part of the migration to C++11, I correctly changed T * to std::unique_ptr<T> to indicate that the vector owned its memory. However, I was basing these changes off my understanding in 2012, during which I thought "emplace_back does everything push_back can do and more, so why would I ever use push_back?", so I also changed the push_back to emplace_back.
Had I instead left the code as using the safer push_back, I would have instantly caught this long-standing bug and it would have been viewed as a success of upgrading to C++11. Instead, I masked the bug and didn't find it until months later.

push_back always allows the use of uniform initialization, which I'm very fond of. For instance:
struct aggregate {
int foo;
int bar;
};
std::vector<aggregate> v;
v.push_back({ 42, 121 });
On the other hand, v.emplace_back({ 42, 121 }); will not work.

Backwards compatibility with pre-C++11 compilers.

Some library implementations of emplace_back do not behave as specified in the C++ standard including the version that ship with Visual Studio 2012, 2013 and 2015.
In order to accommodate known compiler bugs, prefer usingstd::vector::push_back() if the parameters reference iterators or other objects which will be invalid after the call.
std::vector<int> v;
v.emplace_back(123);
v.emplace_back(v[0]); // Produces incorrect results in some compilers
On one compiler, v contains the values 123 and 21 instead of the expected 123 and 123. This is due to the fact that the 2nd call to emplace_back results in a resize at which point v[0] becomes invalid.
A working implementation of the above code would use push_back() instead of emplace_back() as follows:
std::vector<int> v;
v.emplace_back(123);
v.push_back(v[0]);
Note: The use of a vector of ints is for demonstration purposes. I discovered this issue with a much more complex class which included dynamically allocated member variables and the call to emplace_back() resulted in a hard crash.

Use push_back only for primitive/built-in types or raw pointers. Otherwise use emplace_back.

Consider what happens in Visual Studio 2019 with c++-17 compiler. We have emplace_back in a function with proper arguments set up. Then someone changes parameters of the constuctor called by emplace_back. There is no warning whatsover in VS, the code also compiles fine, then it crashes in runtime. I removed all emplace_back from the codebase after this.

What kinds of types does qsort not work for in C++?

std::sort swaps elements by using std::swap, which in turn uses the copy constructor and assignment operators, guaranteeing that you get correct semantics when exchanging the values.
qsort swaps elements by simply swapping the elements' underlying bits, ignoring any semantics associated with the types you are swapping.
Even though qsort is ignorant of the semantics of the types you are sorting, it still works remarkably well with non-trivial types. If I'm not mistaken, it will work with all standard containers, despite them not being POD types.
I suppose that the prerequisite for qsort working correctly on a type T is that T is /trivially movable/. Off the top of my head, the only types that are not trivially movable are those that have inner pointers. For example:
struct NotTriviallyMovable
{
NotTriviallyMovable() : m_someElement(&m_array[5]) {}
int m_array[10];
int* m_someElement;
};
If you sorted an array of NotTriviallyMovable then the m_someElements would end up pointing to the wrong elements.
My question is: what other kinds of types do not work with qsort?

Any type that is not a POD type is not usable with qsort(). There might be more types that are usable with qsort() if you consider C++0x, as it changes definition of POD. If you are going to use non-POD types with qsort() then you are in the land of UBs and daemons will fly out of your nose.

This doesn't work either for types that have pointers to "related" objects. Such pointers have many of the issues associated with "inner" pointers, but it's a lot harder to prove precisely what a "related" object is.
A specific kind of "related" objects are objects with backpointers. If object A and B are bit-swapped, and A and C pointed to each other, then afterwards B will point to C but C will point to A.

You are completely mistaken. Any non-POD type working with qsort is complete and utter luck. Just because it happens to work for you on your platform with your compiler on a blue moon if you sacrifice the blood of a virgin to the Gods and do a little dance first doesn't mean that it actually works.
Oh, and here's another one for not trivially movable- types whose instances are externally observed. You move it, but you don't notify the observer, because you never called the swap or copy construction functions.

"If I'm not mistaken, it will work with all standard containers"
The whole question boils down to, in what implementation? Do you want to code to the standard, or do you want to code to implementation details of the compiler you have in front of you today? If the latter, then if all your tests pass I guess it works.
If you're asking about the C++ programming language, then qsort is required to work only for POD types. If you're asking about a specific implementation, which one? If you're asking about all implementations, then you've sort of missed your chance, since the best place for that kind of straw poll was C++0x working group meetings, since they gathered together representatives of pretty much every organization with an actively-maintained C++ implementation.
For what it's worth, I can pretty easily imagine an implementation of std::list in which a list node is embedded in the list object itself, and used as a head/tail sentinel. I don't know what implementations (if any) actually do that, since it's also common to use a null pointer as a head/tail sentinel, but certainly there are some advantages to implementing a doubly-linked list with a dummy node at each end. An instance of such a std::list would of course not be trivially movable, since the nodes for its first and last elements would no longer point to the sentinel. Its swap implementation and (in C++0x) its move constructor would account for this by updating those first and last nodes.
There is nothing to stop your compiler switching to this implementation of std::list in its next release, although that would break binary compatibility so given how most compilers are managed it would have to be a major release.
Similarly, the map/set/multimap/multiset quartet could have nodes that point to their parents. Debugging iterators for any container might conceivably contain a pointer to the container. To do what you want, you'd have to (at least) rule out the existence of any pointer into the container in any part of its implementation, and a sweeping statement like "no implementation uses any of these tricks" is pretty unwise. The whole point of having a standard is to make statements about all conforming implementations, so if you haven't deduced your conclusion from the standard, then even if your statement is true today it could become untrue tomorrow.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js