Where to use std::variant over union? - c++

Please explain what is the difference between union and std::variant and why std::variant was introduced into the standard? In what situations should we use std::variant over the old-school union?

Generally speaking, you should prefer variant unless one of the following comes up:
You're cheating. You're doing type-punning or other things that are UB but you're hoping your compiler won't break your code.
You're doing some of the pseudo-punnery that C++ unions are allowed to do: conversion between layout-compatible types or between common initial sequences.
You explicitly need layout compatibility. variant<Ts> are not required to have any particular layout; unions of standard layout types are standard layout.
You need low-level support for in-place switching of objects. Using a memory buffer for such things doesn't provide the trivial copying guarantees that you could get out of a union.
The basic difference between the two is that variant knows which type it stores, while union expects you to keep track of that externally. So if you try to access the wrong item in a variant, you get an exception or nullptr. By contrast, doing so with a union is merely undefined behavior.
union is a lower-level tool, and thus should only be used when you absolutely need that lower-level.
variant also has machinery for doing visitation, which means you get to avoid having a bunch of if statements where you ask "if it is type X, do this. If it is type Y, do that, etc".

Related

How evil would it be to use type punning between trivially copyable structs?

I have a library with a Swift interface that hides a C++ layer. In the C++, I have struct A { ...}. I want the Swift to pass around by-value copies of this struct (for various complicated reasons). Swift understands C declarations but not C++, so I'd need to declare some dummy C struct for it with the same size, e.g. struct FakeA { char data[/* size of A */]; }. Then, I could use type punning to go back and forth. Since A is trivially copyable, I would think it's OK. However, at cppreference.com it states, "Unlike in C, however, objects with trivial default constructors cannot be created by simply reinterpreting suitably aligned storage, such as memory allocated with std::malloc: placement-new is required to formally introduce a new object and avoid potential undefined behavior."
How undefined are we talking? Could it realistically cause problems, say, when compiling with Clang for arm64 and x86_64?
C++ abstracts the concept of a lifetime, even for pods with no constructors, C++ defines in specific terms when lifetime starts and ends of an object, that's why you can't just reinterpret bytes from a memory even if you know their layout match. It is undefined behavior because that is not start of lifetime of the object.
In practice, this is the kind of of UB that people still use though, because there's no equivalent non UB option.
std::start_lifetime_as<T> and new (p) std::byte[n] (formerly std::bless) would be the perfect remedy for this (http://wg21.link/p0593) but sadly not for now.

C++ std::variant vs std::any

C++17 presents std::variant and std::any, both able to store different type of values under an object. For me, they are somehow similar (are they?).
Also std::variant restricts the entry types, beside this one. Why we should prefer std::variant over std::any which is simpler to use?
The more things you check at compile time the fewer runtime bugs you have.
variant guarantees that it contains one of a list of types (plus valueless by exception). It provides a way for you to guarantee that code operating on it considers every case in the variant with std::visit; even every case for a pair of variants (or more).
any does not. With any the best you can do is "if the type isn't exactly what I ask for, some code won't run".
variant exists in automatic storage. any may use the free store; this means any has performance and noexcept(false) issues that variant does not.
Checking for which of N types is in it is O(N) for an any -- for variant it is O(1).
any is a dressed-up void*. variant is a dressed-up union.
any cannot store non-copy or non-move able types. variant can.
The type of variant is documentation for the reader of your code.
Passing a variant<Msg1, Msg2, Msg3> through an API makes the operation obvious; passing an any there means understanding the API requires reliable documentation or reading the implementation source.
Anyone who has been frustrated by statically typeless languages will understand the dangers of any.
Now this doesn't mean any is bad; it just doesn't solve the same problems as variant. As a copyable object for type erasure purposes, it can be great. Runtime dynamic typing has its place; but that place is not "everywhere" but rather "where you cannot avoid it".
The difference is that the objects are stored within the memory allocated by std::variant:
cppreference.com - std::variant
As with unions, if a variant holds a value of some object type T, the object representation of T is allocated directly within the object representation of the variant itself. Variant is not allowed to allocate additional (dynamic) memory.
and for std::any this is not possible.
As of that a std::variant, does only require one memory allocation for the std::variant itself, and it can stay on the stack.
In addition to never using additional heap memory, variant has one other advantage:
You can std::visit a variant, but not any.

What's the purpose of layout-compatible types?

The standard defines when two types are layout-compatible. But, I don't see anywhere in the standard what the consequences are when two types are layout-compatible. It seems that layout-compatible is a definition which is not used anywhere.
What is the purpose of layout-compatible?
Note: Supposedly, it could mean that the types have the same layout (offsetof is the same for each corresponding member), so for example, for trivially copyable types, underlying bytes can be copied between them. But I don't see something like this in the standard.
The standard does define one specific case where layout compatibility matters: in unions. If two members are layout-compatible, and one of them is the active union member, then you may access that object through pointers/references to any layout-compatible member of that union. This is a consequence of the "common initial sequence" rule.
The Standard makes no attempt to mandate that all implementations be suitable for all purposes. Consequently, quality implementations intended to be suitable for purposes beyond those for which the Standard require support will generally need to extend the semantics of the language. One of the simplest and most useful ways they can do this is by saying that in some circumstances where portions of the Standard define or imply the behavior of some action but another part says an overlapping category of actions invoke UB, they will process the behavior as defined or implied by the former parts. On many compilers, for example, there is an option (typically enabled with a -fno-strict-aliasing flag) to say that any program whose behavior would be defined in the absence of type-access rules will be processed in that fashion, even if those rules would say the program invokes UB.
While there are relatively few situations where the fact that two structures are layout-compatible would cause behavior to be defined by the Standard when it otherwise wouldn't, there are many situations where it would imply how an implementation must behave in the absence of those type-access rules (by making it essentially impossible for an implementation to do anything else). For example, if structure types T1 and T2 are layout compatible, that would suggest that if a pointer to a T1 is converted to a T2*, any operation upon a member of the structure using the latter pointer will access the corresponding member of the T1 object.
Because not all programs need such abilities, the Standard does not require that all implementations provide them. On the other hand, implementations that are suitable for low-level programming will provide means by which parts of the code that are designed to handle one type can be used to handle layout-compatible types interchangeably, whether the Standard requires them to or not (implementations that don't would simply be limited to uses other than low-level programming).
I think the Standard would be enormously improved by officially recognizing categories of implementations that are suitable for low-level programming and others that make no claim to be, rather than trying to define a single set of behavior for all implementations. Nonetheless, defining concepts like "layout compatibility" greatly improves the range of constructs that will be portable among implementations that are suitable for low-level programming.

What are the differences between std::variant and boost::variant?

In an answer to this SO question:
What is the equivalent of boost::variant in the C++ standard library?
it is mentioned that boost::variant and std::variant differ somewhat.
What are the differences, as far as someone using these classes is concerned?
What motivation did the committee express to adopt std::variant with these differences?
What should I watch out for when coding with either of these, to maintain maximum compatibility with switching to the other one?
(the motivation is using boost::variant in pre-C++17 code)
Assignment/emplacement behavior:
boost::variant may allocate memory when performing assignment into a live variant. There are a number of rules that govern when this can happen, so whether a boost::variant will allocate memory depends on the Ts it is instantiated with.
std::variant will never dynamically allocate memory. However, as a concession to the complex rules of C++ objects, if an assignment/emplacement throws, then the variant may enter the "valueless_by_exception" state. In this state, the variant cannot be visited, nor will any of the other functions for accessing a specific member work.
You can only enter this state if assignment/emplacement throws.
Boost.Variant includes recursive_variant, which allows a variant to contain itself. They're essentially special wrappers around a pointer to a boost::variant, but they are tied into the visitation machinery.
std::variant has no such helper type.
std::variant offers more use of post-C++11 features. For example:
It forwards the noexcept status of the special member functions of its constituent types.
It has variadic template-based in-place constructors and emplacement functions.
Defect resolutions applied to C++17 may mean that it will also forward trivial copyability of its types. That is, if all of the types are trivially copyable, then so too will variant<Ts>.
It seems the main point of contention regarding the design of a variant class has been what should happen when an assignment to the variant, which should upon completion destory the old value, throws an exception:
variant<std::string, MyClassWithThrowingDefaultCtor> v = "ABC";
v = MyClassWithThrowingDefaultCtor();
The options seem to be:
Prevent this by restricting the possible representable types to nothrow-move-constructible ones.
Keep the old value - but this requires double-buffers.
Construct the new value on the heap, store a pointer to it in the variant (so the variant itself is not garbled even on exception). This is, apparently, what boost::variant does.
Have a 'disengaged' state with no value for each variant, and go to that state on such failures.
Undefined behavior
Make the variant throw when trying to read its value after something like that happens
and if I'm not mistaken, the latter is what's been accepted.
This is summarized from the ISO C++ blog post by Axel Naumann from Nov 2015.
std::variant differs slightly from the boost::variant
std::variant is declared in the header file rather than in <boost.variant.hpp>
std::variant never ever allocates memory
std::variant is usable with constexpr
Instead of writing boost::get(&variable), you have to write std::get_if(&variable) for std::variant
std::variant can not recursively hold itself and misses some other advanced techniques
std::variant can in-place construct objects
std::variant has index() instead of which()

Can I safely memmove around a boost variant?

I have a class wrapping a boost variant that only contains memmovable types (QList, QString, int etc).
May I declare that wrapper class memmovable to Qt containers?
A boost::variant contains only an integral index and an aligned_storage, which is guaranteed by the standard to be a POD. It has no virtual members, but has user-defined constructors and a destructor. As a consequence, boost::variant is not a POD and trying to memmove it is UB (well, I think it is UB, I don't find a definitive reference in the standard).
However, the same can be said for QList, QString, etc. Apparently, Qt assumes that some non-POD types can be safely memmoved, and makes a distinction between POD (so-called "primitive types") and "movable types".
Consequently, if you think it is safe to memmove a QList, you can consider it safe to memmove a boost::variant containing memmovable types.
You probably know that memmoving non-POD types is technically undefined behaviour. That aside, variant doesn't contain anything that would be problematic if memmoved. Since you mention QList and QString as being memmovable, and I have difficulty believing that they are PODs (although I haven't seen them), boost::variant is no worse.