What are the differences between std::variant and boost::variant? - c++

In an answer to this SO question:
What is the equivalent of boost::variant in the C++ standard library?
it is mentioned that boost::variant and std::variant differ somewhat.
What are the differences, as far as someone using these classes is concerned?
What motivation did the committee express to adopt std::variant with these differences?
What should I watch out for when coding with either of these, to maintain maximum compatibility with switching to the other one?
(the motivation is using boost::variant in pre-C++17 code)

Assignment/emplacement behavior:
boost::variant may allocate memory when performing assignment into a live variant. There are a number of rules that govern when this can happen, so whether a boost::variant will allocate memory depends on the Ts it is instantiated with.
std::variant will never dynamically allocate memory. However, as a concession to the complex rules of C++ objects, if an assignment/emplacement throws, then the variant may enter the "valueless_by_exception" state. In this state, the variant cannot be visited, nor will any of the other functions for accessing a specific member work.
You can only enter this state if assignment/emplacement throws.
Boost.Variant includes recursive_variant, which allows a variant to contain itself. They're essentially special wrappers around a pointer to a boost::variant, but they are tied into the visitation machinery.
std::variant has no such helper type.
std::variant offers more use of post-C++11 features. For example:
It forwards the noexcept status of the special member functions of its constituent types.
It has variadic template-based in-place constructors and emplacement functions.
Defect resolutions applied to C++17 may mean that it will also forward trivial copyability of its types. That is, if all of the types are trivially copyable, then so too will variant<Ts>.

It seems the main point of contention regarding the design of a variant class has been what should happen when an assignment to the variant, which should upon completion destory the old value, throws an exception:
variant<std::string, MyClassWithThrowingDefaultCtor> v = "ABC";
v = MyClassWithThrowingDefaultCtor();
The options seem to be:
Prevent this by restricting the possible representable types to nothrow-move-constructible ones.
Keep the old value - but this requires double-buffers.
Construct the new value on the heap, store a pointer to it in the variant (so the variant itself is not garbled even on exception). This is, apparently, what boost::variant does.
Have a 'disengaged' state with no value for each variant, and go to that state on such failures.
Undefined behavior
Make the variant throw when trying to read its value after something like that happens
and if I'm not mistaken, the latter is what's been accepted.
This is summarized from the ISO C++ blog post by Axel Naumann from Nov 2015.

std::variant differs slightly from the boost::variant
std::variant is declared in the header file rather than in <boost.variant.hpp>
std::variant never ever allocates memory
std::variant is usable with constexpr
Instead of writing boost::get(&variable), you have to write std::get_if(&variable) for std::variant
std::variant can not recursively hold itself and misses some other advanced techniques
std::variant can in-place construct objects
std::variant has index() instead of which()

Related

Why are there two variant class implementations in Boost?

Boost seems to have two implementations of a variant class template:
boost::variant
boost::variant2
It is rare (though not unheard of) for Boost two include two takes on the same concept. Why has this happened for variants? How do these variants differ?
The first kind of variant class template, boost::variant, predates C++17's std::variant. See this question for its comparison to std::variant. Mainly, the difference regards what to do when an exception is thrown on construction of a value within the variant.
The std::variant choice is to allow a valueless state; the boost::variant choice is to construct the new object on the heap, not in-place, and store a pointer to that location.
boost::variant2 is a later addition, which, on the one hand, wishes to adhere to the C++17 API, but on the other hand is an expression of dissatisfaction with its choice on this matter.
boost::variant2 chooses a third option, different than both previous implementations: double-buffering. It takes up twice the size of the problematic type; constructs a new value in the unused half, and once the construction succeeds - destroys the old value in the other half. If all types are nothrow-constructible, this is not necessary and boost::variant2 will not actually have the double buffer.
This choice means that boost::variant2 can never be valueless; and indeed, its documentation title emphasizes this fact.

Why is it a good idea for (C++) types to be regular?

(This question stems for this more specific questions about stream iterators.)
A type is said [Stepanov, McJones] to be Regular if:
it is equality-comparable
it is assignable (from other values of the type)
it is destructible
it is default-constructible (i.e. constructible with no arguments)
It has a (default) total ordering of its values
(and there's some wording about "underlying type" which I didn't quite get.)
Some/many people claim that, when designing types - e.g. for the standard library of C++ - it is worthwhile or even important to make an effort to make these types regular, possibly ignoring the total-order requirement. Some mention the maxim:
Do as the ints do.
and indeed, the int type satisfies all these requirements. However, default-constructible types, when constructed, hold some kind of null, invalid or junk value - ints do. A different approach is requiring initialization on construction (and de-initialization on destruction), so that the object's lifetime corresponds to its time of viability.
To contrast these two approaches, one can perhaps think about a T-pointer type, and a T-reference or T-reference-wrapper type. The pointer is basically regular (the ordering assumes a linear address space), and has nullptr which one needs to look out for; the reference is, under the hood, a pointer - but you can't just construct a reference-to-nothing or a junk-reference. Now, pointers may have their place, but we mostly like to work with references (or possibly reference wrappers which are assignable).
So, why should we prefer designing (as library authors) and using regular types? Or at least, why should we prefer them in so many contexts?
I doubt this is the most significant answer, but you could make the argument that "no uninitialized state" / "no junk value" is a principle that doesn't sit well with move-assignment: After you move from your value, and take its resources away - what state does it remain in? It is not very far from default-constructing it (unless the move-assignment is based on some sort of swapping; but then - you could look at move construction).
The rebuttal would be: "Fine, so let's not have such types move-assignable, only move-destructible"; unfortunately - C++ decided sometime in the 2000's to go with non-destructive moves. See also this question & answer by #HowardHinnant:
Why does C++ move semantics leave the source constructed?

C++ std::variant vs std::any

C++17 presents std::variant and std::any, both able to store different type of values under an object. For me, they are somehow similar (are they?).
Also std::variant restricts the entry types, beside this one. Why we should prefer std::variant over std::any which is simpler to use?
The more things you check at compile time the fewer runtime bugs you have.
variant guarantees that it contains one of a list of types (plus valueless by exception). It provides a way for you to guarantee that code operating on it considers every case in the variant with std::visit; even every case for a pair of variants (or more).
any does not. With any the best you can do is "if the type isn't exactly what I ask for, some code won't run".
variant exists in automatic storage. any may use the free store; this means any has performance and noexcept(false) issues that variant does not.
Checking for which of N types is in it is O(N) for an any -- for variant it is O(1).
any is a dressed-up void*. variant is a dressed-up union.
any cannot store non-copy or non-move able types. variant can.
The type of variant is documentation for the reader of your code.
Passing a variant<Msg1, Msg2, Msg3> through an API makes the operation obvious; passing an any there means understanding the API requires reliable documentation or reading the implementation source.
Anyone who has been frustrated by statically typeless languages will understand the dangers of any.
Now this doesn't mean any is bad; it just doesn't solve the same problems as variant. As a copyable object for type erasure purposes, it can be great. Runtime dynamic typing has its place; but that place is not "everywhere" but rather "where you cannot avoid it".
The difference is that the objects are stored within the memory allocated by std::variant:
cppreference.com - std::variant
As with unions, if a variant holds a value of some object type T, the object representation of T is allocated directly within the object representation of the variant itself. Variant is not allowed to allocate additional (dynamic) memory.
and for std::any this is not possible.
As of that a std::variant, does only require one memory allocation for the std::variant itself, and it can stay on the stack.
In addition to never using additional heap memory, variant has one other advantage:
You can std::visit a variant, but not any.

Where to use std::variant over union?

Please explain what is the difference between union and std::variant and why std::variant was introduced into the standard? In what situations should we use std::variant over the old-school union?
Generally speaking, you should prefer variant unless one of the following comes up:
You're cheating. You're doing type-punning or other things that are UB but you're hoping your compiler won't break your code.
You're doing some of the pseudo-punnery that C++ unions are allowed to do: conversion between layout-compatible types or between common initial sequences.
You explicitly need layout compatibility. variant<Ts> are not required to have any particular layout; unions of standard layout types are standard layout.
You need low-level support for in-place switching of objects. Using a memory buffer for such things doesn't provide the trivial copying guarantees that you could get out of a union.
The basic difference between the two is that variant knows which type it stores, while union expects you to keep track of that externally. So if you try to access the wrong item in a variant, you get an exception or nullptr. By contrast, doing so with a union is merely undefined behavior.
union is a lower-level tool, and thus should only be used when you absolutely need that lower-level.
variant also has machinery for doing visitation, which means you get to avoid having a bunch of if statements where you ask "if it is type X, do this. If it is type Y, do that, etc".

Which C++ idioms are deprecated in C++11?

With the new standard, there are new ways of doing things, and many are nicer than the old ways, but the old way is still fine. It's also clear that the new standard doesn't officially deprecate very much, for backward compatibility reasons. So the question that remains is:
What old ways of coding are definitely inferior to C++11 styles, and what can we now do instead?
In answering this, you may skip the obvious things like "use auto variables".
Final Class: C++11 provides the final specifier to prevent class derivation
C++11 lambdas substantially reduce the need for named function object (functor) classes.
Move Constructor: The magical ways in which std::auto_ptr works are no longer needed due to first-class support for rvalue references.
Safe bool: This was mentioned earlier. Explicit operators of C++11 obviate this very common C++03 idiom.
Shrink-to-fit: Many C++11 STL containers provide a shrink_to_fit() member function, which should eliminate the need swapping with a temporary.
Temporary Base Class: Some old C++ libraries use this rather complex idiom. With move semantics it's no longer needed.
Type Safe Enum Enumerations are very safe in C++11.
Prohibiting heap allocation: The = delete syntax is a much more direct way of saying that a particular functionality is explicitly denied. This is applicable to preventing heap allocation (i.e., =delete for member operator new), preventing copies, assignment, etc.
Templated typedef: Alias templates in C++11 reduce the need for simple templated typedefs. However, complex type generators still need meta functions.
Some numerical compile-time computations, such as Fibonacci can be easily replaced using generalized constant expressions
result_of: Uses of class template result_of should be replaced with decltype. I think result_of uses decltype when it is available.
In-class member initializers save typing for default initialization of non-static members with default values.
In new C++11 code NULL should be redefined as nullptr, but see STL's talk to learn why they decided against it.
Expression template fanatics are delighted to have the trailing return type function syntax in C++11. No more 30-line long return types!
I think I'll stop there!
At one point in time it was argued that one should return by const value instead of just by value:
const A foo();
^^^^^
This was mostly harmless in C++98/03, and may have even caught a few bugs that looked like:
foo() = a;
But returning by const is contraindicated in C++11 because it inhibits move semantics:
A a = foo(); // foo will copy into a instead of move into it
So just relax and code:
A foo(); // return by non-const value
As soon as you can abandon 0 and NULL in favor of nullptr, do so!
In non-generic code the use of 0 or NULL is not such a big deal. But as soon as you start passing around null pointer constants in generic code the situation quickly changes. When you pass 0 to a template<class T> func(T) T gets deduced as an int and not as a null pointer constant. And it can not be converted back to a null pointer constant after that. This cascades into a quagmire of problems that simply do not exist if the universe used only nullptr.
C++11 does not deprecate 0 and NULL as null pointer constants. But you should code as if it did.
Safe bool idiom → explicit operator bool().
Private copy constructors (boost::noncopyable) → X(const X&) = delete
Simulating final class with private destructor and virtual inheritance → class X final
One of the things that just make you avoid writing basic algorithms in C++11 is the availability of lambdas in combination with the algorithms provided by the standard library.
I'm using those now and it's incredible how often you just tell what you want to do by using count_if(), for_each() or other algorithms instead of having to write the damn loops again.
Once you're using a C++11 compiler with a complete C++11 standard library, you have no good excuse anymore to not use standard algorithms to build your's. Lambda just kill it.
Why?
In practice (after having used this way of writing algorithms myself) it feels far easier to read something that is built with straightforward words meaning what is done than with some loops that you have to uncrypt to know the meaning. That said, making lambda arguments automatically deduced would help a lot making the syntax more easily comparable to a raw loop.
Basically, reading algorithms made with standard algorithms are far easier as words hiding the implementation details of the loops.
I'm guessing only higher level algorithms have to be thought about now that we have lower level algorithms to build on.
You'll need to implement custom versions of swap less often. In C++03, an efficient non-throwing swap is often necessary to avoid costly and throwing copies, and since std::swap uses two copies, swap often has to be customized. In C++, std::swap uses move, and so the focus shifts on implementing efficient and non-throwing move constructors and move assignment operators. Since for these the default is often just fine, this will be much less work than in C++03.
Generally it's hard to predict which idioms will be used since they are created through experience. We can expect an "Effective C++11" maybe next year, and a "C++11 Coding Standards" only in three years because the necessary experience isn't there yet.
I do not know the name for it, but C++03 code often used the following construct as a replacement for missing move assignment:
std::map<Big, Bigger> createBigMap(); // returns by value
void example ()
{
std::map<Big, Bigger> map;
// ... some code using map
createBigMap().swap(map); // cheap swap
}
This avoided any copying due to copy elision combined with the swap above.
When I noticed that a compiler using the C++11 standard no longer faults the following code:
std::vector<std::vector<int>> a;
for supposedly containing operator>>, I began to dance. In the earlier versions one would have to do
std::vector<std::vector<int> > a;
To make matters worse, if you ever had to debug this, you know how horrendous are the error messages that come out of this.
I, however, do not know if this was "obvious" to you.
Return by value is no longer a problem. With move semantics and/or return value optimization (compiler dependent) coding functions are more natural with no overhead or cost (most of the time).