Why GCC rejects std::optional for references? - c++

std::optional<int&> xx; just doesn't compile for the latest gcc-7.0.0 snapshot. Does the C++17 standard include std::optional for references? And why if it doesn't? (The implementation with pointers in a dedicated specialization whould cause no problems i guess.)

Because optional, as standardized in C++17, does not permit reference types. This was excluded by design.
There are two reasons for this. The first is that, structurally speaking, an optional<T&> is equivalent to a T*. They may have different interfaces, but they do the same thing.
The second thing is that there was effectively no consensus by the standards committee on questions of exactly how optional<T&> should behave.
Consider the following:
optional<T&> ot = ...;
T t = ...;
ot = t;
What should that last line do? Is it taking the object being referenced by ot and copy-assign to it, such that *ot == t? Or should it rebind the stored reference itself, such that ot.get() == &t? Worse, will it do different things based on whether ot was engaged or not before the assignment?
Some people will expect it to do one thing, and some people will expect it to do the other. So no matter which side you pick, somebody is going to be confused.
If you had used a T* instead, it would be quite clear which happens:
T* pt = ...;
T t = ...;
pt = t; //Compile error. Be more specific.
*pt = t; //Assign to pointed-to object.
pt = &t; //Change pointer.

In [optional]:
A program that necessitates the instantiation of template optional for a reference type, or for possibly cv-qualified types in_place_t or nullopt_t is ill-formed.
There is no std::optional<T&>. For now, you'll have to use std::optional<std::reference_wrapper<T>>.

Related

Rationale for const ref binding to a different type?

I recently learned that it's possible to assign a value to a reference of a different type. Concrete example:
const std::optional<float>& ref0 = 5.0f;
const std::optional<float>& ref1 = get_float();
That's surprising to me. I would certainly expect this to work with a non-reference, but assumed that references only bind to the same type.
I found a pretty good chunk of the c++ standard which talks about all kinds of ways this works: https://eel.is/c++draft/dcl.init.ref#5. But I would appreciate some insight: When is this ever desirable?
A particular occasion where this hurt me recently was this:
auto get_value() -> std::optional<float>{ /* ... */ }
const std::optional<float>& value = get_value();
// check and use value...
I later then changed the return value of the function to a raw float, expecting all uses with a reference type to fail. They did not. Without paying attention, all the useless checking code would have stayed in place.
The basic reason is one of consistency. Since const-reference parameters are very widely used not for reference semantics but merely to avoid copying, one would expect each of
void y(X);
void z(const X&);
to accept anything, rvalue or otherwise, that can be converted to an X. Initializing a local variable has the same semantics.
This syntax also once had a practical value: in C++03, the results of functions (including conversions) were notionally copied:
struct A {A(int);};
struct B {operator A() const;};
void g() {
A o=B(); // return value copied into o
const A &r=3; // refers to (lifetime-extended) temporary
}
There was already permission to elide these copies, and in this sort of trivial case it was common to do so, but the reference guaranteed it.

When is it safe to re-use memory from a trivially destructible object without laundering

Regarding the following code:
class One {
public:
double number{};
};
class Two {
public:
int integer{};
}
class Mixture {
public:
double& foo() {
new (&storage) One{1.0};
return reinterpret_cast<One*>(&storage)->number;
}
int& bar() {
new (&storage) Two{2};
return reinterpret_cast<Two*>(&storage)->integer;
}
std::aligned_storage_t<8> storage;
};
int main() {
auto mixture = Mixture{};
cout << mixture.foo() << endl;
cout << mixture.bar() << endl;
}
I haven't called the destructor for the types because they are trivially destructible. My understanding of the standard is that for this to be safe, we would need to launder the pointer to storage before passing it to the reinterpret_cast. However, std::optional's implementation in libstdc++ does not seem to use std::launder() and simply constructs the object right into the union storage. https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/std/optional.
Is my example above well defined behavior? What do I need to do to make it work? Would a union make this work?
In your code, you do need std::launder in order to make your reinterpret_cast do what you want it to do. This is a separate issue from that of re-using memory. According to the standard ([expr.reinterpret].cast]7), your expression
reinterpret_cast<One*>(&storage)
is equivalent to:
static_cast<One*>(static_cast<void*>(&storage))
However, the outer static_cast does not succeed in producing a pointer to the newly created One object because according to [expr.static.cast]/13,
if the original pointer value points to an object a, and there is an object b of type T (ignoring cv-qualification) that is pointer-interconvertible (6.9.2)
with a, the result is a pointer to b. Otherwise, the pointer value is unchanged by the conversion.
That is, the resulting pointer still points to the storage object, not to the One object nested within it, and using it as a pointer to a One object would violate the strict aliasing rule. You must use std::launder to force the resulting pointer to point to the One object. Or, as pointed out in the comments, you could simply use the pointer returned by placement new directly, rather than the one obtained from reinterpret_cast.
If, as suggested in the comments, you used a union instead of aligned_storage,
union {
One one;
Two two;
};
you would sidestep the pointer-interconvertibility issue, so std::launder would not be needed on account of non-pointer-interconvertibility. However, there is still the issue of re-use of memory. In this particular case, std::launder is not needed on account of re-use because your One and Two classes do not contain any non-static data members of const-qualified or reference type ([basic.life]/8).
Finally, there was the question of why libstdc++'s implementation of std::optional does not use std::launder, even though std::optional may contain classes that contain non-static data members of const-qualified or reference type. As pointed out in comments, libstdc++ is part of the implementation, and may simply elide std::launder when the implementers know that GCC will still compile the code properly without it. The discussion that led up to the introduction of std::launder (see CWG 1776 and the linked thread, N4303, P0137) seems to indicate that, in the opinion of people who understand the standard much better than I do, std::launder is indeed required in order to make the union-based implementation of std::optional well-defined in the presence of members of const-qualified or reference type. However, I am not sure that the standard text is clear enough to make this obvious, and it might be worth having a discussion about how it might be clarified.

Why are references forbidden in std::variant?

I use boost::variant a lot and am quite familiar with it. boost::variant does not restrict the bounded types in any way, in particular, they may be references:
#include <boost/variant.hpp>
#include <cassert>
int main() {
int x = 3;
boost::variant<int&, char&> v(x); // v can hold references
boost::get<int>(v) = 4; // manipulate x through v
assert(x == 4);
}
I have a real use-case for using a variant of references as a view of some other data.
I was then surprised to find, that std::variant does not allow references as bounded types, std::variant<int&, char&> does not compile and it says here explicitly:
A variant is not permitted to hold references, arrays, or the type void.
I wonder why this is not allowed, I don't see a technical reason. I know that the implementations of std::variant and boost::variant are different, so maybe it has to do with that? Or did the authors think it is unsafe?
PS: I cannot really work around the limitation of std::variant using std::reference_wrapper, because the reference wrapper does not allow assignment from the base type.
#include <variant>
#include <cassert>
#include <functional>
int main() {
using int_ref = std::reference_wrapper<int>;
int x = 3;
std::variant<int_ref> v(std::ref(x)); // v can hold references
static_cast<int&>(std::get<int_ref>(v)) = 4; // manipulate x through v, extra cast needed
assert(x == 4);
}
Fundamentally, the reason that optional and variant don't allow reference types is that there's disagreement on what assignment (and, to a lesser extent, comparison) should do for such cases. optional is easier than variant to show in examples, so I'll stick with that:
int i = 4, j = 5;
std::optional<int&> o = i;
o = j; // (*)
The marked line can be interpreted to either:
Rebind o, such that &*o == &j. As a result of this line, the values of i and j themselves remain changed.
Assign through o, such &*o == &i is still true but now i == 5.
Disallow assignment entirely.
Assign-through is the behavior you get by just pushing = through to T's =, rebind is a more sound implementation and is what you really want (see also this question, as well as a Matt Calabrese talk on Reference Types).
A different way of explaining the difference between (1) and (2) is how we might implement both externally:
// rebind
o.emplace(j);
// assign through
if (o) {
*o = j;
} else {
o.emplace(j);
}
The Boost.Optional documentation provides this rationale:
Rebinding semantics for the assignment of initialized optional references has been chosen to provide consistency among initialization states even at the expense of lack of consistency with the semantics of bare C++ references. It is true that optional<U> strives to behave as much as possible as U does whenever it is initialized; but in the case when U is T&, doing so would result in inconsistent behavior w.r.t to the lvalue initialization state.
Imagine optional<T&> forwarding assignment to the referenced object (thus changing the referenced object value but not rebinding), and consider the following code:
optional<int&> a = get();
int x = 1 ;
int& rx = x ;
optional<int&> b(rx);
a = b ;
What does the assignment do?
If a is uninitialized, the answer is clear: it binds to x (we now have another reference to x). But what if a is already initialized? it would change the value of the referenced object (whatever that is); which is inconsistent with the other possible case.
If optional<T&> would assign just like T& does, you would never be able to use Optional's assignment without explicitly handling the previous initialization state unless your code is capable of functioning whether after the assignment, a aliases the same object as b or not.
That is, you would have to discriminate in order to be consistent.
If in your code rebinding to another object is not an option, then it is very likely that binding for the first time isn't either. In such case, assignment to an uninitialized optional<T&> shall be prohibited. It is quite possible that in such a scenario it is a precondition that the lvalue must be already initialized. If it isn't, then binding for the first time is OK while rebinding is not which is IMO very unlikely. In such a scenario, you can assign the value itself directly, as in:
assert(!!opt);
*opt=value;
Lack of agreement on what that line should do meant it was easier to just disallow references entirely, so that most of the value of optional and variant can at least make it for C++17 and start being useful. References could always be added later - or so the argument went.
The fundamental reason is that a reference must be assigned to something.
Unions naturally do not - can not, even - set all their fields simultaneously and therefore simply cannot contain references, from the C++ standard:
If a union contains a non-static data member of reference type the
program is ill-formed.
std::variant is a union with extra data denoting the type currently assigned to the union, so the above statement implicitly holds true for std:variant as well. Even if it were to be implemented as a straight class rather than a union, we'd be back to square one and have an uninitialised reference when a different field was in use.
Of course we can get around this by faking references using pointers, but this is what std::reference_wrapper takes care of.

const casting an int in a class vs outside a class

I read on the wikipedia page for Null_pointer that Bjarne Stroustrup suggested defining NULL as
const int NULL = 0;
if "you feel you must define NULL." I instantly thought, hey.. wait a minute, what about const_cast?
After some experimenting, I found that
int main() {
const int MyNull = 0;
const int* ToNull = &MyNull;
int* myptr = const_cast<int*>(ToNull);
*myptr = 5;
printf("MyNull is %d\n", MyNull);
return 0;
}
would print "MyNull is 0", but if I make the const int belong to a class:
class test {
public:
test() : p(0) { }
const int p;
};
int main() {
test t;
const int* pptr = &(t.p);
int* myptr = const_cast<int*>(pptr);
*myptr = 5;
printf("t.p is %d\n", t.p);
return 0;
}
then it prints "t.p is 5"!
Why is there a difference between the two? Why is "*myptr = 5;" silently failing in my first example, and what action is it performing, if any?
First of all, you're invoking undefined behavior in both cases by trying to modify a constant variable.
In the first case the compiler sees that MyNull is declared as a constant and replaces all references to it within main() with a 0.
In the second case, since p is within a class the compiler is unable to determine that it can just replace all classInstance.p with 0, so you see the result of the modification.
Firstly, what happens in the first case is that the compiler most likely translates your
printf("MyNull is %d\n", MyNull);
into the immediate
printf("MyNull is %d\n", 0);
because it knows that const objects never change in a valid program. Your attempts to change a const object leads to undefined behavior, which is exactly what you observe. So, ignoring the undefined behavior for a second, from the practical point of view it is quite possible that your *myptr = 5 successfully modified your Null. It is just that your program doesn't really care what you have in your Null now. It knows that Null is zero and will always be zero and acts accordingly.
Secondly, in order to define NULL per recommendation you were referring to, you have to define it specifically as an Integral Constant Expression (ICE). Your first variant is indeed an ICE. You second variant is not. Class member access is not allowed in ICE, meaning that your second variant is significantly different from the first. The second variant does not produce a viable definition for NULL, and you will not be able to initialize pointers with your test::p even though it is declared as const int and set to zero
SomeType *ptr1 = Null; // OK
test t;
SomeType *ptr2 = t.p; // ERROR: cannot use an `int` value to initialize a pointer
As for the different output in the second case... undefined behavior is undefined behavior. It is unpredictable. From the practical point of view, your second context is more complicated, so the compiler was unable to prefrom the above optimization. i.e. you are indeed succeeded in breaking through the language-level restrictions and modifying a const-qualified variable. Language specification does not make it easy (or possible) for the compilers to optimize out const members of the class, so at the physical level that p is just another member of the class that resides in memory, in each object of that class. Your hack simply modifies that memory. It doesn't make it legal though. The behavior si still undefined.
This all, of course, is a rather pointless exercise. It looks like it all began from the "what about const_cast" question. So, what about it? const_cast has never been intended to be used for that purpose. You are not allowed to modify const objects. With const_cast, or without const_cast - doesn't matter.
Your code is modifying a variable declared constant so anything can happen. Discussing why a certain thing happens instead of another one is completely pointless unless you are discussing about unportable compiler internals issues... from a C++ point of view that code simply doesn't have any sense.
About const_cast one important thing to understand is that const cast is not for messing about variables declared constant but about references and pointers declared constant.
In C++ a const int * is often understood to be a "pointer to a constant integer" while this description is completely wrong. For the compiler it's instead something quite different: a "pointer that cannot be used for writing to an integer object".
This may apparently seem a minor difference but indeed is a huge one because
The "constness" is a property of the pointer, not of the pointed-to object.
Nothing is said about the fact that the pointed to object is constant or not.
The word "constant" has nothing to do with the meaning (this is why I think that using const it was a bad naming choice). const int * is not talking about constness of anything but only about "read only" or "read/write".
const_cast allows you to convert between pointers and references that can be used for writing and pointer or references that cannot because they are "read only". The pointed to object is never part of this process and the standard simply says that it's legal to take a const pointer and using it for writing after "casting away" const-ness but only if the pointed to object has not been declared constant.
Constness of a pointer and a reference never affects the machine code that will be generated by a compiler (another common misconception is that a compiler can produce better code if const references and pointers are used, but this is total bogus... for the optimizer a const reference and a const pointer are just a reference and a pointer).
Constness of pointers and references has been introduced to help programmers, not optmizers (btw I think that this alleged help for programmers is also quite questionable, but that's another story).
const_cast is a weapon that helps programmers fighting with broken const-ness declarations of pointers and references (e.g. in libraries) and with the broken very concept of constness of references and pointers (before mutable for example casting away constness was the only reasonable solution in many real life programs).
Misunderstanding of what is a const reference is also at the base of a very common C++ antipattern (used even in the standard library) that says that passing a const reference is a smart way to pass a value. See this answer for more details.

Passing non-const references to rvalues in C++

In the following line of code:
bootrec_reset(File(path, size, off), blksize);
Calling a function with prototype:
static void bootrec_reset(File &file, ssize_t blksize);
I receive this error:
libcpfs/mkfs.cc:99:53: error: invalid initialization of non-const reference of type 'File&' from an rvalue of type 'File'
libcpfs/mkfs.cc:30:13: error: in passing argument 1 of 'void bootrec_reset(File&, ssize_t)'
I'm aware that you can not pass non-const references (const &) to rvalues according to the standard. MSVC however allows you to do this (see this question). This question attempts to explain why but the answer makes no sense as he is using references to literals, which are a corner case and should obviously be disallowed.
In the given example it's clear to see that following order of events will occur (as it does in MSVC):
File's constructor will be called.
A reference to the File, and blksize, are pushed on the stack.
bootrec_reset makes use of file.
After returning from bootrec_reset, the temporary File is destroyed.
It's necessary to point out that the File reference needs to be non-const, as it's a temporary handle to a file, on which non-const methods are invoked. Furthermore I don't want to pass the File's constructor arguments to bootrec_reset to be constructed there, nor do I see any reason to manually construct and destroy a File object in the caller.
So my questions are:
What justifies the C++ standard disallowing non-const references in this manner?
How can I force GCC to permit this code?
Does the upcoming C++0x standard change this in anyway, or is there something the new standard gives me that is more appropriate here, for example all that jibberish about rvalue references?
Yes, the fact that plain functions cannot bind non-const references to temporaries -- but methods can -- has always bugged me. TTBOMK the rationale goes something like this (sourced from this comp.lang.c++.moderated thread):
Suppose you have:
void inc( long &x ) { ++x; }
void test() {
int y = 0;
inc( y );
std::cout << y;
}
If you allowed the long &x parameter of inc() to bind to a temporary long copy made from y, this code obviously wouldn't do what you expect -- the compiler would just silently produce code that leaves y unchanged. Apparently this was a common source of bugs in the early C++ days.
Had I designed C++, my preference would have been to allow non-const references to bind to temporaries, but to forbid automatic conversions from lvalues to temporaries when binding to references. But who knows, that might well have opened up a different can of worms...
"What justifies the C++ standard disallowing non-const references in this manner?"
Practical experience with the opposite convention, which was how things worked originally. C++ is to a large degree an evolved language, not a designed one. Largely, the rules that are still there are those that turned out to work (although some BIG exceptions to that occurred with the 1998 standardization, e.g. the infamous export, where the committee invented rather than standardizing existing practice).
For the binding rule one had not only the experience in C++, but also similar experience with other languages such as Fortran.
As #j_random_hacker notes in his answer (which as I wrote this was scored 0, showing that the scoring in SO really doesn't work as a measure of quality), the most serious problems have to do with implicit conversions and overload resolution.
"How can I force GCC to permit this code?"
You can't.
Instead of ...
bootrec_reset(File(path, size, off), blksize);
... write ...
File f(path, size, off);
bootrec_reset(f, blksize);
Or define an appropriate overload of bootrec_reset. Or, if "clever" code appeals, you can in principle write bootrec_reset(tempref(File(path, size, off)), blksize);, where you simply define tempref to return its argument reference appropriately const-casted. But even though that's a technical solution, don't.
"Does the upcoming C++0x standard change this in anyway, or is there something the new standard gives me that is more appropriate here, for example all that jibberish about rvalue references?"
Nope, nothing that changes things for the given code.
If you're willing to rewrite, however, then you can use e.g. C++0x rvalue references, or the C++98 workarounds shown above.
Cheers & hth.,
Does the upcoming C++0x standard change this in anyway, or is there something the new standard gives me that is more appropriate here, for example all that jibberish about rvalue references?
Yes. Since every name is an lvalue, it is almost trivial to treat any expression as if it was an lvalue:
template <typename T>
T& as_lvalue(T&& x)
{
return x;
}
// ...
bootrec_reset(as_lvalue(File(path, size, off)), blksize);
Is a fairly arbitrary decision - non-const references to temporaries are allowed when the temporary is the subject of a method call, for example (e.g. the "swap trick" to free the memory allocated by a vector, std::vector<type>().swap(some_vector);)
Short of giving the temporary a name, I don't think you can.
As far as I'm aware this rule exists in C++0x too (for regular references), but rvalue references specifically exist so you can bind references to temporaries - so changing bootrec_reset to take a File && should make the code legal.
Please note that calling C++0x "jibberish" is not presenting a very favorable picture of your coding ability or desire to understand the language.
1) Is actually not so arbitrary. Allowing non-const references to bind to r-values leads to extremely confusing code. I recently filed a bug against MSVC which relates to this, where the non-standard behavior caused standard-compliant code to fail to compile and/or compile with a deviant behavior.
In your case, consider:
#include <iostream>
template<typename T>
void func(T& t)
{
int& r = t;
++r;
}
int main(void)
{
int i = 4;
long n = 5;
const int& r = n;
const int ci = 6;
const long cn = 7;
//int& r1 = ci;
//int& r2 = cn;
func(i);
//func(n);
std::cout << r << std::endl;
}
Which of the commented lines to you want to compile? Do you want func(i) to change its argument and func(n) to NOT do so?
2) You can't make that code compile. You don't want to have that code. A future version of MSVC is probably going to remove the non-standard extension and fail to compile that code. Instead, use a local variable. You can always use a extra pair of braces to control the lifetime of that local variable and cause it to be destroyed prior to the next line of code, just like the temporary would be. Or r-value references.
{
File ftemp(path, size, off);
bootrec_reset(ftemp, blksize);
}
3) Yes, you can use C++0x r-value references in this scenario.
Alternatively, simply overload.
static void bootrec_reset(File &&file, ssize_t blksize) {
return bootrec_reset(file, blksize);
}
This is the easiest solution.
How can I force GCC to permit this code?
If you own the definition of File then you can try playing tricks such as this one:
class File /* ... */ {
public:
File* operator&() { return this; }
/* ... */
};
/* ... */
bootrec_reset(*&File(path, size, off), blksize);
This compiles for me in c++98 mode.
Does the upcoming C++0x standard change this in anyway, or is there something the new standard gives me that is more appropriate here, for example all that jibberish about rvalue references?
Obviously this the way to go if at all possible.