I wanted to ask why as_const forbids rvalue arguments, according to cppreference.com (i.e. why the Standards folks made it so, not why cppreference.com specifically quoted them on that. And also not where in the spec the intent of the committee is codified, just for making sure :))). This (artificial) example would yield an error (user wants to make it const to keep COW quiet)
QChar c = as_const(getQString())[0];
Another question's answer notes that if we just remove the deletion of the rvalue reference overload, it would silently transform rvalues to lvalues. Right, but why not handle rvalues gracefully and return const rvalues for rvalue input and const lvalues for lvalue input?
The problem is to handle lifetime extension
const auto& s = as_const(getQString()); // Create dangling pointer
QChar c = s[0]; // UB :-/
A possibility would be the following overload (instead of the deleted one)
template< typename T >
const T as_const(T&& t) noexcept(noexcept(T(std::forward<T>(t))))
{
return std::forward<T>(t);
}
which involves extra construction, and maybe other pitfalls.
One reason might be that it could be dangerous on rvalues due to lack of ownership transfer
for (auto const &&value : as_const(getQString())) // whoops!
{
}
and that there might not be a compelling use case to justify disregarding this possibility.
(I accidentally answered the wrong question to a related question of this Q&A, after mis-reading it; I'm moving my answer to this question instead, the question which my answer actually addressed)
P0007R1 introduced std::as_const as part of C++17. The accepted proposal did not mention rvalues at all, but the previous revision of it, P0007R0, contained a closing discussion on rvalues [emphasis mine]:
IX. Further Discussion
The above implementation only supports safely re-casting an l-value as
const (even if it may have already been const). It is probably
desirable to have xvalues and prvalues also be usable with as_const,
but there are some issues to consider.
[...]
An alternative implementation which would support all of the forms
used above, would be:
template< typename T >
inline const T &
as_const( const T& t ) noexcept
{
return t;
}
template< typename T >
inline const T
as_const( T &&t ) noexcept( noexcept( T( t ) ) )
{
return t;
}
We believe that such an implementation helps to deal with lifetime
extension issues for temporaries which are captured by as_const, but
we have not fully examined all of the implications of these forms. We
are open to expanding the scope of this proposal, but we feel that
the utility of a simple-to-use as_const is sufficient even without the
expanded semantics.
So std::as_const was basically added only for lvalues as the implications of implementing it for rvalues were not fully examined by the original proposal, even if the return by value overload for rvalue arguments was at least visited. The final proposal, on the other hand, focused on getting the utility in for the common use case of lvalues.
P2012R0 aims to address the hidden dangers of range-based for loops
Fix the range‐based for loop, Rev0
The range-based for loop became the most important control structure
of modern C++. It is the loop to deal with all elements of a
container/collection/range.
However, due to the way it is currently defined, it can easily
introduce lifetime problems in non-trivial but simple applications
implemented by ordinary application programmers.
[...]
The symptom
Consider the following code examples when iterating over elements of
an element of a collection:
std::vector<std::string> createStrings(); // forward declaration
…
for (std::string s : createStrings()) … // OK
for (char c : createStrings().at(0)) … // UB (fatal runtime error)
While iterating over a temporary return value works fine, iterating
over a reference to a temporary return value is undefined behavior.
[...]
The Root Cause for the problem
The reason for the undefined behavior above is that according to the
current specification, the range-base for loop internally is expanded
to multiple statements: [...]
And the following call of the loop:
for (int i : createOptInts().value()) … // UB (fatal runtime error)
is defined as equivalent to the following:
auto&& rg = createOptInts().value(); // doesn’t extend lifetime of returned optional
auto pos = rg.begin();
auto end = rg.end();
for ( ; pos != end; ++pos ) {
int i = *pos;
…
}
By rule, all temporary values created during the initialization of the
reference rg that are not directly bound to it are destroyed before
the raw for loop starts.
[...]
Severity of the problem
[...]
As another example for restrictions caused by this problem consider
using std::as_const() in a range-based for loop:
std::vector vec; for (auto&& val : std::as_const(getVector())) {
… }
Both std::ranges with operator | and std::as_const() have a
deleted overload for rvalues to disable this and similar uses. With
the proposed fix things like that could be possible. We can definitely
discuss the usability of such examples, but it seems that there are
more example than we thought where the problem causes to =delete
function calls for rvalues.
These gotchas is one argument to avoid allowing an std::as_const() overload for rvalues, but if P2012R0 gets accepted, such an overload could arguably be added (if someone makes a proposal and shows a valid use case for it).
Because as_const doesn't take the argument as const reference.
Non-const lvalue references can't bind to temporaries.
The forwarding reference overload is explicitly deleted.
Related
Many times I saw code like this:
template<typename Collection>
void Foo(Collection&& c)
{
for (auto&& i : std::forward<Collection>(c))
// do something with i
}
For all STL containers (except vector<bool>) i has type of lvalue reference. Is any practical sense to type auto&& in this case?
As you said, the perfect example of where you don't have an lvalue is std::vector<bool>. Using auto& will not compile, as a prvalue is returned from the iterator.
Also, it happened to me some times to make ranges that did not returned an lvalue from its iterator.
Also, the upside of using auto&& is that there is no cases where it won't work. Even of you have a bizarre case where your iterator yield a const rvalue reference, auto&& will bind to it.
For teaching, it's also easier to tell "use auto&& in your for loops." because it will not cause copy and work everywhere.
There where also a proposal to allow implicit auto&& and enable the syntax for (x : range) (I cannot remember which one is it. If you know it, please tell me in the comments)
What happens if to a movable object if I call
std::set<>::insert on it, and the insertion doesn't take place
because there is already an object with the same key present in
the set. In particular, is the following valid:
struct Object
{
std::string key;
// ...
struct OrderByKey
{
bool operator()( Object const& lhs, Object const& rhs) const
{
return lhs.key < rhs.key;
}
};
Object( Object&& other )
: key(std::move(other.key))
// ...
{
}
};
and:
std::set<Object, Object::OrderByKey> registry;
void
register(Object&& o)
{
auto status = registry.insert(std::move(o));
if (!status.second)
throw std::runtime_error("Duplicate entry for " + o.key);
}
After the registry.insert, when no insertion takes place, has
o been moved or not. (If it might have been moved, I need to
save a copy of the string before hand, in order to use it in the
error message. (And yes, I know that I can always write:
throw std::runtime_error( "Duplicate entry for " + status->first.key );
Which is what I'll probably do. But I would still like to know
what the standard says about this.)
A std::move()ed object passed to a standard library function should be considered to be moved from: the library is free to consider the object to be its only reference, i.e., it may move from it even if it doesn't use it. The relevant clause is 17.6.4.9 [res.on.arguments] paragraph 2, third bullet (it seems identical in C++11 and C++14):
If a function argument binds to an rvalue reference, the implementation may assume that this parameter is a unique reference to this argument. ...
This is (very simmilar to) LWG Active issue 2362 which deals with emplace. IIRC the sentiment is that it should be guaranteed that the object is only moved iff it is inserted/emplaced. With emplace it seems to be not trivial to achieve. I do not remember if the situation is easier for insert.
The third bullet of [res.on.arguments]/1 from N3936:
If a function argument binds to an rvalue reference parameter, the implementation may assume that
this parameter is a unique reference to this argument. [ Note: If the parameter is a generic parameter of the form T&& and an lvalue of type A is bound, the argument binds to an lvalue reference (14.8.2.1) and thus is not covered by the previous sentence. —end note ] [ Note: If a program casts an lvalue to an xvalue while passing that lvalue to a library function (e.g. by calling the function with the argument move(x)), the program is effectively asking that function to treat that lvalue as a temporary. The implementation is free to optimize away aliasing checks which might be needed if the argument was an lvalue. —end note ]
despite on its face being about aliasing, can be interpreted as allowing the implementation to do virtually anything to an argument that is passed to a standard library function bound to an rvalue reference. As Fabio says, there are situations in which tightening this specification could be useful, and the committee is looking at some of them.
Well, to answer your last question
I would still like to know what the standard says about this
the standard specifies what happen with insert(t) in Table 102 - Associative container requirements, p717 of N3376 (emphasis mine):
Requires: If t is a non-const rvalue expression, value_type shall be MoveInsertable into X; otherwise, value_type shall be CopyInsertable into X.
Effects: Inserts t if and only if there is no element in the container with key equivalent to the key of t. The bool component of the returned pair is true if and only if the insertion takes place, and the iterator component of the pair points to the element with key equivalent to the key of t.
so I'd say that nothing should happen and that your Object is still valid and not unspecified state.
EDIT: As someone pointed out in the comment, this may actually depends on the implementation as it requires explicitly that the set is not modified, but states no explicit requirement on the argument and whether it's been moved or not.
In this article it says the following code is valid C++11 and works with GNU's libstdc++:
int n;
std::vector<int> v;
...
std::function<bool(int)> f(std::cref([n](int i) {return i%n == 0));
std::count_if(v.begin(), v.end(), f);
The thing is that I aways believed the lambda object to be created at call site, what would make it a temporary object in this snippet, since it is not being stored on any variable, but instead a const reference to it is being created and passed to the std::function. If that is so, the lambda object should have been destroyed right alway, leaving a dangling reference inside f, that would lead to undefined behavior when used by std::count_if.
Assuming the article is not wrong, what is wrong about my mental model? When the lambda object is destructed?
OK, let's start with the basics: the above code is certainly not legal because it is ill-formed in some rather basic ways. The line
std::function<bool(int)> f(std::cref([n](int i) {return i%n == 0));
would at bare minimum need to be written as
std::function<bool(int)> f(std::cref([n](int i) {return i%n == 0;}));
Note that the code was written in the Dr.Dobb's article as it was in the question, i.e., any statement of the code being legal is already quite questionable.
Once the simple syntax errors are resolved the next question is whether std::cref() can actually be used to bind to an rvalue. The lambda exrpession is clearly a temporary according to 5.1.2 [expr.prim.lambda] paragraph 2 (thanks to DyP for the reference). Since it would generally be a rather bad idea to bind a reference to a temporary and is prohibited elsewhere, std::cref() would be a way to circumvent this restriction. It turns out that according to 20.10 [function.objects] paragraph 2 std::cref() is declared as
template <class T> reference_wrapper<const T> cref(const T&) noexcept;
template <class T> void cref(const T&&) = delete;
template <class T> reference_wrapper<const T> cref(reference_wrapper<T>) noexcept;
That is, the statement is incorrect even after correcting the syntax errors. Neither gcc nor clang compile this code (I have used fairly recent versions of both compilers with their respective standard C++ libraries). That is, based on the above declaration this code is clearly illegal!
Finally, there is nothing which would extend the life-time of the temporary in the above expression. The only reason the life-time of a temporary is extended is when it or one of its data members is immediately bound to a [const] reference. Wrapping a function call around the temporary inhibits this life-time extensions.
In summary: the code quoted in the article is not legal on many different levels!
I'm the author of the aforementioned article and I apologise for my mistake. No one else is to blame in this case. I've just ask the editor to add an errata:
1) Replace
std::count_if(v.begin(), v.end(), std::cref(is_multiple_of(n)));
with
is_multiple_of f(n);
std::count_if(v.begin(), v.end(), std::cref(f));
2) Replace
std::count_if(v.begin(), v.end(), std::cref([n](int i){return i%n == 0;}));
with
auto f([n](int i){return i%n == 0;});
std::count_if(v.begin(), v.end(), std::cref(f));
3) Replace
std::function<bool(int)> f(std::cref([n](int i) {return i%n == 0));
with
auto f1([n](int i){return i%n == 0;});
std::function<bool(int)> f(std::cref(f1));
In all cases the problem is the same (as Dietmar Kühl has nicely explained, +1 to him): we are calling std::cref on a temporary. This function returns a std::reference_wrapper storing a pointer to the temporary and this pointer will dangle if the std::reference_wrapper outlives the temporary. Basically this is what happens in case 3 above (which also contains a typo).
In cases 1 and 2, the std::reference_wrapper would not outlive the temporary. However, since the overloads of std::cref accepting temporaries (rvalues) are deleted the code should not compile (including case 3). At the time of publication, the implementations were not up to date with the Standard as they are today. The code used to compile but it doesn't when used with newer implementations of the standard library. This is not an excuse for my mistake though.
In any case, I believe the main point of the article, that is, the use of std::reference_wrapper, std::cref and std::ref to avoid expensive copies and dynamic allocations is still valid provided, of course, that the lifetime of the referred object is long enough.
Again, I apologise for the inconvenience.
Update: The article has been fixed. Thanks uk4321, DyP and, especially, lvella and Dietmar Kühl for raising and discussing the issue.
My class' interface includes an accessor to an object which may not exist. Currently, it returns a pointer which may be null. I would like to replace the pointer with std::optional as suggested here. The accessor has a const overload which uses Meyers' const_cast trick to avoid repeating the same code twice.
In short, I want to replace this:
T const * MyClass::get() const {
/* non-trivial */
}
T * MyClass::get() {
return const_cast<T *>(const_cast<MyClass const *>(this)->get());
}
with this:
std::optional<T const &> MyClass::get() const {
/* non-trivial */
}
std::optional<T &> MyClass::get() {
auto t = const_cast<MyClass const *>(this)->get();
return t ? std::optional<T &>(const_cast<T &>(* t)) : std::nullopt;
}
The replacement seems unsatisfactory because:
it introduces a branch;
the additional complexity somewhat defeats the goal of making the overload be lightweight (and trivially optimized away by the compiler).
I am assuming that the std::optional specialization for a reference can basically boil down to little more than a pointer with added safety and wonder therefore if there's some way to preserve the simplicity of the pointer solution. Is there a more satisfactory way to write the accessor overload to use std::optional?
As mentioned by others, instantiating std::optional with reference type is ill-formed in the c++14 standard. (See N3690 20.6.2.) Thus using std::optional as a drop-in replacement for a pointer (that points to a single object whose absence is represented by a value of nullptr) is not viable unless you are willing to copy the object by value, rather than by reference.
However, the specification leaves the door open to adding such functionality in the future. Additionally, section 7.15 of N3672 suggests a workaround using std::reference_wrapper.
Update: Additionally, #HowardHinnant informs me that inclusion in the standard has been punted out of c++14 altogether.
I have read Effective C++ 3rd Edition written by Scott Meyers.
Item 3 of the book, "Use const whenever possible", says if we want to prevent rvalues from being assigned to function's return value accidentally, the return type should be const.
For example, the increment function for iterator:
const iterator iterator::operator++(int) {
...
}
Then, some accidents is prevented.
iterator it;
// error in the following, same as primitive pointer
// I wanted to compare iterators
if (it++ = iterator()) {
...
}
However, iterators such as std::vector::iterator in GCC don't return const values.
vector<int> v;
v.begin()++ = v.begin(); // pass compiler check
Are there some reasons for this?
I'm pretty sure that this is because it would play havoc with rvalue references and any sort of decltype. Even though these features were not in C++03, they have been known to be coming.
More importantly, I don't believe that any Standard function returns const rvalues, it's probably something that wasn't considered until after the Standard was published. In addition, const rvalues are generally not considered to be the Right Thing To Do™. Not all uses of non-const member functions are invalid, and returning const rvalues is blanketly preventing them.
For example,
auto it = ++vec.begin();
is perfectly valid, and indeed, valid semantics, if not exactly desirable. Consider my class that offers method chains.
class ILikeMethodChains {
public:
int i;
ILikeMethodChains& SetSomeInt(int param) {
i = param;
return *this;
}
};
ILikeMethodChains func() { ... }
ILikeMethodChains var = func().SetSomeInt(1);
Should that be disallowed just because maybe, sometimes, we might call a function that doesn't make sense? No, of course not. Or how about "swaptimization"?
std::string func() { return "Hello World!"; }
std::string s;
func().swap(s);
This would be illegal if func() produced a const expression - but it's perfectly valid and indeed, assuming that std::string's implementation does not allocate any memory in the default constructor, both fast and legible/readable.
What you should realize is that the C++03 rvalue/lvalue rules frankly just don't make sense. They are, effectively, only part-baked, and the minimum required to disallow some blatant wrongs whilst allowing some possible rights. The C++0x rvalue rules are much saner and much more complete.
If it is non-const, I expect *(++it) to give me mutable access to the thing it represents.
However, dereferencing a const iterator yields only non-mutable access to the thing it represents. [edit: no, this is wrong too. I really give up now!]
This is the only reason I can think of.
As you rightly point out, the following is ill-formed because ++ on a primitive yields an rvalue (which can't be on the LHS):
int* p = 0;
(p++)++;
So there does seem to be something of an inconsistency in the language here.
EDIT: This is not really answering the question as pointed in the comments. I'll just leave the post here in the case it's useful anyhow...
I think this is pretty much a matter of syntax unification towards a better usable interface. When providing such member functions without differentiating the name and letting only the overload resolution mechanism determine the correct version you prevent (or at least try to) the programmer from making const related worries.
I know this might seem contradictory, in particular given your example. But if you think on most of the use cases it makes sense. Take an STL algorithm like std::equal. No matter whether your container is constant or not, you can always code something like bool e = std::equal(c.begin(), c.end(), c2.begin()) without having to think on the right version of begin and end.
This is the general approach in the STL. Remember of operator[]... Having in the mind that the containers are to be used with the algorithms, this is plausible. Although it's also noticeable that in some cases you might still need to define an iterator with a matching version (iterator or const_iterator).
Well, this is just what comes up to my mind right now. I'm not sure how convincing it is...
Side note: The correct way to use constant iterators is through the const_iterator typedef.