I noticed that std::ranges::sort cannot sort std::vector<bool>:
<source>:6:51: error: no match for call to '(const std::ranges::__sort_fn) (std::vector<bool, std::allocator<bool> >)'
6 | std::ranges::sort(std::vector{false, true, true});
|
Is this allowed? Should we need a specialization of std::ranges::sort for std::vector<bool>? Is there any information about how the committee consider this?
As an update, now that zip was adopted for c++23, part of that paper added const-assignment to vector<bool>::reference, which allows that that type to satisfy indirectly_writable, and thus std::ranges::sort on a vector<bool> works in C++23.
Correct.
More generally, std::ranges::sort cannot sort proxy references. The direct reason is that sort requires sortable (surprising, right) which if we follow that chain up requires permutable which requires indirectly_movable_storable which requires indirectly_movable which requires indirectly_writable.
And indirectly_writeable is a very peculiar looking concept.
template<class Out, class T>
concept indirectly_writable =
requires(Out&& o, T&& t) {
*o = std::forward<T>(t); // not required to be equality-preserving
*std::forward<Out>(o) = std::forward<T>(t); // not required to be equality-preserving
const_cast<const iter_reference_t<Out>&&>(*o) =
std::forward<T>(t); // not required to be equality-preserving
const_cast<const iter_reference_t<Out>&&>(*std::forward<Out>(o)) =
std::forward<T>(t); // not required to be equality-preserving
};
I want to specifically draw your attention to:
const_cast<const iter_reference_t<Out>&&>(*o) = std::forward<T>(t);
Wait, we require const assignability?
This particular issue has a long history. You can start with #573, in which a user demonstrated this problem:
struct C
{
explicit C(std::string a) : bar(a) {}
std::string bar;
};
int main()
{
std::vector<C> cs = { C("z"), C("d"), C("b"), C("c") };
ranges::sort(cs | ranges::view::transform([](const C& x) {return x.bar;}));
for (const auto& c : cs) {
std::cout << c.bar << std::endl;
}
}
The expectation of course was that it would print b, c, d, z in that order. But it didn't. It printed z, d, b, c. The order didn't change. The reason here is that because this is a range of prvalues, the elements we're swapping as part of the sort. Well, they're temporaries. This has no effect on cs whatsoever.
This obviously can't work. The user has a bug - they intended to sort the Cs by the bars (i.e. use a projection) but instead they're just sorting the bars (even if the lambda returned a reference, they'd be sorting just the bars and not the Cs anyway -- in this case there is only one member of C anyway but in the general case this is clearly not the intended behavior).
But the goal then is really: how do we make this bug not compile? That's the dream. The problem is that C++ added ref-qualifications in C++11, but implicit assignment has always existed. And implicit operator= has no ref-qualifier, you can assign to an rvalue just fine, even if that makes no sense whatsoever:
std::string("hello") = "goodbye"; // fine, but pointless, probably indicative of a bug
Assigning to an rvalue is really only okay if the ravlue itself handles this correctly. Ideally, we could just check to make sure a type has an rvalue-qualified operator=. Proxy types (such as vector<bool>::reference) would then qualify their assignment operators, that's what we would check, and everybody's happy.
But we can't do that - because basically every type is rvalue-assignable, even if very few types actually meaningfully are. So what Eric and Casey came up with is morally equivalent to adding a type trait to a type that says "I am, legitimately, for real, rvalue-assignable." And unlike most type traits where you would do something like:
template <>
inline constexpr bool for_real_rvalue_assignable<T> = true;
This one is just spelled:
T& operator=(Whatever) const;
Even though the const equality operator will not actually be invoked as part of the algorithm. It just has to be there.
You might ask at this point - wait, what about references? For "normal" ranges (say, vector<int>, the iter_reference_t<Out> gives you int&, and const iter_reference_t<Out>&& is... still just int&. That's why this just works. For ranges that yield glvalues, these const-assignment requirements basically duplicate the normal assignment requirements. The const-assignability issue is _only_for prvalues.
This issue was also the driver of why views::zip isn't in C++20. Because zip also yields a prvalue range and a tuple<T&...> is precisely the kind of proxy reference that we would need to handle here. And to handle that, we would have to make a change to std::tuple to allow this kind of const-assignability.
As far as I'm aware, this is still the direction that it's intended (given that we have already enshrined that requirement into a concept, a requirement that no standard library proxy types actually satisfy). So when views::zip is added, tuple<T&...> will be made const-assignable as well as vector<bool>::reference.
The end result of that work is that:
std::ranges::sort(std::vector{false, true, true});
will actually both compile and work correctly.
Related
There's a similar question: check if elements of a range can be moved?
I don't think the answer in it is a nice solution. Actually, it requires partial specialization for all containers.
I made an attempt, but I'm not sure whether checking operator*() is enough.
// RangeType
using IteratorType = std::iterator_t<RangeType>;
using Type = decltype(*(std::declval<IteratorType>()));
constexpr bool canMove = std::is_rvalue_reference_v<Type>;
Update
The question may could be split into 2 parts:
Could algorithms in STL like std::copy/std::uninitialized_copy actually avoid unnecessary deep copy when receiving elements of r-value?
When receiving a range of r-value, how to check if it's a range adapter like std::ranges::subrange, or a container which holds the ownership of its elements like std::vector?
template <typename InRange, typename OutRange>
void func(InRange&& inRange, OutRange&& outRange) {
using std::begin;
using std::end;
std::copy(begin(inRange), end(inRange), begin(outRange));
// Q1: if `*begin(inRange)` returns a r-value,
// would move-assignment of element be called instead of a deep copy?
}
std::vector<int> vi;
std::list<int> li;
/* ... */
func(std::move(vi), li2);
// Q2: Would elements be shallow copy from vi?
// And if not, how could I implement just limited count of overloads, without overload for every containers?
// (define a concept (C++20) to describe those who take ownership of its elements)
Q1 is not a problem as #Nicol Bolas , #eerorika and #Davis Herring pointed out, and it's not what I puzzled about.
(But I indeed think the API is confusing, std::assign/std::uninitialized_construct may be more ideal names)
#alfC has made a great answer about my question (Q2), and gives a pristine perspective. (move idiom for ranges with ownership of elements)
To sum up, for most of the current containers (especially those from STL), (and also every range adapter...), partial specialization/overload function for all of them is the only solution, e.g.:
template <typename Range>
void func(Range&& range) { /*...*/ }
template <typename T>
void func(std::vector<T>&& movableRange) {
auto movedRange = std::ranges::subrange{
std::make_move_iterator(movableRange.begin()),
std::make_move_iterator(movableRange.end())
};
func(movedRange);
}
// and also for `std::list`, `std::array`, etc...
I understand your point.
I do think that this is a real problem.
My answer is that the community has to agree exactly what it means to move nested objected (such as containers).
In any case this needs the cooperation of the container implementors.
And, in the case of standard containers, good specifications.
I am pessimistic that standard containers can be changed to "generalize" the meaning of "move", but that can't prevent new user defined containers from taking advantage of move-idioms.
The problem is that nobody has studied this in depth as far as I know.
As it is now, std::move seem to imply "shallow" move (one level of moving of the top "value type").
In the sense that you can move the whole thing but not necessarily individual parts.
This, in turn, makes useless to try to "std::move" non-owning ranges or ranges that offer pointer/iterator stability.
Some libraries, e.g. related to std::ranges simply reject r-value of references ranges which I think it is only kicking the can.
Suppose you have a container Bag.
What should std::move(bag)[0] and std::move(bag).begin() return? It is really up to the implementation of the container decide what to return.
It is hard to think of general data structures, bit if the data structure is simple (e.g. dynamic arrays) for consistency with structs (std::move(s).field) std::move(bag)[0] should be the same as std::move(bag[0]) however the standard strongly disagrees with me already here: https://en.cppreference.com/w/cpp/container/vector/operator_at
And it is possible that it is too late to change.
Same goes for std::move(bag).begin() which, using my logic, should return a move_iterator (or something of the like that).
To make things worst, std::array<T, N> works how I would expect (std::move(arr[0]) equivalent to std::move(arr)[0]).
However std::move(arr).begin() is a simple pointer so it looses the "forwarding/move" information! It is a mess.
So, yes, to answer your question, you can check if using Type = decltype(*std::forward<Bag>(bag).begin()); is an r-value but more often than not it will not implemented as r-value.
That is, you have to hope for the best and trust that .begin and * are implemented in a very specific way.
You are in better shape by inspecting (somehow) the category of the range itself.
That is, currently you are left to your own devices: if you know that bag is bound to an r-value and the type is conceptually an "owning" value, you currently have to do the dance of using std::make_move_iterator.
I am currently experimenting a lot with custom containers that I have. https://gitlab.com/correaa/boost-multi
However, by trying to allow for this, I break behavior expected for standard containers regarding move.
Also once you are in the realm of non-owning ranges, you have to make iterators movable by "hand".
I found empirically useful to distinguish top-level move(std::move) and element wise move (e.g. bag.mbegin() or bag.moved().begin()).
Otherwise I find my self overloading std::move which should be last resort if anything at all.
In other words, in
template<class MyRange>
void f(MyRange&& r) {
std::copy(std::forward<MyRange>(r).begin(), ..., ...);
}
the fact that r is bound to an r-value doesn't necessarily mean that the elements can be moved, because MyRange can simply be a non-owning view of a larger container that was "just" generated.
Therefore in general you need an external mechanism to detect if MyRange owns the values or not, and not just detecting the "value category" of *std::forward<MyRange>(r).begin() as you propose.
I guess with ranges one can hope in the future to indicate deep moves with some kind of adaptor-like thing "std::ranges::moved_range" or use the 3-argument std::move.
If the question is whether to use std::move or std::copy (or the ranges:: equivalents), the answer is simple: always use copy. If the range given to you has rvalue elements (i.e., its ranges::range_reference_t is either kind(!) of rvalue), you will move from them anyway (so long as the destination supports move assignment).
move is a convenience for when you own the range and decide to move from its elements.
The answer of the question is: IMPOSSIBLE. At least for the current containers of STL.
Assume if we could add some limitations for Container Requirements?
Add a static constant isContainer, and make a RangeTraits. This may work well, but not an elegant solution I want.
Inspired by #alfC , I'm considering the proper behaviour of a r-value container itself, which may help for making a concept (C++20).
There is an approach to distinguish the difference between a container and range adapter, actually, though it cannot be detected due to the defect in current implementation, but not of the syntax design.
First of all, lifetime of elements cannot exceed its container, and is unrelated with a range adapter.
That means, retrieving an element's address (by iterator or reference) from a r-value container, is a wrong behaviour.
One thing is often neglected in post-11 epoch, ref-qualifier.
Lots of existing member functions, like std::vector::swap, should be marked as l-value qualified:
auto getVec() -> std::vector<int>;
//
std::vector<int> vi1;
//getVec().swap(vi1); // pre-11 grammar, should be deprecated now
vi1 = getVec(); // move-assignment since C++11
For the reasons of compatibility, however, it hasn't been adopted. (It's much more confusing the ref-qualifier hasn't been widely applied to newly-built ones like std::array and std::forward_list..)
e.g., it's easy to implement the subscript operator as we expected:
template <typename T>
class MyArray {
T* _items;
size_t _size;
/* ... */
public:
T& operator [](size_t index) & {
return _items[index];
}
const T& operator [](size_t index) const& {
return _items[index];
}
T operator [](size_t index) && {
// not return by `T&&` !!!
return std::move(_items[index]);
}
// or use `deducing this` since C++23
};
Ok, then std::move(container)[index] would return the same result as std::move(container[index]) (not exactly, may increase an additional move operation overhead), which is convenient when we try to forward a container.
However, how about begin and end?
template <typename T>
class MyArray {
T* _items;
size_t _size;
/* ... */
class iterator;
class const_iterator;
using move_iterator = std::move_iterator<iterator>;
public:
iterator begin() & { /*...*/ }
const_iterator begin() const& { /*...*/ }
// may works well with x-value, but pr-value?
move_iterator begin() && {
return std::make_move_iterator(begin());
}
// or more directly, using ADL
};
So simple, like that?
No! Iterator will be invalidated after destruction of container. So deferencing an iterator from a temporary (pr-value) is undefined behaviour!!
auto getVec() -> std::vector<int>;
///
auto it = getVec().begin(); // Noooo
auto item = *it; // undefined behaviour
Since there's no way (for programmer) to recognize whether an object is pr-value or x-value (both will be duduced into T), retrieving iterator from a r-value container should be forbidden.
If we could regulate behaviours of Container, explicitly delete the function that obtain iterator from a r-value container, then it's possible to detect it out.
A simple demo is here:
https://godbolt.org/z/4zeMG745f
From my perspective, banning such an obviously wrong behaviour may not be so destructive that lead well-implemented old projects failing to compile.
Actually, it just requires some lines of modification for each container, and add proper constraints or overloads for range access utilities like std::begin/std::ranges::begin.
I can't seem to get around this specific problem for some time now.
For example if I have the following code:
void foo(std::vector<int>::iterator &it) {
// ...
}
int main(){
std::vector<int> v{1,2,3};
foo(v.begin());
}
I would get compile error:
initial value of reference to non-const must be an lvalue.
And my guess would be that I get the error because a.begin() returns a rvalue.
If so how is it possible that the following expression works:
v.begin()=v.begin()++;
if v.begin() is a rvalue?
The reason is historical. In the initial days of the language, there was simply no way for user code to express that a type's copy-assignment operator should only work on l-values. This was only true for user-defined types of course; for in-built types assignment to an r-value has always been prohibited.
int{} = 42; // error
Consequently, for all types in the standard library, copy-assignment just "works" on r-values. I don't believe this ever does anything useful, so it's almost certainly a bug if you write this, but it does compile.
std::string{} = "hello"s; // ok, oops
The same is true for the iterator type returned from v.begin().
From C++11, the ability to express this was added in the language. So now one can write a more sensible type like this:
struct S
{
S& operator=(S const &) && = delete;
// ... etc
};
and now assignment to r-values is prohibited.
S{} = S{}; // error, as it should be
One could argue that all standard library types should be updated to do the sensible thing. This might require a fair amount of rewording, as well as break existing code, so this might not be changed.
I can see why the auto type in C++11 improves correctness and maintainability. I've read that it can also improve performance (Almost Always Auto by Herb Sutter), but I miss a good explanation.
How can auto improve performance?
Can anyone give an example?
auto can aid performance by avoiding silent implicit conversions. An example I find compelling is the following.
std::map<Key, Val> m;
// ...
for (std::pair<Key, Val> const& item : m) {
// do stuff
}
See the bug? Here we are, thinking we're elegantly taking every item in the map by const reference and using the new range-for expression to make our intent clear, but actually we're copying every element. This is because std::map<Key, Val>::value_type is std::pair<const Key, Val>, not std::pair<Key, Val>. Thus, when we (implicitly) have:
std::pair<Key, Val> const& item = *iter;
Instead of taking a reference to an existing object and leaving it at that, we have to do a type conversion. You are allowed to take a const reference to an object (or temporary) of a different type as long as there is an implicit conversion available, e.g.:
int const& i = 2.0; // perfectly OK
The type conversion is an allowed implicit conversion for the same reason you can convert a const Key to a Key, but we have to construct a temporary of the new type in order to allow for that. Thus, effectively our loop does:
std::pair<Key, Val> __tmp = *iter; // construct a temporary of the correct type
std::pair<Key, Val> const& item = __tmp; // then, take a reference to it
(Of course, there isn't actually a __tmp object, it's just there for illustration, in reality the unnamed temporary is just bound to item for its lifetime).
Just changing to:
for (auto const& item : m) {
// do stuff
}
just saved us a ton of copies - now the referenced type matches the initializer type, so no temporary or conversion is necessary, we can just do a direct reference.
Because auto deduces the type of the initializing expression, there is no type conversion involved. Combined with templated algorithms, this means that you can get a more direct computation than if you were to make up a type yourself – especially when you are dealing with expressions whose type you cannot name!
A typical example comes from (ab)using std::function:
std::function<bool(T, T)> cmp1 = std::bind(f, _2, 10, _1); // bad
auto cmp2 = std::bind(f, _2, 10, _1); // good
auto cmp3 = [](T a, T b){ return f(b, 10, a); }; // also good
std::stable_partition(begin(x), end(x), cmp?);
With cmp2 and cmp3, the entire algorithm can inline the comparison call, whereas if you construct a std::function object, not only can the call not be inlined, but you also have to go through the polymorphic lookup in the type-erased interior of the function wrapper.
Another variant on this theme is that you can say:
auto && f = MakeAThing();
This is always a reference, bound to the value of the function call expression, and never constructs any additional objects. If you didn't know the returned value's type, you might be forced to construct a new object (perhaps as a temporary) via something like T && f = MakeAThing(). (Moreover, auto && even works when the return type is not movable and the return value is a prvalue.)
There are two categories.
auto can avoid type erasure. There are unnamable types (like lambdas), and almost unnamable types (like the result of std::bind or other expression-template like things).
Without auto, you end up having to type erase the data down to something like std::function. Type erasure has costs.
std::function<void()> task1 = []{std::cout << "hello";};
auto task2 = []{std::cout << " world\n";};
task1 has type erasure overhead -- a possible heap allocation, difficulty inlining it, and virtual function table invocation overhead. task2 has none. Lambdas need auto or other forms of type deduction to store without type erasure; other types can be so complex that they only need it in practice.
Second, you can get types wrong. In some cases, the wrong type will work seemingly perfectly, but will cause a copy.
Foo const& f = expression();
will compile if expression() returns Bar const& or Bar or even Bar&, where Foo can be constructed from Bar. A temporary Foo will be created, then bound to f, and its lifetime will be extended until f goes away.
The programmer may have meant Bar const& f and not intended to make a copy there, but a copy is made regardless.
The most common example is the type of *std::map<A,B>::const_iterator, which is std::pair<A const, B> const& not std::pair<A,B> const&, but the error is a category of errors that silently cost performance. You can construct a std::pair<A, B> from a std::pair<const A, B>. (The key on a map is const, because editing it is a bad idea)
Both #Barry and #KerrekSB first illustrated these two principles in their answers. This is simply an attempt to highlight the two issues in one answer, with wording that aims at the problem rather than being example-centric.
The existing three answers give examples where using auto helps “makes it less likely to unintentionally pessimize” effectively making it "improve performance".
There is a flip side to the the coin. Using auto with objects that have operators that don't return the basic object can result in incorrect (still compilable and runable) code. For example, this question asks how using auto gave different (incorrect) results using the Eigen library, i.e. the following lines
const auto resAuto = Ha + Vector3(0.,0.,j * 2.567);
const Vector3 resVector3 = Ha + Vector3(0.,0.,j * 2.567);
std::cout << "resAuto = " << resAuto <<std::endl;
std::cout << "resVector3 = " << resVector3 <<std::endl;
resulted in different output. Admittedly, this is mostly due to Eigens lazy evaluation, but that code is/should be transparent to the (library) user.
While performance hasn't been greatly affected here, using auto to avoid unintentional pessimization might be classified as premature optimization, or at least wrong ;).
Inside an algorithm, I want to create a lambda that accepts an element by reference-to-const:
template<typename Iterator>
void solve_world_hunger(Iterator it)
{
auto lambda = [](const decltype(*it)& x){
auto y = x; // this should work
x = x; // this should fail
};
}
The compiler does not like this code:
Error: »const«-qualifier cannot be applied to »int&« (translated manually from German)
Then I realized that decltype(*it) is already a reference, and of course those cannot be made const. If I remove the const, the code compiles, but I want x = x to fail.
Let us trust the programmer (which is me) for a minute and get rid of the const and the explicit &, which gets dropped due to reference collapsing rules, anyways. But wait, is decltype(*it) actually guaranteed to be a reference, or should I add the explicit & to be on the safe side?
If we do not trust the programmer, I can think two solutions to solve the problem:
(const typename std::remove_reference<decltype(*it)>::type& x)
(const typename std::iterator_traits<Iterator>::value_type& x)
You can decide for yourself which one is uglier. Ideally, I would want a solution that does not involve any template meta-programming, because my target audience has never heard of that before. So:
Question 1: Is decltype(*it)& always the same as decltype(*it)?
Question 2: How can I pass an element by reference-to-const without template meta-programming?
Question 1: no, the requirement on InputIterator is merely that *it is convertible to T (table 72, in "Iterator requirements").
So decltype(*it) could for example be const char& for an iterator whose value_type is int. Or it could be int. Or double.
Using iterator_traits is not equivalent to using decltype, decide which you want.
For the same reason, auto value = *it; does not necessarily give you a variable with the value type of the iterator.
Question 2: might depend what you mean by template meta-programming.
If using a traits type is TMP, then there's no way of specifying "const reference to the value type of an iterator" without TMP, because iterator_traits is the only means to access the value type of an arbitrary iterator.
If you want to const-ify the decltype then how about this?
template<typename Iterator>
void solve_world_hunger(Iterator it)
{
const auto ret_type = *it;
auto lambda = [](decltype(ret_type)& x){
auto y = x; // this should work
x = x; // this should fail
};
}
You might have to capture ret_type in order to use its type, I can't easily check at the moment.
Unfortunately it dereferences the iterator an extra time. You could probably write some clever code to avoid that, but the clever code would end up being an alternative version of remove_reference, hence TMP.
I have read Effective C++ 3rd Edition written by Scott Meyers.
Item 3 of the book, "Use const whenever possible", says if we want to prevent rvalues from being assigned to function's return value accidentally, the return type should be const.
For example, the increment function for iterator:
const iterator iterator::operator++(int) {
...
}
Then, some accidents is prevented.
iterator it;
// error in the following, same as primitive pointer
// I wanted to compare iterators
if (it++ = iterator()) {
...
}
However, iterators such as std::vector::iterator in GCC don't return const values.
vector<int> v;
v.begin()++ = v.begin(); // pass compiler check
Are there some reasons for this?
I'm pretty sure that this is because it would play havoc with rvalue references and any sort of decltype. Even though these features were not in C++03, they have been known to be coming.
More importantly, I don't believe that any Standard function returns const rvalues, it's probably something that wasn't considered until after the Standard was published. In addition, const rvalues are generally not considered to be the Right Thing To Do™. Not all uses of non-const member functions are invalid, and returning const rvalues is blanketly preventing them.
For example,
auto it = ++vec.begin();
is perfectly valid, and indeed, valid semantics, if not exactly desirable. Consider my class that offers method chains.
class ILikeMethodChains {
public:
int i;
ILikeMethodChains& SetSomeInt(int param) {
i = param;
return *this;
}
};
ILikeMethodChains func() { ... }
ILikeMethodChains var = func().SetSomeInt(1);
Should that be disallowed just because maybe, sometimes, we might call a function that doesn't make sense? No, of course not. Or how about "swaptimization"?
std::string func() { return "Hello World!"; }
std::string s;
func().swap(s);
This would be illegal if func() produced a const expression - but it's perfectly valid and indeed, assuming that std::string's implementation does not allocate any memory in the default constructor, both fast and legible/readable.
What you should realize is that the C++03 rvalue/lvalue rules frankly just don't make sense. They are, effectively, only part-baked, and the minimum required to disallow some blatant wrongs whilst allowing some possible rights. The C++0x rvalue rules are much saner and much more complete.
If it is non-const, I expect *(++it) to give me mutable access to the thing it represents.
However, dereferencing a const iterator yields only non-mutable access to the thing it represents. [edit: no, this is wrong too. I really give up now!]
This is the only reason I can think of.
As you rightly point out, the following is ill-formed because ++ on a primitive yields an rvalue (which can't be on the LHS):
int* p = 0;
(p++)++;
So there does seem to be something of an inconsistency in the language here.
EDIT: This is not really answering the question as pointed in the comments. I'll just leave the post here in the case it's useful anyhow...
I think this is pretty much a matter of syntax unification towards a better usable interface. When providing such member functions without differentiating the name and letting only the overload resolution mechanism determine the correct version you prevent (or at least try to) the programmer from making const related worries.
I know this might seem contradictory, in particular given your example. But if you think on most of the use cases it makes sense. Take an STL algorithm like std::equal. No matter whether your container is constant or not, you can always code something like bool e = std::equal(c.begin(), c.end(), c2.begin()) without having to think on the right version of begin and end.
This is the general approach in the STL. Remember of operator[]... Having in the mind that the containers are to be used with the algorithms, this is plausible. Although it's also noticeable that in some cases you might still need to define an iterator with a matching version (iterator or const_iterator).
Well, this is just what comes up to my mind right now. I'm not sure how convincing it is...
Side note: The correct way to use constant iterators is through the const_iterator typedef.