C++20 introduces std::common_reference. What is its purpose? Can someone give an example of using it?
common_reference came out of my efforts to come up with a conceptualization of STL's iterators that accommodates proxy iterators.
In the STL, iterators have two associated types of particular interest: reference and value_type. The former is the return type of the iterator's operator*, and the value_type is the (non-const, non-reference) type of the elements of the sequence.
Generic algorithms often have a need to do things like this:
value_type tmp = *it;
... so we know that there must be some relationship between these two types. For non-proxy iterators the relationship is simple: reference is always value_type, optionally const and reference qualified. Early attempts at defining the InputIterator concept required that the expression *it was convertible to const value_type &, and for most interesting iterators that is sufficient.
I wanted iterators in C++20 to be more powerful than this. For example, consider the needs of a zip_iterator that iterates two sequences in lock-step. When you dereference a zip_iterator, you get a temporary pair of the two iterators' reference types. So, zip'ing a vector<int> and a vector<double> would have these associated types:
zip iterator's reference : pair<int &, double &>
zip iterator's value_type: pair<int, double>
As you can see, these two types are not related to each other simply by adding top-level cv- and ref qualification. And yet letting the two types be arbitrarily different feels wrong. Clearly there is some relationship here. But what is the relationship, and what can generic algorithms that operate on iterators safely assume about the two types?
The answer in C++20 is that for any valid iterator type, proxy or not, the types reference && and value_type & share a common reference. In other words, for some iterator it there is some type CR which makes the following well-formed:
void foo(CR) // CR is the common reference for iterator I
{}
void algo( I it, iter_value_t<I> val )
{
foo(val); // OK, lvalue to value_type convertible to CR
foo(*it); // OK, reference convertible to CR
}
CR is the common reference. All algorithms can rely on the fact that this type exists, and can use std::common_reference to compute it.
So, that is the role that common_reference plays in the STL in C++20. Generally, unless you are writing generic algorithms or proxy iterators, you can safely ignore it. It's there under the covers ensuring that your iterators are meeting their contractual obligations.
EDIT: The OP also asked for an example. This is a little contrived, but imagine it's C++20 and you are given a random-access range r of type R about which you know nothing, and you want to sort the range.
Further imagine that for some reason, you want to use a monomorphic comparison function, like std::less<T>. (Maybe you've type-erased the range, and you need to also type-erase the comparison function and pass it through a virtual? Again, a stretch.) What should T be in std::less<T>? For that you would use common_reference, or the helper iter_common_reference_t which is implemented in terms of it.
using CR = std::iter_common_reference_t<std::ranges::iterator_t<R>>;
std::ranges::sort(r, std::less<CR>{});
That is guaranteed to work, even if range r has proxy iterators.
Related
Consider the following simple range:
struct my_range {
int* begin();
int* end();
const int* data();
};
Although this class has a data() member, according to the definition of contiguous_range in [range.refinements]:
template<class T>
concept contiguous_range =
random_access_range<T> && contiguous_iterator<iterator_t<T>> &&
requires(T& t) {
{ ranges::data(t) } -> same_as<add_pointer_t<range_reference_t<T>>>;
};
ranges::data(t) will directly call the my_range's member function data() and return const int*, but since my_range::begin() returns int*, this makes add_pointer_t<range_reference_t<my_range>> to be int*, so the last requires-clause is not satisfied, so that my_range is not a contiguous_range.
However, when I apply some range adaptors to my_range, it can construct a contiguous_range (goldbot):
random_access_range auto r1 = my_range{};
static_assert(!contiguous_range<my_range>);
contiguous_range auto r2 = r1 | std::views::take(1);
This is because take_view inherits view_interface, and the view_interface::data() only constrains the derived's iterator to be contiguous_iterator.
Since my_range::begin() returns int* which models contiguous_iterator, so view_interface::data() is instantiated and returns to_address(ranges::begin(derived)) which is int*, this makes both take_view::data() and begin() return int*, so r2 satisfies the last requires-clause and models contiguous_range.
Here, the range adaptors seem to refine the range concept of the underlying range, that is, converting a random_access_range to a contiguous_range, which seems to be dangerous since it makes ranges::data(r2) can return a modifiable int* pointer:
std::same_as<const int*> auto d1 = r1.data();
std::same_as<int*> auto d2 = r2.data();
I don't know if this refinement is allowed? Can this be considered a defect of the standard? Or is there something wrong with the definition of my_range?
I would not consider this a defect. The iterator/range model treats iterators as the truth. Several kinds of ranges deliberately have more functionality than just being a pair of iterators of some kind. This is because said functionality is materially useful and users of a range of a particular kind should expect to be able to use it. But this also leaves open the possibility of defining an incoherent range: where the iterator concept is stronger than the range concept because the range lacks certain functionality expected of the iterator concept.
If someone creates an incoherent range type, any functionality that operates on such a range (views, but also algorithms) has 3 options:
Believe the iterators.
Believe the range.
Error out.
Now for algorithms, if the algorithm wants to use the data member, it makes sense that it will believe the range. That is where the member is after all.
But for a view, does it make sense to believe the range? Views don't store copies of ranges. After all, a range can be a container. They instead store and operate on iterators and sentinels. They therefore treat ranges as just a way to get an iterator/sentinel pair.
When a view defines itself as a particular range type, it therefore manufactures the ancillary functional of the range from what it stores: the iterator/sentinel pair. And most such things are pretty simple to manufacture. The data member of a contiguous_range can be manufactured by using std::to_address (a requirement of being a contiguous_iterator) on the result of begin().
So when given an incoherent range, it would actually be harder to filter such things out based on the original range type. Particularly in light of view_interface, which only sees the new view type, not the range it is built from.
After all, not all views are built from other ranges. iota_view is a view, but it isn't built from anything. But its iterators are random access; so too is its range. single_view is likewise not built from a "range"; it treats a single object as a single-element contiguous range. And subrange is built from an iterator/sentinel pair, not a range.
So either views built from other ranges would have to have their own special view_interface... or you create circumstances where views are just better than their original ranges. Or you error out.
It should also be noted that the current behavior is 100% safe. No code will be functionally broken by having a view be stronger than the range it was built from. After all, your not-quite-contiguous_range type still provides non-const access to the elements. The user just has to work a bit harder for it.
I have trouble finding the semantics of the reference type trait of an iterator. Let's say I want to implement a chunk iterator, that, given a position into a range, will give me chunks of that range:
template<class T, int N>
class chunk_iterator {
public:
using reference = std::span<T,N>;
chunk_iterator(T* ptr): ptr(ptr) {}
chunk_iterator operator++() { ptr += N; return *this; }
reference operator*() const { return {ptr,N}; }
private:
T* ptr;
};
The problem that I see here is that std::span is a view-like thing, but it does not behave like a reference (say a std::array<T,N>& in this case). In particular, if I assign to a span, the assignement is shallow, it will not copy the value.
Is std::span a valid iterator::reference type? Are view and reference semantics explained in detail somewhere?
What should I do to solve my problem? Implement a span_ref with proper reference semantics? It it already implemented in some library? Is a non-native reference type even allowed?
(note: solving the problem by storing a std::array<T,N> and returning a std::array<T,N>& in operator* is doable, but ugly, and if N is not known at compile time, storing instead a std::vector<T> with dynamic memory allocation is just plain wrong)
When talking about standard-compliant iterators, it depends on several things.
For conforming Iterators, it almost doesn't matter what the reference type is because the standard does not require any usage semantics for the reference type. But that also means nobody except you knows how to use your iterator.
For conforming Input Iterators, the reference type must meet the semantics specified. Notice that for LegacyInputIterator, the expression *it must be a reference that is usable as a reference with all the normal semantics, otherwise code that uses your iterator will not behave as expected. This means reading from a reference is akin to reading from a built-in reference. In particular, the following should do "normal" things:
auto value = *itr; // this should read a value
In this situation, a view type like span wouldn't work because span is more like a pointer than a reference: in the above snippet value would be a span, not whatever the span refers to.
For conforming Output Iterators, the reference type has no requirements. In fact, standard LegacyOutputIterators like std::back_insert_iterator have void as a reference type.
For conforming Forward Iterators and above, the standard actually requires the reference be a built-in reference. This is to support uses like below:
auto& ref = *itr;
auto ptr = &ref; // this must create a pointer pointing to the original object
auto ref2 = *ptr; // this must create a second, equivalent reference
auto other = std::move( ref ); // this must do a "move", which may be the same as a copy
ref = other; // this must assign "other"'s value back into the referred-to object
If the above didn't work correctly, many of the standard algorithms wouldn't be possible to write generically.
Speaking to span specifically, it acts more like a pointer than a reference logically. It can be re-assigned to point to something else. Taking its address creates a pointer to the span, not a pointer to the container being spanned over. Calling std::move on a span copies the span, and doesn't move the contents of the spanned range. A built-in reference T& will only refer to one thing ever once it's been created.
Creating a non-conforming reference that actually works with standard algorithms would involve a family of types overloading operator*, operator->, and operator&, operator=, and std::move, and modeling pointers, lvalue references, and rvalue references.
The meaning of an iterator's reference type cannot be understood without comprehending its relationship to the iterator's value_type. An iterator is a construct that represents a position within a sequence of value_types. A reference is a mediator within this paradigm; it is a thing that acts like a value_type (const) &. Until you figure out what your value_type is going to be, you can't decide what your reference will need to look like.
What "acts like" means depends on what kind of iterator we're talking about.
For C++11, the InputIterator category requires that reference be a type which is implicitly convertible to a value_type. For the OutputIterator category, reference is required to be a type which is assignable from a value_type.
For all of the more restricted iterator categories (ForwardIterator and above), reference is required to be exactly one of value_type & (if you can write to the sequence) or value_type const & (if you can only read from the sequence).
Iterators where reference is not a value_type (const) & are often called proxy iterators, as the reference type typically acts as a "proxy" for the actual data stored in the sequence (assuming the iterator isn't just inventing values to begin with). Proxy iterators are often used for cases where the iterator doesn't iterate over a range of actual value_types, but simply pretends to. This could be the bitwise iterators of vector<bool> or an iterator that iterates over the sequence of integers on some half-open range [0, N).
But proxy iterator references have to act like language references to one degree or another. InputIterator references have to be implicitly convertible to the value_type. span<T, N> is not implicitly convertible to array<T, N> or any other container type that would be appropriate for a value_type. OutputIterator references have to be assignable from value_type. And while span<T, N> may be assignable from an array<T, N>, the assignment operation doesn't have the same meaning. To assign to an OutputIterator's reference ought to change the values stored within the sequence. And this doesn't.
In any case, you first need to invent a value_type that does what you need it to do. Then you need to build a proper reference type that acts like a reference. Lastly... well, you can't make your iterator a ForwardIterator or higher, because C++11 doesn't support proxy iterators of the most useful iterator categories. C++20's new formulation of iterators allows proxy iterators for anything that isn't a contiguous_iterator.
cppreference says that the iterators for the vector<bool> specialization are implementation defined and many not support traits like ForwardIterator (and therefore RandomAccessIterator).
cplusplus adds a mysterious "most":
The pointer and iterator types used by the container are not
necessarily neither pointers nor conforming iterators, although they
shall simulate most of their expected behavior.
I don't have access to the official specification. Are there any iterator behaviors guaranteed for the vector<bool> iterators?
More concretely, how would one write standards-compliant code to insert an item in the middle of a vector<bool>? The following works on several compilers that I tried:
std::vector<bool> v(4);
int k = 2;
v.insert(v.begin() + k, true);
Will it always?
The fundamental problem with vector<bool>'s iterators is that they are not ForwardIterators. C++14 [forward.iterators]/1 requires that ForwardIterators' reference type be T& or const T&, as appropriate.
Any function which takes a forward iterator over a range of Ts is allowed to do this:
T &t = *it;
t = //Some value.
However, vector<bool>'s reference types are not bool&; they're a proxy object that is convertible to and assignable from a bool. They act like a bool, but they are not a bool. As such, this code is illegal:
bool &b = *it;
It would be attempting to get an lvalue reference to a temporary created from the proxy object. That's not allowed.
Therefore, you cannot use vector<bool>'s iterators in any function that takes ForwardIterators or higher.
However, your code doesn't necessarily have to care about that. As long as you control what code you pass those vector<bool> iterators to, and you don't do anything that violates how they behave, then you're fine.
As far as their interface is concerned, they act like RandomAccessIterators, except for when they don't (see above). So you can offset them with integers with constant time complexity and so forth.
vector<bool> is fine, so long as you don't treat it like a vector that contains bools. Your code will work because it uses vector<bool>'s own interface, which it obviously accepts.
It would not work if you passed a pair of vector<bool> iterators to std::sort.
C++14 [vector.bool]/2:
Unless described below, all operations have the same requirements and
semantics as the primary vector template, except that operations
dealing with the bool value type map to bit values in the container
storage and allocator_traits::construct (20.7.8.2) is not used to
construct these values.
Since C++14 (N3657) member function templates find, count, lower_bound, upper_bound, and equal_range of associative containers support heterogeneous comparison lookup but at and operator[] don't have those equivalent member function templates. Why is that so ?
Example :
std::map<std::string, int, std::less<>> m;
// ...
auto it = m.find("foo"); // does not construct an std::string
auto& v = m.at("foo"); // construct an std::string
There are no logical reasons in principle for it. For example for operator[] a reasonable semantic could be
If the passed value is comparable with key_type do the search using it and convert to key_type only if needed (i.e. if the element is not found and the container is neither const nor accessed using a const reference).
If in the case before the passed type is not convertible to key_type the use of operator[] should just not compile (like it happens now)
If the passed type cannot be compared with key_type but can be converted to key_type then a temporary should be created immediately to do the search and possibly the insertion (like it's now).
Of course there should be a requirement to have x < y for a T element x and a key_type element y if and only if key_type(x) < y because otherwise semantic would be nonsense (like it would be nonsense for example to have operator< to return a value based on a random source).
Unfortunately C++ template machinery is at the same time extremely complex and extremely weak and implementing the conversion to key_type for operator[] only when really necessary is probably more complex than it seems.
This machinery is however what the C++ community decided to condemn itself to use for metaprogramming and until someone manages out to get a decent implementation using only that, this reasonable requirement is probably not going to be in the standard (in the past it happened that the standard mandated things that were fuzzily defined and/or basically impossible to implement like template export, and it wasn't funny).
I've recently begun to prefer the free functions std::next and std::prev to explicitly copying and incrementing/decrementing iterators. Now, I am seeing weird behavior in a pretty specific case, and I would appreciate any help demystifying it.
I have an interpolation/extrapolation function operating on a boost::any_range of some X_type. The full definition of the range type is:
boost::any_range <
const X_type,
boost::random_access_traversal_tag,
const X_type,
std::ptrdiff_t
>
The any_range, in this particular case, is assigned from an iterator_range holding two pointers to const X_type, which serves as an X_type view of about half of the data() area of a vector<char>.
Compiling my application in MSVC 2010, everything works just fine.
Compiling the same code in MinGW g++ 4.7.0, it seemed to hang in one particular location, which I've then narrowed down to this (slightly abbreviated):
// Previously ensured conditions:
// 1) xrange is nonempty;
// 2) yrange is the same size as xrange.
auto x_equal_or_greater =
std::lower_bound(std::begin(xrange),std::end(xrange),xval);
if (x_equal_or_greater == std::end(xrange))
{
return *yit_from_xit(std::prev(x_equal_or_greater),xrange,yrange);
}
Stepping through the code in gdb, I found out it wasn't getting stuck, just taking a very long time to return from the single std::prev call - which in libstdc++ is implemented in terms of std::advance and ultimately the += operator.
By merely replacing the return line with:
auto xprev=x_equal_or_greater;
--xprev;
return *yit_from_xit(xprev,xrange,yrange);
Performance is great again, and there's virtually no delay.
I am aware of the overhead of using type-erased iterators (those of any_range), but even so, are the two cases above really supposed to carry such different costs? Or am I doing something wrong?
Okay, after responding to SplinterOfChaos's comment, I realized something. The problem is in your use of the any_range. In particular, the 3rd argument, which indicates that the Reference argument is a const int. In the boost iterator facade, when the reference is not a real reference, it will either use std::input_iterator_tag, or not provide an STL equivalent tag.
It has to do with the fact, that strictly speaking, all forward, bidirectional, and random access STL iterators must use a real reference for their reference type. From 24.2.5 of the C++11 standard:
A class or a built-in type X satisfies the requirements of a forward iterator if
— X satisfies the requirements of an input iterator (24.2.3),
— X satisfies the DefaultConstructible requirements (17.6.3.1),
— if X is a mutable iterator, reference is a reference to T; if X is a const iterator, reference is a reference to const T,
— the expressions in Table 109 are valid and have the indicated semantics, and
— objects of type X offer the multi-pass guarantee, described below.
In this case, it's returning an std::input_iterator_tag when queried for its iterator_category, which causes the call to std::prev() veer into Undefined Behavior.
Either way, the solution is to change (if possible) your use of boost::any_range to the following:
boost::any_range <
const X_type,
boost::random_access_traversal_tag,
const X_type&,
std::ptrdiff_t
>
This will cause it to have an iterator_category of std::random_access_iterator_tag, and will perform the operation as you originally expected.