Const correctness in generic functions using iterators

Const correctness in generic functions using iterators - c++

I want to write a generic functions that takes in a sequence, while guaranteeing to not alter said sequence.
template<typename ConstInputIter, typename OutputIter>
OutputIter f(ConstInputIter begin, ConstInputIter end, OutputIter out)
{
InputIter iter = begin;
do
{
*out++ = some_operation(*iter);
}while(iter!=end);
return out;
}
Yet the above example still would take any type as ConstInputIterator, not just const ones. So far, the notion towards being const in it is nominal.
How do I declare the sequence given will not be altered by this function?

Even in C++20, there is no generic way to coerce an iterator over a non-const T into an iterator over a T const. Particular iterators may have a mechanism to do that, and you can use std::cbegin/cend for ranges to get const iterators. But given only an iterator, you are at the mercy of what the user provides.
Applying a C++20 constraint (requiring iter_value_t to be const) is the wrong thing, as your function should be able to operate on a non-const range.

Since c++17, you can use std::as_const every time you deference your iterator:
// ...
*out++ = some_operation(std::as_const(*iter));
// ...
This makes sure you never access the elements of the underlying container through a non-const l-value reference.
Note: if you don't have access to c++17, it's pretty trivial to implement your own version of std::as_const. Just make sure you declare it outside of namespace std.

Related

Why can't I use istream_view and std::accumulate to sum up my input?

This program:
#include <ranges>
#include <numeric>
#include <iostream>
int main() {
auto rng = std::ranges::istream_view<int>(std::cin);
std::cout << std::accumulate(std::ranges::begin(rng), std::ranges::end(rng), 0);
}
is supposed to sum up all integers appearing as text on the standard input stream. But - it doesn't compile. I know std::ranges::begin() and std::ranges::end() exist, so what's going on? The compiler tells me it can't find a suitable candidate; why?

From inception up through C++17, everything in <algorithm> is based on iterator pairs: you have one iterator referring to the beginning of a range and one iterator referring to the end of the range, always having the same type.
In C++20, this was generalized. A range is now denoted by an iterator and a sentinel for that iterator - where the sentinel itself need not actually be an iterator of any kind, it just needs to be a type that can compare equal to its corresponding iterator (this is the sentinel_for concept).
C++17 ranges tend to be† valid C++20 ranges, but not necessarily in the opposite direction. One reason is the ability to have a distinct sentinel type, but there are others, which also play into this question (see below).
To go along with the new model, C++20 added a large amount of algorithms into the std::ranges namespace that take an iterator and a sentinel, rather than two iterators. So for instance, while we've always had:
template<class InputIterator, class T>
constexpr InputIterator find(InputIterator first, InputIterator last,
const T& value);
we now also have:
namespace ranges {
template<input_iterator I, sentinel_for<I> S, class T, class Proj = identity>
requires indirect_binary_predicate<ranges::equal_to, projected<I, Proj>, const T*>
constexpr I find(I first, S last, const T& value, Proj proj = {});
template<input_range R, class T, class Proj = identity>
requires indirect_binary_predicate<ranges::equal_to,
projected<iterator_t<R>, Proj>, const T*>
constexpr borrowed_iterator_t<R>
find(R&& r, const T& value, Proj proj = {});
}
The first overload here takes an iterator/sentinel pair and the second takes a range instead.
While a lot of algorithms added corresponding overloads into std::ranges, the ones in <numeric> were left out. There is a std::accumulate but there is no std::ranges::accumulate. As such, the only version we have available at the moment is one that takes an iterator-pair. Otherwise, you could just write:
auto rng = std::ranges::istream_view<int>(std::cin);
std::cout << std::ranges::accumulate(rng, 0);
Unfortunately, std::ranges::istream_view is one of the new, C++20 ranges whose sentinel type differs from its iterator type, so you cannot pass rng.begin() and rng.end() into std::accumulate either.
This leaves you with two options generally (three, if you include waiting for C++23, which will hopefully have a std::ranges::fold):
Write your own range-based and iterator-sentinel-based algorithms. Which for fold is very easy to do.
Or
There is a utility to wrap a C++20 range into a C++17-compatible one: views::common. So you could this:
auto rng = std::ranges::istream_view<int>(ints) | std::views::common;
std::cout << std::accumulate(rng.begin(), rng.end(), 0);
Except not in this specific case.
istream_view's iterators aren't copyable, and in C++17 all iterators must be. So there isn't really a way to provide C++17-compatible iterators based on istream_view. You need proper C++20-range support. The future std::ranges::fold will support move-only views and move-only iterators, but std::accumulate never can.
Which in this case, just leaves option 1.
†A C++20 iterator needs to be default-constructible, which was not a requirement of C++17 iterators. So a C++17 range with non-default-constructible iterators would not be a valid C++20 range.

The problem is that the end of a C++ range is not, in the general case, an iterator, but rather, a sentinel. A sentinel can have a different type than an iterator and admit fewer operations - as, generally speaking, you mostly need to compare against it to know you've reached the end of the range, and may not be allowed to just work with it like any iterator. For more about this distinction, read:
What's the difference between a sentinel and an end iterator?
Now, standard-library algorithms (including the ones in <numeric>) take pairs of iterators of the same type. In your example:
template< class InputIt, class T >
constexpr T accumulate( InputIt first, InputIt last, T init );
see? InputIt must be the type of both the beginning and end of the range. This can (probably) not even be "fixed" for the istream_view range, because the end of the standard input really isn't an iterator per se. (Although maybe you could bludgeon it into being an iterator and throw exceptions when doing irrelevant things with it.)
So, we would need either a new variant of std::accumulate or an std::ranges::accumulate, which we currently don't have. Or, of course, you could write that yourself, which should not be too difficult.
Edit: One last option, suggested by #RemyLebeau, is to use an std::istream_iterator instead:
std::cout << std::accumulate(
std::istream_iterator<int>(std::cin),
std::istream_iterator<int>(),
0);

How to use `boost::range` iterators with standard iterators

I have functions that take in std::vector iterators, as in
typedef std::vector<Point> Points;
Points ConvexHull(Points::const_iterator first, Points::const_iterator last);
I usually pass the std iterators to them, but occasionally I need to work with boost iterators, such as boost::join's range iterator. How should I change the parametrizations of my functions, ideally without templates, so that they accept both iterators? Moreover, how do I indicate in each type which iterator concepts I need?
I tried looking at the boost::range documentation but it's overwhelmingly confusing for me and I don't know where to start.
For example, I couldn't find the difference between boost::range_details::any_forward_iterator_interface and boost::range_details::any_forward_iterator_wrapper, and whether I should use either of those to specify that I need a forward iterator.
Edit:
If I use boost::any_range, how can I pass non-const lvalue references?
For example:
template<typename T>
using Range = boost::any_range<T, boost::random_access_traversal_tag,
T, std::ptrdiff_t>;
f(Range<Point> &points); // defined elsewhere
// -------------
vector<Point> vec;
f(vec); // error; cannot bind non-const lvalue reference to unrelated type

boost-range has the any_range for this purpose and it suits both purposes for your case.
https://www.boost.org/doc/libs/1_60_0/libs/range/doc/html/range/reference/ranges/any_range.html
From your example it would look like this:
#include <boost/range/any_range.hpp>
typedef boost::any_range<Point,
boost::bidirectional_traversal_tag,
Point,
std::ptrdiff_t
> PointRange;

You should strongly consider using a template. Doing so let's the compiler keep useful information about what operations are actually occurring, which greatly helps it generate optimised output. The std:: convention is to name the type parameter for the concept required. E.g.
template< class BidirIt, class UnaryPredicate > // anything bidirectional (which includes random access)
BidirIt std::partition( BidirIt first, BidirIt last, UnaryPredicate p );
If you really don't want a template, you still shouldn't name anything in a detail namespace. Something like
#include <boost/range/any_range.hpp>
using PointRange = boost::any_range<Point, boost::random_access_traversal_tag>; // or another traversal tag.
using PointIterator = PointRange::iterator;
You will likely need to pass PointRange & less frequently than, say, int *&. Almost always passing by value is the correct behaviour. It is cheap to copy, as it holds a begin and end iterator from the Range that it was constructed from, nothing more.

std::begin and R-values

Recently I was trying to fix a pretty difficult const-correctness compiler error. It initially manifested as a multi-paragraph template vomit error deep within Boost.Python.
But that's irrelevant: it all boiled down to the following fact: the C++11 std::begin and std::end iterator functions are not overloaded to take R-values.
The definition(s) of std::begin are:
template< class C >
auto begin( C& c ) -> decltype(c.begin());
template< class C >
auto begin( const C& c ) -> decltype(c.begin());
So since there is no R-value/Universal Reference overload, if you pass it an R-value you get a const iterator.
So why do I care? Well, if you ever have some kind of "range" container type, i.e. like a "view", "proxy" or a "slice" or some container type that presents a sub iterator range of another container, it is often very convenient to use R-value semantics and get non-const iterators from temporary slice/range objects. But with std::begin, you're out of luck because std::begin will always return a const-iterator for R-values. This is an old problem which C++03 programmers were often frustrated with back in the day before C++11 gave us R-values - i.e. the problem of temporaries always binding as const.
So, why isn't std::begin defined as:
template <class C>
auto begin(C&& c) -> decltype(c.begin());
This way, if c is constant we get a C::const_iterator and a C::iterator otherwise.
At first, I thought the reason was for safety. If you passed a temporary to std::begin, like so:
auto it = std::begin(std::string("temporary string")); // never do this
...you'd get an invalid iterator. But then I realized this problem still exists with the current implementation. The above code would simply return an invalid const-iterator, which would probably segfault when dereferenced.
So, why is std::begin not defined to take an R-value (or more accurately, a Universal Reference)? Why have two overloads (one for const and one for non-const)?

The above code would simply return an invalid const-iterator
Not quite. The iterator will be valid until the end of the full-expression that the temporary the iterator refers to was lexically created in. So something like
std::copy_n( std::begin(std::string("Hallo")), 2,
std::ostreambuf_iterator<char>(std::cout) );
is still valid code. Of course, in your example, it is invalidated at the end of the statement.
What point would there be in modifying a temporary or xvalue? That is probably one of the questions the designers of the range accessors had in mind when proposing the declarations. They didn't consider "proxy" ranges for which the iterators returned by .begin() and .end() are valid past its lifetime; Perhaps for the very reason that, in template code, they cannot be distinguished from normal ranges - and we certainly don't want to modify temporary non-proxy ranges, since that is pointless and might lead to confusion.
However, you don't need to use std::begin in the first place but could rather declare them with a using-declaration:
using std::begin;
using std::end;
and use ADL. This way you declare a namespace-scope begin and end overload for the types that Boost.Python (o.s.) uses and circumvent the restrictions of std::begin. E.g.
iterator begin(boost_slice&& s) { return s.begin(); }
iterator end (boost_slice&& s) { return s.end() ; }
// […]
begin(some_slice) // Calls the global overload, returns non-const iterator
Why have two overloads (one for const and one for non-const)?
Because we still want rvalues objects to be supported (and they cannot be taken by a function parameter of the form T&).

Is it possible to test whether two iterators point to the same object?

Say I'm making a function to copy a value:
template<class ItInput, class ItOutput>
void copy(ItInput i, ItOutput o) { *o = *i; }
and I would like to avoid the assignment if i and o point to the same object, since then the assignment is pointless.
Obviously, I can't say if (i != o) { ... }, both because i and o might be of different types and because they might point into different containers (and would thus be incomparable). Less obviously, I can't use overloaded function templates either, because the iterators might belong to different containers even though they have the same type.
My initial solution to this was:
template<class ItInput, class ItOutput>
void copy(ItInput i, ItOutput o)
{
if (&*o != static_cast<void const *>(&*i))
*o = *i;
}
but I'm not sure if this works. What if *o or *i actually returns an object instead of a reference?
Is there a way to do this generally?

I don't think that this is really necessary: if assignment is expensive, the type should define an assignment operator that performs the (relatively cheap) self assignment check to prevent doing unnecessary work. But, it's an interesting question, with many pitfalls, so I'll take a stab at answering it.
If we are to assemble a general solution that works for input and output iterators, there are several pitfalls that we must watch out for:
An input iterator is a single-pass iterator: you can only perform indirection via the iterator once per element, so, we can't perform indirection via the iterator once to get the address of the pointed-to value and a second time to perform the copy.
An input iterator may be a proxy iterator. A proxy iterator is an iterator whose operator* returns an object, not a reference. With a proxy iterator, the expression &*it is ill-formed, because *it is an rvalue (it's possible to overload the unary-&, but doing so is usually considered evil and horrible, and most types do not do this).
An output iterator can only be used for output; you cannot perform indirection via it and use the result as an rvalue. You can write to the "pointed to element" but you can't read from it.
So, if we're going to make your "optimization," we'll need to make it only for the case where both iterators are forward iterators (this includes bidirectional iterators and random access iterators: they're forward iterators too).
Because we're nice, we also need to be mindful of the fact that, despite the fact that it violates the concept requirements, many proxy iterators misrepresent their category because it is very useful to have a proxy iterator that supports random access over a sequence of proxied objects. (I'm not even sure how one could implement an efficient iterator for std::vector<bool> without doing this.)
We'll use the following Standard Library headers:
#include <iterator>
#include <type_traits>
#include <utility>
We define a metafunction, is_forward_iterator, that tests whether a type is a "real" forward iterator (i.e., is not a proxy iterator):
template <typename T>
struct is_forward_iterator :
std::integral_constant<
bool,
std::is_base_of<
std::forward_iterator_tag,
typename std::iterator_traits<T>::iterator_category
>::value &&
std::is_lvalue_reference<
decltype(*std::declval<T>())
>::value>
{ };
For brevity, we also define a metafunction, can_compare, that tests whether two types are both forward iterators:
template <typename T, typename U>
struct can_compare :
std::integral_constant<
bool,
is_forward_iterator<T>::value &&
is_forward_iterator<U>::value
>
{ };
Then, we'll write two overloads of the copy function and use SFINAE to select the right overload based on the iterator types: if both iterators are forward iterators, we'll include the check, otherwise we'll exclude the check and always perform the assignment:
template <typename InputIt, typename OutputIt>
auto copy(InputIt const in, OutputIt const out)
-> typename std::enable_if<can_compare<InputIt, OutputIt>::value>::type
{
if (static_cast<void const volatile*>(std::addressof(*in)) !=
static_cast<void const volatile*>(std::addressof(*out)))
*out = *in;
}
template <typename InputIt, typename OutputIt>
auto copy(InputIt const in, OutputIt const out)
-> typename std::enable_if<!can_compare<InputIt, OutputIt>::value>::type
{
*out = *in;
}
As easy as pie!

I think this may be a case where you may have to document some assumptions about the types you expect in the function and be content with not being completely generic.
Like operator*, operator& could be overloaded to do all sorts of things. If you're guarding against operator*, then you should consider operator& and operator!=, etc.
I would say that a good prerequisite to enforce (either through comments in the code or a concept/static_assert) is that operator* returns a reference to the object pointed to by the iterator and that it doesn't (or shouldn't) perform a copy. In that case, your code as it stands seems fine.

Your code, as is, is definitly not okay, or atleast not okay for all iterator categories.
Input iterators and output iterators are not required to be dereferenceable after the first time (they're expected to be single-pass) and input iterators are allowed to dereference to anything "convertible to T" (§24.2.3/2).
So, if you want to handle all kinds of iterators, I don't think you can enforce this "optimization", i.e. you can't generically check if two iterators point to the same object. If you're willing to forego input and output iterators, what you have should be fine. Otherwise, I'd stick with doing the copy in any case (I really don't think you have another option on this).

Write a helper template function equals that automatically returns false if the iterators are different types. Either that or do a specialization or overload of your copy function itself.
If they're the same type then you can use your trick of comparing the pointers of the objects they resolve to, no casting required:
if (&*i != &*o)
*o = *i;
If *i or *o doesn't return a reference, no problem - the copy will occur even if it didn't have to, but no harm will be done.

How do I escape the const_iterator trap when passing a const container reference as a parameter

I generally prefer constness, but recently came across a conundrum with const iterators that shakes my const attitude annoys me about them:
MyList::const_iterator find( const MyList & list, int identifier )
{
// do some stuff to find identifier
return retConstItor; // has to be const_iterator because list is const
}
The idea that I'm trying to express here, of course, is that the passed in list cannot/willnot be changed, but once I make the list reference const I then have to use 'const_iterator's which then prevent me from doing anything with modifing the result (which makes sense).
Is the solution, then, to give up on making the passed in container reference const, or am I missing another possibility?
This has always been my secret reservation about const: That even if you use it correctly, it can create issues that it shouldn't where there is no good/clean solution, though I recognize that this is more specifically an issue between const and the iterator concept.
Edit: I am very aware of why you cannot and should not return a non-const iterator for a const container. My issue is that while I want a compile-time check for my container which is passed in by reference, I still want to find a way to pass back the position of something, and use it to modify the non-const version of the list. As mentioned in one of the answers it's possible to extract this concept of position via "advance", but messy/inefficient.

If I understand what you're saying correctly, you're trying to use const to indicate to the caller that your function will not modify the collection, but you want the caller (who may have a non-const reference to the collection) to be able to modify the collection using the iterator you return. If so, I don't think there's a clean solution for that, unless the container provides a mechanism for turning a const interator into a non-const one (I'm unaware of a container that does this). Your best bet is probably to have your function take a non-const reference. You may also be able to have 2 overloads of your function, one const and one non-const, so that in the case of a caller who has only a const reference, they will still be able to use your function.

It's not a trap; it's a feature. (:-)
In general, you can't return a non-const "handle" to your data from a const method. For example, the following code is illegal.
class Foo
{
public:
int& getX() const {return x;}
private:
int x;
};
If it were legal, then you could do something like this....
int main()
{
const Foo f;
f.getX() = 3; // changed value of const object
}
The designers of STL followed this convention with const-iterators.
In your case, what the const would buy you is the ability to call it on const collections. In which case, you wouldn't want the iterator returned to be modifiable. But you do want to allow it to be modifiable if the collection is non-const. So, you may want two interfaces:
MyList::const_iterator find( const MyList & list, int identifier )
{
// do some stuff to find identifier
return retConstItor; // has to be const_iterator because list is const
}
MyList::iterator find( MyList & list, int identifier )
{
// do some stuff to find identifier
return retItor;
}
Or, you can do it all with one template function
template<typename T>
T find(T start, T end, int identifier);
Then it will return a non-const iterator if the input iterators are non-const, and a const_iterator if they are const.

What I've done with wrapping standard algorithms, is have a metaobject for determining the type of container:
namespace detail
{
template <class Range>
struct Iterator
{
typedef typename Range::iterator type;
};
template <class Range>
struct Iterator<const Range>
{
typedef typename Range::const_iterator type;
};
}
This allows to provide a single implementation, e.g of find:
template <class Range, class Type>
typename detail::Iterator<Range>::type find(Range& range, const Type& value)
{
return std::find(range.begin(), range.end(), value);
}
However, this doesn't allow calling this with temporaries (I suppose I can live with it).
In any case, to return a modifiable reference to the container, apparently you can't make any guarantees what your function does or doesn't do with the container. So this noble principle indeed breaks down: don't get dogmatic about it.
I suppose const correctness is more of a service for the caller of your functions, rather that some baby-sitting measure that is supposed to make sure you get your simple find function right.
Another question is: how would you feel if I defined a following predicate and then abused the standard find_if algorithm to increment all the values up to the first value >= 3:
bool inc_if_less_than_3(int& a)
{
return a++ < 3;
}
(GCC doesn't stop me, but I couldn't tell if there's some undefined behaviour involved pedantically speaking.)
1) The container belongs to the user. Since allowing modification through the predicate in no way harms the algorithm, it should be up to the caller to decide how they use it.
2) This is hideous!!! Better implement find_if like this, to avoid this nightmare (best thing to do, since, apparently, you can't choose whether the iterator is const or not):
template <class Iter, class Pred>
Iter my_find_if(Iter first, Iter last, Pred fun)
{
while (first != last
&& !fun( const_cast<const typename std::iterator_traits<Iter>::value_type &>(*first)))
++first;
return first;
}

Although I think your design is a little confusing (as others have pointed iterators allow changes in the container, so I don't see your function really as const), there's a way to get an iterator out of a const_iterator. The efficiency depends on the kind of iterators.
#include <list>
int main()
{
typedef std::list<int> IntList;
typedef IntList::iterator Iter;
typedef IntList::const_iterator ConstIter;
IntList theList;
ConstIter cit = function_returning_const_iter(theList);
//Make non-const iter point to the same as the const iter.
Iter it(theList.begin());
std::advance(it, std::distance<ConstIter>(it, cit));
return 0;
}

Rather than trying to guarantee that the list won't be changed using the const keyword, it is better in this case to guarantee it using a postcondition. In other words, tell the user via comments that the list won't be changed.
Even better would be using a template that could be instantiated for iterators or const_iterators:
template <typename II> // II models InputIterator
II find(II first, int identifier) {
// do stuff
return iterator;
}
Of course, if you're going to go to that much trouble, you might as well expose the iterators of MyList to the user and use std::find.

If you're changing the data directed by the iterator, you're changing the list.
The idea that I'm trying to express here, of course, is that the passed in list cannot/willnot be changed, but once I make the list reference const I then have to use 'cons_iterator's which then prevent me from doing anything with the result.
What is "dong anything"? Modifying the data? That's changing the list, which is contradictory to your original intentions. If a list is const, it (and "it" includes its data) is constant.
If your function were to return a non-const iterator, it would create a way of modifying the list, hence the list wouldn't be const.

You are thinking about your design in the wrong way. Don't use const arguments to indicate what the function does - use them to describe the argument. In this case, it doesn't matter that find() doesn't change the list. What matters is that find() is intended to be used with modifiable lists.
If you want your find() function to return a non-const iterator, then it enables the caller to modify the container. It would be wrong for it to accept a const container, because that would provide the caller with a hidden way of removing the const-ness of the container.
Consider:
// Your function
MyList::iterator find(const MyList& list, int identifier);
// Caller has a const list.
const MyList list = ...
// but your function lets them modify it.
*( find(list,3) ) = 5;
So, your function should take a non-const argument:
MyList::iterator find(MyList& list, int identifier);
Now, when the caller tries to use your function to modify her const list, she'll get a compilation error. That's a much better outcome.

If you're going to return a non-const accessor to the container, make the function non-const as well. You're admitting the possibility of the container being changed by a side effect.
This is a good reason the standard algorithms take iterators rather than containers, so they can avoid this problem.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Const correctness in generic functions using iterators - c++

Related

Why can't I use istream_view and std::accumulate to sum up my input?

How to use `boost::range` iterators with standard iterators

std::begin and R-values

Is it possible to test whether two iterators point to the same object?

How do I escape the const_iterator trap when passing a const container reference as a parameter

Categories

Resources