In c++ how to check a pointer lies within a range? - c++

Intuitively to check whecker pointer p lies in [a,b) one will do
a<=p && p<b
However, comparing pointers from two arrays results in unspecified behavior and thus we cannot safely say p is in [a,b) from this comparison.
Is there any way one can check for this with certainty?
(It would be better if it can be done for std::vector<T>::const_iterator, but I don't think it's feasible.)

Here's a partial solution. You can leverage the fact that the comparison would invoke unspecified behavior, and the fact that a core-constant-expression can't perform this operation:
template<typename T>
constexpr bool check(T *p, T *a, T *b)
{
return a <= p and p < b;
}
Now this function can be used like this:
int main()
{
int arr[5];
int arr_2[5];
constexpr bool b1 = check(arr + 1, arr, arr + 3); // ok
constexpr bool b2 = check(arr_2 + 1, arr, arr + 3); // error
}
Here's a demo.
This obviously works only if the pointer values are known at compile time. At run-time, there is no efficient way of doing this check.

The solution for pointers is to use the comparison objects defined in <functional>, like less/less_equal, etc.
From §20.8.5/8 of the c++17 standard1:
For templates greater, less, greater_equal, and less_equal, the specializations for any pointer type yield a total order, even if the built-in operators <, >, <=, >= do not.
So the solution for pointers would be:
template<typename T>
bool check(T *p, T *a, T *b)
{
return std::less_equal<T*>{}(a,p) && std::less<T*>{}(p,b);
}
Here's a working example using pointers.
There is no such strict guarantee for iterators; however this can be worked around in c++20, since it provides std::to_address which can convert pointable objects to pointers. Note, however, that the behavior of doing this for the purpose of comparisons is only really well defined for contiguous iterators.
Since we know that std::vector iterators cover a contiguous range, we can use this to retrieve the underlying pointer (note: not dereference it, as this would be undefined behavior for the past-the-end pointer).
So for a std::vector<T>::iterator, a solution might look like:
template <typename T>
bool check(const std::vector<T>::const_iterator p, std;:vector<T>::const_iterator a, std::vector<T>::const_iterator b)
{
// Delegate to the pointer check version defined above, for brevity
return check(std::to_address(p), std::to_address(a), std::to_address(b));
}
Here's a working example using iterators.
1 This same note exists all the way back to c++11 under §23.14.7/2, with similar wording.

If I understand you correctly, you want to check if vector iterator is between two other vector iterators.
Then you may use std::distance to compute distance between vector.begin and a, p and b and then simply compare itegers you get from distance return value.

std::distance(first, last) from C++17 can be used for both, but result is undefined if last is unreachable from first (e.g. different range or invalid iterator)

Related

Tuple wrapper that works with get, tie, and other tuple operations

I have written a fancy "zip iterator" that already fulfils many roles (can be used in for_each, copy loops, container iterator range constructors etc...).
Under all the template code to work around the pairs/tuples involved, it comes down to the dereference operator of the iterator returning a tuple/pair of references and not a reference to a tuple/pair.
I want my iterator to work with std::sort, so I need to be able to do swap(*iter1, *iter2) and have the underlying values switched in the original containers being iterated over.
The code and a small demo can be viewed here (it's quite a bit to get through): http://coliru.stacked-crooked.com/a/4fe23b4458d2e692
Although libstdc++'s sort uses std::iter_swap which calls swap, e.g. libc++'s does not, and it just calls swap directly, so I would like a solution involving swap as the customization point.
What I have tried (and gotten oooooh so close to working) is instead of returning std::pair/std::tuple from the operator* as I am doing now, is returning a simple wrapper type instead. The intent is to have the wrapper behave as if it were a std::pair/std::tuple, and allow me to write a swap function for it.
It looked like this:
template<typename... ValueTypes>
struct TupleWrapper : public PairOrTuple_t<ValueTypes...>
{
using PairOrTuple_t<ValueTypes...>::operator=;
template<typename... TupleValueTypes>
operator PairOrTuple_t<TupleValueTypes...>() const
{
return static_cast<PairOrTuple_t<ValueTypes...>>(*this);
}
};
template<std::size_t Index, typename... ValueTypes>
decltype(auto) get(TupleWrapper<ValueTypes...>& tupleWrapper)
{
return std::get<Index>(tupleWrapper);
}
template<std::size_t Index, typename... ValueTypes>
decltype(auto) get(TupleWrapper<ValueTypes...>&& tupleWrapper)
{
return std::get<Index>(std::forward<TupleWrapper<ValueTypes...>>(tupleWrapper));
}
template<typename... ValueTypes,
std::size_t... Indices>
void swap(TupleWrapper<ValueTypes...> left,
TupleWrapper<ValueTypes...> right,
const std::index_sequence<Indices...>&)
{
(std::swap(std::get<Indices>(left), std::get<Indices>(right)), ...);
}
template<typename... ValueTypes>
void swap(TupleWrapper<ValueTypes...> left,
TupleWrapper<ValueTypes...> right)
{
swap(left, right, std::make_index_sequence<sizeof...(ValueTypes)>());
}
namespace std
{
template<typename... ValueTypes>
class tuple_size<utility::implementation::TupleWrapper<ValueTypes...>> : public tuple_size<utility::implementation::PairOrTuple_t<ValueTypes...>> {};
template<std::size_t Index, typename... ValueTypes>
class tuple_element<Index, utility::implementation::TupleWrapper<ValueTypes...>> : public tuple_element<Index, utility::implementation::PairOrTuple_t<ValueTypes...>> {};
}
Full code here: http://coliru.stacked-crooked.com/a/951cd639d95af130.
Returning this wrapper in operator* seems to compile (at least on GCC) but produces garbage.
On Clang's libc++, the std::tie fails to compile.
Two questions:
How can I get this to compile with libc++ (the magic seems to lie in the conversion operator of TupleWrapper?)
Why is the result wrong and what did I do wrong?
I know it's a lot of code, but well, I can't get it any shorter as all the tiny examples of swapping tuple wrappers worked fine for me.
1st problem
One of the issues is that the ZipIterator class does not satisfy the requirements of RandomAccessIterator.
std::sort requires RandomAccessIterators as its parameters
RandomAccessIterators must be BidirectionalIterators
BidirectionalIterators must be ForwardIterators
ForwardIterators have the condition that ::reference must be value_type& / const value_type&:
The type std::iterator_traits<It>::reference must be exactly
T& if It satisfies OutputIterator (It is mutable)
const T& otherwise (It is constant)
(where T is the type denoted by std::iterator_traits<It>::value_type)
which ZipIterator currently doesn't implement.
It works fine with std::for_each and similar functions that only require the iterator to satisfy the requirements of InputIterator / OutputIterator.
The reference type for an input iterator that is not also a LegacyForwardIterator does not have to be a reference type: dereferencing an input iterator may return a proxy object or value_type itself by value (as in the case of std::istreambuf_iterator).
tl;dr: ZipIterator can be used as an InputIterator / OutputIterator, but not as a ForwardIterator, which std::sort requires.
2nd problem
As #T.C. pointed out in their comment std::sort is allowed to move values out of the container and then later move them back in.
The type of dereferenced RandomIt must meet the requirements of MoveAssignable and MoveConstructible.
which ZipIterator currently can't handle (it never copies / moves the referenced objects), so something like this doesn't work as expected:
std::vector<std::string> vector_of_strings{"one", "two", "three", "four"};
std::vector<int> vector_of_ints{1, 2, 3, 4};
auto first = zipBegin(vector_of_strings, vector_of_ints);
auto second = first + 1;
// swap two values via a temporary
auto temp = std::move(*first);
*first = std::move(*second);
*second = std::move(temp);
// Result:
/*
two, 2
two, 2
three, 3
four, 4
*/
(test on Godbolt)
Result
Unfortunately it is not possible to create an iterator that produces elements on the fly and can by used as a ForwardIterator with the current standard (for example this question)
You could of course write your own algorithms that only require InputIterators / OutputIterators (or handle your ZipIterator differently)
For example a simple bubble sort: (Godbolt)
template<class It>
void bubble_sort(It begin, It end) {
using std::swap;
int n = std::distance(begin, end);
for (int i = 0; i < n-1; i++) {
for (int j = 0; j < n-i-1; j++) {
if (*(begin+j) > *(begin+j+1))
swap(*(begin+j), *(begin+j+1));
}
}
}
Or change the ZipIterator class to satisfy RandomAccessIterator.
I unfortunately can't think of a way that would be possible without putting the tuples into a dynamically allocated structure like an array (which you're probably trying to avoid)

How to make idempotent taking a reference to a dereference of an iterator

The code bellow (-std=c++11) according to a "naive" view should work.
Instead it doesn't (should be known and understood why it doesn't).
Which is the shortest way of modifying the code (overloading &) in order to make it behave according to the "naive" view ?
Shouldn't that be given as an option during stl object creation (without writting too much) ?
#include <iostream>
#include <vector>
int main(int argc, char **argv)
{ std::vector<int> A{10,20,30};
auto i=A.begin();
auto j=&*i;
std::cout<<"i==j gives "<<(i==j)<<std::endl;
return 0;
}
The problem cannot be solved. There are three reasons it cannot be solved.
First problem
The operator & you need to overload is the operator & for the element type of the vector. You cannot overload operator & for arbitrary types, and in particular you can't overload it for built-in types (like int in your example).
Second problem
Presumably you want this to work for std::vector, std::array, and built-in arrays? Also probably std::list, std::deque, etc? You can't. The iterators for each of those contains will be different (in practise: in theory, some of them could share iterators, but I am not aware of any standard library where they do.)
Third problem
If you were prepared to accept that this would only work for std::vector<MyType>, then you could overload MyType::operator & - but you still couldn't work out which std::vector<MyType> the MyType object lives in (and you need that to obtain the iterator).
First of, in your code snippet i deducts to std::vector<int>::iterator and j deducts to int*. The compiler doesn't know how to compare std::vector<int>::iterator against int*.
For this to work out, you could provide an overloaded operator== that would compare vector iterators against vector value type pointers in the following manner:
template<typename T>
bool operator==(typename std::vector<T>::iterator it, T *i) {
return &(*it) == i;
}
template<typename T>
bool operator==(T *i, typename std::vector<T>::iterator it) {
return it == i;
}
Live Demo
This shouldn't work - not even "accoding to a 'naive"' view". Eventhough every pointer is an iterator the reverse is not necessarily true. Why would you expect that to work?
It would work under two scenarios:
The iterator of the std::vector<T> implementation is actually a T*. Then your code would work since decltype(i) == int* and decltype(j) == int*). This MAY be the case for some compilers but you shouldn't even rely on it if it was true for your compiler.
The dereference operator does not return an object of type T but rather something that is convertible to T and has an overloaded operator& which gives the iterator back. This is not the case for very good reasons.
You could -as other have suggested- overload operator== to check whether both indirections (pointer and iterator) reference the same object but I suspect that you want the address of operator to give you back the iterator which cannot be accomplished if the iterator is not a pointer because the object type which is stored in the vector has no notion of vector/iterator or whatever.
The problem isn't in the equality operator, what I need is to define the dereference operator to give an iterator
You can't. The dereference operator in question is std::vector<int>::iterator which is part of the standard library and you can (and should not) manipulate it.
Note that since C++11 in a std::vector<T, A>,
value_type is T and
reference is T&.
Furthermore, the following is true:
All input iterators i support *i which gives a value of type T which is the value type of that iterator.
The iterator of std::vector<T> is required to have T as its value type.
An iterator of std::vector<T> is an input iterator.

Work around for vector<bool>, use basic_string<bool>?

Is this a safe workaround? I want to use vector bool but need to pass a pointer to old code expecting C-style array.
typedef std::basic_string<bool> vector_bool;
int main()
{
vector_bool ab;
ab.push_back(true);
ab.push_back(true);
ab.push_back(true);
ab.push_back(false);
bool *b = &ab[0];
b[1] = false;
}
Edit:
Thanks for suggestions of other solutions, but I would really like a definite answer on my above solution. Thanks.
I'm not sure about std::basic_string<bool> because that will instantiate std::char_traits<bool> and I'm not sure if the standard requires that to be defined, or if the char_traits primary template can be left undefined, with only explicit specializations such as char_traits<char> being defined. You're not allowed to provide your own specialization of char_traits<bool> because you can only specialize standard templates if the specialization depends on a user-defined type, which bool obviously isn't. That said, it might work if your stdlib does have a default char_traits definition, and you don't try to use an string operations that require members of char_traits to do anything useful.
Alternatively, this is hacky but might work:
struct boolish { bool value; };
inline boolish make_boolish(bool b) { boolish bish = { b }; return bish; }
std::vector<boolish> b;
b.push_back( make_boolish(true) );
bool* ptr = &b.front().value;
boolish is a trivial type, so as long as an array of boolish has the same representation as an array of bool (which you'd need to check for your compiler, I used a static_assert to check there is no padding) then you might get away with it, although it probably violates the aliasing rules because *ptr and *++ptr are not part of the same array, so incrementing the pointer doesn't point to the next boolish::value it points "past the end" of the previous one (even if those two locations actually have the same address, although [basic.compound]/3 does seem to say that ++ptr does "point to" the next bool).
The syntax gets a bit easier with C++11, you don't need make_boolish ...
#include <vector>
#include <assert.h>
struct boolish { bool value; };
int main()
{
std::vector<boolish> vec(10);
vec.push_back( boolish{true} );
bool* ptr = &vec.front().value;
assert( ptr[10] == true );
ptr[3] = true;
assert( vec[3].value == true );
static_assert( sizeof(boolish) == sizeof(bool), "" );
boolish test[10];
static_assert( sizeof(test) == (sizeof(bool)*10), "" );
}
From "Working Draft C++, 2012-11-02"
21.1 General [strings.general]
1 This Clause describes components for manipulating sequences of any non-array POD (3.9) type.
21.4.1 basic_string general requirements [string.require]
5 The char-like objects in a basic_string object shall be stored contiguously. That is, for any basic_string
object s, the identity &*(s.begin() + n) == &*s.begin() + n shall hold for all values of n such that 0
<= n < s.size().
but
6 References, pointers, and iterators referring to the elements of a basic_string sequence may be invalidated by the following uses of that basic_string object:
— as an argument to any standard library function taking a reference to non-const basic_string as an argument.233
— Calling non-const member functions, except operator[], at, front, back, begin, rbegin, end, and rend.
So, you should be safe as long as you pay attention, not to call these functions, while you use the raw array somewhere else.
Update:
Character traits and requirements are described in 21.2 Character traits [char.traits] and 21.2.1 Character traits requirements [char.traits.require]. Additionally, typedefs and specializations are described in 21.2.2 traits typedefs [char.traits.typedefs] and 21.2.3 char_traits specializations [char.traits.specializations] respectively.
These traits are used in the Input/output library as well. So there are requirements, like eof() or pos_type and off_type, which don't make sense in the context of basic_string.
I don't see any requirement for these traits to be actually defined by an implementatin, besides the four specializations for char, char16_t, char32_t and wchar_t.
Although, it worked out of the box with gcc 4.7 with your example, I defined a minimal bool_traits with just
struct bool_traits {
typedef bool char_type;
static void assign(char_type &r, char_type d);
static char_type *copy(char_type *s, const char_type *p, std::size_t n);
static char_type *move(char_type *s, const char_type *p, std::size_t n);
};
took the default implementation provided (gcc 4.7), and used that like
std::basic_string<bool, bool_traits> ab;
Your environment might already provide a working implementation. If not, you can implement a simple bool_traits or a template specialization std::char_traits<bool> yourself.
You can see the complete interface for character traits in the Working Draft, PDF or at cppreference.com - std::char_traits.
You can also use boost::container::vector. It is exactly like std::vector but it's not specialized for bool.

How to correctly (yet efficiently) implement something like "vector::insert"? (Pointer aliasing)

Consider this hypothetical implementation of vector:
template<class T> // ignore the allocator
struct vector
{
typedef T* iterator;
typedef const T* const_iterator;
template<class It>
void insert(iterator where, It begin, It end)
{
...
}
...
}
Problem
There is a subtle problem we face here:
There is the possibility that begin and end refer to items in the same vector, after where.
For example, if the user says:
vector<int> items;
for (int i = 0; i < 1000; i++)
items.push_back(i);
items.insert(items.begin(), items.end() - 2, items.end() - 1);
If It is not a pointer type, then we're fine.
But we don't know, so we must check that [begin, end) does not refer to a range already inside the vector.
But how do we do this? According to C++, if they don't refer to the same array, then pointer comparisons would be undefined!
So the compiler could falsely tell us that the items don't alias, when in fact they do, giving us unnecessary O(n) slowdown.
Potential solution & caveat
One solution is to copy the entire vector every time, to include the new items, and then throw away the old copy.
But that's very slow in scenarios such as in the example above, where we'd be copying 1000 items just to insert 1 item, even though we might clearly already have enough capacity.
Is there a generic way to (correctly) solve this problem efficiently, i.e. without suffering from O(n) slowdown in cases where nothing is aliasing?
You can use the predicates std::less etc, which are guaranteed to give a total order, even when the raw pointer comparisons do not.
From the standard [comparisons]/8:
For templates greater, less, greater_equal, and less_equal, the specializations for any pointer type yield a total order, even if the built-in operators <, >, <=, >= do not.
But how do we do this? According to C++, if they don't refer to the same array, then pointer comparisons would be undefined!
Wrong. The pointer comparisons are unspecified, not undefined. From C++03 §5.9/2 [expr.rel]:
[...] Pointers to objects or functions of the same type (after pointer conversions) can be compared, with a result defined as follows:
[...]
-Other pointer comparisons are unspecified.
So it's safe to test if there is an overlap before doing the expensive-but-correct copy.
Interestingly, C99 differs from C++ in this, in that pointer comparisons between unrelated objects is undefined behavior. From C99 §6.5.8/5:
When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to. [...] In all other cases, the behavior is undefined.
Actually, this would be true even if they were regular iterators. There's nothing stopping anyone doing
std::vector<int> v;
// fill v
v.insert(v.end() - 3, v.begin(), v.end());
Determining if they alias is a problem for any implementation of iterators.
However, the thing you're missing is that you're the implementation, you don't have to use portable code. As the implementation, you can do whatever you want. You could say "Well, in my implementation, I follow x86 and < and > are fine to use for any pointers.". And that would be fine.

checking if pointer points within an array

Can I check whether or not a given pointer points to an object within an array, specified by its bounds?
template <typename T>
bool points_within_array(T* p, T* begin, T* end)
{
return begin <= p && p < end;
}
Or do the pointer comparisons invoke undefined behavior if p points outside the bounds of the array? In that case, how do I solve the problem? Does it work with void pointers? Or is it impossible to solve?
Although the comparison is valid only for pointers within the array and "one past the end", it is valid to use a set or map with a pointer as the key, which uses std::less<T*>
There was a big discussion on this way back in 1996 on comp.std.c++
Straight from the MSDN documentation:
Two pointers of different types cannot be compared unless:
One type is a class type derived from the other type.
At least one of the pointers is explicitly converted (cast) to type void *. (The other pointer is implicitly converted to type void * for the conversion.)
So a void* can be compared to anything else (including another void*). But will the comparison produce meaningful results?
If two pointers point to elements of
the same array or to the element one
beyond the end of the array, the
pointer to the object with the higher
subscript compares higher. Comparison
of pointers is guaranteed valid only
when the pointers refer to objects in
the same array or to the location one
past the end of the array.
Looks like not. If you don't already know that you are comparing items inside the array (or just past it), then the comparison is not guaranteed to be meaningful.
There is, however, a solution: The STL provides std::less<> and std::greater<>, which will work with any pointer type and will produce valid results in all cases:
if (std::less<T*>()(p, begin)) {
// p is out of bounds
}
Update:
The answer to this question gives the same suggestion (std::less) and also quotes the standard.
The only correct way to do this is an approach like this.
template <typename T>
bool points_within_array(T* p, T* begin, T* end)
{
for (; begin != end; ++begin)
{
if (p == begin)
return true;
}
return false;
}
Fairly obviously, this doesn't work if T == void. I'm not sure whether two void* technically define a range or not. Certainly if you had Derived[n], it would be incorrect to say that (Base*)Derived, (Base*)(Derived + n) defined a valid range so I can't see it being valid to define a range with anything other than a pointer to the actual array element type.
The method below fails because it is unspecified what < returns if the two operands don't point to members of the same object or elements of the same array. (5.9 [expr.rel] / 2)
template <typename T>
bool points_within_array(T* p, T* begin, T* end)
{
return !(p < begin) && (p < end);
}
The method below fails because it is also unspecified what std::less<T*>::operator() returns if the two operands don't point to members of the same object or elements of the same array.
It is true that a std::less must be specialized for any pointer type to yield a total order if the built in < does not but this is only useful for uses such as providing a key for a set or map. It is not guaranteed that the total order won't interleave distinct arrays or objects together.
For example, on a segmented memory architecture the object offset could be used for < and as the most significant differentiator for std::less<T*> with the segment index being used to break ties. In such a system an element of one array could be ordered between the bounds of a second distinct array.
template <typename T>
bool points_within_array(T* p, T* begin, T* end)
{
return !(std::less<T*>()(p, begin)) && (std::less<T*>()(p, end));
}
The C++ standard does not specify what happens when you are comparing pointers to objects that do not reside in the same array, hence undefined behaviour. However, the C++ standard is not the only standard your platform must conform. Other standards like POSIX specify things that C++ standard leaves as undefined behaviour.
On platforms with virtual address space like Linux and Win32/64 you can compare any pointers without causing any undefined behaviour.
comparisions on pointer types don't neccesarily result in a total order. std::less/std::greater_equal do, however. So ...
template <typename T>
bool points_within_array(T* p, T* begin, T* end)
{
return std::greater_equal<T*>()(p, begin) && std::less<T*>()(p, end);
}
will work.
Could you not do this with std::distance, i.e. your problem effectively boils down to:
return distance(begin, p) >= 0 && distance(begin, p) < distance(begin, end);
Given this random access iterator (pointer) is being passed in, it should boil down to some pointer arithmetic rather than pointer comparisons? (I'm assuming end really is end and not the last item in the array, if the last then change the less than to <=).
I could be way off the mark...