I'd like to pass a single lvalue to a function which expects a pair of iterators, and for it to act as if I'd passed a pair of iterators to a range containing just this value.
My approach is as follows:
#include <iostream>
#include <vector>
template<typename Iter>
void iterate_over(Iter begin, Iter end){
for(auto i = begin; i != end; ++i){
std::cout << *i << std::endl;
}
}
int main(){
std::vector<int> a{1,2,3,4};
iterate_over(a.cbegin(), a.cend());
int b = 5;
iterate_over(&b, std::next(&b));
}
This appears to work correctly in g++5.2, but I'm wondering if this is actually defined behaviour and if there are any potential issues?
Yes this is defined behavior. First we have from [expr.add]/4
For the purposes of these operators, a pointer to a nonarray object behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type.
So a single object is treated as a array of length 1. Then we have [expr.add]/5
[...]Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.
Emphasis mine
So since the first array element is also the last array element, and adding 1 to the last array element gives you the one past the object, it is legal.
Related
In C++ Primer 5th edition it is mentioned that you can take the address of the non-existent element one past the last element of an array (so long as you don't de-reference it).
int arr[] = {0,1,2,3,4,5,6,7,8,9};
int *e = &arr[10]; // 10 is 1 past the end.
For std::vector, does indexing into the vector with the [] operator and taking a reference have the same guarantee as array that it won't crash?
vector<int> vec(10, 0);
int *e = &vec[10];
I do understand there's little use-case for this since we have end() iterators and all that.
It's undefined behaviour. The definition of vec[10] is
*(vec.begin() + 10)
which is dereferencing an past-the-end iterator. Furthermore
Values of an iterator i for which the expression *i is defined are called dereferenceable. The library never assumes that past-the-end values are dereferenceable.
Preamble: It is well-known that taking the pointer one past the end of an array is legal and well-defined:
int main()
{
int na [1] = {};
const int* naBegin = na;
const int* naEnd = na + 1; // one-past-end, OK
}
This pointer can be used in comparisons, which contributes to C-style arrays (or, more accurately, pointers therein) being compatible with Standard Library routines which take iterators, such as copy (Live Demo):
template <typename Field, typename Iter>
void foo(Iter begin, Iter end)
{
std::copy (begin, end, std::ostream_iterator <Field> (std::cout, std::endl);
}
int main()
{
int na [1] = {};
foo <int> (na, na + 1);
}
The legality and definedness of this is supported by the Standard (C++03 reference):
5.7 Additive operators
5/When an expression that has integral type is added to or subtracted
from a pointer, the result has the type of the pointer operand. If the
pointer operand points to an element of an array object, and the array
is large enough, the result points to an element offset from the
original element such that the difference of the subscripts of the
resulting and original array elements equals the integral expression.
In other words, if the expression P points to the i-th element of an
array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N
(where N has the value n) point to, respectively, the i+n-th and
i–n-th elements of the array object, provided they exist. Moreover, if
the expression P points to the last element of an array object, the
expression (P)+1 points one past the last element of the array object,
and if the expression Q points one past the last element of an array
object, the expression (Q)-1 points to the last element of the array
object. If both the pointer operand and the result point to elements
of the same array object, or one past the last element of the array
object, the evaluation shall not produce an overflow; otherwise, the
behavior is undefined.
When I've looked in the Standard for references to the vlaidity of past-the-end pointers, every reference I've found is discussing arrays. What if we were to try to take past-the-end the address of an object, not an array?
Question: Is it possible to treat a single object, not allocated as an array, as if it were an array and take a valid one-past-the-end address of said object?
For instance (Live Demo):
#include <cstdlib>
#include <iostream>
#include <iomanip>
#include <iterator>
template <typename Field, typename Iter>
void foo(Iter begin, Iter end)
{
std::copy (begin, end, std::ostream_iterator <Field> (std::cout, "\n"));
}
int main()
{
int na = 42;
foo <int> (&na, &na + 1);
}
Is this code legal and well-defined by the Standard?
The answer is in the paragraph before the one you quote:
4/ For the purposes of these operators, a pointer to a nonarray object behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type.
(Note: I'm quoting C++11 as I don't have C++03 to hand. I'm fairly sure nothing has changed.)
So yes, &na + 1 is a valid past-the-end pointer.
When I'm making a procedure with pointer arithmetic and !=, such as
template <typename T> void reverse_array ( T * arr, size_t n )
{
T * end = arr + n;
while (arr != end && arr != --end)
{
swap(arr,end);
++arr;
}
}
I always take a lot of caution because if I write my procedure wrong then in a corner case the first pointer might "jump over" the second one. But, if arrays are such that
&arr[0] < &arr[1] < ... < &arr[n]
for any array arr of length n-1, then can't I just do something like
template <typename T> void reverse_array ( T * arr, size_t n )
{
T * end = arr + n;
if (arr == end) break;
--end;
while (arr < end)
{
swap(arr,end);
++arr; --end;
}
}
since it's more readable? Or is there a danger looming? Aren't memory addresses just integral types and thus comparable with < ?
The relational operators are defined to work correctly when comparing addresses within the same array (in fact, also objects of class type, where there are some guarantees about memory layout also) including the one-past-the-end pointer.
However, if you "jump over" the end-of-array pointer, you are no longer comparing two addresses within the same array, and the behavior is undefined. (One cause is that you might in fact experience wraparound when you do pointer arithmetic outside objects, but UB is not restricted).
Your case is perfectly fine concerning jump-over, because your end pointer isn't the one-past-the-end of the array, since you always do at least one --end. An empty array, where --end moves outside the array, would be an issue, but you test for that separately.
Conclusion: your second code is perfectly valid.
For C (since you tagged both), yes, they can be compared, within the same array:
When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to. If two pointers to object types both point to the same object, or both point one past the last element of the same array object, they compare equal. If the objects pointed to are members of the same aggregate object, pointers to structure members declared later compare greater than pointers to members declared earlier in the structure, and pointers to array elements with larger subscript values compare greater than pointers to elements of the same array with lower subscript values. All pointers to members of the same union object compare equal. If the
expression P points to an element of an array object and the expression Q points to the last element of the same array object, the pointer expression Q+1 compares greater than P. In all other cases, the behavior is undefined.
-- C11 6.5.8, "Relational operators".
But it's not because they're "just integral types", which they aren't (and aren't guaranteed to be represented as in memory) - it's because they also have the behaviour defined for them.
Preamble: It is well-known that taking the pointer one past the end of an array is legal and well-defined:
int main()
{
int na [1] = {};
const int* naBegin = na;
const int* naEnd = na + 1; // one-past-end, OK
}
This pointer can be used in comparisons, which contributes to C-style arrays (or, more accurately, pointers therein) being compatible with Standard Library routines which take iterators, such as copy (Live Demo):
template <typename Field, typename Iter>
void foo(Iter begin, Iter end)
{
std::copy (begin, end, std::ostream_iterator <Field> (std::cout, std::endl);
}
int main()
{
int na [1] = {};
foo <int> (na, na + 1);
}
The legality and definedness of this is supported by the Standard (C++03 reference):
5.7 Additive operators
5/When an expression that has integral type is added to or subtracted
from a pointer, the result has the type of the pointer operand. If the
pointer operand points to an element of an array object, and the array
is large enough, the result points to an element offset from the
original element such that the difference of the subscripts of the
resulting and original array elements equals the integral expression.
In other words, if the expression P points to the i-th element of an
array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N
(where N has the value n) point to, respectively, the i+n-th and
i–n-th elements of the array object, provided they exist. Moreover, if
the expression P points to the last element of an array object, the
expression (P)+1 points one past the last element of the array object,
and if the expression Q points one past the last element of an array
object, the expression (Q)-1 points to the last element of the array
object. If both the pointer operand and the result point to elements
of the same array object, or one past the last element of the array
object, the evaluation shall not produce an overflow; otherwise, the
behavior is undefined.
When I've looked in the Standard for references to the vlaidity of past-the-end pointers, every reference I've found is discussing arrays. What if we were to try to take past-the-end the address of an object, not an array?
Question: Is it possible to treat a single object, not allocated as an array, as if it were an array and take a valid one-past-the-end address of said object?
For instance (Live Demo):
#include <cstdlib>
#include <iostream>
#include <iomanip>
#include <iterator>
template <typename Field, typename Iter>
void foo(Iter begin, Iter end)
{
std::copy (begin, end, std::ostream_iterator <Field> (std::cout, "\n"));
}
int main()
{
int na = 42;
foo <int> (&na, &na + 1);
}
Is this code legal and well-defined by the Standard?
The answer is in the paragraph before the one you quote:
4/ For the purposes of these operators, a pointer to a nonarray object behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type.
(Note: I'm quoting C++11 as I don't have C++03 to hand. I'm fairly sure nothing has changed.)
So yes, &na + 1 is a valid past-the-end pointer.
Let's say I have an array of QStrings and a QString pointer. I want to use the pointer to iterate through the entire array; could I do this?
QString * strPointer;
QString data[100];
strPointer = & data[0]; //address to first element
strPointer ++; //address to second element
Would this be valid or am I doing something wrong?
You're on the right lines. Here's one way
QString data[100];
for (QString* strPointer = &data[0]; strPointer != &data[100]; ++strPointer)
{
...
}
Yes, this is fine so long as the type of the pointer matches what's actually being pointed to in the array. By incrementing a pointer you are performing pointer arithmetic.
It may be interesting to note that because iterators in the Standard Library are written to look & feel like pointers in many ways, and all the Standard Library algorithms take iterators specified as template parameters, it is legal and well-defined to use these algorithms with raw pointers as well. For example, this is perfectly legitimate, even with your pointers:
const size_t num_data = sizeof(data)/sizeof(data[0]);
std::copy( &data[0], &data[num_data], ostream_iterator<QString>(cout,"\n") );
...assuming of course you have implemented operator<< for a QString object.
Now, all this being said, take a look at this:
QString data[100];
The 100 here is what's called a Magic Number. The use of Magic Numbers is widely considered to be an anti-pattern, or a bad practice. Ask yourself a couple questions:
How do you know that 100 elements will be enough?
If you don't need 100 elements, are you being wasteful?
If you need more than 100 elements, will your program crash?
It's best to avoid using magic numbers wherever you can. Your choice of 100 here is arbitrary. It would be better to use a collection type that grows and shrinks as you add and remove objects. std::vector is a good place to start:
std::vector<QString> data;
Now you can add items:
data.push_back( ... );
...remove them, and iterate easily, using iterators:
std::copy( data.begin(), data.end(), ostream_iterator<QString>(cout,"\n") );
Yes, it is. Remember about checking index - operator++ can go "too far" - beyond the array.
Yes, incrementing a pointer to an element of an array will produce a pointer to the next element or to a position one past the end of the array.
When an expression that has integral type is added to or subtracted
from a pointer, the result has the type of the pointer operand. If the
pointer operand points to an element of an array object, and the array
is large enough, the result points to an element offset from the
original element such that the difference of the subscripts of the
resulting and original array elements equals the integral expression.
In other words, ifthe expression P points to the i-th element of an
array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N
(where N has the value n) point to, respectively, the i + n-th and i −
n-th elements of the array object, provided they exist. Moreover, if
the expression P points to the last element of an array object, the
expression (P)+1 points one past the last element of the array object,
and if the expression Q points one past the last element of an array
object, the expression (Q)-1 points to the last element of the array
object. If both the pointer operand and the result point to elements
of the same array object, or one past the last element of the array
object, the evaluation shall not produce an overflow; otherwise, the
behavior is undefined. — [expr.add] 5.7 /5