Why must use reference in ranged-based for loops - c++

I was stuck in a piece of code in C++ Primer and had thought about it more than 1 hour. The code is
for (auto row : ia)//should use &row here
for (auto col : row)
The explanation for it is
We do so in order to avoid the normal array to pointer conversion. Because row is not a reference, when the compiler
initializes row it will convert each array element (like any other object of array type)
to a pointer to that array’s first element. As a result, in this loop the type of row is
int*. The inner for loop is illegal. Despite our intentions, that loop attempts to
iterate over an int*.
I know it has something to do with iterations each time doing for(auto col:row). What I can't understand about
We do so in order to avoid the normal array to pointer conversion"
is what's the form of "ia" we passed in for the outer loop? Should it be a pointer that points to the address of its first element rather than a "concrete" array? Then what's wrong for the inner loop? I thought it should be the same mechanism with the outer loop.. I can't understand the posts on Q&A. Someone please enlighten me please...What's wrong with my understanding...A good link is also welcomed! Many thanks in advance!
The declaration for ia is
constexpr size_t rowCnt = 3, colCnt = 4;
int ia[rowCnt][colCnt]; // 12 uninitialized elements
// for each row
for (size_t i = 0; i != rowCnt; ++i) {
// for each column within the row
for (size_t j = 0; j != colCnt; ++j) {
// assign the element's positional index as its value
ia[i][j] = i * colCnt + j;
}
}

In general, a range-based for loop declared as:
for ( decl : coll ) {
statement }
is equivalent to the following, if coll provides begin() and end() members:
for (auto _pos=coll.begin(), _end=coll.end(); _pos!=_end; ++_pos )
{
decl = *_pos;
statement
}
or, if that doesn’t match, to the following by using a global begin() and end() taking coll as argument:
for (auto _pos=begin(coll), _end=end(coll); _pos!=_end; ++_pos )
{
decl = *_pos;
statement
}
Now look at the line decl = *_pos; , here each time compiler will initialize dec , it will convert each array element (like any other object of array type) to a pointer to that array’s first element .
In your case , type of raw will comes out to be int* on which next for loop can't work and hence it becomes illegal .

Range-based for loops are based on the begin and end functions which don't work on T* but work on T[N].
The implementation for:
for ( range_declaration : range_expression ) loop_statement
is along the lines of:
{
auto && __range = range_expression ;
for (auto __begin = begin_expr,
__end = end_expr;
__begin != __end; ++__begin) {
range_declaration = *__begin;
loop_statement
}
}
Since in the outer for loop the array would decay to a pointer with auto row, the inner for loop would become illegal.

Range based for loop works for either with array type or user defined type having member functions begin and end.
By array-to-pointer decay rule when you pass array to a function which takes a pointer, then the array decay to pointer
But when you use a template by **reference** then template type is deduced as a array type:
template<typename T>
void foo(T&);
int main() {
int ia[2][2] = {{0,1}, {2,3}};
auto row_val = ia[0]; //int*
auto& row_ref = ia[0]; //int [2]
foo(row_val); //void foo<int*>(int*&)
foo(row_ref); //void foo<int [2]>(int (&) [2])
}
Here you can see that (from the error it creates!!) that type deduced for row_val and row_ref are not same
row_val ==> int*
row_ref ==> int [2]
You have to use for(auto& row: ia) as row will now array type not a pointer. hence you can use the inner loop for (auto col : row).

Related

Nested for loop in a two-dimension array

I have learned how to print out each element in a two-dimensional array
int arr[3][3] = {....};
for ( auto &row : arr){
for ( auto col : row)
cout<<col<<endl;
}
I understand that the &row in the outer for loop has to be a reference. Otherwise, row will become a pointer pointing to array arr's first element which is an array of 3 ints.
Based on this, I thought the following code could work but it didn't
for( auto row : arr ){
for ( auto col:*row)
cout<<col<<endl;
}
It gives me the error about the inner for loop
no callable 'begin' function found for type 'int'
Did I miss something here?
Each element of arr has type int[3].
When row is a reference, it gets type int (&) [3], which can be iterated over. But when it isn't a reference, the int[3] array decays to a pointer to its first element, so row has type int*, which can't be used in a range-for loop.
Your code is attempting to iterate over *row, which has type int, leading to the error.

What is the accurate description of the use in "when use an array"

When we use an array, it is converted automatically to a pointer to its
first element. (c++ primer 5th ed. pp129)
int ia[3][4];
for (auto p = ia; p!= ia + 3; ++pp){
for (auto q = *p; q ! = *p + 4; ++q)
cout << *q << ' ';
cout << endl;
}
The code snippet above is a good example for the quotation. pis a pointer points to an array of four ints and q is a pointer points to int
However, the for range based loop has different story
for (auto row: ia) // the code won't compile in fact
for (auto col: row)
Here, the type of row is the pointer points to int (reason the second loop won't compile). Why is that? Is this not the case of use the array?
"use an array" is a very handwavy expression.
To understand how the array is used, you must first understand what the range based for loop does. Let's expand your outer loop to use an equivalent regular for loop (I've simplified a little):
{
for (auto __begin = std::begin(ia), __end = std::end(ia);
__begin != __end; ++__begin) {
auto row = *__begin;
for (auto col: row); // oops. Cannot use range-for with a pointer
}
}
The question here is, what will be the deduced type of auto row?
The result of *__begin is an l-value of type "array of 4 ints". auto follows the rules of template argument deduction. An argument cannot be an array object, so auto can never be deduced to be an array. The array type decays to pointer to first element i.e. pointer to int in this case.
An argument can be deduced as "reference to array of 4 ints", so this will work:
for (auto& row: ia)
for (auto col: row)
Your problem here is for (auto row: ia) causes the element of ia to decay to a pointer so row becomes a pointer type. This means you cant use for (auto col: row) since there is no begin function defined for pointers.
What you need to do is take a reference so that you refer to the 1d array and not have a pointer to it. That looks like
for (auto& row: ia) // reference to each row in the array
for (auto col: row) // copy of each element in the row

Nested Range-based for loop Not working in C++17

I have made nested Range-based for loop program in C++17.
#include <iostream>
#include <vector>
int main()
{
std::vector<int> v = {0, 1, 2, 3, 4, 5};
for (int i : v)
{
for (int a : i)
std::cout << a << ' ';
}
}
GCC genrated an error:
main.cpp: In function 'int main()':
main.cpp:10:22: error: 'begin' was not declared in this scope
for (int a : i)
So,
Why does GCC generate an error for nested range based for loop?
What is the scope of range based for loop?
This problem has nothing to do with nested loops.
The following code snippet is nonsense and the compiler ties itself into knots trying to understand it:
int main() {
std::vector<int> v = {0, 1, 2, 3, 4, 5};
for (int i : v) {
for (int a : i)
std::cout << a << ' ';
}
}
The error message thus is also nonsense. It is like feeding the compiler random characters, and the compiler coming back with "missing ;".
In particular:
for (int i : v)
for (int a : i)
The first line declares i as of type int. How could the second line, then, iterate over an int? int is not an array nor is it user-/library-defined.
Types that can be iterated over are arrays, and user/library defined types with a member begin()/end(), and such with a non-member free function begin()/end() in their namespace, whose begin()/end() return something iterator (or pointer) like.
gcc tried to treat it as an iterable object. It isn't an array. It doesn't have a member begin(). It isn't defined in a namespace containing a non-member begin(). This makes gcc give up at that point, and output a message that it could not find the non-member begin() in a nonsense location (as int has no point of definition).
This will generate the same error:
int i;
for( int a:i );
This line
for (int a : i)
makes no sense. If you read the link on range-based loop you provided, you find that the inner loop would be equivalent to the following code,
{
auto && __range = i ;
auto __begin = begin(__range) ;
auto __end = end(__range) ;
for ( ; __begin != __end; ++__begin) {
a = *__begin;
std::cout << a << ' ';
}
}
The begin and end functions are useful for vectors, maps, ranges etc. because they give iterators. They are also defined by the language for arrays, where they point to the beginning and past the end of the array, so the iterating syntax is the same. They are not defined for a plain int variable.
With this information the produced given by compiler is completely clear: it refers to the absence of begin(i) in the third line of the transformed code. That it is not declared in the scope where the inner loop appears (which is: the outer loop) is just an irrelevant detail at this point, it's not defined anywhere else in the program either.

Why isn't this 'for' loop valid?

From C++ Primer 5th Edition by Lippman, page 182, consider:
int ia[3][4];
for (auto row : ia)
for (auto col : row)
The first for iterates through ia, whose elements are arrays of size 4.
Because row is not a reference, when the compiler initializes row it will convert each array element (like any other object of array
type) to a pointer to that array’s first element. As a result, in this
loop the type of row is int*.
I am not really sure that I understand how this auto works, but if I can assume it automatically gives a type to a row based on ia array members type, but I don't understand why this kind of for, where row is not a reference, is not valid. Why is this going to happen? "pointer to that array’s first element", because of what?
The problem is that row is an int * and not a int[4] as one would expect because arrays decay to pointers and there is no automatic way to know how many elements a pointer points to.
To get around that problem std::array has been added where everything works as expected:
#include <array>
int main() {
std::array<std::array<int, 4>, 3> ia;
for (auto &row : ia){
for (auto &col : row){
col = 0;
}
}
}
Note the & before row and col which indicate that you want a reference and not a copy of the rows and columns, otherwise setting col to 0 would have no effect on ia.
To prevent the decay of the int[] to int* you can use &&
int main() {
int ia[3][4];
for (auto && row : ia)
for (auto && col : row)
;
}

How does the range-based for work for plain arrays?

In C++11 you can use a range-based for, which acts as the foreach of other languages. It works even with plain C arrays:
int numbers[] = { 1, 2, 3, 4, 5 };
for (int& n : numbers) {
n *= 2;
}
How does it know when to stop? Does it only work with static arrays that have been declared in the same scope the for is used in? How would you use this for with dynamic arrays?
It works for any expression whose type is an array. For example:
int (*arraypointer)[4] = new int[1][4]{{1, 2, 3, 4}};
for(int &n : *arraypointer)
n *= 2;
delete [] arraypointer;
For a more detailed explanation, if the type of the expression passed to the right of : is an array type, then the loop iterates from ptr to ptr + size (ptr pointing to the first element of the array, size being the element count of the array).
This is in contrast to user defined types, which work by looking up begin and end as members if you pass a class object or (if there is no members called that way) non-member functions. Those functions will yield the begin and end iterators (pointing to directly after the last element and the begin of the sequence respectively).
This question clears up why that difference exists.
I think that the most important part of this question is, how C++ knows what the size of an array is (at least I wanted to know it when I found this question).
C++ knows the size of an array, because it's a part of the array's definition - it's the type of the variable. A compiler has to know the type.
Since C++11 std::extent can be used to obtain the size of an array:
int size1{ std::extent< char[5] >::value };
std::cout << "Array size: " << size1 << std::endl;
Of course, this doesn't make much sense, because you have to explicitly provide the size in the first line, which you then obtain in the second line. But you can also use decltype and then it gets more interesting:
char v[] { 'A', 'B', 'C', 'D' };
int size2{ std::extent< decltype(v) >::value };
std::cout << "Array size: " << size2 << std::endl;
According to the latest C++ Working Draft (n3376) the ranged for statement is equivalent to the following:
{
auto && __range = range-init;
for (auto __begin = begin-expr,
__end = end-expr;
__begin != __end;
++__begin) {
for-range-declaration = *__begin;
statement
}
}
So it knows how to stop the same way a regular for loop using iterators does.
I think you may be looking for something like the following to provide a way to use the above syntax with arrays which consist of only a pointer and size (dynamic arrays):
template <typename T>
class Range
{
public:
Range(T* collection, size_t size) :
mCollection(collection), mSize(size)
{
}
T* begin() { return &mCollection[0]; }
T* end () { return &mCollection[mSize]; }
private:
T* mCollection;
size_t mSize;
};
This class template can then be used to create a range, over which you can iterate using the new ranged for syntax. I am using this to run through all animation objects in a scene which is imported using a library that only returns a pointer to an array and a size as separate values.
for ( auto pAnimation : Range<aiAnimation*>(pScene->mAnimations, pScene->mNumAnimations) )
{
// Do something with each pAnimation instance here
}
This syntax is, in my opinion, much clearer than what you would get using std::for_each or a plain for loop.
It knows when to stop because it knows the bounds of static arrays.
I'm not sure what do you mean by "dynamic arrays", in any case, if not iterating over static arrays, informally, the compiler looks up the names begin and end in the scope of the class of the object you iterate over, or looks up for begin(range) and end(range) using argument-dependent lookup and uses them as iterators.
For more information, in the C++11 standard (or public draft thereof), "6.5.4 The range-based for statement", pg.145
How does the range-based for work for plain arrays?
Is that to read as, "Tell me what a ranged-for does (with arrays)?"
I'll answer assuming that - Take the following example using nested arrays:
int ia[3][4] = {{1,2,3,4},{5,6,7,8},{9,10,11,12}};
for (auto &pl : ia)
Text version:
ia is an array of arrays ("nested array"), containing [3] arrays, with each containing [4] values. The above example loops through ia by it's primary 'range' ([3]), and therefore loops [3] times. Each loop produces one of ia's [3] primary values starting from the first and ending with the last - An array containing [4] values.
First loop: pl equals {1,2,3,4} - An array
Second loop: pl equals {5,6,7,8} - An array
Third loop: pl equals {9,10,11,12} - An array
Before we explain the process, here are some friendly reminders about arrays:
Arrays are interpreted as pointers to their first value - Using an array without any iteration returns the address of the first value
pl must be a reference because we cannot copy arrays
With arrays, when you add a number to the array object itself, it advances forward that many times and 'points' to the equivalent entry - If n is the number in question, then ia[n] is the same as *(ia+n) (We're dereferencing the address that's n entries forward), and ia+n is the same as &ia[n] (We're getting the address of the that entry in the array).
Here's what's going on:
On each loop, pl is set as a reference to ia[n], with n equaling the current loop count starting from 0. So, pl is ia[0] on the first round, on the second it's ia[1], and so on. It retrieves the value via iteration.
The loop goes on so long as ia+n is less than end(ia).
...And that's about it.
It's really just a simplified way to write this:
int ia[3][4] = {{1,2,3,4},{5,6,7,8},{9,10,11,12}};
for (int n = 0; n != 3; ++n)
auto &pl = ia[n];
If your array isn't nested, then this process becomes a bit simpler in that a reference is not needed, because the iterated value isn't an array but rather a 'normal' value:
int ib[3] = {1,2,3};
// short
for (auto pl : ib)
cout << pl;
// long
for (int n = 0; n != 3; ++n)
cout << ib[n];
Some additional information
What if we didn't want to use the auto keyword when creating pl? What would that look like?
In the following example, pl refers to an array of four integers. On each loop pl is given the value ia[n]:
int ia[3][4] = {{1,2,3,4},{5,6,7,8},{9,10,11,12}};
for (int (&pl)[4] : ia)
And... That's how it works, with additional information to brush away any confusion. It's just a 'shorthand' for loop that automatically counts for you, but lacks a way to retrieve the current loop without doing it manually.
Some sample code to demonstrate the difference between arrays on Stack vs arrays on Heap
/**
* Question: Can we use range based for built-in arrays
* Answer: Maybe
* 1) Yes, when array is on the Stack
* 2) No, when array is the Heap
* 3) Yes, When the array is on the Stack,
* but the array elements are on the HEAP
*/
void testStackHeapArrays() {
int Size = 5;
Square StackSquares[Size]; // 5 Square's on Stack
int StackInts[Size]; // 5 int's on Stack
// auto is Square, passed as constant reference
for (const auto &Sq : StackSquares)
cout << "StackSquare has length " << Sq.getLength() << endl;
// auto is int, passed as constant reference
// the int values are whatever is in memory!!!
for (const auto &I : StackInts)
cout << "StackInts value is " << I << endl;
// Better version would be: auto HeapSquares = new Square[Size];
Square *HeapSquares = new Square[Size]; // 5 Square's on Heap
int *HeapInts = new int[Size]; // 5 int's on Heap
// does not compile,
// *HeapSquares is a pointer to the start of a memory location,
// compiler cannot know how many Square's it has
// for (auto &Sq : HeapSquares)
// cout << "HeapSquare has length " << Sq.getLength() << endl;
// does not compile, same reason as above
// for (const auto &I : HeapInts)
// cout << "HeapInts value is " << I << endl;
// Create 3 Square objects on the Heap
// Create an array of size-3 on the Stack with Square pointers
// size of array is known to compiler
Square *HeapSquares2[]{new Square(23), new Square(57), new Square(99)};
// auto is Square*, passed as constant reference
for (const auto &Sq : HeapSquares2)
cout << "HeapSquare2 has length " << Sq->getLength() << endl;
// Create 3 int objects on the Heap
// Create an array of size-3 on the Stack with int pointers
// size of array is known to compiler
int *HeapInts2[]{new int(23), new int(57), new int(99)};
// auto is int*, passed as constant reference
for (const auto &I : HeapInts2)
cout << "HeapInts2 has value " << *I << endl;
delete[] HeapSquares;
delete[] HeapInts;
for (const auto &Sq : HeapSquares2) delete Sq;
for (const auto &I : HeapInts2) delete I;
// cannot delete HeapSquares2 or HeapInts2 since those arrays are on Stack
}