In 'for (auto c : str)' what exactly is c? - c++

If I declare:
string s = "ARZ";
And then run the following code:
for (auto& c : s) {
cout << (void*)&c << endl;
}
the results will correspond to the addresses of s[0], s[1] and s[2] respectively.
If I remove the & and run:
for (auto c : s) {
cout << (void*)&c << endl;
}
the address of c is always the same.
Presumably c is just a pointer into the vector and it's value advances by sizeof(char) with each loop but I'm finding it hard to get my head round why I'm not required to write *c to access the string char values.
And finally if I run:
for (auto c: s) {
c='?';
cout << c << endl;
}
It prints out 3 question marks.
I'm finding it hard to fathom what c actually is?

In 'for (auto c : str)' what exactly is c?
It's a local variable whose scope is the entire for block and has char type.
for (auto c : str) { loop_statement }
is equivalent to
{
for (auto __begin = str.begin(), __end = str.end(); __begin != __end; ++__begin) {
auto c = *__begin;
loop_statement
}
}
On some implementations, under some conditions, since the lifetime of c ends before the lifetime of next-iteration's c begins, it gets allocated at the same place and gets the same address. You cannot rely on that.

If you don't know the type, you can let the compiler tell you:
#include <string>
template <typename T>
struct tell_type;
int main(){
std::string s = "asdf";
for (auto& c : s) {
tell_type<decltype(c)>();
}
}
Note that there is no definition for tell_type, hence this will result in an error along the line of:
error: implicit instantiation of undefined template 'tell_type<char>'
And similarly
error: implicit instantiation of undefined template 'tell_type<char &>'
for the for (auto& ... loop.

c is char.
The syntax can be misleading until you get it under your skin (but it makes sense).
for (auto c : s) //*distinct object* (think: a copy usually)
for (auto& c : s) //reference into the string (can modify string)
Short: use auto& when you need to modify the contents.

In 'for (auto c : str)' what exactly is c?
c is a local variable with automatic storage within the scope of the range-for statement. It's type will be deduced because you used auto. In case of string str="ARZ";, the deduced type will be char.
Presumably c is just a pointer into the vector
There is no vector, and c is not a pointer. It is a char.
Understanding what range-for does may help. It is equivalent to doing following (the __ prefixed variables are not accessible to the programmer; they are conceptual for the behaviour of the loop):
{
auto && __range = range_expression;
auto __begin = begin_expr;
auto __end = end_expr;
for (; __begin != __end; ++__begin) {
range_declaration = *__begin;
loop_statement
}
}
Or, in this particular case:
{
auto && __range = str;
auto __begin = range.begin();
auto __end = range.end();
for ( ; __begin != __end; ++__begin) {
auto c = *__begin; // note here
cout << (void*)&c << endl;
}
}
Note that if you use auto&, then c will be deduced to be a reference to char. Applying the addressof operator to a reference will not produce the address of the reference variable, but instead the address of the referred object. In this case the referred object would be the character within the string.

In this range-based for loop
for (auto c: s) {c='?'; cout << c << endl;}
there are three iterations because the size of the string s is equal to 3.
Within the loop the assigned value of the object c is ignored and the object is reassigned by the character '?'. So three characters '?' are outputted.
The type of the local variable c is char that is the value type of the class std::string
In this range-based for loop
for (auto& c : s) cout << (void*)&c << endl;
the variable c has a referenced type more precisely the type char &. So in this loop the addresses of the referenced objects are outputted. That is in this loop the addresses of elements of the string s are outputted.
In this range-based for loop
for (auto c : s) cout << (void*)&c << endl;
there is outputted the address of the same local variable c.

When you use references, the reference c is a reference to a character inside the string.
When you don't use references, c is a plain char variable, which contains a copy of the character in the string.
The reason the non-reference variant gives the same pointer for all iterations is simply an implementation detail, where the compiler reuses the space for the variable c inside each iteration.

for (auto& c : s)
c is a reference to a character (char&)
This loop is roughly equivalent to C:
for (char* c=str; *c; ++c)
for (auto c : s)
c is a character (char)
This loop is roughly equivalent to C:
int i=0;
for (char c=str[i]; i<strlen(str); c=str[++i])

Related

Range based for-loop with &

I haven't found a question that answers the part I'm confused on, and I apologize if someone did answer it.
I'm confused on whats going on in this for-loop, how is it looping through the addresses?
int arr[] = { 1, 2, 3, 4, 5 };
for(const int &arrEntry : arr) {
cout << arrEntry << " ";
}
Perhaps the placement of & is causing confusion. Remember that C++ doesn't care where you put spaces. Since this: for (const int &arrEntry : arr) is a declaration of a new variable arrEntry for use inside the loop, the use of & on the left-hand side of its name means we are defining an object that has a reference type, specifically arrEntry is a reference to a const int. This means that within the loop, arrEntry is not a copy of the data you're looping over, only a reference to it. The const means that you can't change its value.
If this were not a declaration, and if arrEntry were defined previously, then the expression &arrEntry would indeed be taking the address of arrEntry. Within the body of the loop, arrEntry is already defined, and so you can take its address with &arrEntry
int arr[] = { 1, 2, 3, 4, 5} ;
for(const int &arrEntry : arr){
cout << arrEntry << " "; // prints a const int
cout << &arrEntry << " "; // prints a pointer to a const int
}
The range-based for loop in C++ is actually just syntactic sugar that is equivalent to the following (provided by cppreference:
for (range_declaration : range_expression) loop_statement;
// is equivalent to:
{
auto && __range = range_expression ;
for (auto __begin = begin_expr, __end = end_expr; __begin != __end; ++__begin) {
range_declaration = *__begin;
loop_statement
}
}
In the above code block, begin_expr and end_expr are equivalent to std::begin(__range) and std::end(__range) respectively.
So in the case of using const int &arrEntry, arrEntry is actually declared inside the "real" (normal) for loop and thus in each iteration it refers to a different object in the range, as if by using the raw iterators directly.
Note that this would not be possible if arrEntry was declared outside the normal for loop, as references cannot be repointed to refer to a different object.
Another important (side) fact to consider is that range_expression is kept alive for the entire duration of the loop, which means you can use a prvalue there (e.g. calling a function that returns a std::vector<int> by value.
In your code, the &arrEntry is a reference to arr. This is implicit in the Ranged based For-Loop.
for(const int &arrEntry : arr){
cout << arrEntry << " ";
}
You could do it without the reference, the result is the same.
But notice the value of arr is copied to arrEntry.
for(const int arrEntry : arr){
cout << arrEntry << " ";
}

What is the accurate description of the use in "when use an array"

When we use an array, it is converted automatically to a pointer to its
first element. (c++ primer 5th ed. pp129)
int ia[3][4];
for (auto p = ia; p!= ia + 3; ++pp){
for (auto q = *p; q ! = *p + 4; ++q)
cout << *q << ' ';
cout << endl;
}
The code snippet above is a good example for the quotation. pis a pointer points to an array of four ints and q is a pointer points to int
However, the for range based loop has different story
for (auto row: ia) // the code won't compile in fact
for (auto col: row)
Here, the type of row is the pointer points to int (reason the second loop won't compile). Why is that? Is this not the case of use the array?
"use an array" is a very handwavy expression.
To understand how the array is used, you must first understand what the range based for loop does. Let's expand your outer loop to use an equivalent regular for loop (I've simplified a little):
{
for (auto __begin = std::begin(ia), __end = std::end(ia);
__begin != __end; ++__begin) {
auto row = *__begin;
for (auto col: row); // oops. Cannot use range-for with a pointer
}
}
The question here is, what will be the deduced type of auto row?
The result of *__begin is an l-value of type "array of 4 ints". auto follows the rules of template argument deduction. An argument cannot be an array object, so auto can never be deduced to be an array. The array type decays to pointer to first element i.e. pointer to int in this case.
An argument can be deduced as "reference to array of 4 ints", so this will work:
for (auto& row: ia)
for (auto col: row)
Your problem here is for (auto row: ia) causes the element of ia to decay to a pointer so row becomes a pointer type. This means you cant use for (auto col: row) since there is no begin function defined for pointers.
What you need to do is take a reference so that you refer to the 1d array and not have a pointer to it. That looks like
for (auto& row: ia) // reference to each row in the array
for (auto col: row) // copy of each element in the row

Nested Range-based for loop Not working in C++17

I have made nested Range-based for loop program in C++17.
#include <iostream>
#include <vector>
int main()
{
std::vector<int> v = {0, 1, 2, 3, 4, 5};
for (int i : v)
{
for (int a : i)
std::cout << a << ' ';
}
}
GCC genrated an error:
main.cpp: In function 'int main()':
main.cpp:10:22: error: 'begin' was not declared in this scope
for (int a : i)
So,
Why does GCC generate an error for nested range based for loop?
What is the scope of range based for loop?
This problem has nothing to do with nested loops.
The following code snippet is nonsense and the compiler ties itself into knots trying to understand it:
int main() {
std::vector<int> v = {0, 1, 2, 3, 4, 5};
for (int i : v) {
for (int a : i)
std::cout << a << ' ';
}
}
The error message thus is also nonsense. It is like feeding the compiler random characters, and the compiler coming back with "missing ;".
In particular:
for (int i : v)
for (int a : i)
The first line declares i as of type int. How could the second line, then, iterate over an int? int is not an array nor is it user-/library-defined.
Types that can be iterated over are arrays, and user/library defined types with a member begin()/end(), and such with a non-member free function begin()/end() in their namespace, whose begin()/end() return something iterator (or pointer) like.
gcc tried to treat it as an iterable object. It isn't an array. It doesn't have a member begin(). It isn't defined in a namespace containing a non-member begin(). This makes gcc give up at that point, and output a message that it could not find the non-member begin() in a nonsense location (as int has no point of definition).
This will generate the same error:
int i;
for( int a:i );
This line
for (int a : i)
makes no sense. If you read the link on range-based loop you provided, you find that the inner loop would be equivalent to the following code,
{
auto && __range = i ;
auto __begin = begin(__range) ;
auto __end = end(__range) ;
for ( ; __begin != __end; ++__begin) {
a = *__begin;
std::cout << a << ' ';
}
}
The begin and end functions are useful for vectors, maps, ranges etc. because they give iterators. They are also defined by the language for arrays, where they point to the beginning and past the end of the array, so the iterating syntax is the same. They are not defined for a plain int variable.
With this information the produced given by compiler is completely clear: it refers to the absence of begin(i) in the third line of the transformed code. That it is not declared in the scope where the inner loop appears (which is: the outer loop) is just an irrelevant detail at this point, it's not defined anywhere else in the program either.

Why must use reference in ranged-based for loops

I was stuck in a piece of code in C++ Primer and had thought about it more than 1 hour. The code is
for (auto row : ia)//should use &row here
for (auto col : row)
The explanation for it is
We do so in order to avoid the normal array to pointer conversion. Because row is not a reference, when the compiler
initializes row it will convert each array element (like any other object of array type)
to a pointer to that array’s first element. As a result, in this loop the type of row is
int*. The inner for loop is illegal. Despite our intentions, that loop attempts to
iterate over an int*.
I know it has something to do with iterations each time doing for(auto col:row). What I can't understand about
We do so in order to avoid the normal array to pointer conversion"
is what's the form of "ia" we passed in for the outer loop? Should it be a pointer that points to the address of its first element rather than a "concrete" array? Then what's wrong for the inner loop? I thought it should be the same mechanism with the outer loop.. I can't understand the posts on Q&A. Someone please enlighten me please...What's wrong with my understanding...A good link is also welcomed! Many thanks in advance!
The declaration for ia is
constexpr size_t rowCnt = 3, colCnt = 4;
int ia[rowCnt][colCnt]; // 12 uninitialized elements
// for each row
for (size_t i = 0; i != rowCnt; ++i) {
// for each column within the row
for (size_t j = 0; j != colCnt; ++j) {
// assign the element's positional index as its value
ia[i][j] = i * colCnt + j;
}
}
In general, a range-based for loop declared as:
for ( decl : coll ) {
statement }
is equivalent to the following, if coll provides begin() and end() members:
for (auto _pos=coll.begin(), _end=coll.end(); _pos!=_end; ++_pos )
{
decl = *_pos;
statement
}
or, if that doesn’t match, to the following by using a global begin() and end() taking coll as argument:
for (auto _pos=begin(coll), _end=end(coll); _pos!=_end; ++_pos )
{
decl = *_pos;
statement
}
Now look at the line decl = *_pos; , here each time compiler will initialize dec , it will convert each array element (like any other object of array type) to a pointer to that array’s first element .
In your case , type of raw will comes out to be int* on which next for loop can't work and hence it becomes illegal .
Range-based for loops are based on the begin and end functions which don't work on T* but work on T[N].
The implementation for:
for ( range_declaration : range_expression ) loop_statement
is along the lines of:
{
auto && __range = range_expression ;
for (auto __begin = begin_expr,
__end = end_expr;
__begin != __end; ++__begin) {
range_declaration = *__begin;
loop_statement
}
}
Since in the outer for loop the array would decay to a pointer with auto row, the inner for loop would become illegal.
Range based for loop works for either with array type or user defined type having member functions begin and end.
By array-to-pointer decay rule when you pass array to a function which takes a pointer, then the array decay to pointer
But when you use a template by **reference** then template type is deduced as a array type:
template<typename T>
void foo(T&);
int main() {
int ia[2][2] = {{0,1}, {2,3}};
auto row_val = ia[0]; //int*
auto& row_ref = ia[0]; //int [2]
foo(row_val); //void foo<int*>(int*&)
foo(row_ref); //void foo<int [2]>(int (&) [2])
}
Here you can see that (from the error it creates!!) that type deduced for row_val and row_ref are not same
row_val ==> int*
row_ref ==> int [2]
You have to use for(auto& row: ia) as row will now array type not a pointer. hence you can use the inner loop for (auto col : row).

Using a C++11 Range for loop to Change the Characters in a string

I read in C++ Primer :
If we want to change the value of the characters in a string, we must define the
loop variable as a reference type (§ 2.3.1, p. 50). Remember that a
reference is just another name for a given object. When we use a
reference as our control variable, that variable is bound to each
element in the sequence in turn. Using the reference, we can change
the character to which the reference is bound.
Further they give this code :
string s("Hello World!!!");
// convert s to uppercase
for (auto &c : s) // for every char in s (note: c is a reference)
c = toupper(c); // c is a reference, so the assignment changes the char
in s
cout << s << endl;
The output of this code is HELLO WORLD!!!
I also read :
There is no way to rebind a reference to refer to a different object.
Because there is no way to rebind a reference, references must be
initialized.
Question : Won't this code cause rebinding each time the reference variable c is binded to next character of string s ?
for (auto &c : s)
c = toupper(c);
There's no rebinding of an existing variable, at each iteration the "old" c dies and the "new" c is created again, initialized to the next character. That for loop is equivalent to:
{
auto it = begin(s);
auto e = end(s);
// until C++17: auto it = begin(s), e = end(s);
for(; it!=e; ++it) {
auto &c = *it;
c=toupper((unsigned char)c);
}
}
where you see that, at each iteration, c is re-created and re-initialized.
In other words, a variable declared inside the round parentheses of a range-based for loop has the body of the loop as its scope.
No. A new reference is initialized for each iteration in the for loop.
for (auto &c : s)
c = toupper(c);
is equivalent to:
for (auto it = s.begin(); it != s.end(); ++it)
{
auto &c = *it;
c = toupper(c);
}
Consider
char s[5] = {'h','e','l','l','o'};
for (int secret_index=0; secret_index<5; ++secret_index) {
char &c = s[secret_index];
c = toupper(c);
}
A new reference (with the same variable name) is initialized on every iteration. That is, the for loop enters and leaves the scope on every iteration.