Arrays and pointers arithmetic confusion - c++

#include <iostream>
using namespace std;
int main ()
{
int numbers[5];
int * p;
p = numbers; *p = 10;
p++; *p = 20;
p = &numbers[2]; *p = 30;
p = numbers + 3; *p = 40;
p = numbers; *(p+4) = 50;
for (int n=0; n<5; n++)
cout << numbers[n] << ", ";
return 0;
}
Fairly new coder here. This is an example taken from the pointers page on cplusplus.com. They talk about dereferencing, the address of operator, and then they talk about the relationship between arrays and pointers. I noticed if I simply printed out a declared array it spits out the address of the array. I'm really trying to understand exactly what some of these lines of code are doing and am having a hard time.
Everything makes sense up until *p=10. I read this as, "the value being pointed to by pointer p is equal to 10" but an array isn't just one value. In this case it's 5... So, whats the deal? Will the compiler just assume we are talking about the first element of the array?
Even more confusing for me is p++;. p right now is the address of the array. How do you do p=p+1 when p is an address?
I understand p= &numbers[2];. As well as the following *p = 30;.
p = numbers + 3; confuses me. Why are we able to take an address and just add 3 to it? And, what does that even mean?
Everything within the for loop makes sense. It's just printing out the array correct?
Thanks for your help in advance!

As Pointed out this due to pointer arithmetic.
So both array numbers and pointer p are int, so after assigning numbers's base address to pointer p and the *p=10 does numbers[0]=10
Now when you do p++ the address of p is incremented but not to the next numeric value as in normal arithmetic operation of ++ but to the address of the next index of numbers.
Let's assume a int is 4byte of memory and numbers's initial address is 1000.
So p++ makes address it to be incremented to 1004 which is numbers[1]
Basically it points to the next int variable position in memory.
So when *p = 20 is encountered the value 20 will be assigned to the next adjacent memory i.e. 1004 or numbers[1]=20.
After that the p directly assigned address of 3rd index numbers[2] i.e 1008 (next int var in memory). Now p is pointing to the 3rd index of numbers so *p=30 is again numbers[2]=30
p = numbers + 3; *p = 40; does similar thing that is assigns base address of numbers incremented by 3 indexes (1000 + 3*sizeof(int) =1012 ==> 1000 + 3*4 =1012) to p and assigns 40 to that. so it becomes equivalent to numbers[3]=40.
Then p = numbers; *(p+4) = 50; is assigning 1000 to p and the adding 4 to the address of p by using the () and assigns 50 to the value by using *() which is value of the address(1000+ 4*4 = 1016) inside the braces.

Using a pointer like this in C++: p = numbers makes p point to the first element of the array. So when you do p++ you are making p point to the second element.
Now, doing *p = 10 assigns 10 to whatever p is pointing to, as in your case p is pointing to the first element of the array, then *p = 10 is the same as numbers[0] = 10.
In case you see this later: *(p + i) = 20, this is pointer arithmetic and is the same as p[i] = 20. That's why p[0] = 10 is equivalent to *(p + 0) = 10 and equivalent to *p = 10.
And about your question: How do you do p=p+1 when p is an address?: as arrays in C++ are stored sequentially in memory, if the element p is pointing at memory address 6400010 then p++ points at 6400014 assuming each element pointed by p occupies 4 bytes. If you wonder why p++ is 6400014 instead of 6400011, the reason is a little magic C++ does: when increasing a pointer, it doesn't increase by 1 byte but by 1 element so if you are pointing to integers, each one occupying 4 bytes, then p will be increased by 4 instead of 1.

Let's break it down:
int numbers[5];
int * p;
Assigning the array variable to a pointer means that the pointer will be pointing to the first element:
p = numbers; // p points to the 1st element
*p = 10; // 1st element will be 10
p++; // go to next location i.e. 2nd element
*p = 20; // 2nd element will be 20
You can do pointer arithmetic and it works according to the type. Incrementing an int pointer on a 32-bit machine will do a jump of 4 bytes in memory. On a 64-bit machine, that will be 8 bytes. Keep in mind that you cannot do this with a void pointer.
p = &numbers[2]; // assign the address of 3rd element
*p = 30; // 3rd element will be 30
p = numbers + 3; // numbers + 3 means the address of 4th element
*p = 40; // 4th element will be 40
p = numbers; // p points to the 1st element
*(p + 4) = 50; // p + 4 means the address of 5th element
// dereference 5th location and assign 50
// 5th element will be 50

There are a lot of pages on the web on how pointers work. But there are a couple things to quickly point out.
int* p;
The above line mean p is a pointer to an int. Assigning something to it doesn't change what it is. It is a pointer to an int, not an array of int.
int number[5];
p = number;
This assignment takes the address where the array is stored, and assigns it to p. p now points to the beginning of that array.
p++;
p now points to the 2nd int in the array. Pointers are incremented by the size of the object they point to. So the actual address may increase by 4 or 8, but it goes to the next address of an int.
*p = 30;
This is an assignment, not a comparison. It doesn't say that what p points to is 30, it copies 30 into the address pointed to by p. Saying "equals" is generally confusing in programming terms, so use either "assign" or "compare" to be clear.

1> p=numbers;
pointer p now points to the start of the array, i.e first element.
2> *p=10;
first element now becomes 10, as p is pointing to the first element.
3> p++;
this moves the pointer to the next element of the array, i.e second element
So, again *p=20; will make second element=20
4> p = &numbers[2];
this statement means that p now points to 3rd element of the array.
So, *p = 30; will make 3rd element=30
5> p = numbers + 3;
Wen we do p=numbers+1; it points to second element, so p=numbers+3; will now point to 4th element of array. *p = 40; make the fourth element=40
6> p = numbers;
Again p points to first element of array
*(p+4) = 50; will now make 5th element=50

Related

C++ references and pointers int∗ p = &a [ 2 ] ;

I am studying C++ and came across this Code
int a[] = {9,8,−5};
int∗ p = &a[2] ;
and was wondering what exactly that means? So a is an Array and p is a pointer. But what does the & before a[] mean?
and what does mean then?
std::cout << (p − a) << "\n";
edit:
so i have read some answeres but what i still dont understand is what the & is really for. Is there a difference between
int∗ p = &a[2] ;
and
int∗ p = a[2] ;
?
Thanks for your help.
This code
int∗ p = &a[2];
is equivalent to this:
int∗ p = a + 2;
so even mathematically you can see that p - a is equal to a + 2 - a so I think result should be obvious that it is equal to 2.
Note name of array can decay to a pointer to the first element but is not that pointer, as many may mistakenly say so. There is significant difference.
About your note, yes there is difference, first expression assigns address of third element to p, second is syntax error as you try to assign int to int * which are different types:
int a[] = { 1, 2, 3 };
int var = a[2]; // type of a[2] is int and value is 3
int *pointer = &a[2]; // type of &a[2] is int * and value is address of element that holds value 3
Subject of your question says that you think & means reference in this context. It only means reference when applied to types, when used in expression it means different things:
int i, j;
int &r = i; // refence when applied to types
&i; // address of in this context
i&j; // binary AND in this
In your case & is used to get the address to a value. So what will happen there is that p will contain the address of a[2], thus p will be a pointer to the third element of the array.
EDIT: Demo (with p-a included)

How to use pointers in for loops to count characters?

I was looking through the forums and saw a question about counting the numbers of each letter in a string. I am teaching myself and have done some research and am now starting to do projects. Here I have printed the elements of the array. But I do so without pointers. I know I can use a pointer to an array and have it increment for each value, but I need some help doing so.
Here is my code without the pointer:
code main() {
char alph [] = {'a', 'b', 'c'};
int i, o;
o = 0;
for(i=0; i < 3; i++)
{ cout << alph[i] << ' '; };
};
Here is my bad code that doesn't work trying to get the pointer to work.
main() {
char alph [] = {'a', 'b', 'c'};
char *p;
p = alph;
for (; p<=3; p++);
cout << *p;
return 0;
};
I hope that it's not too obvious of an answer; I don't mean to waste anyone's time. Also this is my first post so if anyone wants to give me advice, thank you.
Very good try. There's just one tiny thing wrong, which is this:
p <= 3
Pointers are just some number which represents a memory address. When you do p = alph, you're not setting p to 0, you're setting it to point to the same address as alph. When looping over an array with pointers, you have to compare the current pointer with a pointer that is one past the end of the array. To get a pointer to one element past the end of the array, you add the number of elements to the array:
alph + 3 // is a pointer to one past the end of the array
Then your loop becomes
for (; p < alph + 3; ++p)
You may think that getting a pointer to one past the end of the array is going out of bounds of the array. However, you're free to get a pointer to anywhere in memory, as long as you don't dereference it. Since the pointer alph + 3 is never dereferenced - you only use it as a marker for the end of the array - and everything is fine.
Here are some rough correlations for the different versions:
/-----------------------------------\
| Pointer Version | Index Version |
-------------------------------------
| p | i |
| p = alph | i = 0 |
| *p | alph[i] |
| alph + 3 | 3 |
| p < alph + 3 | i < 3 |
\-----------------------------------/
Also note that instead of doing alph + 3, you may want to use sizeof. sizeof gives you the number of bytes that an object occupies in memory. For arrays, it gives you the number of bytes the whole array takes up (but it doesn't work with pointers, so you can do it with alph but not with p, for example). The advantage of using sizeof to get the size of the array is that if you change the size later, you do not have to go and find all the places where you wrote alph + 3 and change them. You can do that like this:
for (; p < alph + sizeof(alph); ++p)
Additional note: because the size of char is defined to be 1 byte, this works. If you change the array to an array of int, for example (or any other type that is bigger than 1 byte) it will not work any more. To remedy this, you divide the total size in bytes of the array with the size of a single element:
for (; p < alph + sizeof(alph) / sizeof(*alph); ++p)
This may be a little complicated to understand, but all you're doing is taking the total number of bytes of the array and dividing it by the size of a single element to find the number of elements in the array. Note that you are adding the number of elements in the array, not the size in bytes of the array. This is a consequence of how pointer arithmetic works (C++ automatically multiplies the number you add to a pointer by the size of the type that the pointer points to).
For example, if you have
int alph[3];
Then sizeof(alph) == 12 because each int is 4 bytes big. sizeof(*alph) == 4 because *alph is the first element of the array. Then sizeof(alph) / sizeof(alph) is equal to 12 / 4 which is 3, which is the number of elements in the array. Then by doing
for (; p < alph + sizeof(alph) / sizeof(*alph); ++p)
that is equivalent to
for (; p < alph + 12 / 4; ++p)
which is the same as
for (; p < alph + 3; ++p)
Which is correct.
This has the good advantage that if you change the array size to 50 and change the type to short (or any other combination of type and array size), the code will still work correctly.
When you get more advanced (and hopefully understand arrays enough to stop using them...) then you will learn how to use std::end to do all this work for you:
for (; p < std::end(alph); ++p)
Or you can just use the range-based for loop introduced in C++11:
for (char c : alph)
Which is even easier. I recommend understanding pointers and pointer arithmetic well before reclining on the convenient tools of the Standard Library though.
Also good job SO, you properly syntax-highlighted my ASCII-Art-Chart.
The pointer loop should be:
for (char * p = alph; p != alph + 3; ++p)
{
std::cout << *p << std::endl;
}
You can get a bit more fancy by hoisting the end of the array out of the loop, and by inferring the array size automatically:
for (char * p = alph, * end = alph + sizeof(alph)/sizeof(alph[0]); p != end; ++p)
In C++11, you can do even better:
for (char c : alph) { std::cout << c << std::endl; }
Your problem (apart from compilation issues noted in my comment to the question) is that pointers are not as small as 3. You need:
for (p = alph; p < alph + sizeof(alph); p++)
for the loop. Note that sizeof() generates a compile-time constant. (In C99 or C2011, that is not always the case; in C++, it is always the case, AFAIK). In this context, sizeof() is a fancy way of writing 3, but if you add new characters to the array, it adjusts automatically, whereas if you write 3 and change things, you have to remember to change the 3 to the new value too.
Ruminations on the use of sizeof()
As Kerrek SB points out, sizeof() returns the size of an array in bytes. By definition, sizeof(char) == 1, so in this context, it was safe to use sizeof() on the array. There are also times when it is not safe - or you have to do some extra work. If you had an array of some other type, you can use:
SomeType array[] = { 2, 3, 5, 7, 11, 13 };
then the number of elements in the array is (sizeof(array)/sizeof(array[0])). That is, the number of elements is the total size of the array, in bytes, divided by the size of one element (also in bytes).
The other big gotcha is that if you 'pass an array' as a function argument, you can't use sizeof() on it to get the correct size - you get the size of a pointer instead. This is a good reason for not using C-style strings or C-style arrays: use std::string and std::vector<SomeType> instead, not least because you can find their actual size reliably with a member function call.
I recommand you the following revised code.
char alph [] = {'a', 'b', 'c', 0};
char *p;
p = alph;
while(*p)
{
std::cout << *p++;
}
0 marks the end of a string in both c and c++.

Pointer arithmetic and dereferencing

In the following code, can anyone please explain to my what the line in bold is doing.
struct southParkRec {
int stan[4];
int *kyle[4];
int **kenny;
string cartman;
};
int main()
{
southParkRec cartoon;
cartoon.stan[1] = 4;
cartoon.kyle[0] = cartoon.stan + 1;
cartoon.kenny = &cartoon.kyle[2];
*(cartoon.kenny + 1) = cartoon.stan; //What does this line do?
return 0;
}
Think of it as
cartoon.kenny[1] = cartoon.stan;
They are basically the same thing
If we bring the whole thing to one common style of using the subscript operator [] (possibly with &) instead of a * and + combination, it will look as follows
cartoon.stan[1] = 4;
cartoon.kyle[0] = &cartoon.stan[1];
cartoon.kenny = &cartoon.kyle[2];
cartoon.kenny[1] = &cartoon.stan[0];
After the
cartoon.kenny = &cartoon.kyle[2];
you can think of kenny as an "array" of int * elements embedded into the kyle array with 2 element offset: kenny[0] is equivalent to kyle[2], kenny[1] is equivalent to kyle[3], kenny[2] is equivalent to kyle[4] and so on.
So, when we do
cartoon.kenny[1] = &cartoon.stan[0];
it is equivalent to doing
cartoon.kyle[3] = &cartoon.stan[0];
That's basically what that last line does.
In other words, if we eliminate kenny from the consideration ("kill Kenny"), assuming that the rest of the code (if any) doesn't depend on it, your entire code will be equivalent to
cartoon.stan[1] = 4;
cartoon.kyle[0] = &cartoon.stan[1];
cartoon.kyle[3] = &cartoon.stan[0];
As for what is the point of all this... I have no idea.
In cartoon you have:
- stan, an array of 4 ints.
- kyle, an array of 4 pointers to int.
- kenny, a pointer to a pointer to int, that is, let's say, a pointer to an "array of ints".
cartoon.stan[1] = 4; sets second element of stan array (an int) to 1.
cartoon.kyle[0] = cartoon.stan + 1; sets first element of kyle array (a pointer to int) to point to the second element of stan array (which we have just set to 4).
cartoon.kenny = &cartoon.kyle[2]; sets kenny pointer to point to the third element of kyle array.
*(cartoon.kenny + 1) = cartoon.stan; sets fourth element of kyle array (a pointer to int) to point to the first element of stan array (which hasn't been initialised yet). More in detail:
cartoon.kenny gets kenny pointer's address (third element of kyle array),
cartoon.kenny+1 gets the next int after that address (fourth element of kyle array, which happens to be a pointer to int),
*(cartoon.kenny + 1) dereferences that pointer, so we can set it, and
= cartoon.stan sets it to point to the first element of stan array.
It is incrementing the pointer at *cartoon.kenny. 'kenny' is a pointer to a pointer, so the first dereference returns a pointer, which is incremented, and a value assigned. So, *(kenny + 1) now points to the beginning of the array 'stan'.
It sets the last element of Kyle to point to the first element of Stan.

Help me understand this Strange C++ code

This was a question in our old C++ exam. This code is driving me crazy, could anyone explain what it does and - especially - why?
int arr[3]={10,20,30};
int *arrp = new int;
(*(arr+1)+=3)+=5;
(arrp=&arr[0])++;
std::cout<<*arrp;
This statement writes to the object *(arr+1) twice without an intervening sequence point so has undefined behavior.
(*(arr+1)+=3)+=5;
This statement writes to the object arrp twice without an intervening sequence point so has undefined behavior.
(arrp=&arr[0])++;
The code could result in anything happening.
Reference: ISO/IEC 14882:2003 5 [expr]/4: "Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression."
(*(arr+1)+=3)+=5;
arr + 1 - element with index 1
*(arr + 1) - value of this element
(arr + 1) += 3 - increase by 3
((arr+1)+=3)+=5 - increase by 5;
so arr[1] == 28
(arrp=&arr[0])++;
arr[0] - value of element 0
&arr[0] - address of element 0
arrp=&arr[0] - setting arrp to point to elem 0
(arrp=&arr[0])++ - set arr to point to elem 1
result: 28
This line:
(*(arr+1)+=3)+=5;
produces the same result as this (see footnote):
arr[1] += 3;
arr[1] += 5;
This line:
(arrp=&arr[0])++;
produces the same result as this (see footnote):
int* arrp = arr+1;
So this line:
std::cout<<*arrp
prints out 28.
But this code leaks memory because int *arrp = new int; allocates a new int on the heap which will be lost on assignment by (arrp=&arr[0])++;
Footnote: Of course I'm assuming an absence of weirdness.
Edit: Apparently some of the lines in fact lead to undefined behavior, due to C++ Standard 5/4. So this really is a crappy exam question.
int arr[3]={10,20,30}; // obvious?
int *arrp = new int; // allocated memory for an int
(*(arr+1)+=3)+=5; // (1)
(arrp=&arr[0])++; // (2)
std::cout<<*arrp; // (3)
(1)
*(arr+1) is the same as arr[1], which means that *(arr+1)+=3 will increase arr[1] by 3, so arr[1] == 23 now.
(*(arr+1)+=3)+=5 means arr[1] is increased by another 5, so it will be 28 now.
(2)
arrp will pont to the address of the first element of arr (arr[0]). The pointer arrp will then be incremented, thus it will point to the second element after the entire statement is executed.
(3)
Prints what arrp points to: the second element of arr, meaning 28.
Well, remember that arrays can be interpreted as pointers
int arr[3]={10,20,30};
int *arrp = new int;
creates an array arr of three integers and an int pointer that gets assigned with a freshly allocated value.
Since assignment operators return a reference to the value that has been assigned in order to allow multi-assignment,
(*(arr+1)+=3)+=5;
is equivalent to
*(arr+1)+=3;
*(arr+1)+=5;
*(arr + 1) refers to the first element of the array arr, therefore arr[1] is effectively increased by eight.
(arrp=&arr[0])++; assigns the address of the first array element to arrp and afterward increments this pointer which now points to the second element (arr[1] again).
By dereferencing it in std::cout<<*arrp, you output arr[1] which now holds the value 20 + 3 + 5 = 28.
So the code prints 28 (and furthermore creates a memory-leak since the new int initially assigned to arrp never gets deleted)
I'll try to answer you by rewriting the code in a simpler way.
int arr[3]={10,20,30};
int *arrp = new int;
(*(arr+1)+=3)+=5;
(arrp=&arr[0])++;
std::cout<<*arrp;
=== equals ===
int arr[3]={10,20,30};//create array of 3 elements and assign them
int *arrp = new int;//create an integer pointer and allocate an int to it(useless)
//(*(arr+1)+=3)+=5;
arr[1] = arr[1] + 3;//arr[1] == arr+1 because it is incrementing the arr[0] pointer
arr[1] = arr[1] + 5;
//(arrp=&arr[0])++;
arrp = &arr[0];//point the integer pointer to the first element in arr[]
arrp++;//increment the array pointer, so this really is now pointing to arr[1]
std::cout<<*arrp;//just print out the value, which is arr[1]
I am assuming you understand pointers and basic c.

Change pointer to an array to get a specific array element

I understand the overall meaning of pointers and references(or at least I think i do), I also understand that when I use new I am dynamically allocating memory.
My question is the following:
If i were to use cout << &p, it would display the "virtual memory location" of p.
Is there a way in which I could manipulate this "virtual memory location?"
For example, the following code shows an array of ints.
If I wanted to show the value of p[1] and I knew the "virtual memory location" of p, could I somehow do "&p + 1" and obtain the value of p[1] with cout << *p, which will now point to the second element in the array?
int *p;
p = new int[3];
p[0] = 13;
p[1] = 54;
p[2] = 42;
Sure, you can manipulate the pointer to access the different elements in the array, but you will need to manipulate the content of the pointer (i.e. the address of what p is pointing to), rather than the address of the pointer itself.
int *p = new int[3];
p[0] = 13;
p[1] = 54;
p[2] = 42;
cout << *p << ' ' << *(p+1) << ' ' << *(p+2);
Each addition (or subtraction) mean the subsequent (prior) element in the array. If p points to a 4 byte variable (e.g. int on typical 32-bits PCs) at address say 12345, p+1 will point to 12349, and not 12346. Note you want to change the value of what p contains before dereferencing it to access what it points to.
Not quite. &p is the address of the pointer p. &p+1 will refer to an address which is one int* further along. What you want to do is
p=p+1; /* or ++p or p++ */
Now when you do
cout << *p;
You will get 54. The difference is, p contains the address of the start of the array of ints, while &p is the address of p. To move one item along, you need to point further into the int array, not further along your stack, which is where p lives.
If you only had &p then you would need to do the following:
int **q = &p; /* q now points to p */
*q = *q+1;
cout << *p;
That will also output 54 if I am not mistaken.
It's been a while (many years) since I worked with pointers but I know that if p is pointing at the beginning of the array (i.e. p[0]) and you incremented it (i.e. p++) then p will now be pointing at p[1].
I think that you have to de-reference p to get to the value. You dereference a pointer by putting a * in front of it.
So *p = 33 with change p[0] to 33.
I'm guessing that to get the second element you would use *(p+1) so the syntax you'd need would be:
cout << *(p+1)
or
cout << *(++p)
I like to do this:
&p[1]
To me it looks neater.
Think of "pointer types" in C and C++ as laying down a very long, logical row of cells superimposed on the bytes in the memory space of the CPU, starting at byte 0. The width of each cell, in bytes, depends on the "type" of the pointer. Each pointer type lays downs a row with differing cell widths. A "int *" pointer lays down a row of 4-byte cells, since the storage width of an int is 4 bytes. A "double *" lays down a 8-byte per-cell row; a "struct foo *" pointer lays down a row with each cell the width of a single "struct foo", whatever that is. The "address" of any "thing" is the byte offset, starting at 0, of the cell in the row holding the "thing".Pointer arithmetic is based on cells in the row, not bytes. "*(p+10)" is a reference to the 10th cell past "p", where the cell size is determined by the type of p. If the type of "p" is "int", the address of "p+10" is 40 bytes past p; if p is a pointer to a struct 1000 bytes long, "p+10" is 10,000 bytes past p. (Note that the compiler gets to choose an optimal size for a struct that may be larger than what you'd think; this is due to "padding" and "alignment". The 1000 byte struct discussed might actually take 1024 bytes per cell, for example, so "p+10" would actually be 10,240 bytes past p.)