So, I am learning about pointers via http://cplusplus.com/doc/tutorial/pointers/ and I do not understand anything about the pointer arithmetic section. Could someone clear things up or point me to a tutorial about this that I may better understand.
I am especially confused with all the parentheses things like the difference between *p++,(*p)++, *(p++), and etc.
*p++
For this one, ++ has higher precedence then * so it increments the pointer by one but retrieves the value at the original location since post-increment returns the pointer and then increments its value.
(*p)++
This forces the precedence in the other direction, so the pointer is de-referenced first and then the value at that location in incremented by one (but the value at the original pointer location is returned).
*(p++)
This one increments the pointer first so it acts the same as the first one.
An important thing to note, is that the amount the pointer is incremented is affected by the pointer type. From the link you provided:
char *mychar;
short *myshort;
long *mylong;
char is one byte in length so the ++ increases the pointer by 1 (since pointers point to the beginning of each byte).
short is two bytes in length so the ++ increases the pointer by 2 in order to point at the start of the next short rather than the start of the next byte.
long is four bytes in the length so the ++ increases the pointer by 4.
I found useful some years ago an explanation of strcpy, from Kernighan/Ritchie (I don't have the text available now, hope the code it's accurate): cpy_0, cpy_1, cpy_2 are all equivalent to strcpy:
char *cpy_0(char *t, const char *s)
{
int i = 0;
for ( ; t[i]; i++)
t[i] = s[i];
t[i] = s[i];
i++;
return t + i;
}
char *cpy_1(char *t, const char *s)
{
for ( ; *s; ++s, ++t)
*t = *s;
*t = *s;
++t;
return t;
}
char *cpy_2(char *t, const char *s)
{
while (*t++ = *s++)
;
return t;
}
First you have to understand what post increment does;
The post increment, increases the variable by one BUT the expression (p++) returns the original value of the variable to be used in the rest of the expression.
char data[] = "AX12";
char* p;
p = data;
char* a = p++;
// a -> 'A' (the original value of p was returned from p++ and assigned to a)
// p -> 'X'
p = data; // reset;
char l = *(p++);
// l = 'A'. The (p++) increments the value of p. But returns the original
value to be used in the remaining expression. Thus it is the
original value that gets de-referenced by * so makeing l 'A'
// p -> 'X'
Now because of operator precedence:
*p++ is equivalent to *(p++)
Finally we have the complicated one:
p = data;
char m = (*p)++;
// m is 'A'. First we deference 'p' which gives us a reference to 'A'
// Then we apply the post increment which applies to the value 'A' and makes it a 'B'
// But we return the original value ('A') to be used in assignment to 'm'
// Note 1: The increment was done on the original array
// data[] is now "BXYZ";
// Note 2: Because it was the value that was post incremented p is unchaged.
// p -> 'B' (Not 'X')
*p++
Returns the content, *p, an then increases the pointer's value (postincrement). For example:
int numbers[2];
int *p;
p = &numbers[0];
*p = 4; //numbers[0] = 4;
*(p + 1) = 8; //numbers[1] = 8;
int a = *p++; //a = 4 (the increment takes place after the evaluation)
//*++p would have returned 8 (a = 8)
int b = *p; //b = 8 (p is now pointing to the next integer, not the initial one)
And about:
(*p)++
It increases the value of the content, *p = *p + 1;.
(p++); //same as p++
Increases the pointer so it points to the next element (that may not exist) of the size defined when you declared the pointer.
Related
I have a variable that is a pointer to a constant pointer to a constant char.
char const * const * words;
I then add the word "dog" to that variable.
words = (char const * const *)"dog";
However, when when I debug the code, it states this about words:
{0x616d7251 Error reading characters of string.}
My question is, how would I properly access the characters of that variable to the point where I can record each individual character of the string.
Here is some example code below:
char const * const *words;
words = (char const * const *)"dog";
for (int i = 0; i < 5; ++i)
{
char c = (char)words[i]; // Gives me, -52'i symbol', 'd', and then '\0'
// How do I access the 'o' and 'g'?
}
Thanks for the help in advance.
maybe you mean this
char const * const words = "dog";
for (int i = 0; i < strlen(words); ++i)
{
char c = words[i];
}
now of course in c++ code you should realy be using std::string
Short answer:
Your program has undefined behavior. To remove the undefined behavior use:
char const * word = "dog";
for (int i = 0; i < std::strlen(word); ++i)
{
char c = word[i];
}
or
char const * word = dog;
char const * const *words = &word;
for (int i = 0; i < std::strlen(*words); ++i)
{
char c = (*words)[i];
}
Long answer:
You are forcing a cast from char const* to char const* const* and treating the location of memory that was holding chars as though it is holding char const*s. Not only that, you are accessing memory using an out of bounds index.
Let's say the string "dog" is held in some memory location as (it takes four bytes that includes the null character) and give it an address.
a1
|
v
+---+---+---+----+
| d | o | g | \0 |
+---+---+---+----+
You can treat the address a1 as the value of a pointer of type char const*. That won't be a problem at all. However, by using:
words = (char const * const *)"dog";
You are treating a1 as though it is holding objects of type char const*.
Let's assume for a moment that you have a machine that uses 4 bytes for a pointer and uses little endian for pointers.
words[0] evaluates to a pointer. Its value will be:
'd' in decimal +
256 * 'o' in decimal +
256*256 * 'g' in decimal +
256*256*256 * '\0' in decimal.
After that, you truncate that value to char, which will give you back the character d, which is not too bad. The fun part (undefined behavior) begins when you access words[1].
words[1] is same as *(words+1). words+1 evaluates to a pointer whose value is the address a2.
a2
|
v
+---+---+---+----+
| d | o | g | \0 |
+---+---+---+----+
As you can see, it points to memory that is beyond what the compiler allocated for you. Dereferencing that pointer is cause for undefined behavior.
You are consistently missing the second *.
Ignoring the const stuff, you are declaring a char** word, which is a pointer to a pointer to a single char. You won't get a word or many words into that, and casting just hides the problem.
To get "dog" accessible through such a pointer, you need to take an extra step; make a variable that contains "dog", and put its address into your word.
I am trying to grasp pointers and I have this simple code for which I need some explanation.
I need to copy one char array to another. In my main function I have this code:
const int MAX_SIZE = 100;
char x[MAX_SIZE] = "1234565";
char* y = new char[MAX_SIZE];
copyArray(x, y);
std::cout << y;
delete [] y;
Now comes the question, how does this code (which works jut fine):
while ((*dest = *source) != '\0')
{
dest += 1;
source += 1;
}
Differ from this (gives strange characters at the end):
while (*source != '\0')
{
*dest = *source;
dest += 1;
source += 1;
}
Looking at this it seems those two functions are pretty similar.
It makes sense that we are copying until we reach a null-terminator in the source string, right (2nd function)?
But it's not working correctly - I get some strange characters at the end of the copied array. However, the first function works just fine.
void copyArray(const char* source, char* dest);
The form
while ((*dest = *source) != '\0')
{
dest += 1;
source += 1;
}
guarantees that the assignment of the character to copy ((*dest = *source)) is applied before testing the condition if the terminating '\0' character is reached is evaluated to false.
The second version doesn't copy the terminating '\0' character, because the loop ends before the
*dest = *source;
statement is ever reached.
(*dest = *source) is an evaluated expression just like the 1+1 part of int i = 1+1; so, after it is evaluated, the value is usable in an other expression
The difference is that in ((*dest = *source) != '\0'), the value of *source is assigned to *dest, then the whole expression is evaluated ( expression has the same value than *source ) while the value pointed by *source is only used to evaluate *source != '\0', but never assigned during the evaluation of that statement.
EDIT
user0042 brings a realy acute observation : by doing so, the following code
while ((*dest = *source) != '\0')
{
dest += 1;
source += 1;
}
ensures that the final char of the array has a value of '\0'
In the example you incrementing the addresses of x, y thus until the last character they point to Null-terminator so you have to declare a temporary variables to hold the first address:
char* x = "1234565";
char* y = new char[MAX_SIZE];
// Temporary pointers to hold the first element's address
char* tmp1 = x;
char* tmp2 = y;
while( (*y = *x) != '\0'){
x += 1; // X no longer points to the first element
y += 1; // Y no longer points to the first element
}
std::cout << tmp2;
You can use a do while loop instead of while:
char* x = "1234565";
const int size = strlen(x);
char* y = new char[size];
char* tmp1 = x;
char* tmp2 = y;
do{
*y = *x;
x += 1;
y += 1;
}while( *(x - 1) != '\0');
// Now no need for adding a null-terminator it is already added in the loop
// tmp2[size] = '\0';
std::cout << tmp2;
Because Assigning the values before incrementing so the last character \0 will break the loop before added to the destination pointer y. Thus I made the loop breaks not on n = '\0' but on n - 1 = '\0' to ensure that it is added to y.
In my book, it says Pointers are addresses and have a numerical value. You can print out the value of a pointer as cout << (unsigned long)(p)
Write code to compare p,p+1,q, and q+1. Explain the results, Im not sure what the book wants me to so here's what I have. Does anyone Know if I am doing this right
int num = 20;
double dbl = 20.0;
int *p = #
double *q = &dbl;
cout << (unsigned long)(q) << endl;
q = q + 1;
cout << (unsigned long)(q) << endl;
cout << (unsigned long)(p) << endl;
p = p + 1 ;
cout << (unsigned long)(p) << endl;
Assuming it's the pointer arithmetic you have problems with, let my try to to show how it's done in a more "graphical" way:
Lets say we have a pointer variable ptr which points to an array of integers, something like
int array[4] = { 1234, 5678, 9012, 3456 };
int* ptr = array; // Makes `ptr` point to the first element of `array`
In memory it looks something like
+------+------+------+------+
| 1234 | 5678 | 9012 | 3456 |
+------+------+------+------+
^ ^ ^ ^
| | | |
ptr ptr+1 ptr+2 ptr+3
The first is technically ptr+0
When adding one to a pointer, you go to the next element in the "array".
Perhaps now you start to see some similarities between pointer arithmetic and array indexing. And that is because there is a common thread here: For any pointer or array p and valid index i, the expression p[i] is exactly the same as *(p + i).
Using the knowledge that p[i] is equal to *(p + i) makes it easier to understand how an array can be used as a pointer to its first element. We start with a pointer to the first element of array (as defined above): &array[0]. This is equal to the expression &*(array + 0). The address-of (&) and dereference (*) operators cancel out each, leaving us with (array + 0). Adding zero to anything can be removed as well, so now we have (array). And finally we can remove the parentheses, leaving us with array. That means that &array[0] is equal to array.
You do it right, if you want to print the decimal representation of the addresses your pointers point to.
If you wonder, why the results are such, you need to learn pointer arithmetic. If you add 1 to any pointer, it's address will be increased by sizeof(<type>), where type is the type of the variable your pointer points to.
So that, if you have a pointer to int and increment it, the address will be increased by sizeof(int), which is, most likely, four.
If I add 1 to a pointer, the actual value added will be the size of the type that the pointer points to right? For example:
int* num[5];
cout << *num << ", " << *(num + 2) << endl;
This will print the value stored at num[1] and at num[2],
so num + 2 is actually num + 2*sizeof(int) if I'm not wrong.
Now, if I initialize an array of pointers to char to string literals, like this one:
char* ch[5] =
{
"Hi",
"There",
"I,m a string literal"
};
This can be done because a string literal like "hi" represents the address of its first character, in this case 'h'. Now my question is how can I write something like:
cout << *(ch + 2);
and get "I,m a string literal" as the output?
Since the pointer points to char, shouldn't adding 2 to the pointer actually be (ch + 2*sizeof(char)) ? giving me the output 'There' ?
Does it have something to do with cout? Does cout search the memory of the pointed to values to see if it finds '\0's recognizing the contents of the pointed to values as strings and then modifying pointer arithmetic? But then adding 1 to a pointer to char pointing to strings would mean adding different number of bytes (instead of the size of a char) everytime, since a string can be any size. Or am I totally wrong? I'm sorry I am new to C++, and programming in gerenal.
The array isn't storing chars, it's storing char *s. Hence saying ch + 2 will be equivalent to ch + 2*sizeof(char *). You then dereference that, which is pointing to "I'm a string literal".
Your initial example shows the confusion:
int* num[5];
cout << *num << ", " << *(num + 2) << endl;
This is an array of pointers-to-int. Hence, *(num + 2) will be *(num + 2*sizeof(int *)), not 2*sizeof(int). Let's demonstrate this with a small program:
#include <iostream>
int main()
{
int *num[3];
int x, y, z;
x = 1;
y = 2;
z = 3;
num[0] = &x;
num[1] = &y;
num[2] = &z;
std::cout << *(num + 2) << "\n";
}
This will print out a memory address (like 0x22ff28) because it is holding pointers, not values.
In C and C++, arrays and pointers are very similar (many books claim they are exactly the same. This is not -quite- true, but it is true in a lot of situations).
Your first example should be int num[5]. Then *(num + 2) (which is equivalent to num[2]) will be equivalent to *(num + 2*sizeof(int). Hopefully this clears up your confusion somewhat.
"If I add 1 to a pointer, the actual value added will be the size of the type that the pointer points to right?"
There's no guarantee in the C++ Standard that a pointer is the number of the byte of some memory where the pointer points to. If you add an integer n to a pointer, the result is a pointer to the nth next element in that array:
int iarr[10];
int* pi = iarr; // pi points to iarr[0]
int* pi2 = pi+2; // pi2 points to iarr[2]
What you get when you look at, e.g. int repr = (int)pi; is not defined by the C++ Standard.
What will happen on the most popular platforms/implementations, is that
(int)pi2 == ((int)pi) + 2*sizeof(int)
When you have arrays of pointers, the exact same thing happens:
int* piarr[10];
int** ppi = piarr; // ppi points to iarr[0]
int** ppi2 = piarr+2; // ppi2 points to iarr[2]
Note that the type of piarr is array of 10 pointer to int, therefore the elements of that array have the type pointer to int. A pointer to an element of that array consequently has the type pointer to pointer to int.
char* ch[5] is an array of 5 pointers to char.
"Hello" etc. are (narrow) string literals. A (narrow) string literal is an array of n const char, where n is the length of the string plus 1 (for the terminating \0 character). Arrays can be implicitly converted to pointers to the first element of the array, this is what happens here:
char* ch[5] =
{
"Hi",
"There",
"I,m a string literal"
};
The array ch contains three pointer to char. As those have been obtained by converting arrays to pointers, each of them points to the first element of an array of char: The pointer ch[0] (the first element of the array ch) points to the first element of the array "Hi", ch[1] points to the first element of "There" and so on.
Note there's also a conversion involved from const char to char, which is deprecated and should be avoided. The better form would be:
char const* ch[5] =
{
"Hi",
"There",
"I,m a string literal"
};
The expression *(ch + 2) is interpreted as follows:
ch names that array (see above)
ch + 2 implicitly converts ch from array of 3 pointers to char to pointer to pointer to char, a pointer pointing to the first element of the array ch. The type of this expression therefore is pointer to pointer to char.
ch + 2 makes the pointer from the last step now point to the second next element; it pointed to the first element of ch, so it now points to the third element of the array ch.
*(ch + 2) finally, the * dereferences the pointer and "fetches" the object pointed to. The pointer created by ch + 2 points to the 3rd element of the array ch, therefore, this expression resolves into the third element of the array ch. The type of the expression now is pointer to char.
The result of the expression is passed to std::cout::operator<<. As the type of the expression is pointer to char, cout will print that string: the third element of the array ch.
In C, a character is represented by the datatype char. It can hold any ASCII character and it ranges from 0 to 255. Moreover it uses a single byte of size.
A string, however, is represented by char*, which technically is an array of chars. There's a difference. A char is not the same as a char*. The former stores a single character, and the latter stores a memory direction which corresponds to the offset of the string.
Now, in your example, ch is not a char* but a char**. This is, it is an array of an array of chars, or better said, an array of strings. If we dereference ch once, as in *ch, we will get the first string: Hi. If we dereference it twice, as in **ch, we will get the first character of the first string: H. So, we can start working with pointer arithmetics!
cout << *(ch + 2) will output I,m a string literal
cout << **(ch + 1) will output T (first character of second string)
cout << *(*ch + 1) will output i (second character of first string)
Keep on working with these examples to understand better how characters and strings are output! It's all about pointer arithmetics!
Does it have something to do with cout?
No:
const char* cstrings[5] =
{
"Hi",
"There",
"I,m a string literal"
};
const char** x = cstrings + 2;
cout << *x << endl;
.
.
But what can be confusing is that the << operator works differently when given a pointer to a cstring--instead of outputting the address, it outputs the string. Here is an example:
int x = 10;
int* pint = &x;
const char* pstr = "hello";
cout << pint << endl << pstr << endl;
--output:--
0x7fff5fbff85c //hexidecimal string representation of an integer
hello
Since the pointer points to char,
1) The literal strings are stored in your array as pointers. That's why the type of the array is pointer.
2) Pointers are adresses in memory, which are just integers.
3) So your array of pointers is really an array of integers.
This is probably a stupid question, but I don't understand why this works:
int** test = new int*[7];
int x = 7;
*(test+1) = &x;
cout << (**(test+1));
test is a pointer to a pointer right? The second pointer points to the array, right?
In my understand I would need to dereference the "test" pointer first to get to the pointer that has the array.
(*test) // Now I have int*
*((*test) + 1) // to access the first element.
Where is my faulty thinking?
int** test = new int*[7];
+------++------++------++------++------++------++------+
| int* || int* || int* || int* || int* || int* || int* |
+------++------++------++------++------++------++------+
is the equivalent of an array with int pointers:
int* test[0]
int* test[1]
...
int* test[6]
this
int x = 7;
*(test+1) = &x;
+------++------++------++------++------++------++------+
| int* || &x || int* || int* || int* || int* || int* |
+------++------++------++------++------++------++------+
is the same as
int x = 7;
test[1] = &x
so now one of the pointers in your original array is pointing the memory location of x
cout << (**(test+1));
is the same as
cout << *test[1]
which is the value of x (==7) and which both test[1] and &x point to.
Is your misunderstanding that you think you have created a pointer to an array of 7 int? You haven't. You actually have created an array of 7 pointers to int. So there is no "second pointer" here that would point to an array. There is just one pointer that points to the first of the 7 pointers (test).
And with *test you get that first pointer which you haven't initialized yet, though. If you would add 1 to that, you would add 1 to some random address. But if you add 1 to test you get a pointer that points to the second pointer of the array. And dererencing that you get that second pointer, which you did initialize.
What you describe would be achieved by a different syntax
typedef int array[7];
array* test = new int[1][7];
// Note: "test" is a pointer to an array of int.
// There are already 7 integers! You cannot make it
// point to an int somehow.
*(*test + 1) = 7;
int *p1 = *test
int i1 = *(p1 + 1); // i1 is 7, second element of the int[7]
delete[] test;
Without using the typedef, this looks like the following
int(*test)[7] = new int[1][7];
That is, you have created a one-element array, where the element-type of that is a 7-element array of int. new gives you a pointer back to that array. Note that the parenthesis is important: The * has less precedence than the [7], so otherwise this would be taken as an array of 7 pointer to integers.
Suppose that
test[0] = 0x12345678; // some pointer value
test[1] = 0x23456789; // some pointer value
*test = 0x12345678;
*test + 1 is now 0x12345678 + 1 = 0x12345679;
* or dereference operator has higher precedence than binary +). So the expression is evaluated in that order.
However what you wanted for is to get to test[0] = 0x23456789;
So the correct expression to get to test[1] = (*(test + 1))
In general arr[i] is *(arr + i)
EDIT 2:
given
int buf[10] = {0, 1, 2};
int *p = buf;
buf[0] == p[0] == *(p + 0) equal to 0.
Note that it is perfectly fine to use array access syntax with the lvalue expression p even if it is not an array type. In fact the expression buf[0] is internally translated by the compiler to *(buf + 0) as well.
The expression *(test + 1) is equivalent to test[1], so your code could be rewritten thus:
int** test = new int*[7];
int x = 7;
test[1] = &x;
cout << *test[1];
Since test[1] obviously points to x, *test[1] is 7.
Just to be clear, the expression **(test + 1) is simply equivalent to *(*(test + 1)), which is in turn equivalent to *test[1].