C++ pointer to char arithmetic - c++

If I add 1 to a pointer, the actual value added will be the size of the type that the pointer points to right? For example:
int* num[5];
cout << *num << ", " << *(num + 2) << endl;
This will print the value stored at num[1] and at num[2],
so num + 2 is actually num + 2*sizeof(int) if I'm not wrong.
Now, if I initialize an array of pointers to char to string literals, like this one:
char* ch[5] =
{
"Hi",
"There",
"I,m a string literal"
};
This can be done because a string literal like "hi" represents the address of its first character, in this case 'h'. Now my question is how can I write something like:
cout << *(ch + 2);
and get "I,m a string literal" as the output?
Since the pointer points to char, shouldn't adding 2 to the pointer actually be (ch + 2*sizeof(char)) ? giving me the output 'There' ?
Does it have something to do with cout? Does cout search the memory of the pointed to values to see if it finds '\0's recognizing the contents of the pointed to values as strings and then modifying pointer arithmetic? But then adding 1 to a pointer to char pointing to strings would mean adding different number of bytes (instead of the size of a char) everytime, since a string can be any size. Or am I totally wrong? I'm sorry I am new to C++, and programming in gerenal.

The array isn't storing chars, it's storing char *s. Hence saying ch + 2 will be equivalent to ch + 2*sizeof(char *). You then dereference that, which is pointing to "I'm a string literal".
Your initial example shows the confusion:
int* num[5];
cout << *num << ", " << *(num + 2) << endl;
This is an array of pointers-to-int. Hence, *(num + 2) will be *(num + 2*sizeof(int *)), not 2*sizeof(int). Let's demonstrate this with a small program:
#include <iostream>
int main()
{
int *num[3];
int x, y, z;
x = 1;
y = 2;
z = 3;
num[0] = &x;
num[1] = &y;
num[2] = &z;
std::cout << *(num + 2) << "\n";
}
This will print out a memory address (like 0x22ff28) because it is holding pointers, not values.
In C and C++, arrays and pointers are very similar (many books claim they are exactly the same. This is not -quite- true, but it is true in a lot of situations).
Your first example should be int num[5]. Then *(num + 2) (which is equivalent to num[2]) will be equivalent to *(num + 2*sizeof(int). Hopefully this clears up your confusion somewhat.

"If I add 1 to a pointer, the actual value added will be the size of the type that the pointer points to right?"
There's no guarantee in the C++ Standard that a pointer is the number of the byte of some memory where the pointer points to. If you add an integer n to a pointer, the result is a pointer to the nth next element in that array:
int iarr[10];
int* pi = iarr; // pi points to iarr[0]
int* pi2 = pi+2; // pi2 points to iarr[2]
What you get when you look at, e.g. int repr = (int)pi; is not defined by the C++ Standard.
What will happen on the most popular platforms/implementations, is that
(int)pi2 == ((int)pi) + 2*sizeof(int)
When you have arrays of pointers, the exact same thing happens:
int* piarr[10];
int** ppi = piarr; // ppi points to iarr[0]
int** ppi2 = piarr+2; // ppi2 points to iarr[2]
Note that the type of piarr is array of 10 pointer to int, therefore the elements of that array have the type pointer to int. A pointer to an element of that array consequently has the type pointer to pointer to int.
char* ch[5] is an array of 5 pointers to char.
"Hello" etc. are (narrow) string literals. A (narrow) string literal is an array of n const char, where n is the length of the string plus 1 (for the terminating \0 character). Arrays can be implicitly converted to pointers to the first element of the array, this is what happens here:
char* ch[5] =
{
"Hi",
"There",
"I,m a string literal"
};
The array ch contains three pointer to char. As those have been obtained by converting arrays to pointers, each of them points to the first element of an array of char: The pointer ch[0] (the first element of the array ch) points to the first element of the array "Hi", ch[1] points to the first element of "There" and so on.
Note there's also a conversion involved from const char to char, which is deprecated and should be avoided. The better form would be:
char const* ch[5] =
{
"Hi",
"There",
"I,m a string literal"
};
The expression *(ch + 2) is interpreted as follows:
ch names that array (see above)
ch + 2 implicitly converts ch from array of 3 pointers to char to pointer to pointer to char, a pointer pointing to the first element of the array ch. The type of this expression therefore is pointer to pointer to char.
ch + 2 makes the pointer from the last step now point to the second next element; it pointed to the first element of ch, so it now points to the third element of the array ch.
*(ch + 2) finally, the * dereferences the pointer and "fetches" the object pointed to. The pointer created by ch + 2 points to the 3rd element of the array ch, therefore, this expression resolves into the third element of the array ch. The type of the expression now is pointer to char.
The result of the expression is passed to std::cout::operator<<. As the type of the expression is pointer to char, cout will print that string: the third element of the array ch.

In C, a character is represented by the datatype char. It can hold any ASCII character and it ranges from 0 to 255. Moreover it uses a single byte of size.
A string, however, is represented by char*, which technically is an array of chars. There's a difference. A char is not the same as a char*. The former stores a single character, and the latter stores a memory direction which corresponds to the offset of the string.
Now, in your example, ch is not a char* but a char**. This is, it is an array of an array of chars, or better said, an array of strings. If we dereference ch once, as in *ch, we will get the first string: Hi. If we dereference it twice, as in **ch, we will get the first character of the first string: H. So, we can start working with pointer arithmetics!
cout << *(ch + 2) will output I,m a string literal
cout << **(ch + 1) will output T (first character of second string)
cout << *(*ch + 1) will output i (second character of first string)
Keep on working with these examples to understand better how characters and strings are output! It's all about pointer arithmetics!

Does it have something to do with cout?
No:
const char* cstrings[5] =
{
"Hi",
"There",
"I,m a string literal"
};
const char** x = cstrings + 2;
cout << *x << endl;
.
.
But what can be confusing is that the << operator works differently when given a pointer to a cstring--instead of outputting the address, it outputs the string. Here is an example:
int x = 10;
int* pint = &x;
const char* pstr = "hello";
cout << pint << endl << pstr << endl;
--output:--
0x7fff5fbff85c //hexidecimal string representation of an integer
hello
Since the pointer points to char,
1) The literal strings are stored in your array as pointers. That's why the type of the array is pointer.
2) Pointers are adresses in memory, which are just integers.
3) So your array of pointers is really an array of integers.

Related

Pointer Arithmetic (adding ints to arrays)

So I read online that if you have an array like
int arr[3] = {1,2,3}
I read that if you take (arr+n) any number n it will just add sizeof(n) to the address so if n is an integer(takes up 4 bytes) it will just add 4 right? But then I also experimented on my own and read some more stuff and found that
arr[i] = *(arr+i) for any i, which means it's not just adding the sizeof(i) so how exactly is this working?
Because obviously arr[0] == *(arr+0) and arr[1] == *(arr+1) so it's not just adding sizeof(the number) what is it doing?
I read that if you take (arr+n) any number n it will just add sizeof(n) to the address
This is wrong. When n is an int (or an integer literal of type int) then sizeof(n) is the same as sizeof(int) and that is a compile time constant. What actually happens is that first arr decays to a pointer to the first element of the array. Then sizeof(int) * n is added to the pointers value (because elements type is int):
1 2 3
^ ^
| |
arr arr+2
This is because each element in the array occupies sizeof(int) bytes and to get to memory address of the next element you have to add sizeof(int).
[...] and read some more stuff and found that arr[i] = *(arr+i)
This is correct. For c-ararys arr[i] is just shorthand way of writing *(arr+i).
When you write some_pointer + x then how much the pointer value is incremented depends on the type of the pointer. Consider this example:
#include <iostream>
int main(void) {
int * x = 0;
double * y = 0;
std::cout << x + 2 << "\n";
std::cout << y + 2 << "\n";
}
Possible output is
0x8
0x10
because x is incremented by 2* sizeof(int) while y is incremented by 2 * sizeof(double). Thats also the reason why you get different results here:
#include <iostream>
int main(void) {
int x[] = {1,2,3};
std::cout << &x + 1 <<"\n";
std::cout << &x[0] + 1;
}
However, note that you get different output with int* x = new int[3]{1,2,3}; because then x is just a int* that points to an array, it is not an array. This distinction between arrays and pointers to arrays causes much confusion. It is important to understand that arrays are not pointers, but they often do decay to pointers to their first element.

What is the difference between adding these strings?

Program 1:
#include <iostream>
using namespace std;
int main() {
string str;
char temp = 'a';
str += temp + "bc";
cout << str;
return 0;
}
Output:
Unknown characters
Program 2:
#include <iostream>
using namespace std;
int main() {
string str;
char temp = 'a';
str += temp;
str += "bc";
cout << str;
return 0;
}
Output:
abc
Why are both the outputs different? Shouldn't both outputs be the same?
This statement
str += temp + "bc";
can be represented like
str = str + ( temp + "bc" );
In the sub-expression temp + "bc" the string literal "bc" is implicitly converted to pointer to its first character and has the type const char *. The value of the variable temp is converted to the type integer due to the integer promotions that for example in ASCII table has the value 97.
So in the sub-expression there is used the pointer arithmetic. The expression temp + "bc" points to a memory that outside the string literal. So the result of the expression is undefined.
If you would write for example
char temp = 1;
then the expression temp + "bc" points to the second character of the string literal. As a result str would have the value 'b'.
Or to get the same result as in the second program you could write
str += temp + std::string( "bc" );
As for the second program then in this statement
str += temp;
str += "bc";
there are used overloaded operators += for the class std::string and objects of type char and char *. So these statements are well-defined.
Pat attention to that you should include explicitly the header <string>.
#include <string>
Program 1 in this line
str += temp + "bc";
first arguments of addition are evaluated
on right there is array of char: const char[3]
on left there is char which can be auto converted to int
const char[3] degrades to const char *
adding int (97) and const char * is doable, but results points beyond buffer range.
then you are using basic_string::operator+=( const CharT* s );, but argument is invalid. You have got buffer overflow.
Program 2.
Doesn't have this undefined behavior and operators form std::string are used.
Another reasonable version is:
str = str + temp + "bc";
Now each addition will create a std::string as a result.
std::string class has + and += operators overloaded, allowing you to use those to concatenate std::strings with each other, with individual characters and arrays of characters.
But just like in C, "bc" is not a std::string but a const char [3] (a constant array of 3 characters). Arrays are often automatically converted (decay) to pointers to their first elements. It happens here too.
str += "foobar"; appends the characters pointed by the pointer one by one, until it hits the null byte that terminates the string.
You can add integers to pointers: str += "foobar" + 3; will append "bar" to the string.
In C++, chars are simply small integers. So 'a' + "bc" actually means 97 + "bs" (assuming your compiler uses ASCII, all common ones do).
This forms a pointer that's out of bounds of the array and causes undefined behavior.
The random characters you're seeing are the contents of memory located 97 bytes after the "bc" array, terminated by a random null byte that was in that memory.

C++ - Inserting and Extracting Characters from an Integer Array

For example:
char mem[100000];
int reg[8];
mem[36] = 'p'; // add char p to our 36th index of our char array
reg[3] = (int)mem[36]; // store value of mem[36] into reg[3]
Now I want to print the char value at index 3 of that int array.
So far my thought process has lead me to code such as this:
char *c = (char*)reg[3];
cout << *c << endl;
But I am still getting weird values and characters when trying to print it out.
From my understanding, an integer is equal to 4 characters. Since a character is technically a byte and an integer is 4 bytes.
So I am storing a character into my integer array as 4 bytes, but when I pull it out, there is garbage data since the character I inserted is only one byte compared to the index being 4 bytes in size.
Have you tried this:
char mem[100000];
int reg[8];
mem[36] = 'p'; // add char p to our 36th index of our char array
reg[3] = (int)mem[36]; // store value of mem[36] into reg[3]
char txt[16];
sprintf(txt, "%c", reg[3]); // assigns the value as a char to txt array
cout<<txt<<endl;
This prints out the value 'p'
You shouldn't be using pointers here; it's sufficient to work with chars:
char c = reg[3];
cout << c << endl;
Note, however, that you could lose information when trying to stuff an int into a char variable.
I do not see what is your problem. You store the char into int var. You want to print it back - just cast the value to char and print it
#include <iostream>
int main()
{
char mem[100];
int reg[8];
mem[36] = 'p'; // add char p to our 36th index of our char array
// store value of mem[36] into reg[3]
reg[3] = mem[36];
// store value of mem[36] into reg[4] with cast
reg[4] = static_cast<int>(mem[36]);
std::cout << static_cast<char>(reg[3]) << '\n';
std::cout << static_cast<char>(reg[4]) << '\n';
}
/****************
* Output
$ ./test
p
p
*/

Value Of Pointers

In my book, it says Pointers are addresses and have a numerical value. You can print out the value of a pointer as cout << (unsigned long)(p)
Write code to compare p,p+1,q, and q+1. Explain the results, Im not sure what the book wants me to so here's what I have. Does anyone Know if I am doing this right
int num = 20;
double dbl = 20.0;
int *p = &num;
double *q = &dbl;
cout << (unsigned long)(q) << endl;
q = q + 1;
cout << (unsigned long)(q) << endl;
cout << (unsigned long)(p) << endl;
p = p + 1 ;
cout << (unsigned long)(p) << endl;
Assuming it's the pointer arithmetic you have problems with, let my try to to show how it's done in a more "graphical" way:
Lets say we have a pointer variable ptr which points to an array of integers, something like
int array[4] = { 1234, 5678, 9012, 3456 };
int* ptr = array; // Makes `ptr` point to the first element of `array`
In memory it looks something like
+------+------+------+------+
| 1234 | 5678 | 9012 | 3456 |
+------+------+------+------+
^ ^ ^ ^
| | | |
ptr ptr+1 ptr+2 ptr+3
The first is technically ptr+0
When adding one to a pointer, you go to the next element in the "array".
Perhaps now you start to see some similarities between pointer arithmetic and array indexing. And that is because there is a common thread here: For any pointer or array p and valid index i, the expression p[i] is exactly the same as *(p + i).
Using the knowledge that p[i] is equal to *(p + i) makes it easier to understand how an array can be used as a pointer to its first element. We start with a pointer to the first element of array (as defined above): &array[0]. This is equal to the expression &*(array + 0). The address-of (&) and dereference (*) operators cancel out each, leaving us with (array + 0). Adding zero to anything can be removed as well, so now we have (array). And finally we can remove the parentheses, leaving us with array. That means that &array[0] is equal to array.
You do it right, if you want to print the decimal representation of the addresses your pointers point to.
If you wonder, why the results are such, you need to learn pointer arithmetic. If you add 1 to any pointer, it's address will be increased by sizeof(<type>), where type is the type of the variable your pointer points to.
So that, if you have a pointer to int and increment it, the address will be increased by sizeof(int), which is, most likely, four.

c++ navigate through array by moving pointer

In the following I expected 13 to be printed.
I wanted to move arr (which is a pointer to the memory, where int values from array are stored, if i understand everything right) by the size of one array member, which is int.
Instead 45 is printed. So instead making one array-member-wide jump the 5th Array member is retrieved. Why?
int arr[] = {1,13,25,37,45,56};
int val = *( arr + 4 ); //moving the pointer by the sizeof(int)=4
std::cout << "Array Val: " << val << std::endl;
Your assumption is wrong. It moves the pointer 4 elements ahead, not 4 bytes ahead.
*(arr + 4) is like saying in that logic *(arr + 4 * sizeof (arr [0])).
The statement *(arr + 4) is equivalent to arr [4]. It does make for some neat syntax, though, as *(4 + arr) is equally valid, meaning so is 4 [arr].
Your behaviour could be achieved through the following example:
#include <iostream>
int main()
{
int a[3] = {65,66,67};
char *b = reinterpret_cast<char *>(a);
std::cout << *(b + sizeof (int)); //prints 'B'
}
I wouldn't recommend using reinterpret_cast for this purpose though.
arr + 4 will give you 4 items on from the start address of the array, not 4 bytes. That's why you get 45 which is the zeroth item plus 4.
It is performing pointer arithmetic. See this: http://www.eskimo.com/~scs/cclass/notes/sx10b.html
arr is your array and this decomposes to the first element arr[0] which will be 1, then you + 4 to it which moves the pointer by 4 elements arr[4] so it is now pointing to 45.