Pointer-Array Interaction w/ null terminator - c++

I was just experimenting with the use of pointers when dealing with arrays and I've become a bit confused with how C++ is handling the arrays. Here are the relevant bits of code I wrote:
//declare a string (as a pointer)
char* szString = "Randy";
cout << "Display string using a pointer: ";
char* pszString = szString;
while (*pszString)
cout << *pszString++;
First off, when I tried using cout to write what was in "pszString" (without de-referencing)I was a bit surprised to see it gave me the string. I just assumed it was because I gave the pointer a string and not a variable.
What really caught my attention though is that when I removed the asterisk from the line cout << *pszString++; it printed "Randyandyndydyy". I'm not sure why it's writes the array AND then writes it again with 1 letter less. My reasoning is that after writing the char string the increment operator immediately brings the index to the next letter before it can reach the null terminator. I don't see why the null terminator wouldn't cause the loop to return false after the string is output for the first time otherwise. Is this the right reasoning? Could someone explain if I'm getting this relationship between arrays and pointers?

cout has an operator<< overload for char* to print the entire string (that is, print each character until it encounters a 0). By contrast, the char overload for cout's operator<< prints just that one character. That's essentially the difference here. If you need more explanation, read on.
When you dereference the pointer after incrementing it, you're sending cout a char, not and char*, so it prints one character.
So cout << *pszString++; is like doing
cout << *pszString;
pszString = pszString + 1;
When you don't dereference the pointer, you're sending it a char* so cout prints the entire string, and you're moving the start of the string up by one character in each iteration through the loop.
So cout << pszString++; is like doing
cout << pszString;
pszString = pszString + 1;
Illustration with a little loop unrolling:
For cout << *pszString++;
Randy\0
^ pszString points here
// this means increment pszString and send cout the character at which pszString *used* to be pointing
cout << *pszString++;
// so cout prints R and pszString now points
Randy\0
^ here
// this means increment pszString and send cout the character at which pszString *used* to be pointing
cout << *pszString++;
// so cout prints a and pszString now points
Randy\0
^ here
// and so on
For cout << pszString++;
Randy\0
^ pszString points here
// this means increment pszString and pass the old pointer to cout's operator<<
cout << pszString++;
// so cout prints Randy, and now pszString points
Randy\0
^ here
cout << pszString++;
// cout prints andy, and now pszString points
Randy\0
^ here
// and so on
I am glad you are experimenting with pointers this way, it'll make you actually know what's going on unlike many programmers who will do anything to get away from having to deal with pointers.

Related

Why does printing the 'address of index n' of c style strings lead to output of substring

I'm rather new to C++ and while working with a pointer to a char array (C style string) I was confused by its behavior with the ostream object.
const char* items {"sox"};
cout << items << endl;
cout << items[0] << endl;
cout << *items << endl;
cout << &items << endl;
cout << &items[1] << endl;
Running this leads to:
sox
s
s
0x7fff2e832870
ox
In contrary to pointer of other data types, printing the variable doesn't output the address, but the string as a whole. By what I understand, this is due to the << operator being overloaded for char arrays to treat them as strings.
What I don't understand is, that cout << &items[1] prints the string from index 1 onward (ox), instead of the address of the char at index 1. Is this also due to << operator being overloaded or what is the reason for this behavior?
The type of &items[1] is const char *. Therefore the const char * overload of operator << is used, which prints the string from index 1 onwards.
OTOH, the type of &items is const char **, for which no specific overload exists, so the address of items is printed (via the const void * overload).
Back in the olden days, when C ran the world, there was no std::string, and programmers had to make do with arrays of char to manage text. When C++ brought enlightenment (and std::string), old habits persevered, and arrays of char are still used to manage text. Because of this heritage, you'll find many places where arrays of char act differently from arrays of any other type.
So,
const int integers[] = { 1, 2, 3, 4 };
std::cout << integers << '\n';
prints the address of the first element in the array.
But,
const char text[] = { 'a', 'b', 'c', '\0' };
std::cout << text << '\n';
prints the text in the array text, up to the final 0: abc
Similarly, if you try to print addresses inside the array, you get different behavior:
std::cout << &integers[1] << '\n';
prints the address of the second element in th array, but
std::cout << &text[1] << '\n';
prints the text starting at the second character of the array: bc
And, as you suspected, that's because operator<< has an overload that takes const char* and copies text beginning at the location pointed to by the pointer, and continuing up to the first 0 that it sees. That's how C strings work, and that behavior carries over into C++.
items[1] is the second character of the array and its address, i.e. &items[1], is a pointer to the second character (with index 1) as well. So, with the same rule that you have mentioned for operator <<, the second character of the string till the end is printed.

Why strstr() - char pointer = number?

I have this program:
#include <iostream>
#include <conio.h>
#include <string.h>
using namespace std;
int main()
{
char char1[30] = "ExtraCharacter", char2[30] = "Character", *p;
p = strstr(char1, char2);
cout << "p: " << p << endl;
cout << "char1: " << char1 << endl;
cout << "(p-char1): " << (p-char1) << endl;
return 0;
}
When I run it, I get:
p: Character
char1: ExtraCharacter
(p-char1): 5
as expected.
But this is not the problem, I'm not sure why "Character" - "ExtraCharacter" is an integer (5)? Perhaps not an integer, but a number/digit anyways.
Actually I don't understand why is "Character" stored in p, and not the memory address.
If I understood well from a book, strstr() returns a memory address, shouldn't it be more like a strange value, like a hex (0x0045fe00) or something like that? I mean, it's cout << p not cout << *p to display the actual value of that memory address.
Can someone explain me how it works?
P.S.: I apologize if the title is not that coherent.
But this is not the problem, I'm not sure why "Character" - "ExtraCharacter" is an integer (5)?
You subtract one pointer from another and result - number, distance from char char1 points to to char p points to. This is how pointer arithmetic works.
Note: this subtraction is only valid when both pointers point to the same array (or behind the last element), which is the case in your code, but you need to be careful. For example if strstr() does not find susbtring then it would return nullptr and your subtraction will have UB. So at least check p before subtracting (and passing nullptr to std::cout would have UB as well)
If I understood well from a book, strstr() returns a memory address, shouldn't it be more like a strange value, like a hex (0x0045fe00) or something like that? I mean, it's cout << p not cout << *p to display the actual value of that memory address.
Yes p is a pointer aka memory adress. std::ostream has special rule how to print pointers to char - as strings, because strings in C stored that way. If you want to see it as a pointer just cast it:
std::cout << static_cast<void *>( p );
then you will see it as an address.
To display address, you have to cast char* to void*:
std::cout << "p: " << static_cast<const void*>(p) << std::endl;
Demo
For std::basic_ostream (type of cout), character and character string arguments (e.g., of type char or const char*) are handled by the non-member overloads of operator<< which are being treated as strings. char[30] will be decayed to const char* argument and basic_ostream will output the null terminated string at the address of the pointer.
As for (p-char1), the result of subtracting two pointers is a std::ptrdiff_t. It is an implementation-defined signed integer. That's why the output is 5

Printing out the value of pointer to the first index of an char array

I'm new to C++ and is trying to learn the concept of pointer. When I tried to print out the value of pStart, I was expecting its value to be the address of text[0] in hexdecimal (e.g. something like 0x7fff509c5a88). However, the actual value printed out is abcdef.
Could someone explain it to me why this is the case? What parts am I missing?
char text[] = "abcdef";
char *pStart = &text[0];
cout << "value of pStart: " << pStart << endl;
Iostreams provide an overload that assumes a pointer to char points to a NUL-terminated (C-style) string, and prints out the string it points to.
To get the address itself to print out, cast it to a pointer to void instead:
cout << "value of pStsart: " << (void *)pStart << "\n";
Note that you don't really need pStart here at all though. The name of an array (usually, including this case) evaluates to the address of the beginning of the array, so you can just print it directly:
cout << "address of text: " << (void *)text << "\n";
Get out of the habit of using endl as well. It does things you almost certainly don't realize and almost never want.

Why this C++ code works the way it works?

# include <iostream>
using namespace std;
int main(void)
{
char *name = "Stack overflow";
cout << *&name << endl;
cout << &*name << endl; // I don't understand why this works
return 0;
}
I understand how the first "cout" statement works but unable to understand why and how the second one works.
& and * are opposite operations. The first one takes the address of the array (adding one level of indirection) and then dereferences it (removing one level of indirection). The second dereferences the pointer (removing one level of indirection) and then takes the address of the result (adding one). Either way, you get back to the same value.
Just like 4 / 2 * 2 is the same as 4 * 2 / 2, or just like taking a step back and then forward leaves you at the same place as taking a step forward and then backward.
To understand how the second statement works substitute it
cout << &*name << endl;
for
cout << &name[0] << endl;
because *name and name[0] are equivalent and return reference to (lvalue) the first character of the string literal pointed by name.
The last statement is equivalent to
cout << name << endl;

C++: Pointers outputs confusing me

So I have the following code:
cout << _userLoginName << endl;
cout << *_userLoginName << endl;
cout << (_userLoginName+1) << endl;
cout << *(_userLoginName+1) << endl;
the variable char * _userLoginName has been set equal to "smith". My question is simple: Why in the last lines of code do I get the following output?
smith // as from cout << _userLoginName << endl;
s // as from cout << *_userLoginName << endl;
mith // cout << (_userLoginName+1) << endl;
m // cout << *(_userLoginName+1) << endl;
I really did try reasoning the result but I cannot figure it out.
Thank you.
If you give cout1 a char *, it will try to print a string. If you give it a char, then it will print that single character.
_userLoginName and (_userLoginName+1) are of type char *; *_userLoginName and *(_userLoginName+1) are of type char.
1. Technically, "give std::operator<<(std::ostream &, T)".
Pull out a sheet of paper and draw a box with six cell with "smith" written into them:
+-+-+-+-+-+--+
|s|m|i|t|h|\0|
+-+-+-+-+-+--+
^ ^
| +- _userLoginName + 1
+- _userLoginName
Use your pen as your pointer '_userLoginName' and point it at the first cell. Derefencing the pointer (i.e. using *ptr for a pointer ptr) means looking at the content of the cell it points at. That is '*_userLoginName' shows into the content of the cell. Writing a pointer of type char* or char const* does something funny: it follows the pointer and writes the content of each cell it finds until it reaches a cell having the value \0.
This should explain the first to outputs. Now, ptr + 1 looks at the cell next to ptr, i.e. ptr + 1 is another pointer (pull out another pen if necessary) placed the next cell. It does just the same as above.
Consider the type of *_userLoginName—it is char.
Maybe you overlooked that * in this context dereferences the pointer?
Dereferencing a pointer (such as *_userLoginName) always returns the element the pointer is pointing at, in the case of a normal string the first character therein, which is then printed.
Adding n to a pointer (such as _userLoginName+1) increments the pointer by n steps, so if it pointed to the 0th element it will afterwards point to the nth element.
Combine the two to explain the fourth line.
The first cout is looking at the pointer userLoginName (char* and char[] are very very similar in c++). The cout will print all values in memory, treating them as chars, until it comes across a '\0' character, which terminates the string.
The second cout is looking at one memory element, that pointed to by userLoginName, or userLoginName[0].
The third cout is doing the same as the first, but the memory address starts 1 char later than userLoginName, as the pointer is of type char.
The final cout is the same as the second, but is userLoginName[1].
There are two separate overloads for operator<< at work here: One for char-pointers, and one for chars. The second one, for single characters, simply prints that one character. The first one, for char pointers, treats the pointer as the pointer to the first character in a null-terminated array of characters (a "string") and prints all those.
Combine this with the language syntax that a[i] is the same as *(a + i) for an array a, and you have:
cout << s; // prints all characters, starting at the first
cout << *s; // prints only the first character, equal to "cout << s[0];"
cout << s + 1; // prints all characters, starting at the second
cout << *(s+1); // prints only the second character, equal to "cout << s[1];"