char* - why is there no address in the pointer? - c++

I have a basic question about char* I don't understand
char* aString = "Hello Stackoverflow";
The Pointer points at the first character of the character chain.
cout << *aString; // H
but why is the whole string saved in the pointer?
cout << aString //Hello Stackoverflow
I would expect an address, aren't addresses saved in pointers? Where is the address of "Hello Stackoverflow"?
Any help much appreciated

There is an overload for operator<<(ostream&, char const*) which output the null-terminated string starting at that pointer and which is preferred to the operator ostream::operator<<(void*) which would have output the address.
If you want the address, cast the pointer to void*.

The string is saved sequentially, starting from that position. The rules of C, inherited by C++ simply state that when you try to use a char * as a string, it will keep reading characters until encountering a 0 byte.
If you do want to get an address, tell cout to not interpret it as a "string":
std::cout << (void *)aString << std::endl;
EDIT
Where does the standard state that 0 == '\0'?
From a C++11 draft, section 2.3-3:
The basic execution character set and the basic execution wide-character set shall each contain all the members of the basic source character set, plus control characters representing alert, backspace, and carriage return, plus a null character (respectively, null wide character), whose representation has all zero bits.

aString does indeed hold an address, but the overload of operator<< that your code selects (operator<<(ostream&, const char*)) happens to print not the address that it is given, but the null terminated string starting at that address.
If you want to print the address, you can use the ostream::operator<<(void*) overload by casting aString to void*.

Related

C++ char pointer

Why does the following happen?
char str[10]="Pointers";
char *ptr=str;
cout << str << "\n"; // Output : Pointers
int abc[2] = {0,1 };
int *ptr1 = abc;
cout <<ptr1 << "\n"; // But here the output is an address.
// Why are the two outputs different?
As others have said, the reason for the empty space is because you asked it to print out str[3], which contains a space character.
Your second question seems to be asking why there's a difference between printing a char* (it prints the string) and int* (it just prints the address). char* is treated as a special case, it's assumed to represent a C-style string; it prints all the characters starting at that address until a trailing null byte.
Other types of pointers might not be part of an array, and even if they were there's no way to know how long the array is, because there's no standard terminator. Since there's nothing better to do for them, printing them just prints the address value.
1) because str[3] is a space so char * ptr = str+3 points to a space character
2) The << operator is overloaded, the implementation is called depending on argument type:
a pointer to an int (int*) uses the default pointer implementation and outputs the formatted address
a pointer to a char (char*) is specialized, output is formated as a null terminated string from the value it points to. If you want to output the adress, you must cast it to void*
The empty space is actually Space character after "LAB". You print the space character between "LAB" and "No 5".
Your second question: You see address, because ptr1 is actually address (pointer):
int *ptr1;
If you want to see it's first member (0), you should print *ptr1

Why does cout << &r give different output than cout << (void*)&r?

This might be a stupid question, but I'm new to C++ so I'm still fooling around with the basics. Testing pointers, I bumped into something that didn't produce the output I expected.
When I ran the following:
char r ('m');
cout << r << endl;
cout << &r << endl;
cout << (void*)&r << endl;
I expected this:
m
0042FC0F
0042FC0F
..but I got this:
m
m╠╠╠╠ôNh│hⁿB
0042FC0F
I was thinking that perhaps since r is of type char, cout would interpret &r as a char* and [for some reason] output the pointer value - the bytes comprising the address of r - as a series of chars, but then why would the first one would be m, the content of the address pointed to, rather than the char representation of the first byte of the pointer address.. It was as if cout interprets &r as r but instead of just outputting 'm', it goes on to output more chars - interpreted from the byte values of the subsequent 11 memory addresses.. Why? And why 11?
I'm using MSVC++ (Visual Studio 2013) on 64 bit Win7.
Postscript: I got a lot of correct answers here (as expected, given the trivial nature of the question). Since I can only accept one, I made it the first one I saw. But thanks, everyone.
So to summarize and expand on the instinctive theories mentioned in my question:
Yes, cout does interpret &r as char*, but since char* is a 'special thing' in C++ that essentially means a null terminated string (rather than a pointer [to a single char]), cout will attempt to print out that string by outputting chars (interpreted from the byte contents of the memory address of r onwards) until it encounters '\0'. Which explains the 11 extra characters (it just coincidentally took 11 more bytes to hit that NUL).
And for completeness - the same code, but with int instead of char, performs as expected:
int s (3);
cout << s << endl;
cout << &s << endl;
cout << (void*)&s << endl;
Produces:
3
002AF940
002AF940
A char * is a special thing in C++, inherited from C. It is, in most circumstances, a C-style string. It is supposed to point to an array of chars, terminated with a 0 (a NUL character, '\0').
So it tries to print this, following on in to the memory after the 'm', looking for a terminating '\0'. This makes it print some random garbage. This is known as Undefined Behaviour.
There is an operator<< overload specifically for char* strings. This outputs the null-terminated string, not the address. Since the pointer you're passing this overload isn't a null-terminated string, you also get Undefined Behavior when operator<< runs past the end of the buffer.
Conversely, the void* overload will print the address.
Because operator<< is overloaded based on the data type.
If you give it a char, it assumes you want that character.
If you give it a void*, it assumes you want an address.
However, if you give it a char*, it takes that as a C-style string and attempts to output it as such. Since the original intent of C++ was "C with classes", handling of C-style strings was a necessity.
The reason you get all the rubbish at the end is simply because, despite your assertion to the compiler, it isn't actually a C-style string. Specifically, it is not guaranteed to have a string-terminating NUL character at the end so the output routines will just output whatever happens to be in memory after it.
This may work (if there's a NUL there), it may print gibberish (if there's a NUL nearby), or it may fall over spectacularly (if there's no NUL before it gets to memory it cannot read). It's not something you should rely on.
Because there's an overload of operator<< which takes a const char pointer as it's second argument and prints out a string. The overload that takes a void pointer prints only the address.
A char * is often - usually even - a pointer to a C-style null-terminated string (or a string literal) and is treated as such by ostreams. A void * by contrast unambiguously indicates a pointer value is required.
The output operator (operator<<()) is overloaded for char const* and void const*. When passing a char* the overload for char const* is a better match and chosen. This overload expects a pointer to the start of a null terminated string. You give it a pointer to an individual char, i.e., you get undefined behavior.
If you want to try with a well-defined example you can use
char s[] = { 'm', 0 };
std::cout << s[0] << '\n';
std::cout << &s[0] << '\n';
std::cout << static_cast<void*>(&s[0]) << '\n';

Making strings by pointers

I have learned that a pointer points to a memory address so i can use it to alter the value set at that address. So like this:
int *pPointer = &iTuna;
pPointer here has the memory address of iTuna. So we can use pPointer to alter the value at iTuna. If I print pPointer the memory address gets printed and if I print *pPointer then the value at iTuna gets printed
Now see this program
char* pStr= "Hello !";
cout<< pStr << endl;
cout<< *pStr << endl;
system("PAUSE");
return 0;
There are a lot of stuff I don't understand here:
In "Hello !" Each letter is stored separately, and a pointer holds one memory address. So how does pStr point to all the letters.
Also When I print out pStr it prints Hello !, not a memory address.
And when I print out *pStr it prints out H only not all what pStr is pointing too.
I really can't understand and these are my concerns. I hope someone can explain to me how this works ad help me understand
"Hello !" is an array of type char const[8] and value { 'H', 'e', 'l', 'l', 'o', ' ', '!', 0 }. pStr is a pointer to its first element; its last element has the value 0.
There is an overload in the iostreams library for a char const * argument, which treats the ar­gu­ment as a pointer to the first element of an array and prints every element until it encounters a zero. ("Null-ter­mi­nat­ed string" in colloquial parlance.)
Dereferencing the pointer to the first element of an array gives you the first element of the array, i.e. 'H'. This identical to pStr[0].
1-) Since pStr points to a char, it actually points to the beginning of an array of a null terminated string
2-) cout is overloaded with a char * argument. It will print out ever character in the string until it reaches the null character
3-) You are dereferencing the pointer to the first element of the character array.
1-) In "Hello !" Each letter is stored seperatly, and a pointer holds
one memory adress. So how does pStr point to all the letters.
The letters are stored in that order in each memory cell with an extra final cell
holding 0 to mark the end.
2-)Also When i print out pStr it prints Hello ! not a memory adress.
The cout << understands that you are pointing at a string and so prints the string.
3-)And when i print out *pStr it prints out H only not all what pStr
is pointng too.
The * means you are askign for the value at that address. cout << knows that the address holds a char and so prints the char.
Your understanding of pointers is correct in all respects.
Your problem is that the << operator has been overridden for various datatypes on a stream. So the standard library writers have MADE the operator << do something specific on any variable of the type char * (in this case the something specific means output the characters at that address until you get to the end of string marker) as opposed to what you expect it to do (print an address in decimal or hex)
Similarly the << operator has been overridden for char to just output a single character, if you think about it for a bit you will realise that *pStr is a dereferenced pointer to a character - that is it is a char - thus it just prints a single char.
You need to understand concept of strings as well. In C and C++, string is a few characters (char's) located one after another in a memory, basically, 'H','e','l','l','o','\0'. Your pointer holds memory address of the first symbol, and your C++ library knows that a string is everything starting this address and ending with '\0'.
When you pass char* to cout, it knows you output a string, and prints it as a string.
Construction *pStr means "give me whatever is located at address stored in pStr". That would be char - a single character - 'H', which is then passed to cout, and you get only one character printed.
A pointer, *pStr points to a specific memory address, but that memory address can be used not only as a single element, i.e. a char, but also as the beginning of an array of such elements, or a block of memory.
char arrays are a special type of array in C in that some operations handle them in a specific way: as a string. Hence, printf("%s", ... and cout know that when given a char * they should look for a string, and print all the characters until the terminating null character. Furthermore, C provides a string library with functions designed to manipulate such char * as strings.
This behavior is just as you'd expect from your own analysis: de-referencing pStr simply gives you the value at the memory address, in this case the first element of an array of chars in memory.
A pointer to an array (a C style string is an array of char) is just a pointer to the first element of the array.
The << operator is overloaded for the type char* to treat it as a C-style string, so when you pass it a char* it starts at the pointer you give it and keeps adding 1 to it to find the next character until it finds the null character which signals the end of the string.
When you dereference the pointer you the type you get is char because the pointer only actually points to the first item in the array. The overload of << for char doesn't treat it as a string, just as a single character.
Using strings like this is C-style code. When using C++ you should instead use std::string. It is much easier to use.

How does printf display a string from a char*?

Consider a simple example
Eg: const char* letter = "hi how r u";
letter is a const character pointer, which points to the string "hi how r u". Now when i want to print the data or to access the data I should use *letter correct?
But in this situation, shouldn't I only have to use the address in the call to printf?
printf("%s",letter);
So why is this?
*letter is actually a character; it's the first character that letter points to. If you're operating on a whole string of characters, then by convention, functions will look at that character, and the next one, etc, until they see a zero ('\0') byte.
In general, if you have a pointer to a bunch of elements (i.e., an array), then the pointer points to the first element, and somehow any code operating on that bunch of elements needs to know how many there are. For char*, there's the zero convention; for other kinds of arrays, you often have to pass the length as another parameter.
Simply because printf has a signature like this: int printf(const char* format, ...); which means it is expecting pointer(s) to a char table, which it will internally dereference.
letter does not point to the string as a whole, but to the first character of the string, hence a char pointer.
When you dereference the pointer (with *) then you are referring to the first character of the string.
however a single character is much use to prinf (when print a string) so it instead takes the pointer to the first element and increments it's value printing out the dereference values until the null character is found '\0'.
As this is a C++ question it is also important to note that you should really store strings as the safe encapulated type std::string and you the type safe iostreams where possible:
std::string line="hi how r u";
std::cout << line << std::endl;
%s prints up to the first \0 see: http://msdn.microsoft.com/en-us/library/hf4y5e3w.aspx, %s is a character string format field, there is nothing strange going on here.
printf("%s") expect the address in order to go through the memory searching for NULL (\0) = end of string. In this case you say only letter. To printf("%c") would expect the value not the address: printf("%c", *letter);
printf takes pointers to data arrays as arguments. So, if you're displaying a string (a type of array) with %s or a number with %d, %e, %f, etc, always pass the variable name without the *. The variable name is the pointer to the first element of the array, and printf will print each element of the array by using simple pointer arithmetic according to the type (char is 1 or 2 bytes, ints are 4, etc) until it reaches an EOL or zero value.
Of course, if you make a pointer to the array variable, then you'd want to dereference that pointer with *. But that's more the exception than the rule. :)

address value of character pointers

//if the code is
char str[]="hello";
char *sptr=&str[2];
cout<<sptr;
//the output is not a hexadecimal value but llo
....why?
//also consider the following code
char array[]="hello world";
char array1[11];
int i=0;
while(array[i]!='\0')
{array1[i]=array[i];
i++;}
//if we print array1,it is printed as hello world, but it should have stopped when the space between hello and world is encountered.
The operator<< overload used by output streams is overloaded for const char* so that you can output string literals and C strings more naturally.
If you want to print the address, you can cast the pointer to const void*:
std::cout << static_cast<const void*>(sptr);
In your second problem, \0 is the null terminator character: it is used to terminate the string. array actually contains "hello world\0", so the loop doesn't stop until it reaches the end of the string. The character between hello and world is the space character (presumably): ' '.
When you print a char*, it's printed as a string.
'\0' is not ' '
Regarding your first question:
By default, a char is interpreted as its textual representation trough a call to std::cout.
If you want to output the hexadecimal representation of your character, use the following code:
std::cout << std::hex << static_cast<int>(c);
Where c is your character.
In your code:
cout<<sptr;
sptr is a pointer to a char, not a char so std::cout displays it as a C-string.
Regarding your second question:
A space is the ASCII character 32, not 0. Thus your test just checks for the ending null character for its copy.
With regards to cout, there is a special overload for writing const char * (and char *) to streams that will print the contents rather than the pointer.
With regards to array1, actually there is nothing wrong with making it size 11 as you do not copy the null byte into it. The issue comes though when you try printing it, as it will try to print until it finds a 0 byte and that means reading off the end of the array.
As it stands the compiler may well pad out a few bytes because the next thing that follows is an int so it is likely to align it. What is actually in that one byte of padding could be anything though, not necessarily a zero character.
The next few characters it is likely to meet will be the int itself, and depending on the endian-ness the zero byte will be either the first or second.
It is technically undefined behaviour though.