Initialization of pointers in c++ - c++

I need to clarify my concepts regarding the basics of pointer initialization in C++. As per my understanding, a pointer must be assigned an address before putting some value using the pointer.
int *p;
*p=10; //inappropriate
cout << *p <<"\n";
This would probably show the correct output (10) but this may cause issue in larger programs since p initially had garbage address which can be anything & may later be used somewhere else in the program as well.So , I believe this is incorrrect, the correct way is:
int *p;
int x=10;
p=&x; //appropriate
cout << *p <<"\n";
My question is, if the above understanding is correct, then does the same apply on char* as well?:
const char *str="hello"; // inappropriate
cout << str << "\n";
//OR
const string str1= "hello";
const char str2[6] ="world";
const char *str=str1; //appropriate
const char *st=str2; //appropriate
cout << str << st << "\n";
Please advice

Your understanding of strings is incorrect.
Lets take for example the very first line:
const char *str="hello";
This is actually correct. A string literal like "hello" is turned into a constant array by the compiler, and like all arrays it can decay to a pointer to its first element. So what you are doing is making str point to the first character of the array.
Then lets continue with
const string str1= "hello";
const char *str=str1;
This is actually wrong. A std::string object have no casting operator defined to cast to a const char *. The compiler will give you an error for this. You need to use the c_str function go get a pointer to the contained string.
Lastly:
const char str2[6] ="world";
const char *st=str2; //appropriate
This is really no different than the first line when you declare and initialize str. This is, as you say, "appropriate".
About that first example with the "inappropriate" pointer:
int *p;
*p=10; //inappropriate
cout << *p <<"\n";
This is not only "inappropriate", this leads to undefined behavior and may actually crash your program. Also, the correct term is that the value of p is indeterminate.

When I declare a pointer
int *p;
I get an object p whose values are addresses. No ints are created anywhere. The thing you need to do is think of p as being an address rather than being an int.
At this point, this isn't particularly useful since you have no addresses you could assign to it other than nullptr. Well, technically that's not true: p itself has an address which you can get with &p and store it in an int**, or even do something horrible like p = reinterpret_cast<int*>(&p);, but let's ignore that.
To do something with ints, you need to create one. e.g. if you go on to declare
int x;
you now have an int object whose values are integers, and we could then assign its address to p with p = &x;, and then recover the object from p via *p.
Now, C style strings have weird semantics — the weirdest aspect being that C doesn't actually have strings at all: it's always working with arrays of char.
String literals, like "Hello!", are guaranteed to (act1 like they) exist as an array of const char located at some address, and by C's odd conversion rules, this array automatically converts to a pointer to its first element. Thus,
const char *str = "hello";
stores the address of the h character in that character array. The declaration
const char str2[6] ="world";
works differently; this (acts1 like it) creates a brand new array, and copies the contents of the string literal "world" into the new array.
As an aside, there is an obsolete and deprecated feature here for compatibility with legacy programs, but for some misguided reason people still use it in new programs these days so you should be aware of it and that it's 'wrong': you're allowed to break the type system and actually write
char *str = "hello";
This shouldn't work because "hello" is an array of const char, but the standard permits this specific usage. You're still not actually allowed to modify the contents of the array, however.
1: By the "as if" rule, the program only has to behave as if things happen as I describe, but if you peeked at the assembly code, the actual way things happen can be very different.

Related

Is it possible for separately initialized string variables to overlap?

If I initialize several string(character array) variables in the following ways:
const char* myString1 = "string content 1";
const char* myString2 = "string content 2";
Since const char* is simply a pointer a specific char object, it does not contain any size or range information of the character array it is pointing to.
So, is it possible for two string literals to overlap each other? (The newly allocated overlap the old one)
By overlap, I mean the following behaviour;
// Continue from the code block above
std::cout << myString1 << std::endl;
std::cout << myString2 << std::endl;
It outputs
string costring content 2
string content 2
So the start of myString2 is somewhere in the middle of myString1. Because const char* does not "protect"("possess") a range of memory locations but only that one it points to, I do not see how C++ can prevent other string literals from "landing" on the memory locations of the older ones.
How does C++/compiler avoid such problem?
If I change const char* to const char[], is it still the same?
Yes, string literals are allowed to overlap in general. From lex.string#9
... Whether all string-literals are distinct (that is, are stored in nonoverlapping objects) and whether successive evaluations of a string-literal yield the same or a different object is unspecified.
So it's up to the compiler to make a decision as to whether any string literals overlap in memory. You can write a program to check whether the string literals overlap, but since it's unspecified whether this happens, you may get different results every time you run the program.
A string is required to end with a null character having a value of 0, and can't have such a character in the middle. So the only case where this is even possible is when two strings are equal from the start of one to the end of both. That is not the case in the example you gave, so those two particular strings would never overlap.
Edit: sorry, I didn't mean to mislead anybody. It's actually easy to put a null character in the middle of a string with \0. But most string handling functions, particularly those in the standard library, will treat that as the end of a string - so your strings will get truncated. Not very practical. Because of that the compiler won't try to construct such a string unless you explicitly ask it to.
The compiler knows the size of each string, because it can "see" it in your code.
Additionally, they are not allocated the same way, that you would allocate them at run-time. Instead, if the strings are constant and defined globally, they are most likely located in the .text section of the object file, not on the heap.
And since the compiler knows the size of a constant string at compile-time, it can simply put its value in the free space of the .text section. The specifics depend on the compiler you use, but be assured the people who wrote are smart enough to avoid this issue.
If you define these strings inside some function instead, the compiler can choose between the first option and allocating space on the stack.
As for the const char[], most compilers will treat it the same way as const char*.
Two string literals will not likely overlap unless they are the same. In that case though the pointers will be pointing to the same thing. (This isn't guaranteed by the standard though, but I believe any modern compiler should make this happen.)
const char *a = "Hello there."
const char *b = "Hello there."
cout << (a == b);
// prints "1" which means they point to the same thing
The const char * can share a string though.
const char *a = "Hello there.";
const char *b = a + 6;
cout << a;
// prints "Hello there."
cout << b;
// prints "there."
I think to answer your second question an explanation of c-style strings is useful.
A const char * is just a pointer to a string of characters. The const means that the characters themselves are immutable. (They are stored as part of the executable itself and you wouldn't want your program to change itself like this. You can use the strings command on unix to see all the strings in an executable easily i.e. strings a.out. You will see many more strings than what you coded as many exist as part of the standard library other required things for an executable.)
So how does it know to just print the string and then stop at the end? Well a c-style string is required to end with a null byte (\0). The complier implicitly puts it there when you declare a string. So "string content 1" is actually "string content 1\0".
const char *a = "Hello\0 there.";
cout << a;
// prints "Hello"
For the most part const char *a and const char a[] are the same.
// These are valid and equivalent
const char *a = "Hello";
const char b[] = "there."
// This is valid
const char *c = b + 3; // *c = "re."
// This, however, is not valid
const char d[] = b + 3;

How to: Typecasting Pointers on c++

I am learning typecasting.
Here is my basic code, i dont know why after typecasting, when printing p0, it is not showing the same address of a
I know this is very basic.
#include <iostream>
using namespace std;
int main()
{
int a=1025;
cout<<a<<endl;
cout<<"Size of a in bytes is "<<sizeof(a)<<endl;
int *p;//pointer to an integer
p=&a; //p stores an address of a
cout<<p<<endl;//display address of a
cout<<&a<<endl;//displays address of a
cout<<*p<<endl;//display value where p points to. p stores an address of a and so it points to the value of a
char *p0;//pointer to character
p0=(char*)p;//typecasting
cout<<p0<<endl;
cout<<*p0;
return 0;
}
When you pass a char * pointer to the << operator of std::cout, it prints the string that the pointer points to, not the address. It's the same behavior as the following code:
const char *str = "Hello!";
cout << str; // Prints the string "Hello!", not the address of the string
In your case, p0 doesn't point to a string, which is why you're getting unexpected behavior.
The overload of operator<<, used with std::cout and char* as arguments, is expecting a null-terminated string. What you are feeding it with, instead, is a pointer to what was an int* instead. This leads to undefined behavior when trying to output the char* in cout<<p0<<endl;.
In C++, is often a bad idea to use C-style casts. If you had used static_cast for example, you would have been warned that the conversion your are trying to make does not make much sense. It is true that you could use reinterpret_cast instead, but what you should be asking yourself is: why am I doing this? Why am I trying to shoot myself in the foot?
If what you want is to convert the number to string, you should be using other techniques instead. If you just want to print out the address of the char* you should be using std::addressof:
std::cout << std::addressof(p0) << std::endl;
As others have said cout is interpreting the char* as a string, and not a pointer
If you wanted to prove that the address is the same whatever type of pointer it is then you can cast it to a void pointer
cout<<(void*)p0<<endl;
In fact you get the address for pretty much any type other than char&
cout<<(float*)p0<<endl;
To prove to yourself that a char* pointer would have the same value use printf
printf("%x", p0);

C++ map::find char * vs. char []

I'm using C++ map to implemented a dictionary in my program. My function gets a structure as an argument and should return the associated value based on structure.name member which is char named[32]. The following code demonstrates my problem:
map <const char *, const char *> myMap;
myMap.insert(pair<const char *, const char *>("test", "myTest"));
char *p = "test";
char buf[5] = {'\0'};
strcpy(buf, "test");
cout << myMap.find(p)->second << endl; // WORKS
cout << myMap.find("test")->second << endl; // WORKS
cout << myMap.find(buf)->second << endl; // DOES NOT WORK
I am not sure why the third case doesn't work and what should I do to make it work.
I debugged the above code to watch the values passed and I still cannot figure the problem.
Thanks!
Pointer comparison, not string comparison, will be performed by map to locate elements. The first two work because "test" is a string literal and will have the same address. The last does not work because buf will not have the same address as "test".
To fix, either use a std::string or define a comparator for char*.
The map key is a pointer, not a value. All your literal "test" strings share storage, because the compiler is clever that way, so their pointers are the same, but buf is a different memory address.
You need to use a map key that has value equality semantics, such as std::string, instead of char*.
Like was mentioned you are comparing on the address not the value. I wanted to link this article:
Is a string literal in c++ created in static memory?
Since all the literals had the same address this explains why your comparison of string literals worked even though the underlying type is still a const char * (but depending on the compiler it may not ALWAYS be so)
Its because by buf[5] you are allocating the memory pointed by buf but when u use 'p' pointer it points to the same memory location as used by map. So always use std::string in key instead of pointer variable.

C++ Swap string

I am trying to create a non-recursive method to swap a c-style string. It throws an exception in the Swap method. Could not figure out the problem.
void Swap(char *a, char* b)
{
char temp;
temp = *a;
*a = *b;
*b = temp;
}
void Reverse_String(char * str, int length)
{
for(int i=0 ; i <= length/2; i++) //do till the middle
{
Swap(str+i, str+length - i);
}
}
EDIT: I know there are fancier ways to do this. But since I'm learning, would like to know the problem with the code.
It throws an exception in the Swap method. Could not figure out the problem.
No it doesn't. Creating a temporary character and assigning characters can not possibly throw an exception. You might have an access violation, though, if your pointers don't point to blocks of memory you own.
The Reverse_String() function looks OK, assuming str points to at least length bytes of writable memory. There's not enough context in your question to extrapolate past that. I suspect you are passing invalid parameters. You'll need to show how you call Reverse_String() for us to determine if the call is valid or not.
If you are writing something like this:
char * str = "Foo";
Reverse_String(str, 3);
printf("Reversed: '%s'.\n", str);
Then you will definitely get an access violation, because str points to read-only memory. Try the following syntax instead:
char str[] = "Foo";
Reverse_String(str, 3);
printf("Reversed: '%s'.\n", str);
This will actually make a copy of the "Foo" string into a local buffer you can overwrite.
This answer refers to the comment by #user963018 made under #André Caron's answer (it's too long to be a comment).
char *str = "Foo";
The above declares a pointer to the first element of an array of char. The array is 4 characters long, 3 for F, o & o and 1 for a terminating NULL character. The array itself is stored in memory marked as read-only; which is why you were getting the access violation. In fact, in C++, your declaration is deprecated (it is allowed for backward compatibility to C) and your compiler should be warning you as such. If it isn't, try turning up the warning level. You should be using the following declaration:
const char *str = "Foo";
Now, the declaration indicates that str should not be used to modify whatever it is pointing to, and the compiler will complain if you attempt to do so.
char str[] = "Foo";
This declaration states that str is a array of 4 characters (including the NULL character). The difference here is that str is of type char[N] (where N == 4), not char *. However, str can decay to a pointer type if the context demands it, so you can pass it to the Swap function which expects a char *. Also, the memory containing Foo is no longer marked read-only, so you can modify it.
std::string str( "Foo" );
This declares an object of type std::string that contains the string "Foo". The memory that contains the string is dynamically allocated by the string object as required (some implementations may contain a small private buffer for small string optimization, but forget that for now). If you have string whose size may vary, or whose size you do not know at compile time, it is best to use std::string.

C++ strings: [] vs. *

Been thinking, what's the difference between declaring a variable with [] or * ? The way I see it:
char *str = new char[100];
char str2[] = "Hi world!";
.. should be the main difference, though Im unsure if you can do something like
char *str = "Hi all";
.. since the pointer should the reference to a static member, which I don't know if it can?
Anyways, what's really bugging me is knowing the difference between:
void upperCaseString(char *_str) {};
void upperCaseString(char _str[]) {};
So, would be much appreciated if anyone could tell me the difference? I have a hunch that both might be compiled down the same, except in some special cases?
Ty
Let's look into it (for the following, note char const and const char are the same in C++):
String literals and char *
"hello" is an array of 6 const characters: char const[6]. As every array, it can convert implicitly to a pointer to its first element: char const * s = "hello"; For compatibility with C code, C++ allows one other conversion, which would be otherwise ill-formed: char * s = "hello"; it removes the const!. This is an exception, to allow that C-ish code to compile, but it is deprecated to make a char * point to a string literal. So what do we have for char * s = "foo"; ?
"foo" -> array-to-pointer -> char const* -> qualification-conversion -> char *. A string literal is read-only, and won't be allocated on the stack. You can freely make a pointer point to them, and return that one from a function, without crashing :).
Initialization of an array using a String literal
Now, what is char s[] = "hello"; ? It's a whole other thing. That will create an array of characters, and fill it with the String "hello". The literal isn't pointed to. Instead it is copied to the character-array. And the array is created on the stack. You cannot validly return a pointer to it from a function.
Array Parameter types.
How can you make your function accept an array as parameter? You just declare your parameter to be an array:
void accept_array(char foo[]);
but you omit the size. Actually, any size would do it, as it is just ignored: The Standard says that parameters declared in that way will be transformed to be the same as
void accept_array(char * foo);
Excursion: Multi Dimensional Arrays
Substitute char by any type, including arrays itself:
void accept_array(char foo[][10]);
accepts a two-dimensional array, whose last dimension has size 10. The first element of a multi-dimensional array is its first sub-array of the next dimension! Now, let's transform it. It will be a pointer to its first element again. So, actually it will accept a pointer to an array of 10 chars: (remove the [] in head, and then just make a pointer to the type you see in your head then):
void accept_array(char (*foo)[10]);
As arrays implicitly convert to a pointer to their first element, you can just pass an two-dimensional array in it (whose last dimension size is 10), and it will work. Indeed, that's the case for any n-dimensional array, including the special-case of n = 1;
Conclusion
void upperCaseString(char *_str) {};
and
void upperCaseString(char _str[]) {};
are the same, as the first is just a pointer to char. But note if you want to pass a String-literal to that (say it doesn't change its argument), then you should change the parameter to char const* _str so you don't do deprecated things.
The three different declarations let the pointer point to different memory segments:
char* str = new char[100];
lets str point to the heap.
char str2[] = "Hi world!";
puts the string on the stack.
char* str3 = "Hi world!";
points to the data segment.
The two declarations
void upperCaseString(char *_str) {};
void upperCaseString(char _str[]) {};
are equal, the compiler complains about the function already having a body when you try to declare them in the same scope.
Okay, I had left two negative comments. That's not really useful; I've removed them.
The following code initializes a char pointer, pointing to the start of a dynamically allocated memory portion (in the heap.)
char *str = new char[100];
This block can be freed using delete [].
The following code creates a char array in the stack, initialized to the value specified by a string literal.
char [] str2 = "Hi world!";
This array can be modified without problems, which is nice. So
str2[0] = 'N';
cout << str2;
should print Ni world! to the standard output, making certain knights feel very uncomfortable.
The following code creates a char pointer in the stack, pointing to a string literal... The pointer can be reassigned without problems, but the pointed block cannot be modified (this is undefined behavior; it segfaults under Linux, for example.)
char *str = "Hi all";
str[0] = 'N'; // ERROR!
The following two declarations
void upperCaseString(char *_str) {};
void upperCaseString(char [] _str) {};
look the same to me, and in your case (you want to uppercase a string in place) it really doesn't matters.
However, all this begs the question: why are you using char * to express strings in C++?
As a supplement to the answers already given, you should read through the C FAQ regarding arrays vs. pointers. Yes it's a C FAQ and not a C++ FAQ, but there's little substantial difference between the two languages in this area.
Also, as a side note, avoid naming your variables with a leading underscore. That's reserved for symbols defined by the compiler and standard library.
Please also take a look at the http://c-faq.com/aryptr/aryptr2.html The C-FAQ might prove to be an interesting read in itself.
The first option dynamically allocates 100 bytes.
The second option statically allocates 10 bytes (9 for the string + nul character).
Your third example shouldn't work - you're trying to statically-fill a dynamic item.
As to the upperCaseString() question, once the C-string has been allocated and defined, you can iterate through it either by array indexing or by pointer notation, because an array is really just a convenient way to wrap pointer arithmetic in C.
(That's the simple answer - I expect someone else will have the authoritative, complicated answer out of the spec :))