C++ map::find char * vs. char [] - c++

I'm using C++ map to implemented a dictionary in my program. My function gets a structure as an argument and should return the associated value based on structure.name member which is char named[32]. The following code demonstrates my problem:
map <const char *, const char *> myMap;
myMap.insert(pair<const char *, const char *>("test", "myTest"));
char *p = "test";
char buf[5] = {'\0'};
strcpy(buf, "test");
cout << myMap.find(p)->second << endl; // WORKS
cout << myMap.find("test")->second << endl; // WORKS
cout << myMap.find(buf)->second << endl; // DOES NOT WORK
I am not sure why the third case doesn't work and what should I do to make it work.
I debugged the above code to watch the values passed and I still cannot figure the problem.
Thanks!

Pointer comparison, not string comparison, will be performed by map to locate elements. The first two work because "test" is a string literal and will have the same address. The last does not work because buf will not have the same address as "test".
To fix, either use a std::string or define a comparator for char*.

The map key is a pointer, not a value. All your literal "test" strings share storage, because the compiler is clever that way, so their pointers are the same, but buf is a different memory address.
You need to use a map key that has value equality semantics, such as std::string, instead of char*.

Like was mentioned you are comparing on the address not the value. I wanted to link this article:
Is a string literal in c++ created in static memory?
Since all the literals had the same address this explains why your comparison of string literals worked even though the underlying type is still a const char * (but depending on the compiler it may not ALWAYS be so)

Its because by buf[5] you are allocating the memory pointed by buf but when u use 'p' pointer it points to the same memory location as used by map. So always use std::string in key instead of pointer variable.

Related

Is it possible for separately initialized string variables to overlap?

If I initialize several string(character array) variables in the following ways:
const char* myString1 = "string content 1";
const char* myString2 = "string content 2";
Since const char* is simply a pointer a specific char object, it does not contain any size or range information of the character array it is pointing to.
So, is it possible for two string literals to overlap each other? (The newly allocated overlap the old one)
By overlap, I mean the following behaviour;
// Continue from the code block above
std::cout << myString1 << std::endl;
std::cout << myString2 << std::endl;
It outputs
string costring content 2
string content 2
So the start of myString2 is somewhere in the middle of myString1. Because const char* does not "protect"("possess") a range of memory locations but only that one it points to, I do not see how C++ can prevent other string literals from "landing" on the memory locations of the older ones.
How does C++/compiler avoid such problem?
If I change const char* to const char[], is it still the same?
Yes, string literals are allowed to overlap in general. From lex.string#9
... Whether all string-literals are distinct (that is, are stored in nonoverlapping objects) and whether successive evaluations of a string-literal yield the same or a different object is unspecified.
So it's up to the compiler to make a decision as to whether any string literals overlap in memory. You can write a program to check whether the string literals overlap, but since it's unspecified whether this happens, you may get different results every time you run the program.
A string is required to end with a null character having a value of 0, and can't have such a character in the middle. So the only case where this is even possible is when two strings are equal from the start of one to the end of both. That is not the case in the example you gave, so those two particular strings would never overlap.
Edit: sorry, I didn't mean to mislead anybody. It's actually easy to put a null character in the middle of a string with \0. But most string handling functions, particularly those in the standard library, will treat that as the end of a string - so your strings will get truncated. Not very practical. Because of that the compiler won't try to construct such a string unless you explicitly ask it to.
The compiler knows the size of each string, because it can "see" it in your code.
Additionally, they are not allocated the same way, that you would allocate them at run-time. Instead, if the strings are constant and defined globally, they are most likely located in the .text section of the object file, not on the heap.
And since the compiler knows the size of a constant string at compile-time, it can simply put its value in the free space of the .text section. The specifics depend on the compiler you use, but be assured the people who wrote are smart enough to avoid this issue.
If you define these strings inside some function instead, the compiler can choose between the first option and allocating space on the stack.
As for the const char[], most compilers will treat it the same way as const char*.
Two string literals will not likely overlap unless they are the same. In that case though the pointers will be pointing to the same thing. (This isn't guaranteed by the standard though, but I believe any modern compiler should make this happen.)
const char *a = "Hello there."
const char *b = "Hello there."
cout << (a == b);
// prints "1" which means they point to the same thing
The const char * can share a string though.
const char *a = "Hello there.";
const char *b = a + 6;
cout << a;
// prints "Hello there."
cout << b;
// prints "there."
I think to answer your second question an explanation of c-style strings is useful.
A const char * is just a pointer to a string of characters. The const means that the characters themselves are immutable. (They are stored as part of the executable itself and you wouldn't want your program to change itself like this. You can use the strings command on unix to see all the strings in an executable easily i.e. strings a.out. You will see many more strings than what you coded as many exist as part of the standard library other required things for an executable.)
So how does it know to just print the string and then stop at the end? Well a c-style string is required to end with a null byte (\0). The complier implicitly puts it there when you declare a string. So "string content 1" is actually "string content 1\0".
const char *a = "Hello\0 there.";
cout << a;
// prints "Hello"
For the most part const char *a and const char a[] are the same.
// These are valid and equivalent
const char *a = "Hello";
const char b[] = "there."
// This is valid
const char *c = b + 3; // *c = "re."
// This, however, is not valid
const char d[] = b + 3;

Initialization of pointers in c++

I need to clarify my concepts regarding the basics of pointer initialization in C++. As per my understanding, a pointer must be assigned an address before putting some value using the pointer.
int *p;
*p=10; //inappropriate
cout << *p <<"\n";
This would probably show the correct output (10) but this may cause issue in larger programs since p initially had garbage address which can be anything & may later be used somewhere else in the program as well.So , I believe this is incorrrect, the correct way is:
int *p;
int x=10;
p=&x; //appropriate
cout << *p <<"\n";
My question is, if the above understanding is correct, then does the same apply on char* as well?:
const char *str="hello"; // inappropriate
cout << str << "\n";
//OR
const string str1= "hello";
const char str2[6] ="world";
const char *str=str1; //appropriate
const char *st=str2; //appropriate
cout << str << st << "\n";
Please advice
Your understanding of strings is incorrect.
Lets take for example the very first line:
const char *str="hello";
This is actually correct. A string literal like "hello" is turned into a constant array by the compiler, and like all arrays it can decay to a pointer to its first element. So what you are doing is making str point to the first character of the array.
Then lets continue with
const string str1= "hello";
const char *str=str1;
This is actually wrong. A std::string object have no casting operator defined to cast to a const char *. The compiler will give you an error for this. You need to use the c_str function go get a pointer to the contained string.
Lastly:
const char str2[6] ="world";
const char *st=str2; //appropriate
This is really no different than the first line when you declare and initialize str. This is, as you say, "appropriate".
About that first example with the "inappropriate" pointer:
int *p;
*p=10; //inappropriate
cout << *p <<"\n";
This is not only "inappropriate", this leads to undefined behavior and may actually crash your program. Also, the correct term is that the value of p is indeterminate.
When I declare a pointer
int *p;
I get an object p whose values are addresses. No ints are created anywhere. The thing you need to do is think of p as being an address rather than being an int.
At this point, this isn't particularly useful since you have no addresses you could assign to it other than nullptr. Well, technically that's not true: p itself has an address which you can get with &p and store it in an int**, or even do something horrible like p = reinterpret_cast<int*>(&p);, but let's ignore that.
To do something with ints, you need to create one. e.g. if you go on to declare
int x;
you now have an int object whose values are integers, and we could then assign its address to p with p = &x;, and then recover the object from p via *p.
Now, C style strings have weird semantics — the weirdest aspect being that C doesn't actually have strings at all: it's always working with arrays of char.
String literals, like "Hello!", are guaranteed to (act1 like they) exist as an array of const char located at some address, and by C's odd conversion rules, this array automatically converts to a pointer to its first element. Thus,
const char *str = "hello";
stores the address of the h character in that character array. The declaration
const char str2[6] ="world";
works differently; this (acts1 like it) creates a brand new array, and copies the contents of the string literal "world" into the new array.
As an aside, there is an obsolete and deprecated feature here for compatibility with legacy programs, but for some misguided reason people still use it in new programs these days so you should be aware of it and that it's 'wrong': you're allowed to break the type system and actually write
char *str = "hello";
This shouldn't work because "hello" is an array of const char, but the standard permits this specific usage. You're still not actually allowed to modify the contents of the array, however.
1: By the "as if" rule, the program only has to behave as if things happen as I describe, but if you peeked at the assembly code, the actual way things happen can be very different.

Passing a string to a char array, then modify the char array at position x in C

So I have a string which contains "RGGB" and I need it to be in a char array to perform some operations. Then I need to replace certain characters for a blank space, for example the first 'G', so that my char array remains "R GB".
How can I do this? So far I tried this solution:
int main()
{
string problem="RGGB";
const char *p=problem.c_str();
p[1]=' ';
return p;
}
I get the error:
assignment of read only location *(p + ((sizetype)i))
To access the "interal string" (I mean a const char*) of a std::string, there are two member functions provided: std::string::c_str and std::string::data. Until C++11, the difference was that std::string::data wasn't bound to return a pointer to a null-terminated const char* while std::string::c_str was. Now, they are equivalent. And both return a const char*, even before C++11.
There are several approaches to your problem:
Use std::strdup or std::str(n)cpy to duplicate the string and write to the duplicate.
Use a const_cast. Pretty drastic, but, if it doesn't hurt any rules (FYI, it does), works.
Don't use std::string at all. Do what you want with a char* and then optionally convert it to a std::string later.
Just use the functionality of std::string.
string, a C++ class, does not provide directly alterable access to its innerds via a char *. While you could cast away the const, this is dangerous because compilers may use the const as an optimization path.
If you absolutely need to do this just use a char array, not a string or alter the contents of the string using string methods.
p[1]=' ';
is not valid as const char * is read-only pointer.
in other words, it is a const pointer.
remember:
When const appears to the left of the *, what's pointed to is constant, and if const appears to the right of the *, the pointer itself is constant. If const appears on both sizes, both are constants.
this code might work:
char problem[]="RGGB";
char* p = problem;
p[1]=' ';
cout<<problem;
The answer is in your question:
I need it to be in a char array
Then put it in a char array:
char problem[] = "RGGB";
problem[1] = ' ';
Problem solved.
If, on the other hand, you want to solve the problem using actual C++:
std::string problem = "RGGB";
problem.at(1) = ' ';

Setting a char array with c_str()?

char el[3] = myvector[1].c_str();
myvector[i] is a string with three letters in. Why does this error?
It returns type char* which is a pointer to a string. You can't assign this directly to an array like that, as that array already has memory assigned to it. Try:
const char* el = myvector[1].c_str();
But very careful if the string itself is destroyed or changed as the pointer will no longer be valid.
Because a const char * is not a valid initializer for an array. What's more, I believe c_str returns a pointer to internal memory so it's not safe to store.
You probably want to copy the value in some way (memcpy or std::copy or something else).
In addition to what others have said, keep in mind that a string with a length of three characters requires four bytes when converted to a c_str. This is because an extra byte has to be reserved for the null at the end of the string.
Arrays in C++ must know their size, and be provided with initialisers, at compile-time. The value returned by c_str() is only known at run-time. If e1 were a std::string, as it probably should be, there would be no problem. If it must be a char[], then use strcpy to populate it.
char el[3];
strcpy( e1, myvector[1].c_str() );
This assumes that the string myvector[1] contains at most two characters.
Just create a copy of the string. Then, if you ever need to access it as a char*, just do so.
string el = myvector[1];
cout << &el[0] << endl;
Make the string const if you don't need to modify it. Use c_str() on 'el' instead if you want.
Or, just access it right from the vector with:
cout << &myvector[1][0] << endl;
if possible for your situation.

using strings as a pointer to its first character

int main()
{
char name[]="avinash";
const char* nameano="a";
strtok(name,"n");
cout<<"the size of name is"<< sizeof(name);
cout<< name;
}
strtok takes in arguments (char*, const char*); name is an array, and hence a pointer to its first element. But if we make a declaration like
string name="avinash";
and pass name as first argument to strtok, then the program doesn't work, but it should, because name, a string, is a pointer to its first character.
Also, if we write
const string n = "n";
and pass it as second argument it doesn't work; this was my first problem.
Now also the sizeof(name) output is 8, but it should be 4, as avinash has been tokenized. Why does this happen?
You are confusing several things.
strtok takes in arguments (char*, const char*)....name is an array and hence a pointer to its first element...
name is an array, and it's not a pointer to its first element. An array decays in a pointer to its first argument in several contexts, but in principle it's a completely different thing. You notice this e.g. when you apply the sizeof operator on a pointer and on an array: on an array you get the array size (i.e. the cumulative size of its elements), on a pointer you get the size of a pointer (which is fixed).
but if we made a declaration like string name="avinash" and passed name as argument
then the prog doesnt work but it should because name of string is a pointer to its first character...
If you make a declaration like
string name="avinash";
you're are saying a completely different thing; string here is not a C-string (i.e. a char[]), but the C++ std::string type, which is a class that manages a dynamic string; those two things are completely different.
If you want to obtain a constant C-string (const char *) from a std::string you have to use it's c_str() method. Still, you can't use the pointer obtained in this way with strtok, since c_str() returns a pointer to a const C-string, i.e. it cannot be modified. Notice that strtok is not intended to work with C++ strings, since it's part of the legacy C library.
also if we write const string n = "n"; and pass it as second argument it doesnt work...this was my first problem...
This doesn't work for the exact same motivation, but in this case you can simply use the c_str() method, since the second argument of strtok is a const char *.
now also the sizeof(name) output is 8 but it should be 4 as avinash has been tokenised..
sizeof returns the "static" size of its operand (i.e. how much memory is allocated for it), it knows nothing about the content of name. To get the length of a C-string you have to use the strlen function; for C++ std::string just use its size() method.
I think you have to pass your name[] array to strtok like this:
strtok(&name[0],"n");
First of all, strtok is C and not C++ and is not made to work with C++ strings. Plus, it cant work if you dont use the return result of the function.
char name[]="avinash";
char * tok_name = strtok(name,"n");
std::cout<<"the size of name is"<< sizeof(tok_name);
std::cout<< tok_name;
You should consider to use std::string::find, std::string::substr and other stuff from the STL library instead of C functions.
As you said, strtok takes arguments of type char * and const char *. A string is not a char * so passing it as the first argument will not compile. For the second argument, you can convert a string to a const char * using the c_str() member function.
What problem are you trying to solve? If you're just trying to learn how strtok works, it would be much better to stick to raw character arrays. That function is inherited from C and thus wasn't designed with C++ strings in mind.
I don't believe the actual size of the name array is ever changed when you call strtok. The call remembers the last location it was called if you pass it a null pointer, and it continues to tokenize the string. The return value of the strtok call is the token that has been found in the string provided. Your code is calling sizeof(name) which is never actually adjusted by the strtok function.
int main()
{
char name[] = "avinash";
const char* nameano = "a";
char* token;
token = strtok(name, "n");
cout << "the length of the TOKEN is" << strlen(token) << endl;
cout << token << endl;
cout << "the length of the string is" << strlen(name) << endl;
cout << name << endl;
}
Try this maybe? I am not in front of a compiler so its likely I made a mistake, but it should get you on the right track to solve this problem.
This might also help:
http://msdn.microsoft.com/en-us/library/2c8d19sb(v=vs.71).aspx