I have a function that is returning a string. However, when I call it and do c_str() on it to convert it into a const char*, it only works when I store it into another string first. If I directly call c_str() off of the function, it stores garbage value in the const char*.
Why is this happening? Feel like I'm missing something very fundamental here...
string str = SomeFunction();
const char* strConverted = str.c_str(); // strConverted stores the value of the string properly
const char* charArray= SomeFunction().c_str(); // charArray stores garbage value
static string SomeFunction()
{
string str;
// does some string stuff
return str;
}
SomeFunction().c_str() gives you a pointer to a temporary(the automatic variable str in the body of SomeFunction). Unlike with references, the lifetime of temporaries isn't extended in this case and you end up with charArray being a dangling pointer explaining the garbage value you see later on when you try to use charArray.
On the other hand, when you do
string str_copy = SomeFunction();
str_copy is a copy of the return value of SomeFunction(). Calling c_str() on it now gives you a pointer to valid data.
The value object returned by a function is a temporary. The results of c_str() are valid only through the lifetime of the temporary. The lifetime of the temporary in most cases is to the end of the full expression, which is often the semicolon.
const char *p = SomeFunction();
printf("%s\n", p); // p points to invalid memory here.
The workaround is to make sure that you use the result of c_str() before the end of the full expression.
#include <cstring>
char *strdup(const char *src_str) noexcept {
char *new_str = new char[std::strlen(src_str) + 1];
std::strcpy(new_str, src_str);
return new_str;
}
const char *p = strdup(SomeFunction.c_str());
Note that strdup is a POSIX function, so if you are a platform that supports POSIX, it's already there.
The "string str" in method SomeFunction() is a local variable in SomeFunction(), and only survives inside the scope of SomeFunction();
Since the return type of the method SomeFunction() is string, not a reference of string, after "return str;", SomeFunction() will return a copy of the value of str, which will be stored as a temporary value in some place of memory, after the call of SomeFunction(), the temporary value will be destroyed immediately;
"string str = SomeFunction();" will store the returned temporary value of SomeFunction() to string str, actually is a copy of that value and stored to str, a new memory block is allocated, and the lifetime of str is bigger than the returned temporary value of SomeFunction(), after the ";" the call of SomeFunction() is finished, and the returned temporary value is destroyed immediately, the memory is recycled by system, but the copy of this value is still stored in str. That is why "const char* strConverted = str.c_str();" can get the right value, actually c_str() returned a pointer of the initial element of str (the first element memory address of str pointed string value), not the returned temporary value of SomeFunction();
"const char* charArray= SomeFunction().c_str();" is different, "SomeFunction().c_str()" will return a pointer of the initial element of the returned temporary value (the first element memory address of returned temporary string value), but after the call of SomeFunction(), the returned temporary value is destroyed, and that memory address is reused by the system, charArray can get the value of that memory address, but not the value you expected;
Use strcpy to copy the string to a locally defined array and your code will work fine.
Related
I have a function that is returning a string. However, when I call it and do c_str() on it to convert it into a const char*, it only works when I store it into another string first. If I directly call c_str() off of the function, it stores garbage value in the const char*.
Why is this happening? Feel like I'm missing something very fundamental here...
string str = SomeFunction();
const char* strConverted = str.c_str(); // strConverted stores the value of the string properly
const char* charArray= SomeFunction().c_str(); // charArray stores garbage value
static string SomeFunction()
{
string str;
// does some string stuff
return str;
}
SomeFunction().c_str() gives you a pointer to a temporary(the automatic variable str in the body of SomeFunction). Unlike with references, the lifetime of temporaries isn't extended in this case and you end up with charArray being a dangling pointer explaining the garbage value you see later on when you try to use charArray.
On the other hand, when you do
string str_copy = SomeFunction();
str_copy is a copy of the return value of SomeFunction(). Calling c_str() on it now gives you a pointer to valid data.
The value object returned by a function is a temporary. The results of c_str() are valid only through the lifetime of the temporary. The lifetime of the temporary in most cases is to the end of the full expression, which is often the semicolon.
const char *p = SomeFunction();
printf("%s\n", p); // p points to invalid memory here.
The workaround is to make sure that you use the result of c_str() before the end of the full expression.
#include <cstring>
char *strdup(const char *src_str) noexcept {
char *new_str = new char[std::strlen(src_str) + 1];
std::strcpy(new_str, src_str);
return new_str;
}
const char *p = strdup(SomeFunction.c_str());
Note that strdup is a POSIX function, so if you are a platform that supports POSIX, it's already there.
The "string str" in method SomeFunction() is a local variable in SomeFunction(), and only survives inside the scope of SomeFunction();
Since the return type of the method SomeFunction() is string, not a reference of string, after "return str;", SomeFunction() will return a copy of the value of str, which will be stored as a temporary value in some place of memory, after the call of SomeFunction(), the temporary value will be destroyed immediately;
"string str = SomeFunction();" will store the returned temporary value of SomeFunction() to string str, actually is a copy of that value and stored to str, a new memory block is allocated, and the lifetime of str is bigger than the returned temporary value of SomeFunction(), after the ";" the call of SomeFunction() is finished, and the returned temporary value is destroyed immediately, the memory is recycled by system, but the copy of this value is still stored in str. That is why "const char* strConverted = str.c_str();" can get the right value, actually c_str() returned a pointer of the initial element of str (the first element memory address of str pointed string value), not the returned temporary value of SomeFunction();
"const char* charArray= SomeFunction().c_str();" is different, "SomeFunction().c_str()" will return a pointer of the initial element of the returned temporary value (the first element memory address of returned temporary string value), but after the call of SomeFunction(), the returned temporary value is destroyed, and that memory address is reused by the system, charArray can get the value of that memory address, but not the value you expected;
Use strcpy to copy the string to a locally defined array and your code will work fine.
I have a function that is returning a string. However, when I call it and do c_str() on it to convert it into a const char*, it only works when I store it into another string first. If I directly call c_str() off of the function, it stores garbage value in the const char*.
Why is this happening? Feel like I'm missing something very fundamental here...
string str = SomeFunction();
const char* strConverted = str.c_str(); // strConverted stores the value of the string properly
const char* charArray= SomeFunction().c_str(); // charArray stores garbage value
static string SomeFunction()
{
string str;
// does some string stuff
return str;
}
SomeFunction().c_str() gives you a pointer to a temporary(the automatic variable str in the body of SomeFunction). Unlike with references, the lifetime of temporaries isn't extended in this case and you end up with charArray being a dangling pointer explaining the garbage value you see later on when you try to use charArray.
On the other hand, when you do
string str_copy = SomeFunction();
str_copy is a copy of the return value of SomeFunction(). Calling c_str() on it now gives you a pointer to valid data.
The value object returned by a function is a temporary. The results of c_str() are valid only through the lifetime of the temporary. The lifetime of the temporary in most cases is to the end of the full expression, which is often the semicolon.
const char *p = SomeFunction();
printf("%s\n", p); // p points to invalid memory here.
The workaround is to make sure that you use the result of c_str() before the end of the full expression.
#include <cstring>
char *strdup(const char *src_str) noexcept {
char *new_str = new char[std::strlen(src_str) + 1];
std::strcpy(new_str, src_str);
return new_str;
}
const char *p = strdup(SomeFunction.c_str());
Note that strdup is a POSIX function, so if you are a platform that supports POSIX, it's already there.
The "string str" in method SomeFunction() is a local variable in SomeFunction(), and only survives inside the scope of SomeFunction();
Since the return type of the method SomeFunction() is string, not a reference of string, after "return str;", SomeFunction() will return a copy of the value of str, which will be stored as a temporary value in some place of memory, after the call of SomeFunction(), the temporary value will be destroyed immediately;
"string str = SomeFunction();" will store the returned temporary value of SomeFunction() to string str, actually is a copy of that value and stored to str, a new memory block is allocated, and the lifetime of str is bigger than the returned temporary value of SomeFunction(), after the ";" the call of SomeFunction() is finished, and the returned temporary value is destroyed immediately, the memory is recycled by system, but the copy of this value is still stored in str. That is why "const char* strConverted = str.c_str();" can get the right value, actually c_str() returned a pointer of the initial element of str (the first element memory address of str pointed string value), not the returned temporary value of SomeFunction();
"const char* charArray= SomeFunction().c_str();" is different, "SomeFunction().c_str()" will return a pointer of the initial element of the returned temporary value (the first element memory address of returned temporary string value), but after the call of SomeFunction(), the returned temporary value is destroyed, and that memory address is reused by the system, charArray can get the value of that memory address, but not the value you expected;
Use strcpy to copy the string to a locally defined array and your code will work fine.
I have a function that is returning a string. However, when I call it and do c_str() on it to convert it into a const char*, it only works when I store it into another string first. If I directly call c_str() off of the function, it stores garbage value in the const char*.
Why is this happening? Feel like I'm missing something very fundamental here...
string str = SomeFunction();
const char* strConverted = str.c_str(); // strConverted stores the value of the string properly
const char* charArray= SomeFunction().c_str(); // charArray stores garbage value
static string SomeFunction()
{
string str;
// does some string stuff
return str;
}
SomeFunction().c_str() gives you a pointer to a temporary(the automatic variable str in the body of SomeFunction). Unlike with references, the lifetime of temporaries isn't extended in this case and you end up with charArray being a dangling pointer explaining the garbage value you see later on when you try to use charArray.
On the other hand, when you do
string str_copy = SomeFunction();
str_copy is a copy of the return value of SomeFunction(). Calling c_str() on it now gives you a pointer to valid data.
The value object returned by a function is a temporary. The results of c_str() are valid only through the lifetime of the temporary. The lifetime of the temporary in most cases is to the end of the full expression, which is often the semicolon.
const char *p = SomeFunction();
printf("%s\n", p); // p points to invalid memory here.
The workaround is to make sure that you use the result of c_str() before the end of the full expression.
#include <cstring>
char *strdup(const char *src_str) noexcept {
char *new_str = new char[std::strlen(src_str) + 1];
std::strcpy(new_str, src_str);
return new_str;
}
const char *p = strdup(SomeFunction.c_str());
Note that strdup is a POSIX function, so if you are a platform that supports POSIX, it's already there.
The "string str" in method SomeFunction() is a local variable in SomeFunction(), and only survives inside the scope of SomeFunction();
Since the return type of the method SomeFunction() is string, not a reference of string, after "return str;", SomeFunction() will return a copy of the value of str, which will be stored as a temporary value in some place of memory, after the call of SomeFunction(), the temporary value will be destroyed immediately;
"string str = SomeFunction();" will store the returned temporary value of SomeFunction() to string str, actually is a copy of that value and stored to str, a new memory block is allocated, and the lifetime of str is bigger than the returned temporary value of SomeFunction(), after the ";" the call of SomeFunction() is finished, and the returned temporary value is destroyed immediately, the memory is recycled by system, but the copy of this value is still stored in str. That is why "const char* strConverted = str.c_str();" can get the right value, actually c_str() returned a pointer of the initial element of str (the first element memory address of str pointed string value), not the returned temporary value of SomeFunction();
"const char* charArray= SomeFunction().c_str();" is different, "SomeFunction().c_str()" will return a pointer of the initial element of the returned temporary value (the first element memory address of returned temporary string value), but after the call of SomeFunction(), the returned temporary value is destroyed, and that memory address is reused by the system, charArray can get the value of that memory address, but not the value you expected;
Use strcpy to copy the string to a locally defined array and your code will work fine.
I'm using a C library in C++ and wrote a wrapper. At one point I need to convert an std::string to a c-style string. There is a class with a function, which returns a string. Casting the returned string works if the string is short, otherwise not. Here is a simple and reduced example illustrating the issue:
#include <iostream>
#include <string>
class StringBox {
public:
std::string getString() const { return text_; }
StringBox(std::string text) : text_(text){};
private:
std::string text_;
};
int main(int argc, char **argv) {
const unsigned char *castString = NULL;
std::string someString = "I am a loooooooooooooooooong string"; // Won't work
// std::string someString = "hello"; // This one works
StringBox box(someString);
castString = (const unsigned char *)box.getString().c_str();
std::cout << "castString: " << castString << std::endl;
return 0;
}
Executing the file above prints this to the console:
castString:
whereas if I swap the commenting on someString, it correctly prints
castString: hello
How is this possible?
You are invoking c_str on a temporary string object retuned by the getString() member function. The pointer returned by c_str() is only valid as long as the original string object exists, so at the end of the line where you assign castString it ends up being a dangling pointer. Officially, this leads to undefined behavior.
So why does this work for short strings? I suspect that you're seeing the effects of the Short String Optimization, an optimization where for strings less than a certain length the character data is stored inside the bytes of the string object itself rather than in the heap. It's possible that the temporary string that was returned was stored on the stack, so when it was cleaned up no deallocations occurred and the pointer to the expired string object still holds your old string bytes. This seems consistent with what you're seeing, but it still doesn't mean what you're doing is a good idea. :-)
box.getString() is an anonymous temporary. c_str() is only valid for the length of the variable.
So in your case, c_str() is invalidated by the time you get to the std::cout. The behaviour of reading the pointer contents is undefined.
(Interestingly the behaviour of your short string is possibly different due to std::string storing short strings in a different way.)
As you return by value
box.getString() is a temporary and so
box.getString().c_str() is valid only during the expression, then it is a dangling pointer.
You may fix that with
const std::string& getString() const { return text_; }
box.getString() produces a temporary. Calling c_str() on that gives you a pointer to a temporary. After the temporary ceases to exist, which is immediately, the pointer is invalid, a dangling pointer.
Using a dangling pointer is Undefined Behavior.
First of all, your code has UB independent of the length of the string: At the end of
castString = (const unsigned char *)box.getString().c_str();
the string returned by getString is destroyed and castString is a dangling pointer to the internal buffer of the destroyed string object.
The reason your code "works" for small strings is probably Small String Optimization: Short strings are (commonly) saved in the string object itself instead of being saved in an dynamically allocated array, and apparently that memory is still accesible and unmodified in your case.
I have a function
ValArgument(char* ptr){
char str[] = "hello world";
ptr = &str[0];
}
In this function, I want to init a char array and add it to the char pointer ptr. I call the function like that:
char* ptr= NULL;
ValArgument(ptr);
The pointer returned still has the value NULL. Why? I expected that the pointer will point onto the char array str[].
The pointer returned still has the value NULL. Why?
Because you passed the pointer by value. That means that the function is given a separate copy of the pointer, and any changes it makes to the pointer will not affect the caller's copy.
You can either pass by reference:
void ValArgument(char *& ptr)
// ^
or return a value:
char * ValArgument();
I expected that the pointer will point onto the char array str[].
No; once you've fixed that problem, it will point to the undead husk of the local variable that was destroyed when the function returned. Any attempt to use the pointer will cause undefined behaviour.
Depending on what you need to do with the string, you might want:
a pointer to a string literal, char const * str = "hello world";. Note that this should be const, since string literals can't be modified.
a pointer to a static array, static char str[] = "hello world";. This means that there is only one string shared by everyone, so any modification will affect everyone.
a pointer to a dynamically allocated array. Don't go there.
a string object, std::string str = "hello world";. This is the least error-prone, since it can be passed around like a simple value.