string and const char* and .c_str()? - c++

I'm getting a weird problem and I want to know why it behaves like that. I have a class in which there is a member function that returns std::string. My goal to convert this string to const char*, so I did the following
const char* c;
c = robot.pose_Str().c_str(); // is this safe??????
udp_slave.sendData(c);
The problem is I'm getting a weird character in Master side. However, if I do the following
const char* c;
std::string data(robot.pose_Str());
c = data.c_str();
udp_slave.sendData(c);
I'm getting what I'm expecting. My question is what is the difference between the two aforementioned methods?

It's a matter of pointing to a temporary.
If you return by value but don't store the string, it disappears by the next sequence point (the semicolon).
If you store it in a variable, then the pointer is pointing to something that actually exists for the duration of your udp send
Consider the following:
int f() { return 2; }
int*p = &f();
Now that seems silly on its face, doesn't it? You are pointing at a value that is being copied back from f. You have no idea how long it's going to live.
Your string is the same way.

.c_str() returns the the address of the char const* by value, which means it gets a copy of the pointer. But after that, the actual character array that it points to is destroyed. That is why you get garbage. In the latter case you are creating a new string with that character array by copying the characters from actual location. In this case although the actual character array is destroyed, the copy remains in the string object.

You can't use the data pointed to by c_str() past the lifetime of the std::string object from whence it came. Sometimes it's not clear what the lifetime is, such as the code below. The solution is also shown:
#include <string>
#include <cstddef>
#include <cstring>
std::string foo() { return "hello"; }
char *
make_copy(const char *s) {
std::size_t sz = std::strlen(s);
char *p = new char[sz];
std::strcpy(p, s);
return p;
}
int
main() {
const char *p1 = foo().c_str(); // Whoops, can't use p1 after this statement.
const char *p2 = make_copy(foo().c_str()); // Okay, but you have to delete [] when done.
}

From c_str():
The pointer obtained from c_str() may be invalidated by:
Passing a non-const reference to the string to any standard library function, or
Calling non-const member functions on the string, excluding operator[], at(), front(), back(), begin(), rbegin(), end() and
rend().
Which means that, if the string returned by robot.pose_Str() is destroyed or changed by any non-const function, the pointer to the string will be invalidated. Since you may be returning a temporary copy to from robot.pose_Str(), the return of c_str() on it shall be invalid right after that call.
Yet, if you return a reference to the inner string you may be holding, instead of a temporary copy, you can either:
be sure it is going to work, in case your function udp_send is synchronous;
or rely on an invalid pointer, and thus experience undefined behavior if udp_send may finish after some possible modification on the inner contents of the original string.

Q
const char* c;
c = robot.pose_Str().c_str(); // is this safe??????
udp_slave.sendData(c);
A
This is potentially unsafe. It depends on what robot.pose_Str() returns. If the life of the returned std::string is longer than the life of c, then it is safe. Otherwise, it is not.
You are storing an address in c that is going to be invalid right after the statement is finished executing.
std::string s = robot.pose_Str();
const char* c = s.c_str(); // This is safe
udp_slave.sendData(c);
Here, you are storing an address in c that will be valid unit you get out of the scope in which s and c are defined.

Related

C++ sending forming and sending JSON structures and posting with CurlLib [duplicate]

I have a function that is returning a string. However, when I call it and do c_str() on it to convert it into a const char*, it only works when I store it into another string first. If I directly call c_str() off of the function, it stores garbage value in the const char*.
Why is this happening? Feel like I'm missing something very fundamental here...
string str = SomeFunction();
const char* strConverted = str.c_str(); // strConverted stores the value of the string properly
const char* charArray= SomeFunction().c_str(); // charArray stores garbage value
static string SomeFunction()
{
string str;
// does some string stuff
return str;
}
SomeFunction().c_str() gives you a pointer to a temporary(the automatic variable str in the body of SomeFunction). Unlike with references, the lifetime of temporaries isn't extended in this case and you end up with charArray being a dangling pointer explaining the garbage value you see later on when you try to use charArray.
On the other hand, when you do
string str_copy = SomeFunction();
str_copy is a copy of the return value of SomeFunction(). Calling c_str() on it now gives you a pointer to valid data.
The value object returned by a function is a temporary. The results of c_str() are valid only through the lifetime of the temporary. The lifetime of the temporary in most cases is to the end of the full expression, which is often the semicolon.
const char *p = SomeFunction();
printf("%s\n", p); // p points to invalid memory here.
The workaround is to make sure that you use the result of c_str() before the end of the full expression.
#include <cstring>
char *strdup(const char *src_str) noexcept {
char *new_str = new char[std::strlen(src_str) + 1];
std::strcpy(new_str, src_str);
return new_str;
}
const char *p = strdup(SomeFunction.c_str());
Note that strdup is a POSIX function, so if you are a platform that supports POSIX, it's already there.
The "string str" in method SomeFunction() is a local variable in SomeFunction(), and only survives inside the scope of SomeFunction();
Since the return type of the method SomeFunction() is string, not a reference of string, after "return str;", SomeFunction() will return a copy of the value of str, which will be stored as a temporary value in some place of memory, after the call of SomeFunction(), the temporary value will be destroyed immediately;
"string str = SomeFunction();" will store the returned temporary value of SomeFunction() to string str, actually is a copy of that value and stored to str, a new memory block is allocated, and the lifetime of str is bigger than the returned temporary value of SomeFunction(), after the ";" the call of SomeFunction() is finished, and the returned temporary value is destroyed immediately, the memory is recycled by system, but the copy of this value is still stored in str. That is why "const char* strConverted = str.c_str();" can get the right value, actually c_str() returned a pointer of the initial element of str (the first element memory address of str pointed string value), not the returned temporary value of SomeFunction();
"const char* charArray= SomeFunction().c_str();" is different, "SomeFunction().c_str()" will return a pointer of the initial element of the returned temporary value (the first element memory address of returned temporary string value), but after the call of SomeFunction(), the returned temporary value is destroyed, and that memory address is reused by the system, charArray can get the value of that memory address, but not the value you expected;
Use strcpy to copy the string to a locally defined array and your code will work fine.

Is there a dangling pointer problem in this code?

string str;
char *a=str.c_str();
This code is working fine for me but every place else I see this code instead
string str;
char *a=new char[str.length()];
strcpy(a,str.c_str());
I wonder which one is correct and why?
Assuming that the type of str is std::string, neither of the code is are correct.
char *a=str.c_str();
is invalid because c_str() will return const char* and removing const without casting (usually const_cast) is invalid.
char *a=new char[str.length()];
strcpy(a,str.c_str());
is invalid because str.length() don't count the terminating null-character while allocating for terminating null-character is required to use strcpy().
There are no dangling pointer problem in code posted here because no pointers are invalidated here.
The two code segments do different things.
The first assigns the pointer value of str to your new c-tpye string, and implicitly converts from const char*(c_str() return type) to char*, which is wrong. If you were to change your new string you would face an error. Even if c_str() returned char*, altering the new string would also make changes in str.
The second on the other hand creates a new c-type string from the original string, copying it byte-by-byte to the new memory allocated for your new string.
Although the line of code you wrote is incorrect, as it does not cover the terminating null character of a c-type string \0. In order to fix that, allocate 1 extra byte for it:
char *a=new char[str.length()+1];
After copying the data from the first string to your new one, making alterations to it will not result in changes in the original str.
Possibly.
Consider this.
char const* get_string() {
string str{"Hello"};
return str.c_str();
}
That function returns a pointer to the internal value of str, which goes out of scope when the function returns. You have a dangling pointer. Undefined behaviour. Watch out for time-travelling nasal monkeys.
Now consider this.
char const* get_string() {
string str{"Hello"};
char const* a = new char[str.length()+1];
strcpy(a, str.c_str());
return a;
}
That function returns a valid pointer to a valid null-terminated C-style string. No dangling pointer. If you forget to delete[] it you will have a memory leak, but that's not what you asked about.
The difference is one of object lifetime. Be aware of scope.

Why can't a pointer be initialized with another pointer to a const char?

Here's another 'reinventing-the-wheel' problem we were given in our Introduction to C++ classes:
Write a function that returns the position of the first occurrence of
a sequence of characters in a string, i.e. a variation of the strstr
function.
I started writing the function as follows:
int strstr2(const char *text, const char *pattern) {
int pos = 0;
char *temp;
temp = text;
}
I thought I'd remember the address of the first character of the string for future use within the function, but the compiler said:
A value of type "const char*" cannot be assigned to an entity of type "char*".
I know that one cannot change a constant once it has been initialized, but why am I not able to assign a pointer to a constant char to another non-constant pointer?
I read several questions referring to pointers and constants, and it seems the bottom of the accepted answer to this post might answer my question, but I'm not a hundred percent sure, as the discussion is still at too advanced a level for me.
My second question is, what is the workaround? How can I define a pointer pointing at the beginning of the string?
Thanks.
This has to do with const-correctness. const char *text means text is a pointer to a constant char. That means if you try to do something like
*text = 'a'
The compiler will issue an error since you are trying to modify a const object. If you could do
char *temp;
temp = text;
then you could do
*temp = 'a'
and there would be no error, even though you just modified a const object. This is why C++ requires you to use const_cast if you actually want to cast away const (there are a few use cases for this but they are by far not what you normally want to do).
Danger, there be dragons below. Be very, very careful if you decide to use const_cast
Lets say you have to deal with an old API call that only takes a char*, but it guarantees it wont modify, then you could use something like
int wrap_api(const char *text)
{
return api_call(const_cast<char*>(text));
}
and this will be "okay" since api_call guarantees it wont modify the string. If on the other hand api_call could modify then this would only be legal if what text points to isn't actually const like
char foo[] = "test"
wrap_api(foo);
would be legal if foo gets modified but
const char foo* = "test"
wrap_api(foo);
would not be legal if foo gets modified and is undefined behavior.
If the assignment would be allowed, you would be able to legally write:
*temp = 'x'; // write to char* is legal in general
But that would be bad in this case because you'd be writing to a constant.
The standard could have said that the temp = text assignment is legal and the *text = 'x' is undefined behavior, but that wouldn't make sense because the only difference between T* and const T* is whether you can write to it.
So it is only logical that C++ disallows assigning a T* type the value of a const T* type to save you from later doing something bad, and instead forces you to use const char* temp; temp = text; in this case.
This is a part of const-correctness type safety. Since text is a pointer to const char, you can't modify the characters it points to through it. This is a good thing and a safety measure.
However, the whole safety would be invalidated if it would be allowed to assign a pointer to non-const character to it! Because than you would modify the character through said pointer and bypass the safety!
Because of that, this assignment is not allowed. To fix it, mark your temp as pointer to const char as well: const char* temp.

Casting c_str() only works for short strings

I'm using a C library in C++ and wrote a wrapper. At one point I need to convert an std::string to a c-style string. There is a class with a function, which returns a string. Casting the returned string works if the string is short, otherwise not. Here is a simple and reduced example illustrating the issue:
#include <iostream>
#include <string>
class StringBox {
public:
std::string getString() const { return text_; }
StringBox(std::string text) : text_(text){};
private:
std::string text_;
};
int main(int argc, char **argv) {
const unsigned char *castString = NULL;
std::string someString = "I am a loooooooooooooooooong string"; // Won't work
// std::string someString = "hello"; // This one works
StringBox box(someString);
castString = (const unsigned char *)box.getString().c_str();
std::cout << "castString: " << castString << std::endl;
return 0;
}
Executing the file above prints this to the console:
castString:
whereas if I swap the commenting on someString, it correctly prints
castString: hello
How is this possible?
You are invoking c_str on a temporary string object retuned by the getString() member function. The pointer returned by c_str() is only valid as long as the original string object exists, so at the end of the line where you assign castString it ends up being a dangling pointer. Officially, this leads to undefined behavior.
So why does this work for short strings? I suspect that you're seeing the effects of the Short String Optimization, an optimization where for strings less than a certain length the character data is stored inside the bytes of the string object itself rather than in the heap. It's possible that the temporary string that was returned was stored on the stack, so when it was cleaned up no deallocations occurred and the pointer to the expired string object still holds your old string bytes. This seems consistent with what you're seeing, but it still doesn't mean what you're doing is a good idea. :-)
box.getString() is an anonymous temporary. c_str() is only valid for the length of the variable.
So in your case, c_str() is invalidated by the time you get to the std::cout. The behaviour of reading the pointer contents is undefined.
(Interestingly the behaviour of your short string is possibly different due to std::string storing short strings in a different way.)
As you return by value
box.getString() is a temporary and so
box.getString().c_str() is valid only during the expression, then it is a dangling pointer.
You may fix that with
const std::string& getString() const { return text_; }
box.getString() produces a temporary. Calling c_str() on that gives you a pointer to a temporary. After the temporary ceases to exist, which is immediately, the pointer is invalid, a dangling pointer.
Using a dangling pointer is Undefined Behavior.
First of all, your code has UB independent of the length of the string: At the end of
castString = (const unsigned char *)box.getString().c_str();
the string returned by getString is destroyed and castString is a dangling pointer to the internal buffer of the destroyed string object.
The reason your code "works" for small strings is probably Small String Optimization: Short strings are (commonly) saved in the string object itself instead of being saved in an dynamically allocated array, and apparently that memory is still accesible and unmodified in your case.

Initializing char pointer

I have a function
ValArgument(char* ptr){
char str[] = "hello world";
ptr = &str[0];
}
In this function, I want to init a char array and add it to the char pointer ptr. I call the function like that:
char* ptr= NULL;
ValArgument(ptr);
The pointer returned still has the value NULL. Why? I expected that the pointer will point onto the char array str[].
The pointer returned still has the value NULL. Why?
Because you passed the pointer by value. That means that the function is given a separate copy of the pointer, and any changes it makes to the pointer will not affect the caller's copy.
You can either pass by reference:
void ValArgument(char *& ptr)
// ^
or return a value:
char * ValArgument();
I expected that the pointer will point onto the char array str[].
No; once you've fixed that problem, it will point to the undead husk of the local variable that was destroyed when the function returned. Any attempt to use the pointer will cause undefined behaviour.
Depending on what you need to do with the string, you might want:
a pointer to a string literal, char const * str = "hello world";. Note that this should be const, since string literals can't be modified.
a pointer to a static array, static char str[] = "hello world";. This means that there is only one string shared by everyone, so any modification will affect everyone.
a pointer to a dynamically allocated array. Don't go there.
a string object, std::string str = "hello world";. This is the least error-prone, since it can be passed around like a simple value.