Code :
#include <iostream>
using namespace std;
int main() {
string str("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa");
const char* temp;
temp = str.substr(0, str.length()).c_str();
printf(str.substr(0, str.length()).c_str());
printf(temp);
const char* test = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa";
printf(test);
return 0;
}
Output:
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
�$P
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
Can someone explain this?
Your compiler should warn you about this line:
temp = str.substr(0, str.length()).c_str();
Warning C26815 The pointer is dangling because it points at a temporary instance which was destroyed.
What is happening, is that str.substr() is creating (and returning) a std::string object, but it's not being assigned to a variable, instead a pointer to its buffer is retrieved with c_str(), but the object itself is being deleted here as well (you can say 'abandoned').
Thus pointer to its buffer is no longer valid. Just by accident there are still some data that partially looks right. Thus you've got undefined behavior.
The way you are assigning temp is creating a dangling pointer. A dangling pointer is a pointer that points to invalid data, in this case, the invalid data is the sub-string you get from str.substring. The sub-string gets released because it is unused in the program, you can correct this by adding a new variable with the sub-string
#include <iostream>
using namespace std;
int main() {
string str("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa");
const char* temp;
//this is the importent line.
string substr = str.substr(0, str.length());
temp = substr.c_str();
printf(str.substr(0, str.length()).c_str());
printf(temp);
const char* test = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa";
printf(test);
return 0;
}
Related
My question can be boiled down to, where does the string returned from stringstream.str().c_str() live in memory, and why can't it be assigned to a const char*?
This code example will explain it better than I can
#include <string>
#include <sstream>
#include <iostream>
using namespace std;
int main()
{
stringstream ss("this is a string\n");
string str(ss.str());
const char* cstr1 = str.c_str();
const char* cstr2 = ss.str().c_str();
cout << cstr1 // Prints correctly
<< cstr2; // ERROR, prints out garbage
system("PAUSE");
return 0;
}
The assumption that stringstream.str().c_str() could be assigned to a const char* led to a bug that took me a while to track down.
For bonus points, can anyone explain why replacing the cout statement with
cout << cstr // Prints correctly
<< ss.str().c_str() // Prints correctly
<< cstr2; // Prints correctly (???)
prints the strings correctly?
I'm compiling in Visual Studio 2008.
stringstream.str() returns a temporary string object that's destroyed at the end of the full expression. If you get a pointer to a C string from that (stringstream.str().c_str()), it will point to a string which is deleted where the statement ends. That's why your code prints garbage.
You could copy that temporary string object to some other string object and take the C string from that one:
const std::string tmp = stringstream.str();
const char* cstr = tmp.c_str();
Note that I made the temporary string const, because any changes to it might cause it to re-allocate and thus render cstr invalid. It is therefor safer to not to store the result of the call to str() at all and use cstr only until the end of the full expression:
use_c_str( stringstream.str().c_str() );
Of course, the latter might not be easy and copying might be too expensive. What you can do instead is to bind the temporary to a const reference. This will extend its lifetime to the lifetime of the reference:
{
const std::string& tmp = stringstream.str();
const char* cstr = tmp.c_str();
}
IMO that's the best solution. Unfortunately it's not very well known.
What you're doing is creating a temporary. That temporary exists in a scope determined by the compiler, such that it's long enough to satisfy the requirements of where it's going.
As soon as the statement const char* cstr2 = ss.str().c_str(); is complete, the compiler sees no reason to keep the temporary string around, and it's destroyed, and thus your const char * is pointing to free'd memory.
Your statement string str(ss.str()); means that the temporary is used in the constructor for the string variable str that you've put on the local stack, and that stays around as long as you'd expect: until the end of the block, or function you've written. Therefore the const char * within is still good memory when you try the cout.
In this line:
const char* cstr2 = ss.str().c_str();
ss.str() will make a copy of the contents of the stringstream. When you call c_str() on the same line, you'll be referencing legitimate data, but after that line the string will be destroyed, leaving your char* to point to unowned memory.
The std::string object returned by ss.str() is a temporary object that will have a life time limited to the expression. So you cannot assign a pointer to a temporary object without getting trash.
Now, there is one exception: if you use a const reference to get the temporary object, it is legal to use it for a wider life time. For example you should do:
#include <string>
#include <sstream>
#include <iostream>
using namespace std;
int main()
{
stringstream ss("this is a string\n");
string str(ss.str());
const char* cstr1 = str.c_str();
const std::string& resultstr = ss.str();
const char* cstr2 = resultstr.c_str();
cout << cstr1 // Prints correctly
<< cstr2; // No more error : cstr2 points to resultstr memory that is still alive as we used the const reference to keep it for a time.
system("PAUSE");
return 0;
}
That way you get the string for a longer time.
Now, you have to know that there is a kind of optimisation called RVO that say that if the compiler see an initialization via a function call and that function return a temporary, it will not do the copy but just make the assigned value be the temporary. That way you don't need to actually use a reference, it's only if you want to be sure that it will not copy that it's necessary. So doing:
std::string resultstr = ss.str();
const char* cstr2 = resultstr.c_str();
would be better and simpler.
The ss.str() temporary is destroyed after initialization of cstr2 is complete. So when you print it with cout, the c-string that was associated with that std::string temporary has long been destoryed, and thus you will be lucky if it crashes and asserts, and not lucky if it prints garbage or does appear to work.
const char* cstr2 = ss.str().c_str();
The C-string where cstr1 points to, however, is associated with a string that still exists at the time you do the cout - so it correctly prints the result.
In the following code, the first cstr is correct (i assume it is cstr1 in the real code?). The second prints the c-string associated with the temporary string object ss.str(). The object is destroyed at the end of evaluating the full-expression in which it appears. The full-expression is the entire cout << ... expression - so while the c-string is output, the associated string object still exists. For cstr2 - it is pure badness that it succeeds. It most possibly internally chooses the same storage location for the new temporary which it already chose for the temporary used to initialize cstr2. It could aswell crash.
cout << cstr // Prints correctly
<< ss.str().c_str() // Prints correctly
<< cstr2; // Prints correctly (???)
The return of c_str() will usually just point to the internal string buffer - but that's not a requirement. The string could make up a buffer if its internal implementation is not contiguous for example (that's well possible - but in the next C++ Standard, strings need to be contiguously stored).
In GCC, strings use reference counting and copy-on-write. Thus, you will find that the following holds true (it does, at least on my GCC version)
string a = "hello";
string b(a);
assert(a.c_str() == b.c_str());
The two strings share the same buffer here. At the time you change one of them, the buffer will be copied and each will hold its separate copy. Other string implementations do things different, though.
My question can be boiled down to, where does the string returned from stringstream.str().c_str() live in memory, and why can't it be assigned to a const char*?
This code example will explain it better than I can
#include <string>
#include <sstream>
#include <iostream>
using namespace std;
int main()
{
stringstream ss("this is a string\n");
string str(ss.str());
const char* cstr1 = str.c_str();
const char* cstr2 = ss.str().c_str();
cout << cstr1 // Prints correctly
<< cstr2; // ERROR, prints out garbage
system("PAUSE");
return 0;
}
The assumption that stringstream.str().c_str() could be assigned to a const char* led to a bug that took me a while to track down.
For bonus points, can anyone explain why replacing the cout statement with
cout << cstr // Prints correctly
<< ss.str().c_str() // Prints correctly
<< cstr2; // Prints correctly (???)
prints the strings correctly?
I'm compiling in Visual Studio 2008.
stringstream.str() returns a temporary string object that's destroyed at the end of the full expression. If you get a pointer to a C string from that (stringstream.str().c_str()), it will point to a string which is deleted where the statement ends. That's why your code prints garbage.
You could copy that temporary string object to some other string object and take the C string from that one:
const std::string tmp = stringstream.str();
const char* cstr = tmp.c_str();
Note that I made the temporary string const, because any changes to it might cause it to re-allocate and thus render cstr invalid. It is therefor safer to not to store the result of the call to str() at all and use cstr only until the end of the full expression:
use_c_str( stringstream.str().c_str() );
Of course, the latter might not be easy and copying might be too expensive. What you can do instead is to bind the temporary to a const reference. This will extend its lifetime to the lifetime of the reference:
{
const std::string& tmp = stringstream.str();
const char* cstr = tmp.c_str();
}
IMO that's the best solution. Unfortunately it's not very well known.
What you're doing is creating a temporary. That temporary exists in a scope determined by the compiler, such that it's long enough to satisfy the requirements of where it's going.
As soon as the statement const char* cstr2 = ss.str().c_str(); is complete, the compiler sees no reason to keep the temporary string around, and it's destroyed, and thus your const char * is pointing to free'd memory.
Your statement string str(ss.str()); means that the temporary is used in the constructor for the string variable str that you've put on the local stack, and that stays around as long as you'd expect: until the end of the block, or function you've written. Therefore the const char * within is still good memory when you try the cout.
In this line:
const char* cstr2 = ss.str().c_str();
ss.str() will make a copy of the contents of the stringstream. When you call c_str() on the same line, you'll be referencing legitimate data, but after that line the string will be destroyed, leaving your char* to point to unowned memory.
The std::string object returned by ss.str() is a temporary object that will have a life time limited to the expression. So you cannot assign a pointer to a temporary object without getting trash.
Now, there is one exception: if you use a const reference to get the temporary object, it is legal to use it for a wider life time. For example you should do:
#include <string>
#include <sstream>
#include <iostream>
using namespace std;
int main()
{
stringstream ss("this is a string\n");
string str(ss.str());
const char* cstr1 = str.c_str();
const std::string& resultstr = ss.str();
const char* cstr2 = resultstr.c_str();
cout << cstr1 // Prints correctly
<< cstr2; // No more error : cstr2 points to resultstr memory that is still alive as we used the const reference to keep it for a time.
system("PAUSE");
return 0;
}
That way you get the string for a longer time.
Now, you have to know that there is a kind of optimisation called RVO that say that if the compiler see an initialization via a function call and that function return a temporary, it will not do the copy but just make the assigned value be the temporary. That way you don't need to actually use a reference, it's only if you want to be sure that it will not copy that it's necessary. So doing:
std::string resultstr = ss.str();
const char* cstr2 = resultstr.c_str();
would be better and simpler.
The ss.str() temporary is destroyed after initialization of cstr2 is complete. So when you print it with cout, the c-string that was associated with that std::string temporary has long been destoryed, and thus you will be lucky if it crashes and asserts, and not lucky if it prints garbage or does appear to work.
const char* cstr2 = ss.str().c_str();
The C-string where cstr1 points to, however, is associated with a string that still exists at the time you do the cout - so it correctly prints the result.
In the following code, the first cstr is correct (i assume it is cstr1 in the real code?). The second prints the c-string associated with the temporary string object ss.str(). The object is destroyed at the end of evaluating the full-expression in which it appears. The full-expression is the entire cout << ... expression - so while the c-string is output, the associated string object still exists. For cstr2 - it is pure badness that it succeeds. It most possibly internally chooses the same storage location for the new temporary which it already chose for the temporary used to initialize cstr2. It could aswell crash.
cout << cstr // Prints correctly
<< ss.str().c_str() // Prints correctly
<< cstr2; // Prints correctly (???)
The return of c_str() will usually just point to the internal string buffer - but that's not a requirement. The string could make up a buffer if its internal implementation is not contiguous for example (that's well possible - but in the next C++ Standard, strings need to be contiguously stored).
In GCC, strings use reference counting and copy-on-write. Thus, you will find that the following holds true (it does, at least on my GCC version)
string a = "hello";
string b(a);
assert(a.c_str() == b.c_str());
The two strings share the same buffer here. At the time you change one of them, the buffer will be copied and each will hold its separate copy. Other string implementations do things different, though.
I'm pretty new to C++ and I'm need to create MyString class, and its method to create new MyString object from another's substring, but chosen substring changes while class is being created and when I print it with my method.
Here is my code:
#include <iostream>
#include <cstring>
using namespace std;
class MyString {
public:
char* str;
MyString(char* str2create){
str = str2create;
}
MyString Substr(int index2start, int length) {
char substr[length];
int i = 0;
while(i < length) {
substr[i] = str[index2start + i];
i++;
}
cout<<substr<<endl; // prints normal string
return MyString(substr);
}
void Print() {
cout<<str<<endl;
}
};
int main() {
char str[] = {"hi, I'm a string"};
MyString myStr = MyString(str);
myStr.Print();
MyString myStr1 = myStr.Substr(10, 7);
cout<<myStr1.str<<endl;
cout<<"here is the substring I've done:"<<endl;
myStr1.Print();
return 0;
}
And here is the output:
hi, I'm a string
string
stri
here is the substring I've done:
♦
Have to walk this through to explain what's going wrong properly so bear with me.
int main() {
char str[] = {"hi, I'm a string"};
Allocated a temporary array of 17 characters (16 letters plus a the terminating null), placed the characters "hi, I'm a string" in it, and ended it off with a null. Temporary means what it sound like. When the function ends, str is gone. Anything pointing at str is now pointing at garbage. It may shamble on for a while and give some semblance of life before it is reused and overwritten, but really it's a zombie and can only be trusted to kill your program and eat its brains.
MyString myStr = MyString(str);
Creates myStr, another temporary variable. Called the constructor with the array of characters. So let's take a look at the constructor:
MyString(char* str2create){
str = str2create;
}
Take a pointer to a character, in this case it will have a pointer to the first element of main's str. This pointer will be assigned to MyString's str. There is no copying of the "hi, I'm a string". Both mains's str and MyString's strpoint to the same place in memory. This is a dangerous condition because alterations to one will affect the other. If one str goes away, so goes the other. If one str is overwritten, so too is the other.
What the constructor should do is:
MyString(char* str2create){
size_t len = strlen(str2create); //
str = new char[len+1]; // create appropriately sized buffer to hold string
// +1 to hold the null
strcpy(str, str2create); // copy source string to MyString
}
A few caveats: This is really really easy to break. Pass in a str2create that never ends, for example, and the strlen will go spinning off into unassigned memory and the results will be unpredictable.
For now we'll assume no one is being particularly malicious and will only enter good values, but this has been shown to be really bad assumption in the real world.
This also forces a requirement for a destructor to release the memory used by str
virtual ~MyString(){
delete[] str;
}
It also adds a requirement for copy and move constructors and copy and move assignment operators to avoid violating the Rule of Three/Five.
Back to OP's Code...
str and myStr point at the same place in memory, but this isn't bad yet. Because this program is a trivial one, it never becomes a problem. myStr and str both expire at the same point and neither are modified again.
myStr.Print();
Will print correctly because nothing has changed in str or myStr.
MyString myStr1 = myStr.Substr(10, 7);
Requires us to look at MyString::Substr to see what happens.
MyString Substr(int index2start, int length) {
char substr[length];
Creates a temporary character array of size length. First off, this is non-standard C++. It won't compile under a lot of compilers, do just don't do this in the first place. Second, it's temporary. When the function ends, this value is garbage. Don't take any pointers to substr because it won't be around long enough to use them. Third, no space was reserved for the terminating null. This string will be a buffer overrun waiting to happen.
int i = 0;
while(i < length) {
substr[i] = str[index2start + i];
i++;
}
OK that's pretty good. Copy from source to destination. What it is missing is the null termination so users of the char array knows when it ends.
cout<<substr<<endl; // prints normal string
And that buffer overrun waiting to happen? Just happened. Whups. You got unlucky because it looks like it worked rather than crashing and letting you know that it didn't. Must have been a null in memory at exactly the right place.
return MyString(substr);
And this created a new MyString that points to substr. Right before substr hit the end of the function and died. This new MyString points to garbage almost instantly.
}
What Substr should do:
MyString Substr(int index2start, int length)
{
std::unique_ptr<char[]> substr(new char[length + 1]);
// unique_ptr is probably paranoid overkill, but if something does go
// wrong, the array's destruction is virtually guaranteed
int i = 0;
while (i < length)
{
substr[i] = str[index2start + i];
i++;
}
substr[length] = '\0';// null terminate
cout<<substr.get()<<endl; // get() gets the array out of the unique_ptr
return MyString(substr.get()); // google "copy elision" for more information
// on this line.
}
Back in OP's code, with the return to the main function that which was substr starts to be reused and overwritten.
cout<<myStr1.str<<endl;
Prints myStr1.str and already we can see some of it has been reused and destroyed.
cout<<"here is the substring I've done:"<<endl;
myStr1.Print();
More death, more destruction, less string.
Things to not do in the future:
Sharing pointers where data should have been copied.
Pointers to temporary data.
Not null terminating strings.
Your function Substr returns the address of a local variable substr indirectly by storing a pointer to it in the return value MyString object. It's invalid to dereference a pointer to a local variable once it has gone out of scope.
I suggest you decide whether your class wraps an external string, or owns its own string data, in which case you will need to copy the input string to a member buffer.
I am confused with const pointers in C++ and wrote a small application to see what the output would be. I am attempting (I believe) to add a pointer to a string, which should not work correctly, but when I run the program I correctly get "hello world". Can anyone help me figure out what how this line (s += s2) is working?
My code:
#include <iostream>
#include <stdio.h>
#include <string>
using namespace std;
const char* append(const char* s1, const char* s2){
std::string s(s1); //this will copy the characters in s1
s += s2; //add s and s2, store the result in s (shouldn't work?)
return s.c_str(); //return result to be printed
}
int main() {
const char* total = append("hello", "world");
printf("%s", total);
return 0;
}
The variable s is local inside the append function. Once the append function returns that variable is destructed, leaving you with a pointer to a string that no longer exists. Using this pointer leads to undefined behavior.
My tip to you on how to solve this: Use std::string all the way!
you're adding const char* pointer to a std::string and that is possible (see this reference). it wouldn't be possible to make that operation on char* type (C style string).
however, you're returning a pointer to local variable, so once function append returns and gets popped of the stack, the string that your returned pointer is pointing to would not exist. this leads to an undefined behavior.
Class std::string has overloaded operator += for an operand of type const char *
basic_string& operator+=(const charT* s);
In fact it simply appends the string pointed to by this pointer to the contents of the object of type std::string allocating additionly memory if required. For example internally the overloaded operator could use standard C function strcat
Conceptually it is similar to the following code snippet.
char s[12] = "Hello ";
const char *s2 = "World";
std::strcat( s, s2 );
Take into account that your program has undefined behaviour because total will be invalid after destroying local object s after exiting function append. So the next statemnent in main
printf("%s", total);
can result in undefined behaviour.
I must have missed an obvious fact here -- haven't been programming C++ for a while. Why can't I print the c-style string after assigning it to a const char* variable? But if I try to print it directly without assigning it works fine:
#include "boost/lexical_cast.hpp"
using namespace std;
using boost::lexical_cast;
int main (int argc, char** argv)
{
int aa=500;
cout << lexical_cast<string>(aa).c_str() << endl; // prints the string "500" fine
const char* bb = lexical_cast<string>(aa).c_str();
cout << bb << endl; // prints nothing
return EXIT_SUCCESS;
}
The C String returned by c_str is only usable while the std::string from which it was obtained exists. Once that std::string is destroyed, the C String is gone too. (At that point, attempting to use the C String yields undefined behavior.)
Other operations may also invalidate the C String. In general, any operation that modifies the string will invalidate the pointer returned by c_str.
c_str function is called on the result of the temporary string which is created from the lexical_cast. Since you don't save it, the string is destroyed at the end of that expression and thus accessing the pointer to the c_str of the string that has been destroyed is undefined behaviour.