GetTypeName is std::string, the following code
printf("%#x\n", proto->GetTypeName().c_str());
printf("%s\n", proto->GetTypeName().c_str());
const char *res = proto->GetTypeName().c_str();
printf("%#x\n",res);
printf("%s\n",res);
produces this output:
0x90ef78
ValidTypeName
0x90ef78
ю■ю■ю■ю■ю■ю■ю■ю■ю■ю■ю■ю■ю■ю■ю■ю■ю■ю■ю■ю■ю■ю■ю■ю■ю■ю■ю■ю■←ЬЬQщZ
addresses are always the same; the following code (lines are exchanges)
const char *res = proto->GetTypeName().c_str();
printf("%#x\n",res);
printf("%s\n",res);
printf("%#x\n", proto->GetTypeName().c_str());
printf("%s\n", proto->GetTypeName().c_str());
produces this output, addresses are always different:
0x57ef78
Y
0x580850
ValidTypeName
What am I doing wrong?
strlen(res)
returns invalid size, so I can't even strcpy.
YourGetTypeName function is returning an std::string and you are calling c_str to get a pointer to the internal data in that string.
As it's a temporary the std::string you return will be deleted at the end of the statement
const char *res = proto->GetTypeName().c_str();
But you still have res pointing to the now deleted data.
Edit: Change your code to something like :-
const std::string& res = proto->GetTypeName();
and call .c_str() on that string in the printf like this :-
printf("%#x\n",res.c_str());
printf("%s\n",res.c_str());
Assigning a temporary to a reference extends the lifetime of that temporary to be the same as the lifetime of the reference...
Better still, just use std::string and iostream for printing and stop messing about with low level pointers when unnecessary :)
Related
My question can be boiled down to, where does the string returned from stringstream.str().c_str() live in memory, and why can't it be assigned to a const char*?
This code example will explain it better than I can
#include <string>
#include <sstream>
#include <iostream>
using namespace std;
int main()
{
stringstream ss("this is a string\n");
string str(ss.str());
const char* cstr1 = str.c_str();
const char* cstr2 = ss.str().c_str();
cout << cstr1 // Prints correctly
<< cstr2; // ERROR, prints out garbage
system("PAUSE");
return 0;
}
The assumption that stringstream.str().c_str() could be assigned to a const char* led to a bug that took me a while to track down.
For bonus points, can anyone explain why replacing the cout statement with
cout << cstr // Prints correctly
<< ss.str().c_str() // Prints correctly
<< cstr2; // Prints correctly (???)
prints the strings correctly?
I'm compiling in Visual Studio 2008.
stringstream.str() returns a temporary string object that's destroyed at the end of the full expression. If you get a pointer to a C string from that (stringstream.str().c_str()), it will point to a string which is deleted where the statement ends. That's why your code prints garbage.
You could copy that temporary string object to some other string object and take the C string from that one:
const std::string tmp = stringstream.str();
const char* cstr = tmp.c_str();
Note that I made the temporary string const, because any changes to it might cause it to re-allocate and thus render cstr invalid. It is therefor safer to not to store the result of the call to str() at all and use cstr only until the end of the full expression:
use_c_str( stringstream.str().c_str() );
Of course, the latter might not be easy and copying might be too expensive. What you can do instead is to bind the temporary to a const reference. This will extend its lifetime to the lifetime of the reference:
{
const std::string& tmp = stringstream.str();
const char* cstr = tmp.c_str();
}
IMO that's the best solution. Unfortunately it's not very well known.
What you're doing is creating a temporary. That temporary exists in a scope determined by the compiler, such that it's long enough to satisfy the requirements of where it's going.
As soon as the statement const char* cstr2 = ss.str().c_str(); is complete, the compiler sees no reason to keep the temporary string around, and it's destroyed, and thus your const char * is pointing to free'd memory.
Your statement string str(ss.str()); means that the temporary is used in the constructor for the string variable str that you've put on the local stack, and that stays around as long as you'd expect: until the end of the block, or function you've written. Therefore the const char * within is still good memory when you try the cout.
In this line:
const char* cstr2 = ss.str().c_str();
ss.str() will make a copy of the contents of the stringstream. When you call c_str() on the same line, you'll be referencing legitimate data, but after that line the string will be destroyed, leaving your char* to point to unowned memory.
The std::string object returned by ss.str() is a temporary object that will have a life time limited to the expression. So you cannot assign a pointer to a temporary object without getting trash.
Now, there is one exception: if you use a const reference to get the temporary object, it is legal to use it for a wider life time. For example you should do:
#include <string>
#include <sstream>
#include <iostream>
using namespace std;
int main()
{
stringstream ss("this is a string\n");
string str(ss.str());
const char* cstr1 = str.c_str();
const std::string& resultstr = ss.str();
const char* cstr2 = resultstr.c_str();
cout << cstr1 // Prints correctly
<< cstr2; // No more error : cstr2 points to resultstr memory that is still alive as we used the const reference to keep it for a time.
system("PAUSE");
return 0;
}
That way you get the string for a longer time.
Now, you have to know that there is a kind of optimisation called RVO that say that if the compiler see an initialization via a function call and that function return a temporary, it will not do the copy but just make the assigned value be the temporary. That way you don't need to actually use a reference, it's only if you want to be sure that it will not copy that it's necessary. So doing:
std::string resultstr = ss.str();
const char* cstr2 = resultstr.c_str();
would be better and simpler.
The ss.str() temporary is destroyed after initialization of cstr2 is complete. So when you print it with cout, the c-string that was associated with that std::string temporary has long been destoryed, and thus you will be lucky if it crashes and asserts, and not lucky if it prints garbage or does appear to work.
const char* cstr2 = ss.str().c_str();
The C-string where cstr1 points to, however, is associated with a string that still exists at the time you do the cout - so it correctly prints the result.
In the following code, the first cstr is correct (i assume it is cstr1 in the real code?). The second prints the c-string associated with the temporary string object ss.str(). The object is destroyed at the end of evaluating the full-expression in which it appears. The full-expression is the entire cout << ... expression - so while the c-string is output, the associated string object still exists. For cstr2 - it is pure badness that it succeeds. It most possibly internally chooses the same storage location for the new temporary which it already chose for the temporary used to initialize cstr2. It could aswell crash.
cout << cstr // Prints correctly
<< ss.str().c_str() // Prints correctly
<< cstr2; // Prints correctly (???)
The return of c_str() will usually just point to the internal string buffer - but that's not a requirement. The string could make up a buffer if its internal implementation is not contiguous for example (that's well possible - but in the next C++ Standard, strings need to be contiguously stored).
In GCC, strings use reference counting and copy-on-write. Thus, you will find that the following holds true (it does, at least on my GCC version)
string a = "hello";
string b(a);
assert(a.c_str() == b.c_str());
The two strings share the same buffer here. At the time you change one of them, the buffer will be copied and each will hold its separate copy. Other string implementations do things different, though.
I saw a strange behavior the other day.
So I wanted to store lines(present in a vector) in a char array and wanted to use '\n' as delimiter.
I know c_str() method in string class returns a pointer to a char array ending in '\0'.
Based on my experience/understanding of C++.(see greet0 and greet2 functions).
I assumed it should work but it didn't.
Can anyone explain the different behavior in three greet functions? What is the the scope of the object mentioned in each of the greet function?
(also i had a guess that the string object was destroyed in greet1 function but if that would have been the case there should be segmentation fault in cout<<"greet1:"<<w1<<endl; but that does not happen so what exactly is happening in background).
//The snippet that where i first encountered the issue.
const char* concatinated_str(std::vector<std::string> lines, const char *delimiter)
{
std::stringstream buf;
std::copy(lines.begin(), lines.end(), std::ostream_iterator<std::string>(buf, delimiter));
string w = buf.str();
const char *ret = w.c_str();
return ret;
}
//Implementation 0
string greet0(){
string msg = "hello";
return msg;
}
//Implementation 1
const char* greet1(){
string msg = "hello";
cout<<&msg<<endl;
return msg.c_str();
}
//Implementation 2
const char* greet2(){
const char* msg = "hello";
return msg;
}
int main(){
auto w0 = greet0();
cout<<&w0<<endl;
cout<<"greet0:"<<w0<<endl;
auto w1 = greet1();
cout<<"greet1:"<<w1<<endl;
const char* w2 = greet2();
cout<<"greet2:"<<w2<<endl;
}
Output:
0x7fff0ff3e8e0
0x7fff0ff3e8e0
greet0:hello
greet1:
greet2:hello
Returning a std::string or the pointer to a string-literal by value is perfectly fine.
Using the return-value of greet1() though has Undefined Behavior because the std::string whose elements you try to print died at the end of its enclosing function, leaving the returned pointer dangling.
What happens if you dereference a dangling pointer is not defined, acting as if you had a pointer to an empty string due to storage being re-used being one of the more benign possibilities.
As an aside, the address of a std::string is rarely that interesting to someone executing your program, though printing it is perfectly fine.
In statements cout<<&w0<<endl; cout<<&msg<<endl; you're outputting a pointer to std::string. Remove the & to actually print string, not its address. IF you're mystified by same result for two different objects, that might be because of they are addresses of local variables. The memory could be reused as those objects are limited in their lifetime not necessary have unique locations.
In greet0 technically msg is a local variable and stops existing on exit from function but compiler may optimize returned value and instead of copying msg to outside, the actual code would form a proper object at destination w0. With newer compilers Returned Value Optimization is guaranteed.
In function
const char* greet1(){
string msg = "hello";
cout<<&msg<<endl;
return msg.c_str();
}
msg here is a function-local variable, so it represents an object that stops existing at end of scope containing it, i.e. after function had returned. After return line the pointer taken from c_str() is dangling, because that method returns a pointer to the internal storage of std::string. The storage of msg was destroyed and you're invoking Undefined Behaviour by accessing it. Segmentation fault (which is purely Linux event by the way, mechanics in Windows are different) is possible outcome but not necessary.
In third function
const char* greet2(){
const char* msg = "hello";
return msg;
}
msg points to a array containing the constant string "hello". Constant strings created by string literals have same lifespan as a global static object. Those strings are formed during compilation. Exiting function doesn't invalidate the pointer, you still can dereference it because string still exists.
The only code that invokes undefined behavior is related to this function
#Implementation 1
const char* greet1(){
string msg = "hello";
cout<<&msg<<endl;
return msg.c_str();
}
The local object msg of the type std::string will not be alive after exiting the function. It will be destroyed. So the function returns an invalid pointer.
In this function implementation
#Implementation 2
const char* greet2(){
const char* msg = "hello";
return msg;
}
there is returned a pointer to the first character of the string literal "hello" that has static storage duration. It means that the string literal will be alive after exiting the function. Thus the function returns a valid pointer.
This function
#Implementation 0
string greet0(){
string msg = "hello";
return msg;
}
returns a temporary object of the type std::string that is moved (possibly with the move elision) to the variable w0 in main
auto w0 = greet0();
So this function is correct.
Quite new to c / c++. I have a question about the below code:
char* string2char(String command){
if (command.length() != 0) {
char *p = const_cast<char*>(command.c_str());
return p;
}
}
void setup() {}
void loop() {
String string1 = "Bob";
char *string1Char = string2char(string1);
String string2 = "Ross";
char *string2Char = string2char(string2);
Serial.println(string1Char);
Serial.println(string2Char);
}
This basically outputs repeatedly:
Ross
Ross
I understand I'm failing to grasp the concept of how pointers are working here - would someone be able to explain it? And how would I alter this so that it could show:
Bob
Ross
This function :
char* string2char(String command){
if (command.length() != 0) {
char *p = const_cast<char*>(command.c_str());
return p;
}
}
Does not make much sense, it takes string by value and returns pointer to its internal buffer, with cased away constnes(don't do it). You are getting some odd behaviour as you are returning values of object that already was destroyed, pass it by ref. Also I'm curious why you need to do all this stuff, can't you just pass:
Serial.println(string1.c_str());
Serial.println(string2.c_str());
As noted by Mark Ransom in the comments, when you pass the string by value, the string command is a local copy of the original string. Therefore you can't return a pointer to its c_str(), because that one points at the local copy, which will go out of scope when the function is done. So you get the same bug as described here: How to access a local variable from a different function using pointers?
A possible solution is to rewrite the function like this:
const char* string2char(const String& command){
return command.c_str();
}
Now the string is passed by reference so that c_str() refers to the same string object as the one in the caller (string1). I also took the libery to fix const-correctness at the same time.
Please note that you cannot modify the string by the pointer returned by c_str()! So it is very important to keep this const.
The problem here is that you've passed String command to the function by value, which makes a copy of whatever String you passed to the function. So, when you call const_cast<char*>(command.c_str()); you're making a pointer to the c string of that copied String. Since the String you've cast is within the scope of the function, the memory is freed when the function returns and the pointer is essentially invalid. What you want to do is change the argument to String & command which will pass a reference to the string, whose memory won't be freed when the function returns.
Your issue revolves around your argument.
char* string2char(String command){
// create a new string that's a copy of the thing you pass in, and call it command
if (command.length() != 0) {
char *p = const_cast<char*>(command.c_str());
// get the const char* that this string contains.
// It's valid only while the string command does; and is invalidated on changing the string.
return p; /// and destroy command - making p invalid
}
}
There are 2 ways to resolve this. The first and most complex, is to pass command in by reference. Thus const String& command and then work with that.
The alternative, which is much simpler, is to completely delete your function; make your char* const char* and just call c_str() on the string; ie
String string1 = "Bob";
const char *string1Char = string1.c_str();
I am confused with const pointers in C++ and wrote a small application to see what the output would be. I am attempting (I believe) to add a pointer to a string, which should not work correctly, but when I run the program I correctly get "hello world". Can anyone help me figure out what how this line (s += s2) is working?
My code:
#include <iostream>
#include <stdio.h>
#include <string>
using namespace std;
const char* append(const char* s1, const char* s2){
std::string s(s1); //this will copy the characters in s1
s += s2; //add s and s2, store the result in s (shouldn't work?)
return s.c_str(); //return result to be printed
}
int main() {
const char* total = append("hello", "world");
printf("%s", total);
return 0;
}
The variable s is local inside the append function. Once the append function returns that variable is destructed, leaving you with a pointer to a string that no longer exists. Using this pointer leads to undefined behavior.
My tip to you on how to solve this: Use std::string all the way!
you're adding const char* pointer to a std::string and that is possible (see this reference). it wouldn't be possible to make that operation on char* type (C style string).
however, you're returning a pointer to local variable, so once function append returns and gets popped of the stack, the string that your returned pointer is pointing to would not exist. this leads to an undefined behavior.
Class std::string has overloaded operator += for an operand of type const char *
basic_string& operator+=(const charT* s);
In fact it simply appends the string pointed to by this pointer to the contents of the object of type std::string allocating additionly memory if required. For example internally the overloaded operator could use standard C function strcat
Conceptually it is similar to the following code snippet.
char s[12] = "Hello ";
const char *s2 = "World";
std::strcat( s, s2 );
Take into account that your program has undefined behaviour because total will be invalid after destroying local object s after exiting function append. So the next statemnent in main
printf("%s", total);
can result in undefined behaviour.
I have a situation in which I'm performing binary serialization of a some items and I'm writing them to an opaque byte buffer:
int SerializeToBuffer(unsigned char* buffer)
{
stringstream ss;
vector<Serializable> items = GetSerializables();
string serializedItem("");
short len = 0;
for(int i = 0; i < items.size(); ++i)
{
serializedItem = items[i].Serialize();
len = serializedItem.length();
// Write the bytes to the stream
ss.write(*(char*)&(len), 2);
ss.write(serializedItem.c_str(), len);
}
buffer = reinterpret_cast<unsigned char*>(
const_cast<char*>(ss.str().c_str()));
return items.size();
}
Is it safe to remove the const-ness from the ss.str().c_str() and then reinterpret_cast the result as unsigned char* then assign it to the buffer?
Note: the code is just to give you an idea of what I'm doing, it doesn't necessarily compile.
No removing const-ness of an inherently contant string will result in Undefined Behavior.
const char* c_str ( ) const;
Get C string equivalent
Generates a null-terminated sequence of characters (c-string) with the same content as the string object and returns it as a pointer to an array of characters.
A terminating null character is automatically appended.
The returned array points to an internal location with the required storage space for this sequence of characters plus its terminating null-character, but the values in this array should not be modified in the program and are only guaranteed to remain unchanged until the next call to a non-constant member function of the string object.
Short answer: No
Long Answer: No. You really can't do that. The internal buffer of those object belong to the objects. Taking a reference to an internal structure is definitely a no-no and breaks encapsulation. Anyway those objects (with their internal buffer) will be destroyed at the end of the function and your buffer variable will point at uninitialized memory.
Use of const_cast<> is usually a sign that something in your design is wrong.
Use of reinterpret_cast<> usually means you are doing it wrong (or you are doing some very low level stuff).
You probably want to write something like this:
std::ostream& operator<<(std::ostream& stream, Data const& serializable)
{
return stream << serializable.internalData;
// Or if you want to write binary data to the file:
stream.write(static_cast<char*>(&serializable.internalData), sizeof(serializable.internalData);
return stream;
}
This is unsafe, partially because you're stripping off const, but more importantly because you're returning a pointer to an array that will be reclaimed when the function returns.
When you write
ss.str().c_str()
The return value of c_str() is only valid as long as the string object you invoked it on still exists. The signature of stringstream::str() is
string stringstream::str() const;
Which means that it returns a temporary string object. Consequently, as soon as the line
ss.str().c_str()
finishes executing, the temporary string object is reclaimed. This means that the outstanding pointer you received via c_str() is no longer valid, and any use of it leads to undefined behavior.
To fix this, if you really must return an unsigned char*, you'll need to manually copy the C-style string into its own buffer:
/* Get a copy of the string that won't be automatically destroyed at the end of a statement. */
string value = ss.str();
/* Extract the C-style string. */
const char* cStr = value.c_str();
/* Allocate a buffer and copy the contents of cStr into it. */
unsigned char* result = new unsigned char[value.length() + 1];
copy(cStr, cStr + value.length() + 1, result);
/* Hand back the result. */
return result;
Additionally, as #Als has pointed out, the stripping-off of const is a Bad Idea if you're planning on modifying the contents. If you aren't modifying the contents, it should be fine, but then you ought to be returning a const unsigned char* instead of an unsigned char*.
Hope this helps!
Since it appears that your primary consumer of this function is a C# application, making the signature more C#-friendly is a good start. Here's what I'd do if I were really crunched for time and didn't have time to do things "The Right Way" ;-]
using System::Runtime::InteropServices::OutAttribute;
void SerializeToBuffer([Out] array<unsigned char>^% buffer)
{
using System::Runtime::InteropServices::Marshal;
vector<Serializable> const& items = GetSerializables();
// or, if Serializable::Serialize() is non-const (which it shouldn't be)
//vector<Serializable> items = GetSerializables();
ostringstream ss(ios_base::binary);
for (size_t i = 0u; i != items.size(); ++i)
{
string const& serializedItem = items[i].Serialize();
unsigned short const len =
static_cast<unsigned short>(serializedItem.size());
ss.write(reinterpret_cast<char const*>(&len), sizeof(unsigned short));
ss.write(serializedItem.data(), len);
}
string const& s = ss.str();
buffer = gcnew array<unsigned char>(static_cast<int>(s.size()));
Marshal::Copy(
IntPtr(const_cast<char*>(s.data())),
buffer,
0,
buffer->Length
);
}
To C# code, this will have the signature:
void SerializeToBuffer(out byte[] buffer);
Here is the underlying problem:
buffer = ... ;
return items.size();
In the second-to last line you're assigning a new value to the local variable that used (up until that point) to hold the pointer your function was given as an argument. Then, immediately after, you return from the function, forgetting everything about the variable you just assigned to. That does not make sense!
What you probably want to do is to copy data from the memory pointed to by ss_str().c_str() to the memory pointed to by the pointer stored in buffer. Something like
memcpy(buffer, ss_str().s_str(), <an appropriate length here>)