C strings vs const char* is confusing me... help please - c++

I'm a C/C++ beginner trying to build what seems like a pretty simple program: it loads a file into a c-string (const char*). However, although the program is incredibly simple, it's not working in a way I understand. Take a look:
#include <iostream>
#include <fstream>
std::string loadStringFromFile(const char* file)
{
std::ifstream shader_file(file, std::ifstream::in);
std::string str((std::istreambuf_iterator<char>(shader_file)), std::istreambuf_iterator<char>());
return str;
}
const char* loadCStringFromFile(const char* file)
{
std::ifstream shader_file(file, std::ifstream::in);
std::string str((std::istreambuf_iterator<char>(shader_file)), std::istreambuf_iterator<char>());
return str.c_str();
}
int main()
{
std::string hello = loadStringFromFile("hello.txt");
std::cout << "hello: " << hello.c_str() << std::endl;
const char* hello2 = loadCStringFromFile("hello.txt");
std::cout << "hello2: " << hello2 << std::endl;
hello2 = hello.c_str();
std::cout << "hello2 = hello.c_str(), hello2: " << hello2 << std::endl;
return 0;
}
The output looks like this:
hello: Heeeeyyyyyy
hello2: 青!
hello2 = hello, hello2: Heeeeyyyyyy
The initial hello2 value changes every time, always some random kanji (I'm using a Japanese computer, so I'm guessing that's why it's kanji).
In my naive view, it seems like the two values should print identically. One function returns a c++ string, which I then convert to a c-string, and the other loads the string, converts the c-string from that and returns it. I made sure that the string was loading properly in loadCStringFromFile by couting the value before I returned it, and indeed it was what I had thought, e.g.:
/*(inside loadCStringFromFile)*/
const char* result = str.c_str();
std::cout << result << std::endl;//prints out "Heeeyyyyyy" as expected
return result;
So why should the value change? Thanks for the help...

Your problem is that str in loadCStringFromFile is a local variable, and is destructed when the function returns. At that point the return value from c_str() is invalid.
More detail here
Your first function, loadStringFromFile, is a more C++-like way of doing it, and illustrates the benefit of having a class manage memory for you. If you use char* then you have to take much more care where memory is allocated and freed.

the function
std::string loadStringFromFile(const char* file)
returns a string copy of the string created inside the function which is copied before the string goes out of scope i.e. the function ends, that is why it works.
const char* loadCStringFromFile(const char* file)
on the other hand returns a pointer to the local string which goes out of scope when the function returns and is destroyed so the returned address, the const char*, points to somewhere undefined.
in order for the second way to work you either need to create the string before calling the function :
const char* loadCStringFromFile(const char* file, string& str); // return str.c_str()
..
string str;
const char* result = loadCStringFromFile(file,str);
or you create a string on the heap in the function and pass the address back, but that gets a bit messy since the caller would need to delete the string to avoid memleak.

You should duplicate output of str.c_str():
return strdup(str.c_str);
function strdup can be found in cstring header.

Related

Can you safely get a pointer to a string from its c_str() const char*?

I have a const char pointer which I know for sure came from a string. For example:
std::string myString = "Hello World!";
const char* myCstring = myString.c_str();
In my case I know myCstring came from a string, but I no longer have access to that string (I received the const char* from a function call, and I cannot modify the function's argument list).
Given that I know myCstring points to contents of an existing string, is there any way to safely access the pointer of the parent string from which it originated? For example, could I do something like this?
std::string* hackyStringPointer = myCstring - 6; //Along with whatever pointer casting stuff may be needed
My concern is that perhaps the string's contents possibly cannot be guaranteed to be stored in contiguous memory on some or all platforms, etc.
Given that I know myCstring points to contents of an existing string, is there any way to safely access the pointer of the parent string from which it originated?
No, there is no way to obtain a valid std::string* pointer from a const char* pointer to character data that belongs to a std::string.
I received the const char* from a function call, and I cannot modify the function's argument list
Your only option in this situation would be if you can pass a pointer to the std::string itself as the actual const char* pointer, but that will only work if whatever is calling your function does not interpret the const char* in any way (and certainly not as a null-terminated C string), eg:
void doSomething(void (*func)(const char*), const char *data)
{
...
func(data);
...
}
void myFunc(const char *myCstring)
{
std::string* hackyStringPointer = reinterpret_cast<std::string*>(myCstring);
...
}
...
std::string myString = "Hello World!";
doSomething(&myFunc, reinterpret_cast<char*>(&myString));
You cannot convert a const char* that you get from std::string::c_str() to a std::string*. The reason you can't do this is because c_str() returns a pointer to the string data, not the string object itself.
If you are trying to get std::string so you can use it's member functions then what you can do is wrap myCstring in a std::string_view. This is a non-copying wrapper that lets you treat a c-string like it is a std::string. To do that you would need something like
std::string_view sv{myCstring, std::strlen(myCstring)};
// use sv here like it was a std::string
Yes (it seems), although I agree that if I need to do this it's likely a sign that my code needs reworking in general. Nevertheless, the answer seems to be that the string pointer resides 4 words before the const char* which c_str() returns, and I did recover a string* from a const char* belonging to a string.
#include <string>
#include <iostream>
std::string myString = "Hello World!";
const char* myCstring = myString.c_str();
unsigned int strPtrSize = sizeof(std::string*);
unsigned int cStrPtrSize = sizeof(const char*);
long strAddress = reinterpret_cast<std::size_t>(&myString);
long cStrAddress = reinterpret_cast<std::size_t>(myCstring);
long addressDifference = strAddress - cStrAddress;
long estStrAddress = cStrAddress + addressDifference;
std::string* hackyStringPointer = reinterpret_cast<std::string*>(estStrAddress);
cout << "Size of String* " << strPtrSize << ", Size of const char*: " << cStrPtrSize << "\n";
cout << "String Address: " << strAddress << ", C String Address: " << cStrAddress << "\n";
cout << "Address Difference: " << addressDifference << "\n";
cout << "Estimated String Address " << estStrAddress << "\n";
cout << "Hacky String: " << *hackyStringPointer << "\n";
//If any of these asserts trigger on any platform, I may need to re-evaluate my answer
assert(addressDifference == -4);
assert(strPtrSize == cStrPtrSize);
assert(hackyStringPointer == &myString);
The output of this is as follows:
Size of String* 4, Size of const char*: 4
String Address: 15725656, C String Address: 15725660
Address Difference: -4
Estimated String Address: 15725656
Hacky String: Hello World!
It seems to work so far. If someone can show that the address difference between a string and its c_str() can change over time on the same platform, or if all members of a string are not guaranteed to reside in contiguous memory, I'll change my answer to "No."
This reference says
The pointer returned may be invalidated by further calls to other member functions that modify the object.
You say you got the char* from a function call, this means you do not know what happens to the string in the mean time, is that right? If you know that the original string is not changed or deleted (e.g. gets out of scope and thus is destructed) then you can still use the char*.
Your example code however has multiple problems. You want to do this:
std::string* hackyStringPointer = myCstring - 6;
but I think you meant
char* hackyStringPointer = myCstring;
One, you cannot cast the char* to a string* and second you do not want to go BEFORE the start of the char*. The char* points to the first character of the string, you can use it to access the characters up to the trailing 0 character. But you should not go before the first or after the trailing 0 character though, as you do not know what is in that memory or if it even exists.

std::to_string store in const char*

I need to convert number to string and store it into a const char* but problem is that const char* variable is blank after assignment.
In the following example code I expect to see number output converted to const char*
#include <iostream>
#include <string>
int main()
{
int number = 123;
const char* ptr_num_string = std::to_string(number).c_str();
std::cout << "number to string is: " << ptr_num_string << std::endl;
std::cin.get();
return 0;
}
Output is blank:
number to string is:
How do I convert number into a const char* ?
std::to_string returns a temporary std::string.
The pointer returned by std::string::c_str is invalidated by any non-const operation on the string itself - this is because it basically gives you a pointer to the string's internal buffer.
Destroying a std::string is definitely a non-const operation!
Therefore, you can't expect to take a pointer into the string returned by to_string and ever be able to use that. You need to save a copy of that string in the first place:
int number = 123;
std::string const numAsString = std::to_string(number);
char const* ptrToNumString = numAsString.c_str(); // use this as long as numAsString is alive
What you are doing is called Undefined Behaviour - this means that your program is invalid, and anything could happen. It could crash, it could print out some garbage, it could appear to work normally, it could do a different one of these each time you run your program...
It is not "blank" it points to the buffer of string that went out of scope. So accessing it causes Undefined Behavior. You need to keep string alive:
auto str{std::to_string(number)};
auto ptr_num_string{str.c_str()};
std::cout << "number to string is: " << ptr_num_string << std::endl;

Working with const std::string pointers as function parameter

after years of writing Java, I would like to dig deeper into C++ again.
Although I think I can handle it, I don't know if I handle it the "state of the art"-way.
Currently I try to understand how to handle std::strings passed as const pointer to as parameter to a method.
In my understanding, any string manipulations I would like to perform on the content of the pointer (the actual string) are not possible because it is const.
I have a method that should convert the given string to lower case and I did quite a big mess (I believe) in order to make the given string editable. Have a look:
class Util
{
public:
static std::string toLower(const std::string& word)
{
// in order to make a modifiable string from the const parameter
// copy into char array and then instantiate new sdt::string
int length = word.length();
char workingBuffer[length];
word.copy(workingBuffer, length, 0);
// create modifiable string
std::string str(workingBuffer, length);
std::cout << str << std::endl;
// string to lower case (include <algorithm> for this!!!!)
std::transform(str.begin(), str.end(), str.begin(), ::tolower);
std::cout << str << std::endl;
return str;
}
};
Especially the first part, where I use the char buffer, to copy the given string into a modifiable string annoys me.
Are there better ways to implement this?
Regards,
Maik
The parameter is const (its a reference not a pointer!) but that does not prevent you from copying it:
// create modifiable string
std::string str = word;
That being said, why did you make the parameter a const reference in the first place? Using a const reference is good to avoid the parameter being copyied, but if you need the copy anyhow, then simply go with a copy:
std::string toLower(std::string word) {
std::transform(word.begin(), word.end(), word.begin(), ::tolower);
// ....
Remeber that C++ is not Java and values are values not references, ie copies are real copies and modifiying word inside the function won't have any effect on the parameter that is passed to the function.
you should replace all this:
// in order to make a modifiable string from the const parameter
// copy into char array and then instantiate new sdt::string
int length = word.length();
char workingBuffer[length];
word.copy(workingBuffer, length, 0);
// create modifiable string
std::string str(workingBuffer, length);
with simple this:
std::string str(word);
and it should work just fine =)
As you must make a copy of the input string, you may as well take it by value (also better use a namespace than a class with static members):
namespace util {
// modifies the input string (taken by reference), then returns a reference to
// the modified string
inline std::string&convert_to_lower(std::string&str)
{
for(auto&c : str)
c = std::tolower(static_cast<unsigned char>(c));
return str;
}
// returns a modified version of the input string, taken by value such that
// the passed string at the caller remains unaltered
inline std::string to_lower(std::string str)
{
// str is a (deep) copy of the string provided by caller
convert_to_lower(str);
// return-value optimisation ensures that no deep copy is made upon return
return str;
}
}
std::string str = "Hello";
auto str1 = util::to_lower(str);
std::cout << str << ", " << str1 << std::endl;
leaves str un-modified: it prints
Hello, hello
See here for why I cast to unsigned char.

Why does returning a C-string from a function result in random characters?

I've had to stop coding so many projects because of this weird quirk that I'm fed up enough to ask and risk looking like an idiot, so here goes...
I wrote a function like this:
const char* readFileToString(const char* filename) {
const char* result;
std::ifstream t(filename);
std::stringstream buffer;
buffer << t.rdbuf();
result = buffer.str().c_str();
return result;
}
I would expect that, if file.txt contains hello, that readFileToString("file.txt") should return hello. Instead, it returns garbled text, something along the lines of H�rv�0. However, if I put a std::cout << result; just before the return, it'll print hello.
Is this some weird, impossible quirk with C++? How do I fix it?
It's neither weird nor impossible; you returned a pointer to a buffer that went out of scope. The const char* doesn't "own" the string data, it only refers to it. Or, it used to! Once returned, that pointer is now invalid. You shall not dereference it.
I suggest you stick with std::string instead of venturing into advanced pointer techniques.
std::string readFileToString(const char* filename)
{
std::ifstream t(filename);
std::stringstream buffer;
buffer << t.rdbuf();
return buffer.str();
}
Unfortunately, I am not aware of any way to avoid a copy here.
If you don't mind shuffling your design about a little, and if you have some way to avoid a stream-to-string copy down the line, you could do this instead:
void readFileToStream(const char* filename, std::ostream& os)
{
std::ifstream t(filename);
os << t.rdbuf();
}
You may wish to return bool signifying the stream's state, but again you can do that at the callsite anyway.
Please see the commentary below:
const char* readFileToString(const char* filename) {
const char* result;
std::ifstream t(filename);
std::stringstream buffer; // Behind the scenes some memory will/is allocated
buffer << t.rdbuf(); // Memory is getting filled
result = buffer.str().c_str(); // Getting the address of that memory
return result;
// Buffer getting destroyed along with the allocated memory (what result points to)
}
.. Here results points to an invalid memory location
So hence it being corrupted

std::ostringstream isn't returning a valid string

I'm trying to use std::ostringstream to convert a number into a string (char *), but it doesn't seem to be working. Here's the code I have:
#include <windows.h>
#include <sstream>
int main()
{
std::ostringstream out;
out << 1234;
const char *intString = out.str().c_str();
MessageBox(NULL, intString, intString, MB_OK|MB_ICONEXCLAMATION);
return 0;
}
The resulting message box simply has no text in it.
This leads me to believe that the call to out.str().c_str() is returning an invalid string, but I'm not sure. Since I've trimmed this program down so far an am still getting the problem, I must have made an embarrassingly simple mistake. Help is appreciated!
out.str() returns a std::string by value, which means that you are calling .c_str() on a temporary. Consequently, by the time intString is initialized, it is already pointing at invalid (destroyed) data.
Cache the result of .str() and work with that:
std::string const& str = out.str();
char const* intString = str.c_str();