Is there an equivalent way to do CString::GetBuffer in std::string? - c++

Many Windows API, such as GetModuleFileName, etc... write output to char* buffer. But it is more convenient to use std::string. Is there a way to have them write to std::string (or std::wstring)'s buffer directly?
Sorry for my poor English. I'm not a native English speaker. -_-
Taworn T.

If you're using C++0x, then the following is guaranteed to work:
std::string s;
s.resize(max_length);
size_t actual_length = SomeApiCall(&s[0], max_length);
s.resize(actual_length);
Before C++0x the std::string contents is not guaranteed to be consecutive in memory, so the code is not reliable in theory; in practice it works for popular STL implementations.

use std::string::c_str() to retrieve a const char * that is null terminated.
std::string::data() also returns a const char * but that may not be null terminated.
But like zeuxcg says, I dont suggest you to write directly in that buffer.

Related

char8_t and utf8everywhere: How to convert to const char* APIs without invoking undefined behaviour?

As this question is some years old
Is C++20 'char8_t' the same as our old 'char'?
I would like to know, what is the recommended way to handle the char8_t and char conversion right now? boost::nowide (1.80.0) doesn´t not yet understand char8_t nor (AFAIK) boost::locale.
As Tom Honermann noted that
reinterpret_cast<const char *>(u8"text"); // Ok.
reinterpret_cast<const char8_t*>("text"); // Undefined behavior.
So: How do i interact with APIs that just accept const char* or const wchar_t* (think Win32 API) if my application "default" string type is std::u8string? The recommendation seems to be https://utf8everywhere.org/.
If i got a std::u8string and convert to std::string by
std::u8string convert(std::string str)
{
return std::u8string(reinterpret_cast<const char8_t*>(str.data()), str.size());
}
std::string convert(std::u8string str)
{
return std::string(reinterpret_cast<const char_t*>(str.data()), str.size());
}
This would invoke the same UB that Tom Honermann mentioned. This would be used when i talk to Win32 API or any other API that wants some const char* or gives some const char* back. I could go all conversions through boost::nowide but in the end i get a const char* back from boost::nowide::narrow() that i need to cast.
Is the current recommendation to just stay at char and ignore char8_t?
This would invoke the same UB that Tom Honermann mentioned.
As pointed out in the post you referred to, UB only happens when you cast from a char* to a char8_t*. The other direction is fine.
If you are given a char* which is encoded in UTF-8 (and you care to avoid the UB of just doing the cast for some reason), you can use std::transform to convert the chars to char8_ts by converting the characters:
std::u8string convert(std::string str)
{
std::u8string ret(str.size());
std::ranges::transform(str, ret.begin(), [](char c) {return char8_t(c);});
return ret;
}
C++23's ranges::to will make using a named return variable unnecessary.
For dealing with wchar_t interfaces (which you shouldn't have to, since nowadays UTF-8 support exists through narrow character interfaces on Windows), you'll have to do an actual UTF-8->UTF-16 conversion. Which you would have had to do anyway.
Personally, I think all the char8_t stuff in C++ is unusable practically!
With the current standard combined with OS support, I would recommend to avoid it, if possible.
But that is not all yet. There is more critic:
Unfortunately the C++ standard itself deprecates its own conversion support before it offers a replacement!
For example, the support in std::filesystem by using an utf-8 encoded standard string (not u8string) is deprecated (std::filesystem::u8path). With that even to use utf-8 encoded std::string is a pain because you must always convert it from one to another and back again!
To your questions. It depends what you want to do. If you want have a std::string which is utf-8 encoded but you only have an std::u8string, then you can simply do the following (no reinterpret_cast needed):
std::string convert( std::u8string str )
{
return std::string(str.begin(), str.end());
}
But here, I personally would expect, that the standard would offer a move constructor in std::string taking a std::u8string. Because otherwise you always must make a copy with an extra allocation for the unchanged data.
Unfortunately the standard does not offer such simple things. They are forcing the users to do uncomfortable and expensive stuff.
The same is true, if you have a std::string and you have 100% verified that it is valid utf-8 then you can direct convert it:
std::u8string convert( std::string str )
{
return std::u8string( str.begin(), str.end() );
}
During writing the long answer I realized that it is even more bad than I though when it comes to conversion! If you need to do a real conversion of the encoding it turns out that std::u8string is not supported at all.
The only way possible (that is my research result so far) is to use std::string as the data holder for the conversion, since the available routines are working on char and NOT on char8_t!
So, for the conversion from std::string to std::u8string you must do the following:
Use std::mbrtoc16 or std::std::mbrtoc32 for convert narrow char to either UTF-16 or UTF-32.
Use std::codecvt_utf8 to produce an UTF-8 encoded std::string.
Finally use the routine above to convert from UTF-8 encoded std::string to std::u8string.
For the other way round from std::u8string to std::string you must do the following:
Use the routine above to create a UTF-8 encoded std::string.
Use std::codecvt_utf8 to create an UTF-16/32 string.
And finally use std::c16rtomb or std::c32rtomb to produce a narrow encoded std::string.
But guess what? The codecvt routines are deprecated without a replacement...
So, personally, I would recommend to use the Windows API for it and use std::string only (or on Windows std::wstring). Usually only on Windows the std::string / char is encoded with a Windows code page and everywhere else you can normally expect it is UTF-8 (except maybe for Mainframes and some very rare old systems).
The conclusion can only be: Don't mess around with char8_t and std::u8string at all. It is practically unusable.

Are the methods in the <cstring> applicable for string class too?

I've tried out using memcpy() method to strings but was getting a "no matching function call" although it works perfectly when I use an array of char[].
Can someone explain why?
www.cplusplus.com/reference/cstring/memcpy/
std::string is an object, not a contiguous array of bytes (which is what memcpy expects). std::string is not char*; std::string contains char* (somewhere really deep).
Although you can pull out the std::string inner byte array by using &str[0] (see note), I strongly encourage you not to. Almost anything you need to do already is implemented as a std::string method. Including appending, subtracting, transforming and anything that makes sense with a text object.
So yes, you can do something as stupid as:
std::string str (100,0);
memcpy(&str[0],"hello world", 11);
but you shouldn't.
Even if you do need memcpy behaviuor, try to use std::copy instead.
Note: this is often done with C functions that expects some buffer, while the developer wants to maintain a RAII style in his code. So he or she produces std::string object but passes it as C string. But if you do clean C++ code you don't need to.
Because there's no matching function call. You're trying to use C library functions with C++ types.

dynamic allocation in c++

I am triing to save a .txt file into an object(c++). the problem is i dont know how big it is.
in c i would do it with a evl with the malloc() function but i have no idea how to do that in c++ or how to google that issue =/
Why not use std::ostringstream?
Or if you want to use an equivalent to malloc, use:
char *storage = new char[__size__];
....
delete[] storage;
But if your file is a binary file odds are you have a byte which is null. strlen won't work the way you expect it then.
You can also use std::string, std::vector<char> in which you can have any values and that can be converted to const char * easily.
why won't you save it as a string in a field of type std::string?
try:
myObj.someString = myFile.rdbuf();
You can use the new operator in C++, or better yet one of the standard library containers.
Try this: http://www.fredosaurus.com/notes-cpp/newdelete/50dynamalloc.html
Prefer using standard containers than raw dynamic allocation

use of char * vs std::string in different environments

I have been using std::string in my code. I was going to make a std::string and pass it by reference. However, someone suggested using a char * instead. Something about std::string is not reliable when porting code. Is that true? I have avoided using char * as I would need to do some memory management for it. Instead I find using the std::string much easier to use.
Basically I have a 10 digit output that I am storing in this string. Atm, I am not sure which would be better to use.
std::string is part of the C++ Standard, and has been since 1998. It is available in all the current C++ compilers. There really is no portability reason not to use it. If you have an API that needs to use a C-style string, you can use the std::string's c_str() member to get one from a string:
std::string s = "foo";
int n = strlen( s.c_str() );
In C++, almost every string should be std::string unless another library requires a cstring, in which case you should still be using an std::string and passing string.c_str(), unless you're using functions that work with buffers.
However, if you're writing a library and exporting functions, it's better to use const char* parameters rather than std::string parameters for portability.
Using a char * you are sure that you will not get portability issues among libraries.
If a library exports a function that uses an std::string, it might have problems communicating with another library that has been linked against a different version of the standard library.
I think that there is nothing to worry about unless you are going to provide some API to 3rd party.
Just use std::string
There's nothing unportable about std::string that isn't also an issue with char *. std::string actually uses a char * internally...
string is better. There is nothing unreliable about it on any platform. If you're worried about passing large classes, you can pass const references of your strings into functions. Makes coding faster and less bug prone.
In addition to the fact thata it's easier, std::string will probably be more efficient. Its small string optimization can keep the 10 digits in the std::string object itself, instead of putting them in another memory block off the heap.

Opening a file with std::string

This should be a fairly trivial problem. I'm trying to open an ofstream using a std::string (or std::wstring) and having problems getting this to work without a messy conversion.
std::string path = ".../file.txt";
ofstream output;
output.open(path);
Ideally I don't want to have to convert this by hand or involve c-style char pointers if there's a nicer way of doing this?
In the path string, use two dots instead of three.
Also you may use 'c_str()' method on string to get the underlying C string.
output.open(path.c_str());
this should work:
output.open(path.c_str())
I'm afraid it's simply not possible. You have to use c_str, and yes, it sucks.
Incidentally, using char* also means fstream has no support for Unicode file names... a shame.