C++ convert string to wstring and back with Minimal code - c++

The following code converts a string to a wstring which I need to call the stemming method I am using. However, the map in which I am storing the stemmed words is full of strings. I looked around at some of the solutions on SO and many of the conversions from wstring to string are circa a dozen lines of code. Is there any way to convert quickly (preferably inline or similar) from a string to a wstring and back?
string ANSIWord("documentation");
wchar_t* UnicodeTextBuffer = new wchar_t[ANSIWord.length()+1];
wmemset(UnicodeTextBuffer, 0, ANSIWord.length()+1);
mbstowcs(UnicodeTextBuffer, ANSIWord.c_str(), ANSIWord.length());
wWord = UnicodeTextBuffer;
Otherwise, I will look into converting my map and other methods to use wstring.
EDIT:
Epiphany: I decided to place the entire conversion-method-conversion in a method of its own, thereby reducing it to the desired one line. However, I would still like to know out of curiousity/ for future reference.

Why don't you write a function that includes the circa 12 lines of code that do the conversion, and then call that function?

Related

How can I replicate compile time hex string interpretation at run time!? c++

In my code the following line gives me data that performs the task its meant for:
const char *key = "\xf1`\xf8\a\\\x9cT\x82z\x18\x5\xb9\xbc\x80\xca\x15";
The problem is that it gets converted at compile time according to rules that I don't fully understand. How does "\x" work in a String?
What I'd like to do is to get the same result but from a string exactly like that fed in at run time. I have tried a lot of things and looked for answers but none that match closely enough for me to be able to apply.
I understand that \x denotes a hex number. But I don't know in which form that gets 'baked out' by the compiler (gcc).
What does that ` translate into?
Does the "\a" do something similar to "\x"?
This is indeed provided by the compiler, but this part is not member of the standard library. That means that you are left with 3 ways:
dynamically write a C++ source file containing the string, and writing it on its standard output. Compile it and (providing popen is available) execute it from your main program and read its input. Pretty ugly isn't it...
use the source of an existing compiler, or directly its internal libraries. Clang is probably a good starting point because it has been designed to be modular. But it could require a good amount of work to find where that damned specific point is coded and how to use that...
just mimic what the compiler does, and write your own parser by hand. It is not that hard, and will learn you why tests are useful...
If it was not clear until here, I strongly urge you to use the third way ;-)
If you want to translate "escape" codes in strings that you get as input at run-time then you need to do it yourself, explicitly.
One way is to read the input into one string. Then copy the characters from that source string into a new destination string, one by one. If you see a backslash then you discard it, fetch the next character, and if it's an x you can use e.g. std::stoi to convert the next few characters into its corresponding integer value, and append that number to the destination string (either adding it with std::to_string, or using output string streams and the normal "output" operator <<).

How to hard code binary data to string

I want to test serialized data conversion in my application, currently the object is stored in file and read the binary file and reloading the object.
In my unit test case I want to test this operation. As the file operations are costly I want to hard code the binary file content in the code itself.
How can I do this?
Currently I am trying like this,
std::string FileContent = "\00\00\00\00\00.........";
and it is not working.
You're right that a string can contain '\0', but here you're still initializing it from const char*, which, by definition, stops at the first '\0'. I'd recommend you to use uint8_t[] or even uint32_t[] (that is, without passing to std::string), even if the second might have up to 3 bytes of overhead (but it's more compact when in source). That's e.g. how X bitmaps are usually stored.
Another possibility is base64 encoding, which is printable but needs (a relatively quick) decoding.
If you really want to put the const char[] to a std::string, first convert the pointer to const char*, then use the two-iterator constructor of std::string. While it's true that std::string can hold '\0', it's somewhat an antipattern to store binary in a string, thus I'm not giving the exact code, just the hint.
The following should do what you need, however probably not recommended as most people wouldn't expect an std::string to contain null bytes.
std::string FileContent { "\x00\x00\x00\x00\x00", 5 };

why do they have a system string and standard string

Why do they have a System::String and a std::string in c++?
I couldn't find anything about this, except some topics about converting the one to another.
I noticed this when I want to put information of a textbox into a std::string variable. so I had to do some odd converting to get this.
Why do they have these 2 different strings when they actually do the same for coding? (holding a string value).
std::string is a class template from the c++ standard library that stores and manipulates strings. The data in std::string is basically a sequence of bytes, i.e. it doesn't have en encoding. std::string supports the most basic set of operations that you would expect from a string, namely it gives you methods for substring search and replace.
System::string is a class from Microsoft's .Net framework. It represents text as a series of Unicode characters, and has some more specialized methods like StartsWith, EndsWith, Split, Trim, ans so on.

String-Conversion: MBCS <-> UNICODE with multiple \0 within

I am trying to convert a std::string Buffer - containing data from a bitmap file - to std::wstring.
I am using MultiByteToWideChar, but that does not work, because the function stops after it encounters the first '\0'-character. Seems like it interprets it as the end of the string.
When i dont pass -1 as the length-parameter, but the real length of the data in the std::string-Buffer, it messes the Unicode-String up with characters that definetly not appeared at that position in the original string...
Do I have to write my own conversion function?
Or maybe shall i keep the data as a casual char-array, because the special-symbols will be converted incorrectly?
With regards
There are many, many things that will fail with this approach. Among other things, extra bytes may be added to your data without your realizing it.
It's odd that your only option takes a std::wstring(). If this is a home-grown library, you should take the trouble to write a new function. If it's not, make sure there's nothing more suitable before writing your own.

I'm getting contradictory answers, what should I do with my code?

I am making a RPG game in C++ and DirectX.
I store all the data for the game in .txt files and read/write it using `ifstream/ofstream. this has worked well for me so far when talking about creature stats and I have a hack together for creature names but this is becoming a problem.
I can store strings in the txt file and read them but I am having trouble using them. for single words I have a hack but now I am up to the story line where characters are talking to each other it is a real problem.
I asked on gamedevelopment.stackexchange how to put text on screen and was told to use D3Dtext but that only accepts C-style strings and I can only read C++ strings from the text file. This is such a big problem now that I am willing to go back and re-factor what need sit as no progress can be made until this is sorted.
So now I have a bunch of questions and I dont know which to ask first:
I want a way to draw the letters like graphics. I was told this is what D3Dtext does but I want to implement it my self if I can I just need info on how if someone knows?
If I am to use D3Dtext like so called experts advise I have to use C-style strings. so how can I convert C-style strings to C++ strings? I have a method now but that requires the new and delete operator for every string and I can see the being a big problem as it grows in complexity?
Is there a way to read C-style strings? Maybe a replacement for ifstream. I would like to keep the txt files as I really dont want to use xml but I could change the file format if it was a viable solution?
Premature optimisation I know but I plan to use the same function for every piece of text in the game so what would be a good way of doing this in terms of speed (why I dont want new/delete for every string)?
I am happy to provide any information that would be needed to help me, just ask.
std::string mystr = "Hello World.";
mystr.c_str(); // gets a null terminated const char* C-style string
Read your file as you are currently doing then if you need to access the c strings as above.
You can convert freely between C-style strings and C++'s std::string. Just use my_cpp_string.c_str() to get the C-string representation of a C++ string, and std::string my_cpp_string(my_c_string) to initialize a new std::string from a C-style string.
2) Use the c_str() method to pass your C++ strings to D3Dtext
some_D3Dtext_function(some_text.c_str())
3 and 4 then become non-issues.