I am trying to convert a string into a wchar_t string to use it in a WNetUseConnection function.
Basicly its an unc name looking like this "\\remoteserver".
I get a return code 1113, which is described as:
No mapping for the Unicode character
exists in the target multi-byte code
page.
My code looks like this:
std::string serverName = "\\uncDrive";
wchar_t *remoteName = new wchar_t[ serverName.size() ];
MultiByteToWideChar(CP_ACP, 0, serverName.c_str(), serverName.size(), remoteName, serverName.size()); //also doesn't work if CP_UTF8
NETRESOURCE nr;
memset( &nr, 0, sizeof( nr ));
nr.dwType = RESOURCETYPE_DISK;
nr.lpRemoteName = remoteName;
wchar_t pswd[] = L"user"; //would have the same problem if converted and not set
wchar_t usrnm[] = L"pwd"; //would have the same problem if converted and not set
int ret = WNetUseConnection(NULL, &nr, pswd, usrnm, 0, NULL, NULL, NULL);
std::cerr << ret << std::endl;
The intersting thing is, that if remoteName is hard codede like this:
char_t remoteName[] = L"\\\\uncName";
Everything works fine. But since later on the server, user and pwd will be parameters which i get as strings, i need a way to convert them (also tried mbstowcs function with the same result).
MultiByteToWideChar will not 0-terminate the converted string with your current code, and therefore you get garbage characters following the converted "\uncDrive"
Use this:
std::string serverName = "\\uncDrive";
int CharsNeeded = MultiByteToWideChar(CP_ACP, 0, serverName.c_str(), serverName.size() + 1, 0, 0);
wchar_t *remoteName = new wchar_t[ CharsNeeded ];
MultiByteToWideChar(CP_ACP, 0, serverName.c_str(), serverName.size() + 1, remoteName, CharsNeeded);
This first checks with MultiByteToWideChar how many chars are needed to store the specified string and the 0-termination, then allocates the string and converts it. Note that I didn't compile/test this code, beware of typos.
Related
I am trying with below code to convert from shift-jis file to utf-8, but when we open the output file it has corrupted characters, looks like something is missed out here, any thoughts?
// From file
FILE* shiftJisFile = _tfopen(lpszShiftJs, _T("rb"));
int nLen = _filelength(fileno(shiftJisFile));
LPSTR lpszBuf = new char[nLen];
fread(lpszBuf, 1, nLen, shiftJisFile);
// convert multibyte to wide char
int utf16size = ::MultiByteToWideChar(CP_ACP, 0, lpszBuf, -1, 0, 0);
LPWSTR pUTF16 = new WCHAR[utf16size];
::MultiByteToWideChar(CP_ACP, 0, lpszBuf, -1, pUTF16, utf16size);
wstring str(pUTF16);
// convert wide char to multi byte utf-8 before writing to a file
fstream File("filepath", std::ios::out);
string result = string();
result.resize(WideCharToMultiByte(CP_UTF8, 0, str.c_str(), -1, NULL, 0, 0, 0));
char* ptr = &result[0];
WideCharToMultiByte(CP_UTF8, 0, str.c_str(), -1, ptr, result.size(), 0, 0);
File << result;
File.close();
There are multiple problems.
The first problem is that when you are writing the output file, you need to set it to binary for the same reason you need to do so when reading the input.
fstream File("filepath", std::ios::out | std::ios::binary);
The second problem is that when you are reading the input file, you are only reading the bytes of the input stream and treat them like a string. However, those bytes do not have a terminating null character. If you call MultiByteToWideChar with a -1 length, it infers the input string length from the terminating null character, which is missing in your case. That means both utf16size and the contents of pUTF16 are already wrong. Add it manually after reading the file:
int nLen = _filelength(fileno(shiftJisFile));
LPSTR lpszBuf = new char[nLen+1];
fread(lpszBuf, 1, nLen, shiftJisFile);
lpszBuf[nLen] = 0;
The last problem is that you are using CP_ACP. That means "the current code page". In your question, you were specifically asking how to convert Shift-JIS. The code page Windows uses for its closes equivalent to what is commonly called "Shift-JIS" is 932 (you can look that up on wikipedia for example). So use 932 instead of CP_ACP:
int utf16size = ::MultiByteToWideChar(932, 0, lpszBuf, -1, 0, 0);
LPWSTR pUTF16 = new WCHAR[utf16size];
::MultiByteToWideChar(932, 0, lpszBuf, -1, pUTF16, utf16size);
Additionally, there is no reason to create wstring str(pUTF16). Just use pUTF16 directly in the WideCharToMultiByte calls.
Also, I'm not sure how kosher char *ptr = &result[0] is. I personally would not create a string specifically as a buffer for this.
Here is the corrected code. I would personally not write it this way, but I don't want to impose my coding ideology on you, so I made only the changes necessary to fix it:
// From file
FILE* shiftJisFile = _tfopen(lpszShiftJs, _T("rb"));
int nLen = _filelength(fileno(shiftJisFile));
LPSTR lpszBuf = new char[nLen+1];
fread(lpszBuf, 1, nLen, shiftJisFile);
lpszBuf[nLen] = 0;
// convert multibyte to wide char
int utf16size = ::MultiByteToWideChar(932, 0, lpszBuf, -1, 0, 0);
LPWSTR pUTF16 = new WCHAR[utf16size];
::MultiByteToWideChar(932, 0, lpszBuf, -1, pUTF16, utf16size);
// convert wide char to multi byte utf-8 before writing to a file
fstream File("filepath", std::ios::out | std::ios::binary);
string result;
result.resize(WideCharToMultiByte(CP_UTF8, 0, pUTF16, -1, NULL, 0, 0, 0));
char *ptr = &result[0];
WideCharToMultiByte(CP_UTF8, 0, pUTF16, -1, ptr, result.size(), 0, 0);
File << ptr;
File.close();
Also, you have a memory leak -- lpszBuf and pUTF16 are not cleaned up.
You should try use std::locale to perform this conversion:
namespace fs = std::filesystem;
void convert(const fs::path inName, const fs::path outName)
{
std::wifstream in{inName};
in.imbue(std::locale{".932"}); // or "ja_JP.SJIS"
if (in) {
std::wofstream out{outName};
out.imbue(std::locale{".utf-8"});
std::wstring line;
while (getline(in, line)) {
out << line << L'\n';
}
}
}
Note locale names are platform specific - I think I used proper one for Windows.
Update: I've tested this on my Window 10 machine with MSVC 19.29.30145 and works perfectly. I used wiki page to get some valid Japanese text and used Notepad++ to save this text in proper encoding (Shift-JIS).
I also used Beyond Compare to verify results:
Note I used similar method here for Korean and it worked nicely.
wstring str(pUTF16); - pUTF16 there does not end with zero char. It should be wstring str(pUTF16, utf16size);
I've got this piece of code:
const char * c = &(4:); //This pointer contains "JPG" string
//Wide char conversion
wchar_t *cc = new wchar_t[128];
MultiByteToWideChar(CP_ACP, 0, c, -1, cc, wcslen(cc));
Then I declare a wstring variable:
wstring sFilter;
sFilter.append(L"Format: ");
sFilter.append(cc);
sFilter.push_back('\0');
sFilter.append(L"*.");
sFilter.append(cc);
sFilter.push_back('\0');
sFilter.push_back('\0');
const wchar_t * extensionFilter = sFilter.c_str();
I'm forming this wchar_t to apply a filter to GetOpenFileName function from WinApi: ofn.lpstrFilter = extensionString; which is a member of a structure.
Extension filter randomly contains: "3ormat: JPG" or ":ormat: JPG"...
I cannot change project to Unicode just because the IDE I'm working on doesn't allow it. So I need to work with this.
wchar_t *cc = new wchar_t[128];
MultiByteToWideChar(CP_ACP, 0, c, -1, cc, wcslen(cc));
new[] does not fill the memory that it allocates. You are calling wcslen() on a buffer that is not guaranteed to be null-terminated. And even if it were, the null would be at the front of the buffer so wcslen() would return 0. You need to pass the actual length of the allocated buffer:
MultiByteToWideChar(CP_ACP, 0, c, -1, cc, 128);
I cannot change project to Unicode just because the IDE I'm working on doesn't allow it.
You don't need to change the whole project. That only affects TCHAR-based declarations anyway. Since your input data is Ansi, you could simply call GetOpenFileNameA() directly and not worry about converting your input data to Unicode first:
const char * c = ...; //This pointer contains "JPG" string
string sFilter;
sFilter.append("Format: ");
sFilter.append(c);
sFilter.push_back('\0');
sFilter.append("*.");
sFilter.append(c);
sFilter.push_back('\0');
sFilter.push_back('\0');
const char * extensionFilter = sFilter.c_str();
OPENFILENAMEA ofn;
...
ofn.lpstrFilter = extensionFilter;
...
GetOpenFileNameA(&ofn);
in a Visual Studio 2008 MFC project I've to manage strings in UTF8 containing arabic cities and searching onlines I write this little piece of code:
CString MyClass::convertString(string input) {
int l = MultiByteToWideChar(CP_UTF8, 0, input.c_str(), -1, NULL, 0);
wchar_t *str = new wchar_t[l];
int r = MultiByteToWideChar(CP_UTF8, 0, input.c_str(), -1, str, l);
CString output = str;
delete str ;
return output;
}
When I try to convert a string it remains the same and if I try to print these two string the result is the same.
What am I doing wrong?
Thanks in advance.
You don't want to convert strings to UTF-8 for display purposes. There is no UTF-8 charset than will allow you to display them correctly. If your already have them in Unicode, just keep them in Unicode. I would build your application in Unicode and avoid MBCS if you can. It makes life easier. Otherwise, for displaying those Arabic strings, you would have to convert them to the Arabic codepage and then use an Arabic font/charset to display them.
Thanks for all replies. I've found a solution; the string in input was not encoded in UTF8 (I should have check it before posting on Stackoverflow), then I edited the code changing the output from CString to wstring.
wstring MyClass::convertString(string input) {
int l = MultiByteToWideChar(CP_UTF8, 0, input.c_str(), -1, NULL, 0);
wchar_t *str = new wchar_t[l];
int r = MultiByteToWideChar(CP_UTF8, 0, input.c_str(), -1, str, l1);
wstring output = wstring(str);
delete str ;
return output
}
Now everything works fine. Thanks.
I'm having a bit of trouble with handling unicode conversions.
The following code outputs this into my text file.
HELLO??O
std::string test = "HELLO";
std::string output;
int len = WideCharToMultiByte(CP_OEMCP, 0, (LPCWSTR)test.c_str(), -1, NULL, 0, NULL, NULL);
char *buf = new char[len];
int len2 = WideCharToMultiByte(CP_OEMCP, 0, (LPCWSTR)test.c_str(), -1, buf, len, NULL, NULL);
output = buf;
std::wofstream outfile5("C:\\temp\\log11.txt");
outfile5 << test.c_str();
outfile5 << output.c_str();
outfile5.close();
But as you can see, output is just a unicode conversion from the test variable. How is this possible?
Check if the LEN is correct after first measuring call. In general, you should not cast test.c_str() to LPCWSTR. The 'test' as is 'char'-string not 'wchar_t'-wstring. You may cast it to LPCSTR - note the 'W' missing. The WinAPI has distinction between that. You really should be using wstring if you want to keep widechars in it.. Yeah, after re-reading your code, the test should be a wstring, then you can cast it to LPCWSTR safely.
after reading this
Microsoft wstring reference
I changed
std::string test = "HELLO";
to
std::wstring test = L"HELLO";
And the string was outputted correctly and I got
HELLOHELLO
I am trying to convert a char string to a wchar string.
In more detail: I am trying to convert a char[] to a wchar[] first and then append " 1" to that string and the print it.
char src[256] = "c:\\user";
wchar_t temp_src[256];
mbtowc(temp_src, src, 256);
wchar_t path[256];
StringCbPrintf(path, 256, _T("%s 1"), temp_src);
wcout << path;
But it prints just c
Is this the right way to convert from char to wchar? I have come to know of another way since. But I'd like to know why the above code works the way it does?
mbtowc converts only a single character. Did you mean to use mbstowcs?
Typically you call this function twice; the first to obtain the required buffer size, and the second to actually convert it:
#include <cstdlib> // for mbstowcs
const char* mbs = "c:\\user";
size_t requiredSize = ::mbstowcs(NULL, mbs, 0);
wchar_t* wcs = new wchar_t[requiredSize + 1];
if(::mbstowcs(wcs, mbs, requiredSize + 1) != (size_t)(-1))
{
// Do what's needed with the wcs string
}
delete[] wcs;
If you rather use mbstowcs_s (because of deprecation warnings), then do this:
#include <cstdlib> // also for mbstowcs_s
const char* mbs = "c:\\user";
size_t requiredSize = 0;
::mbstowcs_s(&requiredSize, NULL, 0, mbs, 0);
wchar_t* wcs = new wchar_t[requiredSize + 1];
::mbstowcs_s(&requiredSize, wcs, requiredSize + 1, mbs, requiredSize);
if(requiredSize != 0)
{
// Do what's needed with the wcs string
}
delete[] wcs;
Make sure you take care of locale issues via setlocale() or using the versions of mbstowcs() (such as mbstowcs_l() or mbstowcs_s_l()) that takes a locale argument.
why are you using C code, and why not write it in a more portable way, for example what I would do here is use the STL!
std::string src = std::string("C:\\user") +
std::string(" 1");
std::wstring dne = std::wstring(src.begin(), src.end());
wcout << dne;
it's so simple it's easy :D
L"Hello World"
the prefix L in front of the string makes it a wide char string.