I am trying to convert a char string to a wchar string.
In more detail: I am trying to convert a char[] to a wchar[] first and then append " 1" to that string and the print it.
char src[256] = "c:\\user";
wchar_t temp_src[256];
mbtowc(temp_src, src, 256);
wchar_t path[256];
StringCbPrintf(path, 256, _T("%s 1"), temp_src);
wcout << path;
But it prints just c
Is this the right way to convert from char to wchar? I have come to know of another way since. But I'd like to know why the above code works the way it does?
mbtowc converts only a single character. Did you mean to use mbstowcs?
Typically you call this function twice; the first to obtain the required buffer size, and the second to actually convert it:
#include <cstdlib> // for mbstowcs
const char* mbs = "c:\\user";
size_t requiredSize = ::mbstowcs(NULL, mbs, 0);
wchar_t* wcs = new wchar_t[requiredSize + 1];
if(::mbstowcs(wcs, mbs, requiredSize + 1) != (size_t)(-1))
{
// Do what's needed with the wcs string
}
delete[] wcs;
If you rather use mbstowcs_s (because of deprecation warnings), then do this:
#include <cstdlib> // also for mbstowcs_s
const char* mbs = "c:\\user";
size_t requiredSize = 0;
::mbstowcs_s(&requiredSize, NULL, 0, mbs, 0);
wchar_t* wcs = new wchar_t[requiredSize + 1];
::mbstowcs_s(&requiredSize, wcs, requiredSize + 1, mbs, requiredSize);
if(requiredSize != 0)
{
// Do what's needed with the wcs string
}
delete[] wcs;
Make sure you take care of locale issues via setlocale() or using the versions of mbstowcs() (such as mbstowcs_l() or mbstowcs_s_l()) that takes a locale argument.
why are you using C code, and why not write it in a more portable way, for example what I would do here is use the STL!
std::string src = std::string("C:\\user") +
std::string(" 1");
std::wstring dne = std::wstring(src.begin(), src.end());
wcout << dne;
it's so simple it's easy :D
L"Hello World"
the prefix L in front of the string makes it a wide char string.
Related
After getting a struct from C# to C++ using C++/CLI:
public value struct SampleObject
{
LPWSTR a;
};
I want to print its instance:
printf(sampleObject->a);
but I got this error:
Error 1 error C2664: 'printf' : cannot convert parameter 1 from
'LPWSTR' to 'const char *'
How can I convert from LPWSTR to char*?
Thanks in advance.
Use the wcstombs() function, which is located in <stdlib.h>. Here's how to use it:
LPWSTR wideStr = L"Some message";
char buffer[500];
// First arg is the pointer to destination char, second arg is
// the pointer to source wchar_t, last arg is the size of char buffer
wcstombs(buffer, wideStr, 500);
printf("%s", buffer);
Hope this helped someone! This function saved me from a lot of frustration.
Just use printf("%ls", sampleObject->a). The use of l in %ls means that you can pass a wchar_t[] such as L"Wide String".
(No, I don't know why the L and w prefixes are mixed all the time)
int length = WideCharToMultiByte(cp, 0, sampleObject->a, -1, 0, 0, NULL, NULL);
char* output = new char[length];
WideCharToMultiByte(cp, 0, sampleObject->a, -1, output , length, NULL, NULL);
printf(output);
delete[] output;
use WideCharToMultiByte() method to convert multi-byte character.
Here is example of converting from LPWSTR to char*
or wide character to character.
/*LPWSTR to char* example.c */
#include <stdio.h>
#include <windows.h>
void LPWSTR_2_CHAR(LPWSTR,LPSTR,size_t);
int main(void)
{
wchar_t w_char_str[] = {L"This is wide character string test!"};
size_t w_len = wcslen(w_char_str);
char char_str[w_len + 1];
memset(char_str,'\0',w_len * sizeof(char));
LPWSTR_2_CHAR(w_char_str,char_str,w_len);
puts(char_str);
return 0;
}
void LPWSTR_2_CHAR(LPWSTR in_char,LPSTR out_char,size_t str_len)
{
WideCharToMultiByte(CP_ACP,WC_COMPOSITECHECK,in_char,-1,out_char,str_len,NULL,NULL);
}
Here is a Simple Solution. Check wsprintf
LPWSTR wideStr = "some text";
char* resultStr = new char [wcslen(wideStr) + 1];
wsprintfA ( resultStr, "%S", wideStr);
The "%S" will implicitly convert UNICODE to ANSI.
Don't convert.
Use wprintf instead of printf:
wprintf
See the examples which explains how to use it.
Alternatively, you can use std::wcout as:
wchar_t *wstr1= L"string";
LPWSTR wstr2= L"string"; //same as above
std::wcout << wstr1 << L", " << wstr2;
Similarly, use functions which are designed for wide-char, and forget the idea of converting wchar_t to char, as it may loss data.
Have a look at the functions which deal with wide-char here:
Unicode in Visual C++
I'm trying to copy a CString to a char* using memcpy() and I have difficulties doing it. In fact, only the first character is copied. Here is my code:
CString str = _T("something");
char* buff = new char();
memcpy(buff, str, str.GetLength() + 1);
After this, all that buff contains is the letter s.
You probably are mixing ASCII and Unicode strings. If compiling with Unicode setting, then CString stores a Unicode string (two bytes per character, in your case each second byte is 0 and thus looks like an ASCII string terminator).
If you want all ASCII:
CStringA str = "something";
char* buff = new char[str.GetLength()+1];
memcpy(buff, (LPCSTR)str, str.GetLength() + 1);
If you want all Unicode:
CStringW str = L"something";
wchar_t* buff = new wchar_t[str.GetLength()+1];
memcpy(buff, (LPCWSTR)str, sizeof(wchar_t)*(str.GetLength() + 1));
If you want it working on both settings:
CString str = _T("something");
TCHAR* buff = new TCHAR[str.GetLength()+1];
memcpy(buff, (LPCTSTR)str, sizeof(TCHAR) * (str.GetLength() + 1));
If you want to convert a Unicode string to an ASCII string:
CString str = _T("something");
char* buff = new char[str.GetLength()+1];
memcpy(buff, (LPCSTR)CT2A(str), str.GetLength() + 1);
Please also recognize the casts from str to LPCSTR, LPCWSTR or LPCTSTR and the corrected buffer allocation (need multiple characters and not only one).
Also, I am not quite sure if this is really what you need. A strdup for example looks much simpler than a new + memcpy.
You have only allocated memory to hold a char variable. To do what you intend, you need to allocate enough memory to hold the complete string.
CString str = _T("something");
LPTSTR buff = new TCHAR[(str.GetLength()+1) * sizeof(TCHAR)]; //allocate sufficient memory
memcpy(buff, str, str.GetLength() + 1);
You are
Only allocating one char, which won't be enough unless the CString is empty, and
copying the CString instance instead of the string it represents.
Try
CString str = _T("something");
int size = str.GetLength() + 1;
char* buff = new char[size];
memcpy(buff, str.GetBuffer(), size);
How can I compare a wstring, such as L"Hello", to a string? If I need to have the same type, how can I convert them into the same type?
Since you asked, here's my standard conversion functions from string to wide string, implemented using C++ std::string and std::wstring classes.
First off, make sure to start your program with set_locale:
#include <clocale>
int main()
{
std::setlocale(LC_CTYPE, ""); // before any string operations
}
Now for the functions. First off, getting a wide string from a narrow string:
#include <string>
#include <vector>
#include <cassert>
#include <cstdlib>
#include <cwchar>
#include <cerrno>
// Dummy overload
std::wstring get_wstring(const std::wstring & s)
{
return s;
}
// Real worker
std::wstring get_wstring(const std::string & s)
{
const char * cs = s.c_str();
const size_t wn = std::mbsrtowcs(NULL, &cs, 0, NULL);
if (wn == size_t(-1))
{
std::cout << "Error in mbsrtowcs(): " << errno << std::endl;
return L"";
}
std::vector<wchar_t> buf(wn + 1);
const size_t wn_again = std::mbsrtowcs(buf.data(), &cs, wn + 1, NULL);
if (wn_again == size_t(-1))
{
std::cout << "Error in mbsrtowcs(): " << errno << std::endl;
return L"";
}
assert(cs == NULL); // successful conversion
return std::wstring(buf.data(), wn);
}
And going back, making a narrow string from a wide string. I call the narrow string "locale string", because it is in a platform-dependent encoding depending on the current locale:
// Dummy
std::string get_locale_string(const std::string & s)
{
return s;
}
// Real worker
std::string get_locale_string(const std::wstring & s)
{
const wchar_t * cs = s.c_str();
const size_t wn = std::wcsrtombs(NULL, &cs, 0, NULL);
if (wn == size_t(-1))
{
std::cout << "Error in wcsrtombs(): " << errno << std::endl;
return "";
}
std::vector<char> buf(wn + 1);
const size_t wn_again = std::wcsrtombs(buf.data(), &cs, wn + 1, NULL);
if (wn_again == size_t(-1))
{
std::cout << "Error in wcsrtombs(): " << errno << std::endl;
return "";
}
assert(cs == NULL); // successful conversion
return std::string(buf.data(), wn);
}
Some notes:
If you don't have std::vector::data(), you can say &buf[0] instead.
I've found that the r-style conversion functions mbsrtowcs and wcsrtombs don't work properly on Windows. There, you can use the mbstowcs and wcstombs instead: mbstowcs(buf.data(), cs, wn + 1);, wcstombs(buf.data(), cs, wn + 1);
In response to your question, if you want to compare two strings, you can convert both of them to wide string and then compare those. If you are reading a file from disk which has a known encoding, you should use iconv() to convert the file from your known encoding to WCHAR and then compare with the wide string.
Beware, though, that complex Unicode text may have multiple different representations as code point sequences which you may want to consider equal. If that is a possibility, you need to use a higher-level Unicode processing library (such as ICU) and normalize your strings to some common, comparable form.
You should convert the char string to a wchar_t string using mbstowcs, and then compare the resulting strings. Notice that mbstowcs works on char */wchar *, so you'll probably need to do something like this:
std::wstring StringToWstring(const std::string & source)
{
std::wstring target(source.size()+1, L' ');
std::size_t newLength=std::mbstowcs(&target[0], source.c_str(), target.size());
target.resize(newLength);
return target;
}
I'm not entirely sure that that usage of &target[0] is entirely standard-conforming, if someone has a good answer to that please tell me in the comments. Also, there's an implicit assumption that the converted string won't be longer (in number of wchar_ts) than the number of chars of the original string - a logical assumption that still I'm not sure it's covered by the standard.
On the other hand, it seems that there's no way to ask to mbstowcs the size of the needed buffer, so either you go this way, or go with (better done and better defined) code from Unicode libraries (be it Windows APIs or libraries like iconv).
Still, keep in mind that comparing Unicode strings without using special functions is slippery ground, two equivalent strings may be evaluated different when compared bitwise.
Long story short: this should work, and I think it's the maximum you can do with just the standard library, but it's a lot implementation-dependent in how Unicode is handled, and I wouldn't trust it a lot. In general, it's just better to stick with an encoding inside your application and avoid this kind of conversions unless absolutely necessary, and, if you are working with definite encodings, use APIs that are less implementation-dependent.
Think twice before doing this — you might not want to compare them in the first place. If you are sure you do and you are using Windows, then convert string to wstring with MultiByteToWideChar, then compare with CompareStringEx.
If you are not using Windows, then the analogous functions are mbstowcs and wcscmp. The standard wide character C++ functions are often not portable under Windows; for instance mbstowcs is deprecated.
The cross-platform way to work with Unicode is to use the ICU library.
Take care to use special functions for Unicode string comparison, don't do it manually. Two Unicode strings could have different characters, yet still be the same.
wstring ConvertToUnicode(const string & str)
{
UINT codePage = CP_ACP;
DWORD flags = 0;
int resultSize = MultiByteToWideChar
( codePage // CodePage
, flags // dwFlags
, str.c_str() // lpMultiByteStr
, str.length() // cbMultiByte
, NULL // lpWideCharStr
, 0 // cchWideChar
);
vector<wchar_t> result(resultSize + 1);
MultiByteToWideChar
( codePage // CodePage
, flags // dwFlags
, str.c_str() // lpMultiByteStr
, str.length() // cbMultiByte
, &result[0] // lpWideCharStr
, resultSize // cchWideChar
);
return &result[0];
}
I am using libjson which is awesome. The only problem I have is I need to convert an utf8 string (char*) to a wide char string (wchar_t*). I googled and tried 3 different libs and they ALL failed (due to missing headers).
I don't need anything fancy. Just a one way conversion. How do I do this?
If you're on windows (which, chances are you are, given your need for wchar_t), use MultiByteToWideChar function (declared in windows.h), as so:
int length = MultiByteToWideChar(CP_UTF8, 0, src, src_length, 0, 0);
wchar_t *output_buffer = new wchar_t [length];
MultiByteToWideChar(CP_UTF8, 0, src, src_length, output_buffer, length);
Alternatively, if all you're looking for is a literal multibyte representation of your UTF8 (which is improbable, but possible), use the following (stdlib.h):
wchar_t * output_buffer = new wchar_t [1024];
int length = mbstowcs(output_buffer, src, 1024);
if(length > 1024){
delete[] output_buffer;
output_buffer = new wchar_t[length+1];
mbstowcs(output_buffer, src, length);
}
Hope this helps.
the below successfully enables CreateDirectoryW() to write to C:\Users\ПетрКарасев , basically an easier-to-understand wrapper around the MultiByteTyoWideChar mentioned by someone earlier.
std::wstring utf16_from_utf8(const std::string & utf8)
{
// Special case of empty input string
if (utf8.empty())
return std::wstring();
// Шаг 1, Get length (in wchar_t's) of resulting UTF-16 string
const int utf16_length = ::MultiByteToWideChar(
CP_UTF8, // convert from UTF-8
0, // default flags
utf8.data(), // source UTF-8 string
utf8.length(), // length (in chars) of source UTF-8 string
NULL, // unused - no conversion done in this step
0 // request size of destination buffer, in wchar_t's
);
if (utf16_length == 0)
{
// Error
DWORD error = ::GetLastError();
throw ;
}
// // Шаг 2, Allocate properly sized destination buffer for UTF-16 string
std::wstring utf16;
utf16.resize(utf16_length);
// // Шаг 3, Do the actual conversion from UTF-8 to UTF-16
if ( ! ::MultiByteToWideChar(
CP_UTF8, // convert from UTF-8
0, // default flags
utf8.data(), // source UTF-8 string
utf8.length(), // length (in chars) of source UTF-8 string
&utf16[0], // destination buffer
utf16.length() // size of destination buffer, in wchar_t's
) )
{
// не работает сука ...
DWORD error = ::GetLastError();
throw;
}
return utf16; // ура!
}
Here is a piece of code i wrote. It seems to work well enough. It returns 0 on utf8 error or when the value is > FFFF (which cant be held by a wchar_t)
#include <string>
using namespace std;
wchar_t* utf8_to_wchar(const char*utf8){
wstring sz;
wchar_t c;
auto p=utf8;
while(*p!=0){
auto v=(*p);
if(v>=0){
c = v;
sz+=c;
++p;
continue;
}
int shiftCount=0;
if((v&0xE0) == 0xC0){
shiftCount=1;
c = v&0x1F;
}
else if((v&0xF0) == 0xE0){
shiftCount=2;
c = v&0xF;
}
else
return 0;
++p;
while(shiftCount){
v = *p;
++p;
if((v&0xC0) != 0x80) return 0;
c<<=6;
c |= (v&0x3F);
--shiftCount;
}
sz+=c;
}
return (wchar_t*)sz.c_str();
}
The following (untested) code shows how to convert a multibyte string in your current locale into a wide string. So if your current locale is UTF-8, then this will suit your needs.
const char * inputStr = ... // your UTF-8 input
size_t maxSize = strlen(inputStr) + 1;
wchar_t * outputWStr = new wchar_t[maxSize];
size_t result = mbstowcs(outputWStr, inputStr, maxSize);
if (result == -1) {
cerr << "Invalid multibyte characters in input";
}
You can use setlocale() to set your locale.
I'm making a firefox extension (nsACString is from mozilla) but LoadLibrary expects a LPCWSTR. I googled a few options but nothing worked. Sort of out of my depth with strings so any references would also be appreciated.
It depends whether your nsACString (which I'll call str) holds ASCII or UTF-8 data:
ASCII
std::vector<WCHAR> wide(str.Length()+1);
std::copy(str.beginReading(), str.endReading(), wide.begin());
// I don't know whether nsACString has a terminating NUL, best to be sure
wide[str.Length()] = 0;
LPCWSTR newstr = &wide[0];
UTF-8
// get length, including nul terminator
int len = MultiByteToWideChar(CP_UTF8, MB_ERR_INVALID_CHARS,
str.BeginReading(), str.Length(), 0, 0);
if (len == 0) panic(); // happens if input data is invalid UTF-8
// allocate enough space
std::vector<WCHAR> wide(len);
// convert string
MultiByteToWideChar(CP_UTF8, MB_ERR_INVALID_CHARS,
str.BeginReading(), str.Length(), &wide[0], len)
LPCWSTR newstr = &wide[0];
This allocates only as much space as is needed - if you want faster code that potentially uses more memory than necessary, you can replace the first two lines with:
int len = str.Length() + 1;
This works because a conversion from UTF-8 to WCHAR never results in more characters than there were bytes of input.
Firstly note: LoadLibrary need not accept a LPWSTR. Only LoadLibraryW does. You may call LoadLibraryA directly (passing a narrow LPCSTR) and it will perform the translation for you.
If you choose to do it yourself however, below is one possible example.
nsACString sFoo = ...; // Some string.
size_t len = sFoo.Length() + 1;
WCHAR *swFoo = new WCHAR[len];
MultiByteToWideChar(CP_ACP, 0, sFoo.BeginReading(), len - 1, swFoo, len);
swFoo[len - 1] = 0; // Null-terminate it.
...
delete [] swFoo;
nsACString a;
const char* pData;
PRUint32 iLen = NS_CStringGetData(a, &pData);