Convert TCHAR * -> std::wstring in both unicode and non-unicode environments - c++

I have some code in a library which has to internally work with wstring, that's all nice and fine. But it's called with a TCHAR string parameter, from both unicode and non-unicode projects, and I'm having trouble finding a neat conversion for both cases.
I see some ATL conversions and so on but can't see the right way, without defining multiple code paths using #define

Assuming TCHAR expands to wchar_t in Unicode builds:
inline std::wstring convert2widestr(const wchar_t* const psz)
{
return psz;
}
inline std::wstring convert2widestr(const char* const psz)
{
std::size_t len = std::strlen(psz);
if( psz.empty() ) return std::wstring();
std::vector<wchar_t> result;
const int len = WideCharToMultiByte( CP_ACP
, 0
, reinterpret_cast<LPCWSTR>(psz)
, static_cast<int>(len)
, NULL
, 0
, NULL
, NULL
);
result.resize( len );
if(result.empty()) return std::wstring();
const int cbytes = WideCharToMultiByte( CP_ACP
, 0
, reinterpret_cast<LPCWSTR>(psz)
, static_cast<int>(len)
, reinterpret_cast<LPSTR>(&result[0])
, static_cast<int>(result.size())
, NULL
, NULL
);
assert(cbytes);
return std::wstring( result.begin(), result.begin() + cbytes );
}
Use like this:
void f(const TCHAR* psz)
{
std::wstring str = convert(psz);
// ...
}

Related

Registry RegQueryValueExW

Is it possible to read a value in the Registry not to an array of chars but directly to an AnsiString in this case?
LONG result;
wchar_t buf[255] = {0};
DWORD dwBufSize = sizeof(buf);
String d = "Nazwa";
DWORD dwType = REG_SZ;
result = ::RegQueryValueExW( hkSoftware, (LPCWSTR)(d.c_str()), NULL, &dwType, (LPBYTE)&buf, &dwBufSize );
First off, your code example is not using AnsiString. In C++Builder 2009 and later, String is an alias for UnicodeString instead.
And yes, you can use UnicodeString with RegQueryValueExW(), without using a typecast. UnicodeString::c_str() returns a WideChar*, and WideChar is an alias for wchar_t on Windows, so WideChar* (aka wchar_t *) is implicitly convertible to LPCWSTR (aka const wchar_t *), eg:
LONG result;
wchar_t buf[255] = {0};
DWORD dwBufSize = sizeof(buf);
UnicodeString d = L"Nazwa";
DWORD dwType = REG_SZ;
result = ::RegQueryValueExW( hkSoftware, d.c_str(), NULL, &dwType, reinterpret_cast<LPBYTE>(&buf), &dwBufSize );
You can also use UnicodeString as a buffer to receive string data from RegQueryValueExW(), eg:
LONG result;
UnicodeString buf;
buf.SetLength(...);
DWORD dwBufSize = ByteLength(buf);
UnicodeString d = L"Nazwa";
DWORD dwType = REG_SZ;
result = ::RegQueryValueExW( hkSoftware, d.c_str(), NULL, &dwType, reinterpret_cast<LPBYTE>(buf.c_str()), &dwBufSize );
if ( result == 0 ) {
buf.SetLength(dwBufSize/sizeof(WideChar));
...
}
That being said, you should consider using C++Builder's TRegistry class instead of using the Win32 Registry API directly. TRegistry has many methods for reading different kinds of data, including ReadString() for String data, eg:
#include <Registry.hpp>
TRegistry *Reg = new TRegistry;
String buf;
Reg->RootKey = ...;
if (Reg->OpenKeyReadOnly(_D("...")))
{
buf = Reg->ReadString(_D("Nazwa"));
Reg->CloseKey();
}
delete Reg;

Create StartMenu Entry

i try to create link to file in StartMenu folder, my code:
bool createStartMenuEntry(string targetPath, string name){
std::wstring stemp = s2ws(targetPath);
LPCWSTR target = stemp.c_str();
WCHAR startMenuPath[MAX_PATH];
HRESULT result = SHGetFolderPathW(NULL, CSIDL_COMMON_PROGRAMS, NULL, 0, startMenuPath);
if (SUCCEEDED(result)) {
std::wstring linkPath = std::wstring(startMenuPath) + s2ws(name);
LPCWSTR link = linkPath.c_str();
//TEST MESSAGE!!!
MessageBox(NULL, LPCSTR(target), LPCSTR(link), MB_ICONWARNING);
CoInitialize(NULL);
IShellLinkW* shellLink = NULL;
result = CoCreateInstance(CLSID_ShellLink, NULL, CLSCTX_ALL, IID_IShellLinkW, (void**)&shellLink);
if (SUCCEEDED(result)) {
shellLink->SetPath(target);
//shellLink->SetDescription(L"Shortcut Description");
shellLink->SetIconLocation(target, 0);
IPersistFile* persistFile;
result = shellLink->QueryInterface(IID_IPersistFile, (void**)&persistFile);
if (SUCCEEDED(result)) {
result = persistFile->Save(link, TRUE);
persistFile->Release();
}
else {
return false;
}
shellLink->Release();
}
else {
return false;
}
}
else {
return false;
}
return true;
}
String to widestring conversion:
std::wstring s2ws(const std::string& s)
{
int len;
int slength = (int)s.length() + 1;
len = MultiByteToWideChar(CP_ACP, 0, s.c_str(), slength, 0, 0);
wchar_t* buf = new wchar_t[len];
MultiByteToWideChar(CP_ACP, 0, s.c_str(), slength, buf, len);
std::wstring r(buf);
delete[] buf;
return r;
}
When I call my func like createStartMenuEntry("E:\\file.exe" , "File"), in test message I have only first letters of path and shortcut isn't created, I think, problem in unicode conversion.
There are multiple problems here:
MessageBox(NULL, LPCSTR(target), LPCSTR(link), MB_ICONWARNING); is all kinds of wrong. You should not be casting strings like this. If you are compiling without UNICODE defined, you must use MessageBoxW() to display a LPCWSTR string. You get a single character because "c:\\" as a Unicode string is 'c',0,':',0,'\\',0,0,0 in memory, and that is the same as a "c" string when treated as a narrow ANSI string.
You ignore the result of persistFile->Save()! You also ignore the results of SetPath() and SetIconLocation().
A normal user cannot write to CSIDL_COMMON_PROGRAMS, only administrators have write access to that folder, because it is shared by all users. If you are not planning to require UAC elevation, you must write to CSIDL_PROGRAMS instead.
You should not use std::string to store paths, only std::wstring and WCHAR*/LP[C]WSTR, because paths that contain certain Unicode characters cannot be represented in a narrow ANSI string.

How to declare wchar_t and set its string value later on?

I am developing for Windows, I have not found adequate information on how to correctly declare and later on set a unicode string. So far,
wchar_t myString[1024] = L"My Test Unicode String!";
What I assume the above does is [1024] is the allocated string length of how many characters I need to have max in that string. L"" makes sure the string in quotes is unicode (An alt I found is _T()). Now later on in my program when I am trying to set that string to another value by,
myString = L"Another text";
I get compiler errors, what am I doing wrong?
Also if anyone has an easy and in-depth unicode app resource I'd like to have some links, used to have bookmarked a website which was dedicated to that but seems that now is gone.
EDIT
I provide the entire code, I intend to use this as a DLL function but nothing so far is returned.
#include "dll.h"
#include <windows.h>
#include <string>
#include <cwchar>
export LPCSTR ex_test()
{
wchar_t myUString[1024];
std::wcsncpy(myUString, L"Another text", 1024);
int myUStringLength = lstrlenW(myUString);
MessageBoxW(NULL, (LPCWSTR)myUString, L"Test", MB_OK);
int bufferLength = WideCharToMultiByte(CP_UTF8, 0, myUString, myUStringLength, NULL, 0, NULL, NULL);
if (bufferLength <= 0) { return NULL; } //ERROR in WideCharToMultiByte
return NULL;
char *buffer = new char[bufferLength+1];
bufferLength = WideCharToMultiByte(CP_UTF8, 0, myUString, myUStringLength, buffer, bufferLength, NULL, NULL);
if (bufferLength <= 0) { delete[] buffer; return NULL; } //ERROR in WideCharToMultiByte
buffer[bufferLength] = 0;
return buffer;
}
The easiest approach is to declare the string differently in the first place:
std::wstring myString;
myString = L"Another text";
If you insist in using arrays of wchar_t directly, you'd use wcscpy() or better wcsncpy() from <cwchar>:
wchar_t myString[1024];
std::wcsncpy(myString, L"Another text", 1024);
wchar_t myString[1024] = L"My Test Unicode String!";
is initializing the array like this
wchar_t myString[1024] = { 'M', 'y', ' ', ..., 'n', 'g', '!', '\0' };
but
myString = L"Another text";
is an assignment which u cannot do to arrays. u have to copy the contents of the new string into your old array:
const auto& newstring = L"Another text";
std::copy(std::begin(newstring), std::end(newstring), myString);
or if its a pointer
wchar_t* newstring = L"Another text";
std::copy(newstring, newstring + wsclen(newstring) + 1, myString);
or as nawaz suggested with copy_n
std::copy_n(newstring, wsclen(newstring) + 1, myString);

How to convert from wchar_t to LPSTR?

How can I convert a string from wchar_t to LPSTR.
A wchar_t string is made of 16-bit units, a LPSTR is a pointer to a string of octets, defined like this:
typedef char* PSTR, *LPSTR;
What's important is that the LPSTR may be null-terminated.
When translating from wchar_t to LPSTR, you have to decide on an encoding to use. Once you did that, you can use the WideCharToMultiByte function to perform the conversion.
For instance, here's how to translate a wide-character string into UTF8, using STL strings to simplify memory management:
#include <windows.h>
#include <string>
#include <vector>
static string utf16ToUTF8( const wstring &s )
{
const int size = ::WideCharToMultiByte( CP_UTF8, 0, s.c_str(), -1, NULL, 0, 0, NULL );
vector<char> buf( size );
::WideCharToMultiByte( CP_UTF8, 0, s.c_str(), -1, &buf[0], size, 0, NULL );
return string( &buf[0] );
}
You could use this function to translate a wchar_t* to LPSTR like this:
const wchar_t *str = L"Hello, World!";
std::string utf8String = utf16ToUTF8( str );
LPSTR lpStr = utf8String.c_str();
I use this
wstring mywstr( somewstring );
string mycstr( mywstr.begin(), mywstr.end() );
then use it as mycstr.c_str()
(edit, since i cannot comment) this is how i used this, and it works fine:
#include <string>
std::wstring mywstr(ffd.cFileName);
std::string mycstr(mywstr.begin(), mywstr.end());
pRequest->Write(mycstr.c_str());

How do I use MultiByteToWideChar?

I want to convert a normal string to a wstring. For this, I am trying to use the Windows API function MultiByteToWideChar.
But it does not work for me.
Here is what I have done:
string x = "This is c++ not java";
wstring Wstring;
MultiByteToWideChar( CP_UTF8 , 0 , x.c_str() , x.size() , &Wstring , 0 );
The last line produces the compiler error:
'MultiByteToWideChar' : cannot convert parameter 5 from 'std::wstring *' to 'LPWSTR'
How do I fix this error?
Also, what should be the value of the argument cchWideChar? Is 0 okay?
You must call MultiByteToWideChar twice:
The first call to MultiByteToWideChar is used to find the buffer size you need for the wide string. Look at Microsoft's documentation; it states:
If the function succeeds and cchWideChar is 0, the return value is the required size, in characters, for the buffer indicated by lpWideCharStr.
Thus, to make MultiByteToWideChar give you the required size, pass 0 as the value of the last parameter, cchWideChar. You should also pass NULL as the one before it, lpWideCharStr.
Obtain a non-const buffer large enough to accommodate the wide string, using the buffer size from the previous step. Pass this buffer to another call to MultiByteToWideChar. And this time, the last argument should be the actual size of the buffer, not 0.
A sketchy example:
int wchars_num = MultiByteToWideChar( CP_UTF8 , 0 , x.c_str() , -1, NULL , 0 );
wchar_t* wstr = new wchar_t[wchars_num];
MultiByteToWideChar( CP_UTF8 , 0 , x.c_str() , -1, wstr , wchars_num );
// do whatever with wstr
delete[] wstr;
Also, note the use of -1 as the cbMultiByte argument. This will make the resulting string null-terminated, saving you from dealing with them.
Few common conversions:
#define WIN32_LEAN_AND_MEAN
#include <Windows.h>
#include <string>
std::string ConvertWideToANSI(const std::wstring& wstr)
{
int count = WideCharToMultiByte(CP_ACP, 0, wstr.c_str(), wstr.length(), NULL, 0, NULL, NULL);
std::string str(count, 0);
WideCharToMultiByte(CP_ACP, 0, wstr.c_str(), -1, &str[0], count, NULL, NULL);
return str;
}
std::wstring ConvertAnsiToWide(const std::string& str)
{
int count = MultiByteToWideChar(CP_ACP, 0, str.c_str(), str.length(), NULL, 0);
std::wstring wstr(count, 0);
MultiByteToWideChar(CP_ACP, 0, str.c_str(), str.length(), &wstr[0], count);
return wstr;
}
std::string ConvertWideToUtf8(const std::wstring& wstr)
{
int count = WideCharToMultiByte(CP_UTF8, 0, wstr.c_str(), wstr.length(), NULL, 0, NULL, NULL);
std::string str(count, 0);
WideCharToMultiByte(CP_UTF8, 0, wstr.c_str(), -1, &str[0], count, NULL, NULL);
return str;
}
std::wstring ConvertUtf8ToWide(const std::string& str)
{
int count = MultiByteToWideChar(CP_UTF8, 0, str.c_str(), str.length(), NULL, 0);
std::wstring wstr(count, 0);
MultiByteToWideChar(CP_UTF8, 0, str.c_str(), str.length(), &wstr[0], count);
return wstr;
}
You can try this solution below. I tested, it works, detect special characters (example: º ä ç á ) and works on Windows XP, Windows 2000 with SP4 and later, Windows 7, 8, 8.1 and 10.
Using std::wstring instead new wchar_t / delete, we reduce problems with leak resources, overflow buffer and corrupt heap.
dwFlags was set to MB_ERR_INVALID_CHARS to works on Windows 2000 with SP4 and later, Windows XP. If this flag is not set, the function silently drops illegal code points.
std::wstring ConvertStringToWstring(const std::string &str)
{
if (str.empty())
{
return std::wstring();
}
int num_chars = MultiByteToWideChar(CP_ACP, MB_ERR_INVALID_CHARS, str.c_str(), str.length(), NULL, 0);
std::wstring wstrTo;
if (num_chars)
{
wstrTo.resize(num_chars);
if (MultiByteToWideChar(CP_ACP, MB_ERR_INVALID_CHARS, str.c_str(), str.length(), &wstrTo[0], num_chars))
{
return wstrTo;
}
}
return std::wstring();
}
Second question about this, this morning!
WideCharToMultiByte() and MultiByteToWideChar() are a pain to use. Each conversion requires two calls to the routines and you have to look after allocating/freeing memory and making sure the strings are correctly terminated. You need a wrapper!
I have a convenient C++ wrapper on my blog, here, which you are welcome to use.
Here's the other question this morning
The function cannot take a pointer to a C++ string. It will expect a pointer to a buffer of wide characters of sufficient size- you must allocate this buffer yourself.
string x = "This is c++ not java";
wstring Wstring;
Wstring.resize(x.size());
int c = MultiByteToWideChar( CP_UTF8 , 0 , x.c_str() , x.size() , &Wstring[0], 0 );