How do I convert from std::wstring _TCHAR []? - c++

I'm using a library and sends me std::wstring from one of its functions, and another library that requires _TCHAR [] to be sent to it. How can I convert it?

Assuming you're using Unicode build, std::wstring.c_str() is what you need. Note that c_str() guarantees that the string it returns is null-terminated.
e.g.
void func(const wchar_t str[])
{
}
std::wstring src;
func(src.c_str());
If you're using non-Unicode build, you'll need to convert the Unicode string to non Unicode string via WideCharToMultiByte.

As #Zach Saw said, if you build only for Unicode you can get away with std::wstring.c_str(), but conteptually it would be better to define a tstring (a typedef for std::basic_string<TCHAR>) so you can safely use this kind of string flawlessly with all the Windows and library functions which expect TCHARs1.
For additional fun you should define also all the other string-related C++ facilities for TCHARs, and create conversion functions std::string/std::wstring <=> tstring.
Fortunately, this work has already been done; see here and here.
Actually no compiled library function can really expect a TCHAR *, since TCHARs are resolved as chars or wchar_ts at compile time, but you got the idea.

Use the ATL and MFC String Conversion Macros. This works regardless of whether you are compiling in _UNICODE or ANSI mode.
You can use these macros even if you aren’t using MFC. Just include the two ATL headers shown in this example:
#include <string>
#include <Windows.h>
#include <AtlBase.h>
#include <AtlConv.h>
int main()
{
std::wstring myString = L"Hello, World!";
// Here is an ATL string conversion macro:
CW2T pszT(myString.c_str());
// pszT is now an object which can be used anywhere a `const TCHAR*`
// is required. For example:
::MessageBox(NULL, pszT, _T("Test MessageBox"), MB_OK);
return 0;
}

Related

'HMODULE LoadLibraryA(LPCSTR)': cannot convert argument 1 from 'const _Elem *' to 'LPCSTR'

in the vc++ I have a solution with two projects. project A has a dllLoader.h and dllLoader.cpp which loads a dll with LoadLibrary and I need to call its functions in Project B. So I did Copy and Paste the header and cpp file to Project B.
Project A Main.cpp
------------------
#include "../Plugin/DllLoader.h"
#include "../Plugin/Types.h"
int main(){
std::string str("plugin.dll");
bool scuccessfulLoad = LoadDll(str);}
and here is the dllLoader in Project A (the mirror/copy in Project B get changed with changes here)
bool LoadDll(std::string FileName)
{
std::wstring wFileName = std::wstring(FileName.begin(), FileName.end());
HMODULE dllHandle1 = LoadLibrary(wFileName.c_str());
if (dllHandle1 != NULL)
{ ****
return TRUE;
}
Building the project itself does not show any error and get successfully done, but when I build the Solution (which contains other projects) I get the error
C2664 'HMODULE LoadLibraryA(LPCSTR)': cannot convert argument 1 from
'const _Elem *' to 'LPCSTR'
Your LoadDll() function takes a std::string as input, converts it (the wrong way 1) to std::wstring, and then passes that to LoadLibrary(). However, LoadLibrary() is not a real function, it is a preprocessor macro that expands to either LoadLibraryA() or LoadLibraryW() depending on whether your project is configured to map TCHAR to char for ANSI or wchar_t for UNICODE:
WINBASEAPI
__out_opt
HMODULE
WINAPI
LoadLibraryA(
__in LPCSTR lpLibFileName
);
WINBASEAPI
__out_opt
HMODULE
WINAPI
LoadLibraryW(
__in LPCWSTR lpLibFileName
);
#ifdef UNICODE
#define LoadLibrary LoadLibraryW
#else
#define LoadLibrary LoadLibraryA
#endif // !UNICODE
In your situation, the project that is failing to compile is configured for ANSI, thus the compiler error because you are passing a const wchar_t* to LoadLibraryA() where a const char* is expected instead.
The simplest solution is to just get rid of the conversion altogether and call LoadLibraryA() directly:
bool LoadDll(std::string FileName)
{
HMODULE dllHandle1 = LoadLibraryA(FileName.c_str());
...
}
If you still want to convert the std::string to std::wstring 1, then you should call LoadLibraryW() directly instead:
bool LoadDll(std::string FileName)
{
std::wstring wFileName = ...;
HMODULE dllHandle1 = LoadLibraryW(wFileName.c_str());
...
}
This way, your code always matches your data and is not dependent on any particular project configuration.
1: the correct way to convert a std::string to a std::wstring is to use a proper data conversion method, such as the Win32 MultiByteToWideChar() function, C++11's std::wstring_convert class, a 3rd party Unicode library, etc. Passing std::string iterators to std::wstring's constructor DOES NOT perform any conversions, it simply expands the char values as-is to wchar_t, thus any non-ASCII char values > 0x7F will NOT be converted to Unicode correctly (UTF-16 is Windows's native encoding for wchar_t strings). Only the 7-bit ASCII characters (0x00 - 0x7F) are the same values in ASCII, ANSI codepages, Unicode UTF encodings, etc. Higher-valued characters require conversion.
You pass a wide string to the function. So the code is clearly intended to be compiled targeting UNICODE, so that the LoadLibrary macro expands to LoadLibraryW. But the project in which the code fails does not target UNICODE. Hence the macro here expands to LoadLibraryA. And hence the compiler error because you are passing a wide string.
The problem therefore is that you have inconsistent compiler settings across different projects. Review the project configuration for the failing project to make sure that consistent conditionals are defined. That is, make sure that the required conditionals (presumably to enable UNICODE) are defined in all of the projects that contain this code.

Reading UTF-8 characters from console

I'm trying to read UTF-8 encoded polish characters from console for my c++ application.
I'm sure that console uses this code page (checked in properties).
What I have already tried:
Using cin - instead of "zażółć" I read "za\0\0\0\0"
Using wcin - instead of "zażółć" - same result as with cin
Using scanf - instead of 'zażółć\0' I read 'za\0\0\0\0\0'
Using wscanf - same result as with scanf
Using getchar to read characters one by one - same result as with scanf
On the beginning of the main function I have following lines:
setlocale(LC_ALL, "PL_pl.UTF-8");
SetConsoleOutputCP(CP_UTF8);
SetConsoleCP(CP_UTF8);
I would be really greatful for help.
Although you’ve already accepted an answer, here’s a more portable version, which sticks closer to the standard library. Unfortunately, this is one area where I’ve found that a lot of widely-used implementations do not support things that are supposedly in the standard. For example, there is supposed to be a standard way to print multi-byte strings (which theoretically could be something unusual like shift-JIS, but in practice are UTF-8 on every modern OS), but it does not actually work portably. Microsoft’s runtime library is especially poor in this regard, but I’ve also found bugs in libc++.
/* Boilerplate feature-test macros: */
#if _WIN32 || _WIN64
# define _WIN32_WINNT 0x0A00 // _WIN32_WINNT_WIN10
# define NTDDI_VERSION 0x0A000002 // NTDDI_WIN10_RS1
# include <sdkddkver.h>
#else
# define _XOPEN_SOURCE 700
# define _POSIX_C_SOURCE 200809L
#endif
#include <iostream>
#include <locale>
#include <locale.h>
#include <stdlib.h>
#include <string>
#ifndef MS_STDLIB_BUGS // Allow overriding the autodetection.
/* The Microsoft C and C++ runtime libraries that ship with Visual Studio, as
* of 2017, have a bug that neither stdio, iostreams or wide iostreams can
* handle Unicode input or output. Windows needs some non-standard magic to
* work around that. This includes programs compiled with MinGW and Clang
* for the win32 and win64 targets.
*
* NOTE TO USERS OF TDM-GCC: This code is known to break on tdm-gcc 4.9.2. As
* a workaround, "-D MS_STDLIB_BUGS=0" will at least get it to compile, but
* Unicode output will still not work.
*/
# if ( _MSC_VER || __MINGW32__ || __MSVCRT__ )
/* This code is being compiled either on MS Visual C++, or MinGW, or
* clang++ in compatibility mode for either, or is being linked to the
* msvcrt (Microsoft Visual C RunTime) library.
*/
# define MS_STDLIB_BUGS 1
# else
# define MS_STDLIB_BUGS 0
# endif
#endif
#if MS_STDLIB_BUGS
# include <io.h>
# include <fcntl.h>
#endif
using std::endl;
using std::istream;
using std::wcin;
using std::wcout;
void init_locale(void)
// Does magic so that wcout can work.
{
#if MS_STDLIB_BUGS
// Windows needs a little non-standard magic.
constexpr char cp_utf16le[] = ".1200";
setlocale( LC_ALL, cp_utf16le );
_setmode( _fileno(stdout), _O_WTEXT );
_setmode( _fileno(stdin), _O_WTEXT );
#else
// The correct locale name may vary by OS, e.g., "en_US.utf8".
constexpr char locale_name[] = "";
setlocale( LC_ALL, locale_name );
std::locale::global(std::locale(locale_name));
wcout.imbue(std::locale());
wcin.imbue(std::locale());
#endif
}
int main(void)
{
init_locale();
static constexpr size_t bufsize = 1024;
std::wstring input;
input.reserve(bufsize);
while ( wcin >> input )
wcout << input << endl;
return EXIT_SUCCESS;
}
This reads in wide-character input from the console regardless of its initial locale or code page. If what you meant instead was that the input will be bytes in the UTF-8 encoding (such as from a redirected file in UTF-8 encoding), not console input, the standard way to accomplish this is supposed to be the conversion facet from UTF-8 to wchar_t in <codecvt> and <locale>, but in practice Windows doesn’t support Unicode locales, so you have to read the bytes in and then convert them manually. A more standard way to do that is mbstowcs(). I have some old code to do the conversion for STL iterators, but there are also conversion functions in the standard library. You might need to do this anyway, if for example you need to save or transmit in UTF-8.
There are some who will recommend you store all strings in UTF-8 internally even when using an API like Windows based on some form of UTF-16, converting to another encoding only when you make API calls. I strongly advise you to use UTF-8 externally whenever you possibly can, but I don’t go quite that far. Note, however, that storing strings as UTF-8 will save you a lot of memory, especially on systems where wchar_t is UCS-32. You would have a better idea than I how many bytes this would typically save you for Polish text.
Here is the trick I use for UTF-8 support. The result is multibyte string which could be then used elsewhere:
#include <cstdio>
#include <windows.h>
#define MAX_INPUT_LENGTH 255
int main()
{
SetConsoleOutputCP(CP_UTF8);
SetConsoleCP(CP_UTF8);
wchar_t wstr[MAX_INPUT_LENGTH];
char mb_str[MAX_INPUT_LENGTH * 3 + 1];
unsigned long read;
void *con = GetStdHandle(STD_INPUT_HANDLE);
ReadConsole(con, wstr, MAX_INPUT_LENGTH, &read, NULL);
int size = WideCharToMultiByte(CP_UTF8, 0, wstr, read, mb_str, sizeof(mb_str), NULL, NULL);
mb_str[size] = 0;
std::printf("ENTERED: %s\n", mb_str);
return 0;
}
Should look like this:
P.S. Big thanks to Remy Lebeau for pointing out some flaws!

How to convert string to wchar* in unicode

I'm trying to define a function like this:
#ifdef _UNICODE
LPCTSTR A2T(const string& str);
#else
#define A2T
#endif
If my project is in Ansi, than A2T(str) is str itself. When my project is in unicode A2T(str) return a LPCTST type
When UNICODE is defined, LPCTSTR is an alias for const wchar_t*, otherwise it is an alias for const char*.
Your current macro returns const wchar_t* for Unicode, but returns std::string for Ansi. That doesn't make sense. You wouldn't be able to use A2T() consistently everywhere LPCTSTR is expected. The code would not even compile for Ansi since a std::string cannot be assigned directly to a char*. For Unicode, the code would compile, but you would have a memory leak since a conversion from std:string to wchar_t* requires a dynamic memory allocation that you have to free eventually.
A better option is to have A2T() return std::wstring for Unicode, and std::string for Ansi:
#ifdef UNICODE
std::wstring A2T(const string& str)
{
std::wstring result;
// do the conversion from Ansi to Unicode as needed...
// MultiByteToWideChar(), std::wstring_convert, etc...
return result;
}
#else
std::string A2T(const string& str)
{
return str;
}
#endif
Alternatively:
std::basic_string<TCHAR> A2T(const string& str)
{
#ifdef UNICODE
std::wstring result;
// do the conversion from Ansi to Unicode as needed...
// MultiByteToWideChar(), std::wstring_convert, etc...
return result;
#else
return str;
#endif
}
Either way, you get the automatic memory management you need after conversion, and you can use A2T() consistently in both Ansi and Unicode (when passing the return value of A2T(str) to a LPCTSTR, you can use A2T(str).c_str()).
Or, you could simply forget writing your own function and just use the existing A2CT() function or CA2CT class that is already available in MFC/ATL:
ATL and MFC String Conversion Macros

How to get %AppData% path as std::string?

I've read that one can use SHGetSpecialFolderPath(); to get the AppData path. However, it returns a TCHAR array. I need to have an std::string.
How can it be converted to an std::string?
Update
I've read that it is possible to use getenv("APPDATA"), but that it is not available in Windows XP. I want to support Windows XP - Windows 10.
The T type means that SHGetSpecialFolderPath is a pair of functions:
SHGetSpecialFolderPathA for Windows ANSI encoded char based text, and
SHGetSpecialFolderPathW for UTF-16 encoded wchar_t based text, Windows' “Unicode”.
The ANSI variant is just a wrapper for the Unicode variant, and it can not logically produce a correct path in all cases.
But this is what you need to use for char based data.
An alternative is to use the wide variant of the function, and use whatever machinery that you're comfortable with to convert the wide text result to a byte-oriented char based encoding of your choice, e.g. UTF-8.
Note that UTF-8 strings can't be used directly to open files etc. via the Windows API, so this approach involves even more conversion just to use the string.
However, I recommend switching over to wide text, in Windows.
For this, define the macro symbol UNICODE before including <windows.h>.
That's also the default in a Visual Studio project.
https://msdn.microsoft.com/en-gb/library/windows/desktop/dd374131%28v=vs.85%29.aspx
#ifdef UNICODE
typedef wchar_t TCHAR;
#else
typedef unsigned char TCHAR;
#endif
Basically you can can convert this array to std::wstring. Converting to std::string is straightforward with std::wstring_convert.
http://en.cppreference.com/w/cpp/locale/wstring_convert
You should use SHGetSpecialFolderPathA() to have the function deal with ANSI characters explicitly.
Then, just convert the array of char to std::string as usual.
/* to have MinGW declare SHGetSpecialFolderPathA() */
#if !defined(_WIN32_IE) || _WIN32_IE < 0x0400
#undef _WIN32_IE
#define _WIN32_IE 0x0400
#endif
#include <shlobj.h>
#include <string>
std::string getPath(int csidl) {
char out[MAX_PATH];
if (SHGetSpecialFolderPathA(NULL, out, csidl, 0)) {
return out;
} else {
return "";
}
}
Typedef String as either std::string or std::wstring depending on your compilation configuration. The following code might be useful:
#ifndef UNICODE
typedef std::string String;
#else
typedef std::wstring String;
#endif

How do I convert an ATL/MFC CString to a QString?

Given the encoding of the project is probably Unicode (but not for sure) what is the best way of converting ATL::CString to QString?
What I have thought of is this:
CString c(_T("SOME_TEXT"));
//...
std::basic_string<TCHAR> intermediate((LPCTSTR)c);
QString q;
#ifdef _UNICODE
q = QString::fromStdWString(intermediate);
#else
q = QString::fromStdString(intermediate);
#endif
Do you think that it works? Any other ideas?
You don't need the intermediate conversion to a std::string. The CString class can be treated as a simple C-style string; that is, an array of characters. All you have to do is cast it to an LPCTSTR.
And once you have that, you just need to create the QString object depending on whether the characters in your CString are of type char or wchar_t. For the former, you can use one of the standard constructors for QString, and for the latter, you can use the fromWCharArray function.
Something like the following code (untested, I don't have Qt installed anymore):
CString c(_T("SOME_TEXT"));
QString q;
#ifdef _UNICODE
q = QString::fromWCharArray((LPCTSTR)c, c.GetLength());
#else
q = QString((LPCTSTR)c);
#endif
Edit: As suggested in the comments, you have to disable "Treat wchar_t as a built-in type` in your project's Properties to get the above code to link correctly in Visual Studio (source).
For _UNICODE, I believe you could also use the fromUtf16 function:
CString c(_T("SOME TEXT"));
QString q = QString::fromUtf16(c.GetBuffer(), c.GetLength());