Proper way crossplatfom convert from std::string to 'const TCHAR *' - c++

I'm working for crossplatrofm project in c++ and I have variable with type std::string and need convert it to const TCHAR * - what is proper way, may be functions from some library ?
UPD 1: - as I see in function definition there is split windows and non-Windows implementations:
#if defined _MSC_VER || defined __MINGW32__
#define _tinydir_char_t TCHAR
#else
#define _tinydir_char_t char
#endif
- so is it a really no way for non spliting realization for send parameter from std::string ?

Proper way crossplatfom convert from std::string to 'const TCHAR *'
TCHAR should not be used in cross platform programs at all; Except of course, when interacting with windows API calls, but those need to be abstracted away from the rest of the program or else it won't be cross-platform. So, you only need to convert between TCHAR strings and char strings in windows specific code.
The rest of the program should use char, and preferably assume that it contains UTF-8 encoded strings. If user input, or system calls return strings that are in a different encoding, you need to figure out what that encoding is, and convert accordingly.
Character encoding conversion functionality of the C++ standard library is rather weak, so that is not of much use. You can implement the conversion according the encoding specification or you can use a third party implementation, as always.
may be functions from some library ?
I recommend this.
as I see in function definition there is split windows and non-Windows implementations
The library that you use doesn't provide a uniform API to different platforms, so it cannot be used in a truly cross-platform way. You can write a wrapper library with uniform function declarations that handles the character encoding conversion on platforms that need it.
Or, you can use another library, which provides a uniform API and converts the encoding transparently.

TCHAR are Windows type and it defined in this way:
#ifdef UNICODE
typedef wchar_t TCHAR, *PTCHAR;
#else
typedef char TCHAR, *PTCHAR;
#endif
UNICODE macro is typically defined in project settings (in case when your use Visual Studio project on Windows).
You can get the const TCHAR* from std::string (which is ASCII or UTF8 in most cases) in this way:
std::string s("hello world");
const TCHAR* pstring = nullptr;
#ifdef UNICODE
std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>> converter;
std::wstring wstr = converter.from_bytes(s);
pstring = wstr.data();
#else
pstring = s.data();
#endif
pstring will be the result.
But it's highly not recommended to use the TCHAR on other platforms. It's better to use the UTF8 strings (char*) within std::string

I came across boost.nowide the other day. I think it will do exactly what you want.
http://cppcms.com/files/nowide/html/

As others have pointed out, you should not be using TCHAR except in code that interfaces with the Windows API (or libraries modeled after the Windows API).
Another alternative is to use the character conversion classes/macros defined in atlconv.h. CA2T will convert an 8-bit character string to a TCHAR string. CA2CT will convert to a const TCHAR string (LPCTSTR). Assuming your 8-bit strings are UTF-8, you should specify CP_UTF8 as the code page for the conversion.
If you want to declare a variable containing a TCHAR copy of a std::string:
CA2T tstr(stdstr.c_str(), CP_UTF8);
If you want to call a function that takes an LPCTSTR:
FunctionThatTakesString(CA2CT(stdsr.c_str(), CP_UTF8));
If you want to construct a std::string from a TCHAR string:
std::string mystdstring(CT2CA(tstr, CP_UTF8));
If you want to call a function that takes an LPTSTR then maybe you should not be using these conversion classes. (But you can if you know that the function you are calling does not modify the string outside its current length.)

Related

Differences using std::string in C++ Builder and VC++

since I can get hands on the new RAD Studio Xe4 I thought I'd give it a try.
Unfortunatly, I am not so experienced with C++ and therefore I was wondering why the Code that works perfectly fine in VC++ doesn't work at all in C++ Builder.
Most of the problems are converting different var-types.
For example :
std::string Test = " ";
GetFileAttributes(Test.c_str());
works in VC++ but in C++ Builder it won't compile, telling me "E2034 Cannot convert 'const char *' to 'wchar_t *'.
Am I missing something? What is the reason that doesn't work the same on all compilers the same?
Thanks
Welcome to Windows Unicode/ASCII hell.
The function
GetFileAttributes
is actually a macro defined to either GetFileAttributesA or GetFileAttributesW depending on if you have _UNICODE (or was it UNICODE, or both?) defined when you include the Windows headers. The *A variants take char* and related arguments, the *W functions take wchar_t* and related arguments.
I suggest calling only the wide *W variants directly in new code. This would mean switching to std::wstring for Windows only code and some well-thought out design choices for a cross-platform application.
Your C++ Builder config is set to use UNICODE character set, which means that Win32 APIs are resolved to their wide character versions. Therefore you need to use wide char strings in your C++ code. If you would set your VS config to use UNICODE, you would get the same error.
You can try this:
// wstring = basic_string<wchar_t>
// _T macro ensures that the specified literal is a wide char literal
std::wstring Test = _T(" ");
GetFileAttributes(Test.c_str()); // c_str now returns const wchar_t*, not const char*
See more details about _T/_TEXT macros here: http://docwiki.embarcadero.com/RADStudio/XE3/en/TCHAR_Mapping
You have defined _UNICODE and/or UNICODE in Builder and not defined it in VC.
Most Windows APIs come in 2 flavours the ANSI flavour and the UNICODE flavour.
For, when you call SetWindowText, there really is no SetWindowText functions. Instead there are 2 different functions
- SetWindowTextA which takes an ANSI string
and
- SetWindowTextW which takes a UNICODE string.
If your program is compiled with /DUNICODE /D_UNICODE, SetWindowText maps to SetWindowTextWwhich expects aconst wchar_t *`.
If your program is compiled without these macros defined, it maps to SetWindowTextA which takes a const char *.
The windows headers typically do something like this to make this happen.
#ifdef UNICODE
#define SetWindowText SetWindowTextW
#else
#define SetWindowText SetWindowTextA
#endif
Likewise, there are 2 GetFileAttributes.
DWORD WINAPI GetFileAttributesA(LPCSTR lpFileName);
DWORD WINAPI GetFileAttributesW(LPCWSTR lpFileName);
In VC, you haven't defined UNICODE/_UNICODE & hence you are able to pass string::c_str() which returns a char *.
In Builder, you probably have defined UNICODE/_UNICODE & it expects a wchar_t *.
You may not have done this UNICODE/_UNICODE thing explicitly - may be the IDE is doing it for you - so check the options in the IDE.
You have many ways of fixing this
find the UNICODE/_UNICODE option in the IDE and disable it.
or
use std::w_string - then c_str() will return a wchar_t *
or
Call GetFileAttributesA directly instead of GetFileAttributes - you will need to do this for every other Windows API which comes with these 2 variants.

GetWindowText with char[]

I am quite new to Windows programming. I am trying to retrieve the name of a window.
char NewName[128];
GetWindowText(hwnd, NewName, 128);
I need to use a char[] but it gives me the error of wrong type.
From what I read, LPWSTR is a kind of char*.
How can I use a char[] with GetWindowText ?
Thanks a lot !
You are probably compiling a Unicode project, so you can either:
Explicitly call the ANSI version of the function (GetWindowTextA), or
Use wchar_t instead of char (LPWSTR is a pointer to wchar_t)
For modern Windows programming (that means, after the year 2000 when Microsoft introduced the Layer for Unicode for Windows 9x), you're far better off using "Unicode", which in C++ in Windows means using wchar_t.
That is, use wchar_t instead of char, and use std::wstring instead of std::string.
Remember to define UNICODE before including <windows.h>. It's also a good idea to define NOMINMAX and STRICT. Although nowadays the latter is defined by default.
When calling Windows APIs without specifying an explicit version by appending either A (ANSI) or W (wide char) you should always use TCHAR. TCHAR will map to the correct type depending on whether UNICODE is #defined or not.

Set console title in C++ using a string

I would like to know how to change the console title in C++ using a string as the new parameter.
I know you can use the SetConsoleTitle function of the Win32 API but that does not take a string parameter.
I need this because I am doing a Java native interface project with console effects and commands.
I am using windows and it only has to be compatible with Windows.
The SetConsoleTitle function does indeed take a string argument. It's just that the kind of string depends on the use of UNICODE or not.
You have to use e.g. the T macro to make sure the literal string is of the correct format (wide character or single byte):
SetConsoleTitle(T("Some title"));
If you are using e.g. std::string things get a little more complicated, as you might have to convert between std::string and std::wstring depending on the UNICODE macro.
One way of not having to do that conversion is to always use only std::string if UNICODE is not defined, or only std::wstring if it is defined. This can be done by adding a typedef in the "stdafx.h" header file:
#ifdef UNICODE
typedef std::wstring tstring;
#else
typedef std::string tstring;
#endif
If your problem is that SetConsoleTitle doesn't take a std::string (or std::wstring) it's because it has to be compatible with C programs which doesn't have the string classes (or classes at all). In that case you use the c_str of the string classes to get a pointer to the string to be used with function that require old-style C strings:
tstring title = T("Some title");
SetConsoleTitle(title.c_str());
There's also another solution, and that is to use the explicit narrow-character "ASCII" version of the function, which have an A suffix:
SetConsoleTitleA("Some title");
There's of course also a wide-character variant, with a W suffix:
SetConsoleTitleW(L"Some title");
string str(L"Console title");
SetConsoleTitle(str.c_str());
The comment is old but you can do it with the system method...
#include <iostream>
int main(){
system("title This is a title");
}

proper style for interfacing with legacy TCHAR code

I'm modifying someone else's code which uses TCHAR extensively. Is it better form to just use std::wstring in my code? wstring should be equivalent to TString on widechar platforms so I don't see an issue. The rationale being, its easier to use a raw wstring than to support TCHAR... e.g., using boost:wformat.
Which style will be more clear to the next maintainer? I wasted several hours myself trying to understand string intricacies, it seems just using wstring would cut off half of the stuff you need to understand.
typedef std::basic_string<TCHAR> TString; //on winxp, TCHAR resolves to wchar_t
typedef basic_string<wchar_t, char_traits<wchar_t>, allocator<wchar_t> > wstring;
...the only difference is the allocator.
In the unlikely case that your program
lands on a Window 9x machine, there's
still an API layer that can translate
your UTF-16 strings to 8-bit chars.
There's no point left in using TCHAR
for new code development.
source
If you are only intending on targetting Unicode (wchar_t) platforms, you are better off using std::wstring. If you want to support multibyte and Unicode builds, you will need to use TString and similar.
Also note that basic_string defaults the char_traits and allocator to one based on the passed in character type, so on builds where UNICODE (or _UNICODE, I can never remember which), TString and wstring will be the same.
NOTE: If you are just passing the arguments to various APIs and not doing any manipulations on them, you are better off using const wchar_t * instead of std::wstring directly (especially if mixing Win32, COM and standard C++ code) as you will end up doing less conversions and copying.
TCHAR used to be more important when you where going to compile the binaries twice, once for char and a second for wchar_t.
You can still make this choice if you like, changing the MSVC project settings from MBCS to Unicode and back.
This also means when calling the windows API you will have the matching data type.

Portable wchar_t in C++

Is there a portable wchar_t in C++? On Windows, its 2 bytes. On everything else is 4 bytes. I would like to use wstring in my application, but this will cause problems if I decide down the line to port it.
If you're dealing with use internal to the program, don't worry about it; a wchar_t in class A is the same as in class B.
If you're planning to transfer data between Windows and Linux/MacOSX versions, you've got more than wchar_t to worry about, and you need to come up with means to handle all the details.
You could define a type that you'll define to be four bytes everywhere, and implement your own strings, etc. (since most text handling in C++ is templated), but I don't know how well that would work for your needs.
Something like typedef int my_char; typedef std::basic_string<my_char> my_string;
What do you mean by "portable wchar_t"? There is a uint16_t type that is 16bits wide everywhere, which is often available. But that of course doesn't make up a string yet. A string has to know of its encoding to make sense of functions like length(), substring() and so on (so it doesn't cut characters in the middle of a code point when using utf8 or 16). There are some unicode compatible string classes i know of that you can use. All can be used in commercial programs for free (the Qt one will be compatible with commercial programs for free in a couple of months, when Qt 4.5 is released).
ustring from the gtkmm project. If you program with gtkmm or use glibmm, that should be the first choice, it uses utf-8 internally. Qt also has a string class, called QString. It's encoded in utf-16. ICU is another project that creates portable unicode string classes, and has a UnicodeString class that internally seems to be encoded in utf-16, like Qt. Haven't used that one though.
The proposed C++0x standard will have char16_t and char32_t types. Until then, you'll have to fall back on using integers for the non-wchar_t character type.
#if defined(__STDC_ISO_10646__)
#define WCHAR_IS_UTF32
#elif defined(_WIN32) || defined(_WIN64)
#define WCHAR_IS_UTF16
#endif
#if defined(__STDC_UTF_16__)
typedef _Char16_t CHAR16;
#elif defined(WCHAR_IS_UTF16)
typedef wchar_t CHAR16;
#else
typedef uint16_t CHAR16;
#endif
#if defined(__STDC_UTF_32__)
typedef _Char32_t CHAR32;
#elif defined(WCHAR_IS_UTF32)
typedef wchar_t CHAR32;
#else
typedef uint32_t CHAR32;
#endif
According to the standard, you'll need to specialize char_traits for the integer types. But on Visual Studio 2005, I've gotten away with std::basic_string<CHAR32> with no special handling.
I plan to use a SQLite database.
Then you'll need to use UTF-16, not wchar_t.
The SQLite API also has a UTF-8 version. You may want to use that instead of dealing with the wchar_t differences.
My suggestion. Use UTF-8 and std::string. Wide strings would not bring you too much added value. As you anyway can't interpret wide character as letter as some characters crated from several unicode code points.
So use anywhere UTF-8 and use good library to deal with natural languages. Like for example Boost.Locale.
Bad idea: define something like typedef uint32_t mychar; is bad. As you can't use iostream with it, you can't create for example stringstream based in this character as you would not be able to write in it.
For example this would not work:
std::basic_ostringstream<unsigned> s;
ss << 10;
Would not create you a string.