proper style for interfacing with legacy TCHAR code

proper style for interfacing with legacy TCHAR code - c++

I'm modifying someone else's code which uses TCHAR extensively. Is it better form to just use std::wstring in my code? wstring should be equivalent to TString on widechar platforms so I don't see an issue. The rationale being, its easier to use a raw wstring than to support TCHAR... e.g., using boost:wformat.
Which style will be more clear to the next maintainer? I wasted several hours myself trying to understand string intricacies, it seems just using wstring would cut off half of the stuff you need to understand.
typedef std::basic_string<TCHAR> TString; //on winxp, TCHAR resolves to wchar_t
typedef basic_string<wchar_t, char_traits<wchar_t>, allocator<wchar_t> > wstring;
...the only difference is the allocator.
In the unlikely case that your program
lands on a Window 9x machine, there's
still an API layer that can translate
your UTF-16 strings to 8-bit chars.
There's no point left in using TCHAR
for new code development.
source

If you are only intending on targetting Unicode (wchar_t) platforms, you are better off using std::wstring. If you want to support multibyte and Unicode builds, you will need to use TString and similar.
Also note that basic_string defaults the char_traits and allocator to one based on the passed in character type, so on builds where UNICODE (or _UNICODE, I can never remember which), TString and wstring will be the same.
NOTE: If you are just passing the arguments to various APIs and not doing any manipulations on them, you are better off using const wchar_t * instead of std::wstring directly (especially if mixing Win32, COM and standard C++ code) as you will end up doing less conversions and copying.

TCHAR used to be more important when you where going to compile the binaries twice, once for char and a second for wchar_t.
You can still make this choice if you like, changing the MSVC project settings from MBCS to Unicode and back.
This also means when calling the windows API you will have the matching data type.

Related

Proper way crossplatfom convert from std::string to 'const TCHAR *'

I'm working for crossplatrofm project in c++ and I have variable with type std::string and need convert it to const TCHAR * - what is proper way, may be functions from some library ?
UPD 1: - as I see in function definition there is split windows and non-Windows implementations:
#if defined _MSC_VER || defined __MINGW32__
#define _tinydir_char_t TCHAR
#else
#define _tinydir_char_t char
#endif
- so is it a really no way for non spliting realization for send parameter from std::string ?

Proper way crossplatfom convert from std::string to 'const TCHAR *'
TCHAR should not be used in cross platform programs at all; Except of course, when interacting with windows API calls, but those need to be abstracted away from the rest of the program or else it won't be cross-platform. So, you only need to convert between TCHAR strings and char strings in windows specific code.
The rest of the program should use char, and preferably assume that it contains UTF-8 encoded strings. If user input, or system calls return strings that are in a different encoding, you need to figure out what that encoding is, and convert accordingly.
Character encoding conversion functionality of the C++ standard library is rather weak, so that is not of much use. You can implement the conversion according the encoding specification or you can use a third party implementation, as always.
may be functions from some library ?
I recommend this.
as I see in function definition there is split windows and non-Windows implementations
The library that you use doesn't provide a uniform API to different platforms, so it cannot be used in a truly cross-platform way. You can write a wrapper library with uniform function declarations that handles the character encoding conversion on platforms that need it.
Or, you can use another library, which provides a uniform API and converts the encoding transparently.

TCHAR are Windows type and it defined in this way:
#ifdef UNICODE
typedef wchar_t TCHAR, *PTCHAR;
#else
typedef char TCHAR, *PTCHAR;
#endif
UNICODE macro is typically defined in project settings (in case when your use Visual Studio project on Windows).
You can get the const TCHAR* from std::string (which is ASCII or UTF8 in most cases) in this way:
std::string s("hello world");
const TCHAR* pstring = nullptr;
#ifdef UNICODE
std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>> converter;
std::wstring wstr = converter.from_bytes(s);
pstring = wstr.data();
#else
pstring = s.data();
#endif
pstring will be the result.
But it's highly not recommended to use the TCHAR on other platforms. It's better to use the UTF8 strings (char*) within std::string

I came across boost.nowide the other day. I think it will do exactly what you want.
http://cppcms.com/files/nowide/html/

As others have pointed out, you should not be using TCHAR except in code that interfaces with the Windows API (or libraries modeled after the Windows API).
Another alternative is to use the character conversion classes/macros defined in atlconv.h. CA2T will convert an 8-bit character string to a TCHAR string. CA2CT will convert to a const TCHAR string (LPCTSTR). Assuming your 8-bit strings are UTF-8, you should specify CP_UTF8 as the code page for the conversion.
If you want to declare a variable containing a TCHAR copy of a std::string:
CA2T tstr(stdstr.c_str(), CP_UTF8);
If you want to call a function that takes an LPCTSTR:
FunctionThatTakesString(CA2CT(stdsr.c_str(), CP_UTF8));
If you want to construct a std::string from a TCHAR string:
std::string mystdstring(CT2CA(tstr, CP_UTF8));
If you want to call a function that takes an LPTSTR then maybe you should not be using these conversion classes. (But you can if you know that the function you are calling does not modify the string outside its current length.)

C++ C2440 Error, cannot convert from 'std::_String_iterator<_Elem,_Traits,_Alloc>' to 'LPCTSTR'

Unfortunately I have been tasked with compiling an old C++ DLL so that it will work on Win 7 64 Bit machines. I have zero C++ experience. I have muddled through other issues, but this one has stumped me and i have not found a solution elsewhere. The following code is throwing a C2440 compiler error.
The error: "error C2440: '' : cannot convert from 'std::_String_iterator<_Elem,_Traits,_Alloc>' to 'LPCTSTR'"
the code:
#include "StdAfx.h"
#include "Antenna.h"
#include "MyTypes.h"
#include "objbase.h"
const TCHAR* CAntenna::GetMdbPath()
{
if(m_MdbPath.size() <= 0) return NULL;
return LPCTSTR(m_MdbPath.begin());
}
"m_MdbPath" is defined in the Antenna.h file as
string m_MdbPath;
Any assistance or guidance someone could provide would be extremely helpful. Thank you in advance. I am happy to provide additional details on the code if needed.

std::string has a .c_str() member function that should accomplish what you're looking for. It will return a const char* (const wchar_t* with std::wstring). std::string also has an empty() member function and I recommend using nullptr instead of the NULL macro.
const TCHAR* CAntenna::GetMdbPath()
{
if(m_MdbPath.empty()) return nullptr;
return m_MdbPath.c_str();
}

The solution:
const TCHAR* CAntenna::GetMdbPath()
{
if(m_MdbPath.size() <= 0) return NULL;
return m_MdbPath.c_str();
}
will work if you can guarantee that your program is using the MBCS character set option (or if the Character Set option is set to Not Setif you're using Visual Studio), since a std::string character type will be the same as the TCHAR type.
However, if your build is UNICODE, then a std::string character type will not be the same as the TCHAR type, since TCHAR is defined as a wide character, not a single-byte character. Thus returning c_str() will give a compiler error. So your original code, plus the tentative fix of returning std::string::c_str() may work, but it is technically wrong with respect to the types being used.
You should be working directly with the types presented to your function -- if the type is TCHAR, then you should be using TCHAR based types, but again, a TCHAR is a type that changes definition depending on the build type.
As a matter of fact, the code will fail miserably at runtime if it were a UNICODE build and to fix the compiler error(s), you casted to an LPCTSTR to keep the compiler quiet. You cannot cast a string pointer type to another string pointer type. Casting is not converting, and merely casting will not convert a narrow string into a wide string and vice-versa. You will wind up with at the least, odd characters being used or displayed, and worse, a program with random weird behavior and crashes.
In addition to this you never know when you really will need to build a UNICODE version of the DLL, since MBCS builds are becoming more rare. Going by your original post, the DLL's usage of TCHAR indicates that yes, this DLL may (or even has) been built for UNICODE.
There are a few alternate solutions that you can use. Note that some of these solutions may require you to make additional coding changes beyond the string types. If you're using C++ stream objects, they may also need to be changed to match the character type (for example, std::ostream and std::wostream)
Solution 1. Use a typedef, where the string type to use depends on the build type.
For example:
#ifdef UNICODE
typedef std::wstring tchar_string;
#else
typedef std::string tchar_string;
#endif
and then use tchar_string instead of std::string. Switching between MBCS and UNICODE builds will work without having to cast strin types, but requires coding changes.
Solution 2. Use a typedef to use TCHAR as the underlying character type in the std::basic_string template.
For example:
typedef std::basic_string<TCHAR> tchar_string;
And use tchar_string in your application. You get the same public interface as std::(w)string and this allows you to switch between MBCS and UNICODE build seamlessly (with respect to the string types).
Solution 3. Go with UNICODE builds and drop MBCS builds altogether
If the application you're building is a new one, then UNICODE is the default option (at least in Visual Studio) when creating a new application. Then all you really need to do is use std::wstring.
This is sort of the opposite of the original solution given of making sure you use MBCS, but IMO this solution makes more sense if there is only going to be one build type. MBCS builds are becoming more rare, and basically should be used only for legacy apps.

Should string encoding for library conform to Unicode or flexible?

I am created a library in C++ which exposes c style interface APIs. Some of the arguments are string so they would be char *. Now I know they should be all Unicode but because it is a library I don't think I want to force users to use decide or not. Ideally I thought it would be best to use TCHAR so I can build it either way for unicode code and ASCII users. Than I read this and it opposes the idea in general.
As an example of API, the strings are filenames or error messages like below.
void LoadSomeFile(char * fileName );
const char * GetErrorMsg();
I am using c++ and STL. There is this debate of std::string vs std::wstring as well.
Personally I really like MFC's CString class which takes care of all this nicely but that means I have to use MFC just for its string class.
Now I think TCHAR is probably the best solution for me but do I have to use CString (internally) for that to work? Can I use it with STL string? As far as I can see, it is either string or wstring there.

The TCHAR type is an unfortunate design choice that has thankfully been left behind us. Nobody has to use TCHAR any more, thank goodness. The Unicode choice has been made for us as well: Unicode is the only sane choice going forwards.
The question is, is your library Windows-only? Or is it portable?
If your library is portable, then the typical choice is char * or std::string with UTF-8 encoded strings. For more information, see UTF-8 Everywhere. The summary is that wchar_t is UTF-16 on Windows but UTF-32 everywhere else, which makes it almost useless for cross-platform programming.
If your library runs on Win32 only, then you may feel free to use wchar_t instead. On Windows, wchar_t is UTF-16.
Don't use both, it will make your code and API bloated and difficult to read. TCHAR is a hack for supporting the Win32 API and migrating to Unicode.

GetWindowText with char[]

I am quite new to Windows programming. I am trying to retrieve the name of a window.
char NewName[128];
GetWindowText(hwnd, NewName, 128);
I need to use a char[] but it gives me the error of wrong type.
From what I read, LPWSTR is a kind of char*.
How can I use a char[] with GetWindowText ?
Thanks a lot !

You are probably compiling a Unicode project, so you can either:
Explicitly call the ANSI version of the function (GetWindowTextA), or
Use wchar_t instead of char (LPWSTR is a pointer to wchar_t)

For modern Windows programming (that means, after the year 2000 when Microsoft introduced the Layer for Unicode for Windows 9x), you're far better off using "Unicode", which in C++ in Windows means using wchar_t.
That is, use wchar_t instead of char, and use std::wstring instead of std::string.
Remember to define UNICODE before including <windows.h>. It's also a good idea to define NOMINMAX and STRICT. Although nowadays the latter is defined by default.

When calling Windows APIs without specifying an explicit version by appending either A (ANSI) or W (wide char) you should always use TCHAR. TCHAR will map to the correct type depending on whether UNICODE is #defined or not.

Would std::basic_string<TCHAR> be preferable to std::wstring on Windows?

As I understand it, Windows #defines TCHAR as the correct character type for your application based on the build - so it is wchar_t in UNICODE builds and char otherwise.
Because of this I wondered if std::basic_string<TCHAR> would be preferable to std::wstring, since the first would theoretically match the character type of the application, whereas the second would always be wide.
So my question is essentially: Would std::basic_string<TCHAR> be preferable to std::wstring on Windows? And, would there be any caveats (i.e. unexpected behavior or side effects) to using std::basic_string<TCHAR>? Or, should I just use std::wstring on Windows and forget about it?

I believe the time when it was advisable to release non-unicode versions of your application (to support Win95, or to save a KB or two) is long past: nowadays the underlying Windows system you'll support are going to be unicode-based (so using char-based system interfaces will actually complicate the code by interposing a shim layer from the library) and it's doubtful whether you'd save any space at all. Go std::wstring, young man!-)

I have done this on very large projects and it works great:
namespace std
{
#ifdef _UNICODE
typedef wstring tstring;
#else
typedef string tstring;
#endif
}
You can use wstring everywhere instead though if you'd like, if you do not need to ever compile using a multi-byte character string. I don't think you need to ever support multi byte character strings though in any modern application.
Note: The std namespace is supposed to be off limits, but I have not had any problems with the above method for several years.

One thing to keep in mind. If you decide to use std::wstring all the way in your program, you might still need to use std::string if you are communicating with other systems using UTF8.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

proper style for interfacing with legacy TCHAR code - c++

Related

Proper way crossplatfom convert from std::string to 'const TCHAR *'

C++ C2440 Error, cannot convert from 'std::_String_iterator<_Elem,_Traits,_Alloc>' to 'LPCTSTR'

Should string encoding for library conform to Unicode or flexible?

GetWindowText with char[]

Would std::basic_string<TCHAR> be preferable to std::wstring on Windows?

Categories

Resources