C++: LPWSTR prints as an address in cout

C++: LPWSTR prints as an address in cout - c++

I have a variable of type LPTSTR, which I print to std::cout with <<. In an ANSI system (don't know exactly where it is determined) it worked fine, it printed the string. Now in a Unicode system I get a hex address instead of the string. So, why does LPSTR (to which LPTSTR is resolved if UNICODE is not defined) act differently from LPWSTR (... if UNICODE is defined) and how do I print the string pointed by the latter one?

For Unicode strings you want wcout.
You may be seeing hex because the ANSI/ASCII output stream doesn't know how to handle Unicode characters.
LPTSTR and LPWSTR are actually C-isms inherited from the C Windows API days. For C++ I would strongly encourage you to use std::string and/or std::wstring instead.
If you need to roll your own macro, you'll want something like:
#ifdef _UNICODE
std::wostream& COUT = std::wcout;
#else
std::ostream& COUT = std::cout;
#endif

Related

Converting std::string to a const WCHAR *string required by a GDI+ parameter

Short version I am using unicode. I am attempting to use a std::string, to a function that requires a const WCHAR string; DrawString(const WCHAR, ...
I compile with GCC. Everything is unicode, I have specified.
I have been trying to convert a string, into a wchar_t*. The purpose is so that I can output using a GDI+ function, its parameters require it so.
Here is how I have outputted string literals, no problems, debugs fine, works fine.
http://msdn.microsoft.com/en-us/library/ms535991%28v=vs.85%29.aspx for reference why:
// works fine
wchar_t* wcBuff;
wcBuff = (wchar_t*)L"Some text here.\0";
AddString(wcBuff, wcslen(wcBuff), &gFontFamilyInfo, FontStyleBold, 20, ptOrg_Controls, &strFormat_Info);
Now this is what I have been trying, all day, and a side note: my conversion function works fine, it is not an issue, nor creating one.
// problems
string s = "Level " + convert::intToString(6) + "\0";
// try 1 - Segfault
wchar_t* wcBuff = new wchar_t[s.length() + 1];
copy(s.begin(), s.end(), wcBuff);
// random tries, compiles, but access violations (my conversion function here has worked other places, do not know for sure here.
wchar_t* wcBuff;
wstring wstr = convert::stringToWideChar(s);
wstring strvalue = convert::stringToWideChar(s);
wcBuff = (wchar_t*)strvalue.c_str();
wcBuff = (wchar_t*)wstr.c_str();
wstring foo;
foo.assign(s.begin(), s.end());
wcBuff = (wchar_t*)foo.c_str();
Everything compiles, but then presents problems. Some runtime errors as soon as it reaches that point. Others access violations and segfaults. Some compiles and debugs no problem, but the strings output constantly changes with random characters.
Any ideas?

(this is not really an answer, but it's too big for a comment)
Try 1: you didn't null-terminate the string
Try 2: can't comment without seeing the conversion function. Remove the casts.
Try 3: Remove the casts, should be OK.
In all cases use wchar_t const *wcBuff. If "Try 3" fails then it means you have a bug somewhere else in your code, that is showing up here. Try to produce a MCVE. You should be able to get it down to about 10-20 lines.
Even if you manage to write the correct code for what you're intending, this is a fairly naive conversion as it doesn't handle characters outside the 0-127 range properly. You need to think about whether that is what you want, or whether you want to do a UTF-8 conversion, etc.
In Windows you can use MultiByteToWideChar.

#include <string>
int main() {
// Can use convenient wstring
std::wstring wstr = L"My wide string";
// When you need a whar_t* just do this
const wchar_t* s = wstr.c_str();
// unicode form of strcpy
wchar_t buf[100] = {0};
wcscpy (buf,s);
// And if you want to convert from string to wstring
std::string thin = "I only take up one byte per character!";
std::wstring wide(thin.begin(), thin.end());
return 0;
}

I first get my data into a wstring. Like this:
(Converting from string):
std::string sString = "This is my string text";
std::wstring str1(sString.begin(), sString.end());
(Converting from int):
wstring str1 = std::to_wstring(BirthDate);
Then, I use it in GDI+ Command like this:
graphics.DrawString(str1.c_str(), -1,
&font, PointF(10, 5), &st);

First thing first. GDI+ is a C++ library. It uses Microsoft C++ ABI. Microsoft C++ ABI is wildly incompatible with gcc so you might just forget about using it. You can try to use WinAPI or any other library that uses C calling conventions.
Now for the wstring question. wchar_t is 32 bits wide in gcc, but Windows APIs require it to be 16 bits wide. You cannot use any native Windows call that requires wchar_t.
You can use -fshort-wchar command line option in gcc, that would make wchar_t 16 bits wide and you will regain compatibility with Windows APIs, but lose compatibility with libc, so no library functions that act on wchar_t for you. std::wstring will probably work as it's header-only, but wprintf or wscpy or all other compiled stuff won't.
None of this is detected at compile time, as the only things gcc sees are header files. It cannot tell whether corresponding libraries are compiled with 16-bit wchar_t or 32-bit wchar_t.
You can use uint16_t when you need to pass an array of wchar_t to a Windows function. If you can use C++11, it has char16_t that you can use too. Here's an example that should work with Basic Multilingual Plane characters:
std::wstring myLittleNiceWstring;
...
std::vector<uint16_t> myUglyCompatibilityString;
std::copy(myLittleNiceWstring.begin(),
myLittleNiceWstring.end(),
std::back_inserter(myUglyCompatibilityString));
myUglyCompatibilityString.push_back(0);
UglyWindowsAPI(static_cast<WCHAR*>(myUglyCompatibilityString.data());
If you have non-BMP characters, you need to convert UTF32 to UTF16 rather than just copy characters with std::copy. You can use libiconv for that or write a conversion routine yourself (it's rather simple) or just boorow some code from the internet.
It is my opinion that Windows-centric development with GCC is rather difficult because of this and other issues. You can use gcc as long as you stick to POSIX-ish APIs.

What is a #define string's type?

I think I am going about this wrong. I'm working on making a SO for a MSR type object. And by default (if I read it correctly) OPOS uses unicode. So I made my C++ automated class use unicode as well and from what I understand there is no way around it. In the OPOS head class there are 2 string definitions, the third one is one my creation:
#define OPOS_ROOTKEY "SOFTWARE\\OLEforRetail\\ServiceOPOS"
#define OPOS_CLASSKEY_MSR "MSR"
#define OPOSMSR OPOS_ROOTKEY "\\" OPOS_CLASSKEY_MSR "\\"
This is so that a person can access the registry. So I decided to make myself a registry helper class instead of having it all in my SO. Looks like I'm having a hard time trying to figure out how I should do this in the end. I copied working code from another SO I had, but I feel that that code was not made correctly, and I want my code made right the first time.
So I came up with this, but I can not figure out how to combine my string with the class name. I made the class name as a parameter in my constructor.
RegistryHelper::RegistryHelper(LPCTSTR deviceName) {
cout << "RegistryHelper::RegistryHelper()+" << endl;
baseOpen = true;
CString test;
test.Format("%s%s",OPOSMSR, theClass); //fail
REGSAM access = KEY_READ | KEY_WOW64_64KEY;
LONG nError = RegOpenKeyEx(HKEY_LOCAL_MACHINE, theClass ,0, access,&hBaseKey); //not what I want, but would compile, I want test here instead of theClass
if (nError != ERROR_SUCCESS) {
cerr << "(E)RegistryHelper::RegistryHelper(): Failed to load base key. [" <<(int)nError << "]" << endl;
RegCloseKey(hBaseKey);
baseOpen = false;
}
cout << "RegistryHelper::RegistryHelper()-" << endl;
}
Any tips on what I am doing wrong? since I'm on the subject: I'm going to post all my code for this. How bad is it?
What I'm after is something like this
unsigned int baud;
char* parity;
bool MSRSO::LoadRegistryValuesIntoMemory(LPCSTR deviceName) {
RegistryHelper reg(deviceName);
bool required = reg.LoadDWORD("BaudRate", 19200, baud);
required = required && reg.LoadREGSZ("Parity", "NONE", parity);
//other values
reg.Close();
return required;
}
Keep in mind I'm a C# and java guy so I may have my data types wrong. I only wrote simple hello world programs and temp conversion programs for myself in C++ on a SUPER old linux box back in the day. Although I am getting better at C++ I'm still not comfortable with it. So to sum up what is the data type of the #define type? How do I combine it with LPCTSTR? Should I have done that so that I can access registry values ONLY?
Thank you.

Your code has inconsistencies between narrow and wide strings. A literal 'a' has the type char, and is a narrow character. A literal L'a' has the type wchar_t, and is a wide character.
Next, we can apply these to strings:
"abc" is a narrow string of type const char (&)[4].
L"abc" is a wide string of type const wchar_t (&)[4].
To reduce the hassle of supporting both, there is what is known as the TCHAR. Defined in a Windows header, this type is char or wchar_t, depending on whether UNICODE is defined. If it is defined, TCHAR will be wchar_t. If it is not defined, TCHAR will be char.
This also came with a TEXT macro that converts a string literal into characters of type TCHAR. That is, if UNICODE is defined, TEXT("abc") will be equivalent to L"abc", and if it is not defined, TEXT("abc") will be equivalent to "abc".
Strings were also given some typedefs as well:
LP[C][W|T]STR
The LP indicates a pointer, and the STR indicates "to a string". If C is included, the string will be a constant one. If W or T is included, the string will be made of characters of type wchar_t or TCHAR respectively.
For example:
LPSTR: char *
LPCSTR: const char *
LPWSTR: wchar_t *
LPCTSTR: const TCHAR *
Using this information, you can correctly understand why using TCHAR and TEXT will cause your code to become compatible with something else, whether it uses narrow or wide characters.
Here's a simple example, keeping in mind that std::string is, for our purposes, std::basic_string<char>:
std::basic_string<TCHAR> s(TEXT("abcd"));
s += TEXT("ZYXW"); //s is now `TEXT("abcdZXW")

Change
#define OPOS_ROOTKEY "SOFTWARE\\OLEforRetail\\ServiceOPOS"
#define OPOS_CLASSKEY_MSR "MSR"
#define OPOSMSR OPOS_ROOTKEY "\\" OPOS_CLASSKEY_MSR "\\"
to
#define OPOS_ROOTKEY L"SOFTWARE\\OLEforRetail\\ServiceOPOS"
#define OPOS_CLASSKEY_MSR L"MSR"
#define OPOSMSR OPOS_ROOTKEY L"\\" OPOS_CLASSKEY_MSR "\\"

Filenames truncate to only show first character

I'm following this guide from MSDN on how to list the files in a directory (i'm using the current directory). In my case I need to put the information in the message part of my packet (char array of size 1016) to send it to the client. When I print packet.message on both the client and server only the first character of the filenames are shown. What's wrong? Here's a snippet of the relevant section of code:
WIN32_FIND_DATA f;
HANDLE h = FindFirstFile(TEXT("./*.*"), &f);
string file;
int size_needed;
do
{
sprintf(packet.message,"%s", &f.cFileName);
//Send packet
} while(FindNextFile(h, &f));

This is commonly caused by a wide character string being mistakenly treated as an ASCII string. The build is targeting UNICODE and cFileName contains a wide character string, but sprintf() is assuming it is an ASCII string.
FindFirstFile() will be mapped to either FindFirstFileA() or FindFirstFileW() depending if the build is or is not targeting UNICODE.
A solution would be to use FindFirstFileA() and ASCII strings explicitly.
Note that the & is unrequired in the sprintf():
sprintf(packet.message, "%s", f.cFileName);
As the application is consuming strings that are outside of its control (i.e file names) I would recommend using the safer _snprintf() to avoid buffer overruns:
/* From your comment on the question 'packet.message' is a 'char[1016]'
so 'sizeof()' will function correctly. */
if (_snprintf(packet.message, sizeof(packet.message), "%s", f.cFileName) > 0)
{
}

You're using the Unicode version of FindFirstFile, almost guaranteed, Either invoke the narrow version or change the format specifier of your print. Personally I would do the former:
WIN32_FIND_DATAA f;
HANDLE h = FindFirstFileA("./*.*", &f);
string file;
int size_needed;
do
{
sprintf(packet.message,"%s", f.cFileName);
//Send packet
} while(FindNextFileA(h, &f));
FindClose(h);
Alternatively, you can compile with MBCS or regular characters.

As others have mentioned, you are calling the Unicode version of FindFirstFile() and are passing Unicode data to the Ansi sprintf() function. The %s specifier expects Ansi input. You have a few choices to address the issue in your code:
continue using sprintf(), but change the %s specifier to %ls so it will accept Unicode input and convert it to Ansi when writing to your message buffer:
sprintf(packet.message, "%ls", f.cFileName);
This is not ideal, though, because it will use the Ansi encoding of the local machine, which may be different than the Ansi encoding used by the receiving machine.
change your message buffer to use TCHAR instead of char, and then switch to either wsprintf() or _stprintf() instead of sprintf(). Like FindFirstFile(), they will match whatever character format that TCHAR and TEXT() use:
TCHAR message[1016];
wsprintf(packet.message, TEXT("%s"), f.cFileName);
Or:
#include <tchar.h>
_TCHAR message[1016];
_stprintf(packet.message, _T("%s"), f.cFileName);
if you must use a char buffer, then you should accept Unicode data from the API and convert it to UTF-8 for transmission, and then the receiver can convert it back to Unicode and use it as needed.
WIN32_FIND_DATAW f;
HANDLE h = FindFirstFileW(L"./*.*", &f);
if (h)
{
do
{
WideCharToMultiByte(CP_UTF8, 0, f.cFileName, lstrlenW(f.cFileName), packet.message, sizeof(packet.message), NULL, NULL);
//Send packet
} while(FindNextFile(h, &f));
FindClose(h);
}

wostringstream, Ascii, Unicode, Win32 and integer concatenation to string

I am writing a library that uses Win32 APIs, and I would like to be able to compile it for both ASCII and Unicode (wide character is the type), and I am generating an internal class name (read: WinAPI "class") that I am appending an integer to a string to to create unique class names for various windows functions.
The definitions of the variables used:
LPCTSTR lpszClassName; // This is char* if ASCII, wchar_t* if Unicode.
#ifdef UNICODE
std::wostringstream Convert;
#else
std::ostringstream Convert;
#endif
The function in question:
void Base::MakeClassName () {
#ifdef _DEBUG_
cerr << "Base::MakeClassName() called\n";
#endif
static int name_mod = 0;
name_mod++;
lpszClassName = TEXT("Win32WinNo");
Convert << lpszClassName << name_mod;
lpszClassName = Convert.str().c_str();
#ifdef _DEBUG_
cerr << "Generated class name = " << lpszClassName << "\n";
#endif
}
In ASCII, I get Generated class name = Win32WinNo1
In Unicode, I get a hex value. Which suggests to me the wide character wostringstream is not doing what I want. Either way, CreateWindow doesn't seem to like it (program hangs, if I debug it, it crashes.)
I am not 100% familiar with stringstream, and going by the limited documentation, it returns a 'string' object, but I need a pointer to a C style string for LPCTSTR, so thus, the Convert.str().c_str(). What I am getting is not working right, and If I try TEXT("Win32WinNo1") in my RegisterClass and CreateWindow calls, it works, but this returned string from above is junk.
What am I doing wrong? I am also concerned if it is not appending the integer to the string. does wostringstream covert the integer to wchar_t?

ostringstream::str returns a copy of the string object currently associated with the string stream buffer. c_str points to a buffer internal to that temporary string. lpszClassName is a dangling pointer as soon as this temporary string goes out of scope.
This is probably the reason why your program crashes/hangs.

You output to cerr, which is still a narrow stream. It will likely display the pointer value of lpszClassName and not the wide string it points to.

I think you need to surround your string-literals with _T() so they will be chars or wchar_ts depending on your UNICODE settings.
For example _T("Hello World").

TCHAR[], LPWSTR, LPTSTR and GetWindow Text function

So the GetWindowText is declared on MSDN as follows:
int GetWindowText(
HWND hWnd,
LPTSTR lpString,
int nMaxCount
);
However for the code to work we have to declare the second parameter as
TCHAR[255] WTitle;
and then call the function GetWindowText(hWnd,Wtitle,255);
The LPTSTR is a pointer to an array of tchar, so declaring LPTSTR is similar to declaring TCHAR[]? It doesn't work this way though.
When using TCHAR[] the program returns valid GetWindowText result (it is an integer equal to the number of symbols in the title). The question is : how can I get the exact title out of TCHAR[] ? Code like
TCHAR[255] WTitle;
cout<< WTitle;
or
cout<< *Wtitle;
returns numbers. How can I compare this with a given string?
TCHAR[4] Test= __T("TEST")
if (WTitle == Test) do smth
doesn't work also.

Wow, let's see where to start from.
First off, the declaration of WTitle needs to look like this:
TCHAR WTitle[255];
Next, if cout is not working write, it's because you are in Unicode mode so you need to do this:
wcout << WTitle;
Or to fit better with the whole tchar framework, you can add this (actually, I'm surprised that this is not already part of tchar.h):
#ifdef _UNICODE
#define tcout wcout
#else
#define tcout cout
#endif
and then use:
tcout << WTitle;

OK, a few definitions first.
The 'T' types are definitions that will evaluate to either CHAR (single byte) or WCHAR (double-byte), depending upon whether you've got the _UNICODE symbol defined in your build settings. The intent is to let you target both ANSI and UNICODE with a single set of source code.
The definitions:
TCHAR title[100];
TCHAR * pszTitle;
...are not equivalent. The first defines a buffer of 100 TCHARs. The second defines a pointer to one or more TCHARs, but doesn't point it at a buffer. Further,
sizeof(title) == 100 (or 200, if _UNICODE symbol is defined)
sizeof(pszTitle) == 4 (size of a pointer in Win32)
If you have a function like this:
void foo(LPCTSTR str);
...you can pass either of the above two variables in:
foo(title); // passes in the address of title[0]
foo(pszTitle); // passes in a copy of the pointer value
OK, so the reason you're getting numbers is probably because you do have UNICODE defined (so characters are wide), and you're using cout, which is specific to single-byte characters. Use wcout instead:
wcout << title;
Finally, these won't work:
TCHAR[4] Test == __T("TEST") ("==" is equality comparison, not assignment)
if (WTitle == Test) do smth (you're comparing pointers, use wcscmp or similar)

Short answer: Unless you're coding for Win98, use wchar_t instead of TCHAR and wcout instead of cout
Long version:
The TCHAR type exists to allow for code to be compiled in multiple string modes. For example supporting ASCII and Unicode. The TCHAR type will conditionally compile to the appropriate character type based no the setting.
All new Win systems are Unicode based. When ASCII strings are passed to OS functions, they are converted to unicode and the call the real function. So it's best to just use Unicode throughout your application.

Use _tcscmp or a variant (which takes in the number of characters to compare). http://msdn.microsoft.com/en-us/library/e0z9k731.aspx
Like:
if (_tcscmp(WTitle, Test) == 0) {
// They are equal! Do something.
}

In C, wchar_t is a typedef for some integer type (usually short int). In C++, it's required to be a separate type of its own -- but Microsoft's compilers default to using a typedef for it anyway. To make it a separate type of its own, you need to use the /Zc:wchar_t compiler switch. Offhand, I don't know if that will entirely fix the problem though -- I'm not sure if the library has real overloads for wchar_t as a native type to print those out as characters instead of short ints.
Generally speaking, however, I'd advise against messing with Microsoft's "T" variants anyway -- getting them right is a pain, and they were intended primarily to provide compatibility with 16-bit Windows anyway. Given that it's now been about 10 years since the last release in that line, it's probably safe to ignore it in new code unless you're really sure at least a few of your customers really use it.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js