CEdit and GetwindowText in MFC - c++

I have added a simple Cedit control to my dialog and have an OnEnChangeEdit callback. I am trying to retrieve the text that is typed in the box, but can only get the first character of what is typed in that call to printf below:
void MFCDlg::OnEnChangeEdit() {
CString s;
m_platformSliceOverrideEditBox.GetWindowText(s);
_cprintf("%s", s.GetString());
}
If it helps I am using the Unicode character set for compilation.

_cprintf expects ansi strings. If you are using unicode then it will stop at the first character because the second byte will be a null.
use _tcprintf instead which will expect wide strings when you build as unicode.

Related

Printing em-dash to console window using printf? [duplicate]

This question already has answers here:
Is it possible to cout an EM DASH on Linux and Windows? [duplicate]
(2 answers)
Closed 5 years ago.
A simple problem: I'm writing a chatroom program in C++ (but it's primarily C-style) for a class, and I'm trying to print, “#help — display a list of commands...” to the output window. While I could use two hyphens (--) to achieve roughly the same effect, I'd rather use an em-dash (—). printf(), however, doesn't seem to support printing em-dashes. Instead, the console just prints out the character, ù, in its place, despite the fact that entering em-dashes directly into the prompt works fine.
How do I get this simple Unicode character to show up?
Looking at Windows alt key codes, I find it interesting how alt+0151 is "—" and alt+151 is "ù". Is this related to my problem, or a simple coincidence?
the windows is unicode (UTF-16) system. console unicode as well. if you want print unicode text - you need (and this is most effective) use WriteConsoleW
BOOL PrintString(PCWSTR psz)
{
DWORD n;
return WriteConsoleW(GetStdHandle(STD_OUTPUT_HANDLE), psz, (ULONG)wcslen(psz), &n, 0);
}
PrintString(L"—");
in this case in your binary file will be wide character — (2 bytes 0x2014) and console print it as is.
if ansi (multi-byte) function is called for output console - like WriteConsoleA or WriteFile - console first translate multi-byte string to unicode via MultiByteToWideChar and in place CodePage will be used value returned by GetConsoleOutputCP. and here (translation) can be problem if you use characters > 0x80
first of all compiler can give you warning: The file contains a character that cannot be represented in the current code page (number). Save the file in Unicode format to prevent data loss. (C4819). but even after you save source file in Unicode format, can be next:
wprintf(L"ù"); // no warning
printf("ù"); //warning C4566
because L"ù" saved as wide char string (as is) in binary file - here all ok and no any problems and warning. but "ù" is saved as char string (single byte string). compiler need convert wide string "ù" from source file to multi-byte string in binary (.obj file, from which linker create pe than). and compiler use for this WideCharToMultiByte with CP_ACP (The current system default Windows ANSI code page.)
so what happens if you say call printf("ù"); ?
unicode string "ù" will be converted to multi-byte
WideCharToMultiByte(CP_ACP, ) and this will be at compile time. resulting multi-byte string will be saved in binary file
the console it run-time convert your multi-byte string to
wide char by MultiByteToWideChar(GetConsoleOutputCP(), ..) and
print this string
so you got 2 conversions: unicode -> CP_ACP -> multi-byte -> GetConsoleOutputCP() -> unicode
by default GetConsoleOutputCP() == CP_OEMCP != CP_ACP even if you run program on computer where you compile it. (on another computer with another CP_OEMCP especially)
problem in incompatible conversions - different code pages used. but even if you change console code page to your CP_ACP - convertion anyway can wrong translate some characters.
and about CRT api wprintf - here situation is next:
the wprintf first convert given string from unicode to multi-byte by using it internal current locale (and note that crt locale independent and different from console locale). and then call WriteFile with multi-byte string. console convert back this multi-bytes string to unicode
unicode -> current_crt_locale -> multi-byte -> GetConsoleOutputCP() -> unicode
so for use wprintf we need first set current crt locale to GetConsoleOutputCP()
char sz[16];
sprintf(sz, ".%u", GetConsoleOutputCP());
setlocale(LC_ALL, sz);
wprintf(L"—");
but anyway here i view (on my comp) - on screen instead —. so will be -— if call PrintString(L"—"); (which used WriteConsoleW) just after this.
so only reliable way print any unicode characters (supported by windows) - use WriteConsoleW api.
After going through the comments, I've found eryksun's solution to be the simplest (...and the most comprehensible):
#include <stdio.h>
#include <io.h>
#include <fcntl.h>
int main()
{
//other stuff
_setmode(_fileno(stdout), _O_U16TEXT);
wprintf(L"#help — display a list of commands...");
Portability isn't a concern of mine, and this solves my initial problem—no more ù—my beloved em-dash is on display.
I acknowledge this question is essentially a duplicate of the one linked by sata300.de. Albeit, with printf in the place of cout, and unnecessary ramblings in the place of relevant information.

Can I point the necessary codepage for the individual string variable in the `Watch1` window?

Visual Studio 2015, C++ language, debugging.
In the Watch1 window I look the values of my variables (strings) of the wchar_t* and char* types. The first of them is Unicode and the second is ANSI (CP_OEMCP codepage). In the Watch1 window the text of the wchar_t* variable is displaying correctly, but the text of the char* variable is displaying unreadable. Can I point the necessary codepage for the individual string variable in the Watch1 window? I want to see both values of my strings correctly in the Watch1 window.
Maybe for such cases is exists the some syntax, similar the $err,hr (the text of the last error, which was gotten via the GetLastError() function).
UPD (the screen added)
Console window has the right output, but in the memory and in the Watch1 window I see unreadable string for my ansiText variable.
The problem is that the original string (starting with hex values 8D A0 A6) is not on Windows-1251 (Windows Cyrillic) code page, but on OEM 866 code page. These two are different, and Visual Studio expects Windows-1251, because that's system's code page (code page used for non-Unicode applications).
It is not possible to specify a code page when you watch a string in debugger. Everything inside should be Unicode anyway, or at least UTF-8, and for those two there are format specifiers, su and s8. See MSDN for all format specifiers.
What you can do is have the following function integrated in the code, and when you want to see some non-ANSI (or non-CP_ACP, to be precise) string just call this function with the string and code page as parameters (but use the function only once in Watch window):
LPCWSTR ViewString(LPCSTR szString, UINT nCodePage)
{
static WCHAR szTemp[1024];
MultiByteToWideChar(nCodePage, 0, szString, -1, szTemp, 1024);
return szTemp;
}
So, in your case in Watch window instead of (char*)ansiText there would be ViewString(ansiText, 866). Also, note that this is not actually "ANSI text", but "OEM text".
I don't know what exactly your program is supposed to do, but I would convert all non-Unicode strings to Unicode at the earliest point in code (right where you get a non-Unicode string), and in your code always work just with Unicode strings. To convert OEM 866 string to Unicode you can use function MultiByteToWideChar with CodePage parameter = 866.

QTextBrowser not displaying non-english characters

I'm developing a Qt GUI application to parse out a custom windows binary file that stores unicode text using wchar_t (default UTF-16 encoding). I've constructed a QString using QString::fromWcharArray and passed it to QTextBrowser::insertPlainText like this
wchar_t *p = ; // pointer to a wchar_t string in the binary file
QString t = QString::fromWCharArray(p);
ui.logBrowser->insertPlainText(t);
The displayed text displays ASCII characters correctly, but non-ASCII characters are displayed as a rectangular box instead. I've followed the code in a debugger and p points to a valid wchar_t string and the constructed QString t is also a valid string matching the wchar_t string. The problem happens when printing it out on a QTextBrowser.
How do I fix this?
First of all read documentation. So depending on system you will have different encoding UCS-4 or UTF-16! What is the size of wchar_t?
Secondly there is alternative API: try QString::fromUtf16.
Finally what kind of character are you using? Hebrew/Cyrillic/Japanese/???. Are you sure those characters are supported by font you are using?

Set Unicode text on MFC form controls in Multi-Byte Char Set application

I have Multi-Byte Char Set MFC windows application. Now I need to display international single byte ASCI characters on windows controls. I can't use ASCI characters directly because to display them correctly it is required windows locale be set to adequate country. I need to display characters in all windows locale cases. For this purpose I must convert ASCI to unicode. I can display required international characters in MessageBoxW, but how to display them on windows MFC controls using SetWindowText?
To show unicode string in MessageBoxW I construct it in wstring
WORD g [] = {0x105,0x106,0x107,0x108,0x109,0x110,0x111,0x112,0x113,0x114,0x115,0x116,0x117,0x118,0x119,0x120};
wstring gg (reinterpret_cast<wchar_t*>(g),15);
MessageBoxW(NULL, gg.c_str() , gg.c_str() , MB_ICONEXCLAMATION | MB_OK);
Seting MFC form control text:
class MyFrm: public CDialogEx
{
virtual BOOL OnInitDialog();
}
...
BOOL MyFrm::OnInitDialog()
{
GetDlgItem(IDC_EDIT_TICKET_NUMBER)->SetWindowText( ???);
}
Is it possible somehow convert wstring gg to CString and show unicode chars on window control?
You could try casting your CDialogEx 'this' object to HWND and then call explictly Win32 API to set text using wchars. So your code will look something like this:
BOOL MyFrm::OnInitDialog()
{
SetDlgItemTextW((HWND)(*this), IDC_EDIT_TICKET_NUMBER, gg.c_str());
}
But as I mentioned earlier Unicode is supported starting from Windows XP and using ASCII is really not a good idea unless you're targeting those very very old OS'es before it. Using them nowdays will cause ALL ASCII strings you pass to be firstly converted into Unicode by the Win32 API. So it is a better idea to switch your project entirely to UNICODE.
First, note that you can simply directly initialize a std::wstring with your Unicode hex character data, without any ugly useless reinterpret_cast<wchar_t*>, etc.
Instead of this:
WORD g [] = {0x105,0x106,0x107,0x108,...,0x120};
wstring gg (reinterpret_cast<wchar_t*>(g),15);
just consider that:
wstring text = L"\x0105\x0106\x0108...\0x0120";
The latter seems much cleaner to me.
Second, if you want to pass an instance to std::wstring to an MFC method that expects a const wchar_t* input string pointer, just consider using wstring::c_str() method.
In addition, the best suggestion I can give you is to just port your app to Unicode.
ASCII/MBCS should be considered programming model of the past for MFC; they bring lots of problem when you want to write "international" code.

How to view the value of a unicode CString in VC6?

I'm using Visual Studio 6 to debug a C++ application. The application is compiled for unicode string support. The CString type is used for manipulating strings. When I am using the debugger, the watch window will display the first character of the string, but will not display the full string. I tried using XDebug, but this tool does not handle unicode strings properly. As a work around, I can create a custom watch for each character of the string by indexing into the private array the CString maintains, but this is very tedious.
How can I view the full, unicode value of a CString in the VC6 debugger?
Go to tools->options->Debug, and check the "Display unicode string" check-box. That would probably fix the problem. Two other options:
In the watch window, if you have a Unicode string variable named szText, add it to the watch as szText,su. This will tell VS to interpret it as a Unicode string (See Symbols for Watch Variables for more of this sort).
Worst comes to worst, you can have a global ANSI string buffer, and a global function that will get a Unicode CString and store its content as ANSI, in that global variable. Then, when need call that function with the string whose content you'd like to see in the watch window, and watch the ANSI buffer.
But the "Display unicode string" thing is probably the problem...