Unicode in wxWidgets

Unicode in wxWidgets - c++

I'm creating a calculator application in C++ wxWidgets using Visual Studio 2019. I have created a custom button class that I want to use for all mathematical operations and symbols.
How can I set the button's label to √ instead of sqrt? If I do that, a ? symbol is displayed instead. I also need to display these symbols on a wxTextCtrl, if I do it I get the following error when I try to compile: (ignore App.razor, the picture is not mine)
Do I need to change che current character set from ASCII to Unicode? How do you do that?

For a single character, you can just use wxUniChar. You create a wxUniChar with a value in hexadecimal of the Unicode code point for the desired character. Since the Unicode code point of the square root character is U+221A, you can create a wxUniChar for this character like so:
wxUniChar c(0x221A);
wxUnichar is implicitly convertible to wxString, so (assuming wxWidgets was built in Unicode mode), you can use wxUniChar variables exactly as you would use a wxString. For example you could do something like:
control->SetLabel(c);
or
dc.DrawText(c,0,0);

The answer by #New-Pagodi (sorry, don't know how to tag people with spaces in their names) works, but just saving your file in UTF-8 encoding, as MSVS proposes you to do is a much nicer solution. Even in this case notice that you still need to either use wxString::FromUTF8("√") or explicitly set your locale encoding to UTF-8 by using setlocale(), which is (finally) supported by the recent Windows versions, in which case you can use just "√", or use wide strings, i.e. L"√"`.
I.e. you must both have the correct bytes (e2 88 9a for UTF-8-encoded representation of U+221A) in the file and use the correct encoding when creating wxString from it if you're using char* strings. By default this encoding is not UTF-8 under Windows, so using just wxString("√") doesn't work.

Related

Playing cards Unicode printing in C++

According to this wiki link, the play cards have Unicode of form U+1f0a1.
I wanted to create an array in c++ to sore the 52 standard playing cards but I notice this Unicode is longer that 2 bytes.
So my simple example below does not work, how do I store a Unicode character that is longer than 2 bytes?
wchar_t t = '\u1f0a1';
printf("%lc",t);
The above code truncates t to \u1f0a

how do I store a longer that 2 byte unicode character?
you can use char32_t with prefix U, but there's no way to print it to console. Besides, you don't need char32_t at all, utf-16 is enough to encode that character. wchar_t t = L'\u2660', you need the prefix L to specify it's a wide char.
If you are using Windows with Visual C++ compiler, I recommend a way:
Save your source file with utf-8 encoding
set compile parameter /utf-8, reference here.
use a console supports utf-8 encoded like Git Bash to see the result.

On Windows wchar_t stores a UTF-16 code-unit, you have to store your string as UTF-16 (using a string-literal with prefix) This doesn't help you either since the windows console can only output characters up to 0xFFFF. See this:
How to use unicode characters in Windows command line?

C++: Qt 5.3 fails to display UTF-8 character

I am trying to display a unicode character (Euro sign) on a button using Qt and C++ in Visual Studio 2013. I tried the following code:
_rotateLeftButton->setText("\u20AC");
and
_rotateLeftButton->setText("€");
and
_rotateLeftButton->setText(QString::fromUtf8("\u20AC"));
and
_rotateLeftButton->setText(QString::fromUtf8("€"));
However, all of those lines result in the following:
All my code files are UTF-8 encoded, except for the moc files (.cxx). For whichever reason the moc executable does not generate them using unicode. Yet I was not able to get this unicode symbol displayed correctly. I also tried setting another font than the default one withouth success. Does anyone know what could be the problem?
Thank you for your help.

QString::fromUtf8("€")
Will work if the file really is handled as UTF-8. As #n.m. commented, VS requires some help from a faux-BOM to ensure this.
QString::fromUtf8("\u20AC")
\u doesn't make sense in a byte string literal. You could spell it using \x byte escapes for the UTF-8 encoded version:
QString::fromUtf8("\xE2\x82\xAC")
Or use a wide string literal:
QString::fromWCharArray(L"\u20AC")

what locale does wstring support?

In my program I used wstring to print out text I needed but it gave me random ciphers (those due to different encoding scheme). For example, I have this block of code.
wstring text;
text.append(L"Some text");
Then I use directX to render it on screen. I used to use wchar_t but I heard it has portability problem so I switched to swtring. wchar_t worked fine but it seemed only took English character from what I can tell (the print out just totally ignore the non-English character entered), which was fine, until I switch to wstring: I only got random ciphers that looked like Chinese and Korean mixed together. And interestingly, my computer locale for non-unicode text is Chinese. Based on what I saw I suspected that it would render Chinese character correctly, so then I tried and it does display the charactor correctly but with a square in front (which is still kind of incorrect display). I then guessed the encoding might depend on the language locale so I switched the locale to English(US) (I use win8), then I restart and saw my Chinese test character in the source file became some random stuff (my file is not saved in unicode format since all texts are English) then I tried with English character, but no luck, the display seemed exactly the same and have nothing to do with the locale. But I don't understand why it doesn't display correctly and looked like asian charactor (even I use English locale).
Is there some conversion should be done or should I save my file in different encoding format? The problem is I wanted to display English charactore correctly which is the default.

In the absence of code that demonstrates your problem, I will give you a correspondingly general answer.
You are trying to display English characters, but see Chinese characters. That is what happens when you pass 8 bit ANSI text to an API that receives UTF-16 text. Look for somewhere in your program where you cast from char* to wchar_t*.

First of all what is type of file you are trying to store text in?Normal txt files stores in ANSI by default (so does excel). So when you are trying to print a Unicode character to a ANSI file it will print junk. Two ways of over coming this problem is:
try to open the file in UTF-8 or 16 mode and then write
convert Unicode to ANSI before writing in file. If you are using windows then MSDN provides particular API to do Unicode to ANSI conversion and vice-verse. If you are using Linux then Google for conversion of Unicode to ANSI. There are lot of solution out there.
Hope this helps!!!

std::wstring does not have any locale/internationalisation support at all. It is just a container for storing sequences of wchar_t.
The problem with wchar_t is that its encoding is unspecified. It might be Unicode UTF-16, or Unicode UTF-32, or Shift-JIS, or something completely different. There is no way to tell from within a program.
You will have the best chances of getting things to work if you ensure that the encoding of your source code is the same as the encoding used by the locale under which the program will run.
But, the use of third-party libraries (like DirectX) can place additional constraints due to possible limitations in what encodings those libraries expect and support.

Bug solved, it turns out to be the CASTING problem (not rendering problem as previously said).
The bugged text is a intermediate product during some internal conversion process using swtringstream (which I forgot to mention), the code is as follows
wstringstream wss;
wstring text;
textToGenerate.append(L"some text");
wss << timer->getTime()
text.append(wss.str());
Right after this process the debugger shows the text as a bunch of random stuff but later somehow it converts back so it's readable. But the problem appears at rendering stage using DirectX. I somehow left the casting for wchar_t*, which results in the incorrect rendering.
old:
LPCWSTR lpcwstrText = (LPCWSTR)textToDraw->getText();
new:
LPCWSTR lpcwstrText = (*textToDraw->getText()).c_str();
By changing that solves the problem.
So, this is resulted by a bad cast. As some kind people provided correction to my statement.

Unicode character for superscript shows a square box: ࠚ

Using the following code to create a Unicode string:
wchar_t HELLO[20];
wsprintf(HELLO, TEXT("%c"), 0x2074);
When I display this onto a Win32 Control like a Text box or a button it gets mapped to a [] Square.
How do I fix this ?
I tried compiling with both Eclipse(MinGW) and Microsoft Visual C++ (2010).
Also, UNICODE is defined at the top
Edit:
I think it might be something to do with my system, because when I visit: http://en.wikipedia.org/wiki/Unicode_subscripts_and_superscripts
some of the unicode characters don't appear.

The font you are using does not contain a glyph for that character. You will likely need to install some new fonts to overcome this deficiency.
The character you have picked out is 'SAMARITAN MODIFIER LETTER EPENTHETIC YUT' (U+081A). Perhaps you were after U+2074, i.e. 'SUPERSCRIPT FOUR' (U+2074). You need hex for that: 0x2074.
Note you changed the question to read 0x2074 but the original version read 2074. Either way, if you see a box that indicates your font is missing that glyph.

The characters you are getting from Wikipedia are expressed in hexadecimal, so your code should be:
wchar_t HELLO[20];
wsprintf(HELLO, TEXT("%c"), (wchar_t)0x2074); // or TEXT('\x2074')
If it still doesn't work, it's a font problem; if you need a pan-Unicode font, it seems that Code2000 is one of the most complete out there.
Funny fact: the character that has the decimal code 2074 (i.e. hex 81a) seems to actually be a box (or it's such a strange beast that even the image outline at FileFormat.Info is wrong). :)
For the curious ones: it turns out that 0x081a is this thing:

Does Visual Studio 2010 Supports C++ Source Code in Unicode with Unicode Char in String Literal

I want to directly embed non-ASCII Unicode characters in string literals and use them in printf. This implies my source codes must be saved in utf-8 or utf-16. Visual Studio 2010 does support editing and saving C++ source files in either format. But when compiled & executed, it does not produce the correct unicode characters. Does the compiler support string literals with unicode characters embedded?
e.g.
wprintf(L" chinese characters:中文字\n"); the trailing chinese characters cannot be displayed

I don't have a Chinese version of Windows to test with, so this is complete speculation.
The console and file output functions are aware that files are not coded in UTF-16, so they attempt to convert the characters to a code page before output. Just as the default locale is "C" rather than anything based on your system settings, so too the default code page is probably an inappropriate one that does not include Chinese characters.
There is a function SetConsoleOutputCP to change the code page for the console. It is not clear if this function changes the code page used by the actual console window, or if it only affects conversions from Unicode within the program.

The easy way to test wide literals is to skip the formatting part of printf, and give your string straight to the OS: WriteConsoleW(GetStdHandle(STD_OUTPUT_HANDLE), L" chinese characters:中文字", ....

It's possible that #pragma setlocale may be what you need.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js