C++: Qt 5.3 fails to display UTF-8 character

C++: Qt 5.3 fails to display UTF-8 character - c++

I am trying to display a unicode character (Euro sign) on a button using Qt and C++ in Visual Studio 2013. I tried the following code:
_rotateLeftButton->setText("\u20AC");
and
_rotateLeftButton->setText("€");
and
_rotateLeftButton->setText(QString::fromUtf8("\u20AC"));
and
_rotateLeftButton->setText(QString::fromUtf8("€"));
However, all of those lines result in the following:
All my code files are UTF-8 encoded, except for the moc files (.cxx). For whichever reason the moc executable does not generate them using unicode. Yet I was not able to get this unicode symbol displayed correctly. I also tried setting another font than the default one withouth success. Does anyone know what could be the problem?
Thank you for your help.

QString::fromUtf8("€")
Will work if the file really is handled as UTF-8. As #n.m. commented, VS requires some help from a faux-BOM to ensure this.
QString::fromUtf8("\u20AC")
\u doesn't make sense in a byte string literal. You could spell it using \x byte escapes for the UTF-8 encoded version:
QString::fromUtf8("\xE2\x82\xAC")
Or use a wide string literal:
QString::fromWCharArray(L"\u20AC")

Related

Printing ASCII code in C ++ (Visual studio not recognizing encoding)

I'm trying to make a xy program which prints ASCII art in the console with chracters such as ⣿, when running the program just prints question marks (?). I understand that its either because of me using the wrong encoding or Microsoft Visual Studio not having the dictionary of these ASCII Characters.
If you have any idea on how to either change encoding or fixing the isue ,it would be much appreciated

Possible solutions:
Try to change the source file encoding to UTF-8 without signature
or UTF-8 with signature.
Try to use wchar_t literal, i.e. std::wcout << L"Your String";.
Learn more:
how to change source file encoding in csharp project (visual studio / msbuild machine)? (Also applies to C++)
What does the 'L' in front a string mean in C++?

There is not a problem with your code but rather a problem with the console that shows your output. It does not show unicode character correctly. In order for it to show these characters correctly it need to recognize unicode and use a font that actually have those characters. To verify this, simple open a cmd window and copy/paste the character into it and see what heppens.

:QML word wrap with nbsp on Linux

I have the following problem:
When I build my application on Windows QML texts do actually wrap correctly with respect to the nbsp character (U+00A0 I think). On my Raspberry Pi with Raspbian however, it seems that the nbsp is ignored and the text is wrapped as if it was just a normal space.
There are several things that may have some importance here:
On Windows I have QT 5.4 whereas on the Raspberry Pi there is 5.2
I think it may have something to do with encoding. The thing is I remember it worked before I forced the G++ compiler on Pi to take the input files as CP1250 (I added QMAKE_CXXFLAGS += -finput-charset=CP1250 to the project file). Well I had to make this tweak because of the diacritics in some of the string literals (otherwise the texts are absolutely broken on raspberry). So as I said I think the word wrap have worked before I changed this compiler switch.
But still, there is not a single problem with the displaying of anything except that the texts happen to be breaked where they shouldn't. Note that there is not any "random" character or something but a regular space. That's absolutely strange as this looks there is no problem with encoding but rather with the word wrapping algorith itslef. But as I said it used to work when it thought the string literals are whatever the default on Linux is (UTF-8 I guess...).
As for the QML Text assignment these strings are taken from C array and assigned to the QML text using QObject::setProperty if that is of any importance...
Also note that I probably cannot change the encoding of my sources to UTF-8 because the file with the strings is shared also for some embedded project that works on the other side of the communication and this one has to be CP1250 because of the IDE.
Thanks in advance
EDIT:
I have some additional information: If I go through one of the affected string literals on Windows, it is in fact shorter than the same literal compiled on Raspberry, even when the source encoding is set to CP1250. For example the nbsp is encoded in only one byte on Windows (160d), but it is two bytes on Raspberry (194d,160d). That's strange, isn't it? I'd expect that after explaining g++ that the source code is encoded in CP1250, it should encode the literals in the same way? Or maybe not because this is then encoding of the string in the memory which is different by default on both Windows and Linux. But still I don't see where's the problem.

As suggested by Kevin Krammer,
QString::fromLocal8Bit()
was the solution.

Translate Unicode Literal in Qt 5.3

In a Qt 5.3 application, I have a string literal that contains non-ASCII characters (specifically German Umlauts) that will need to be translated into foreign languages. So I have two issues: (1) I have to mark that literal with tr() and (2) I have to display the string correctly on the screen for which I would seem to have to use QString::fromLatin1() or some such function.
If I do
QString s = tr("ä");
the string is marked for translation but will not display right.
If I do
QString r = QString::fromLatin1("ä");
the string will display right but will not be marked for translation.
How can I combine the two into one? And yes, my source file is saved in UTF8 encoding.
I've been searching up and down the forums and none of the hints work; mainly because most of the solutions apply to Qt 4.8 and have been removed or depreciated for Qt 5.3. Thank you for your help!!
PS: I'm developing using Visual Studio 2010 on Windows 8. According to VS2010 and Notepad++ my sources are saved in UTF8 with BOM encoding.

If using QString::fromLatin1("ä") you get a correct output then your source files haven't UTF-8 encoding.
When source file
printf("%x\n", QString("ä").at(0).unicode());
printf("%x\n", QString::fromLatin1("ä").at(0).unicode());
has UTF-8 encoding, then output is
e4
c3
but when Latin1 (ISO-8859-1), then
fffd
e4
e4 is the Unicode code of the letter ä (U+00E4)

Read documentation of trUtf8 (deprecated/obsolete in Qt5).
So you don't have to use this function, just set proper default codec. Add i main this line:
QTextCodec::setCodecForTr("UTF-8");
If you prefer avoid changing default codec just use trUtf8 instead of tr.

what locale does wstring support?

In my program I used wstring to print out text I needed but it gave me random ciphers (those due to different encoding scheme). For example, I have this block of code.
wstring text;
text.append(L"Some text");
Then I use directX to render it on screen. I used to use wchar_t but I heard it has portability problem so I switched to swtring. wchar_t worked fine but it seemed only took English character from what I can tell (the print out just totally ignore the non-English character entered), which was fine, until I switch to wstring: I only got random ciphers that looked like Chinese and Korean mixed together. And interestingly, my computer locale for non-unicode text is Chinese. Based on what I saw I suspected that it would render Chinese character correctly, so then I tried and it does display the charactor correctly but with a square in front (which is still kind of incorrect display). I then guessed the encoding might depend on the language locale so I switched the locale to English(US) (I use win8), then I restart and saw my Chinese test character in the source file became some random stuff (my file is not saved in unicode format since all texts are English) then I tried with English character, but no luck, the display seemed exactly the same and have nothing to do with the locale. But I don't understand why it doesn't display correctly and looked like asian charactor (even I use English locale).
Is there some conversion should be done or should I save my file in different encoding format? The problem is I wanted to display English charactore correctly which is the default.

In the absence of code that demonstrates your problem, I will give you a correspondingly general answer.
You are trying to display English characters, but see Chinese characters. That is what happens when you pass 8 bit ANSI text to an API that receives UTF-16 text. Look for somewhere in your program where you cast from char* to wchar_t*.

First of all what is type of file you are trying to store text in?Normal txt files stores in ANSI by default (so does excel). So when you are trying to print a Unicode character to a ANSI file it will print junk. Two ways of over coming this problem is:
try to open the file in UTF-8 or 16 mode and then write
convert Unicode to ANSI before writing in file. If you are using windows then MSDN provides particular API to do Unicode to ANSI conversion and vice-verse. If you are using Linux then Google for conversion of Unicode to ANSI. There are lot of solution out there.
Hope this helps!!!

std::wstring does not have any locale/internationalisation support at all. It is just a container for storing sequences of wchar_t.
The problem with wchar_t is that its encoding is unspecified. It might be Unicode UTF-16, or Unicode UTF-32, or Shift-JIS, or something completely different. There is no way to tell from within a program.
You will have the best chances of getting things to work if you ensure that the encoding of your source code is the same as the encoding used by the locale under which the program will run.
But, the use of third-party libraries (like DirectX) can place additional constraints due to possible limitations in what encodings those libraries expect and support.

Bug solved, it turns out to be the CASTING problem (not rendering problem as previously said).
The bugged text is a intermediate product during some internal conversion process using swtringstream (which I forgot to mention), the code is as follows
wstringstream wss;
wstring text;
textToGenerate.append(L"some text");
wss << timer->getTime()
text.append(wss.str());
Right after this process the debugger shows the text as a bunch of random stuff but later somehow it converts back so it's readable. But the problem appears at rendering stage using DirectX. I somehow left the casting for wchar_t*, which results in the incorrect rendering.
old:
LPCWSTR lpcwstrText = (LPCWSTR)textToDraw->getText();
new:
LPCWSTR lpcwstrText = (*textToDraw->getText()).c_str();
By changing that solves the problem.
So, this is resulted by a bad cast. As some kind people provided correction to my statement.

Does Visual Studio 2010 Supports C++ Source Code in Unicode with Unicode Char in String Literal

I want to directly embed non-ASCII Unicode characters in string literals and use them in printf. This implies my source codes must be saved in utf-8 or utf-16. Visual Studio 2010 does support editing and saving C++ source files in either format. But when compiled & executed, it does not produce the correct unicode characters. Does the compiler support string literals with unicode characters embedded?
e.g.
wprintf(L" chinese characters:中文字\n"); the trailing chinese characters cannot be displayed

I don't have a Chinese version of Windows to test with, so this is complete speculation.
The console and file output functions are aware that files are not coded in UTF-16, so they attempt to convert the characters to a code page before output. Just as the default locale is "C" rather than anything based on your system settings, so too the default code page is probably an inappropriate one that does not include Chinese characters.
There is a function SetConsoleOutputCP to change the code page for the console. It is not clear if this function changes the code page used by the actual console window, or if it only affects conversions from Unicode within the program.

The easy way to test wide literals is to skip the formatting part of printf, and give your string straight to the OS: WriteConsoleW(GetStdHandle(STD_OUTPUT_HANDLE), L" chinese characters:中文字", ....

It's possible that #pragma setlocale may be what you need.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js