How do you print unicode text to an output file? - c++

I'm writing a C++ program in Visual Studio for class. I am using certain Unicode characters within my program like:
╚, █, ╗, ╝, & ║
I have figured out how to print these characters onto the console properly but I have yet to find a way to output it to a file properly.
In Visual Studio, choosing [OEM United States - Codepage 437] encoding when saving the .cpp file allows it to display properly onto the console.
Now I just need a way to output these characters to a file without errors.
Hopefully someone knows how. Thank You!

Create the file using a wofstream, which uses wide (wchar_t) characters instead of an ofstream (which uses char).

Related

Printing ASCII code in C ++ (Visual studio not recognizing encoding)

I'm trying to make a xy program which prints ASCII art in the console with chracters such as ⣿, when running the program just prints question marks (?). I understand that its either because of me using the wrong encoding or Microsoft Visual Studio not having the dictionary of these ASCII Characters.
If you have any idea on how to either change encoding or fixing the isue ,it would be much appreciated
Possible solutions:
Try to change the source file encoding to UTF-8 without signature
or UTF-8 with signature.
Try to use wchar_t literal, i.e. std::wcout << L"Your String";.
Learn more:
how to change source file encoding in csharp project (visual studio / msbuild machine)? (Also applies to C++)
What does the 'L' in front a string mean in C++?
There is not a problem with your code but rather a problem with the console that shows your output. It does not show unicode character correctly. In order for it to show these characters correctly it need to recognize unicode and use a font that actually have those characters. To verify this, simple open a cmd window and copy/paste the character into it and see what heppens.

c++ visual studio code appears weird on windows command line?

opening on windows
opening on powershell
I had the problem of exporting my c++ files from visual studio to my school server/folder, where I would use powershell to open and run the files on the command line. The code is all spaced out and weird font when I open them on file, and it appears as strange characters when I open them on the command line. This causes the code to not run at all.
How do I fix this issue?
edit: I have added some pictures for better reference
This may be because the file is encoded UTF-8 but being read as ANSI or vice-versa (or some other mismatch of encodings). Try navigating to the files directly in powershell, i.e.
cd C:\Users\username\source\repos\projectname\projectname
if you are using the default path, and open a file with notepad then click 'Save as' and check the encoding (left of save button). The default indicates what encoding is being used, try changing to UTF-8 or ANSI - whichever the default is not. If that doesn't work you can also try UTF-16 and UTF-32 (which I believe are listed as Unicode and Unicode Big Endian in notepad, but I haven't verified that).
In visual studio, per this article, you can do this from the save dialog by going to File > Save As and in the Save As dialog you click the down arrow next to Save and select Save with encoding... The default appears to be code 1252, I would recommend trying UTF-8 first and see if that works.
What you have is an encoding problem. The first file starts with Unicode byte order mark ÿþ. That is, UTF-16 little endian. Because UTF-16 uses two bytes for each character and your characters are from ASCII subset, each other byte is 00 - which is rendered as extra spaces.
The second file is harder to dechipher, as Nano doesn't render the characters properly. I'd guess it has exactly the same problem - UTF-16.
It seems that some version of Visual Studio ninja-changed default file encoding as UTF-16.
As how to fix the situation, save the files in ASCII or UTF8 encoding on your Windows system, then upload those just like #Ghost adviced.

C++: Qt 5.3 fails to display UTF-8 character

I am trying to display a unicode character (Euro sign) on a button using Qt and C++ in Visual Studio 2013. I tried the following code:
_rotateLeftButton->setText("\u20AC");
and
_rotateLeftButton->setText("€");
and
_rotateLeftButton->setText(QString::fromUtf8("\u20AC"));
and
_rotateLeftButton->setText(QString::fromUtf8("€"));
However, all of those lines result in the following:
All my code files are UTF-8 encoded, except for the moc files (.cxx). For whichever reason the moc executable does not generate them using unicode. Yet I was not able to get this unicode symbol displayed correctly. I also tried setting another font than the default one withouth success. Does anyone know what could be the problem?
Thank you for your help.
QString::fromUtf8("€")
Will work if the file really is handled as UTF-8. As #n.m. commented, VS requires some help from a faux-BOM to ensure this.
QString::fromUtf8("\u20AC")
\u doesn't make sense in a byte string literal. You could spell it using \x byte escapes for the UTF-8 encoded version:
QString::fromUtf8("\xE2\x82\xAC")
Or use a wide string literal:
QString::fromWCharArray(L"\u20AC")

How to keep characters in C++ from combining when outputted to text file

I have a fairly simple program with a vector of characters which is then outputted to a .txt file.
ofstream op ("output.txt");
vector <char> outp;
for(int i=0;i<outp.size();i++){
op<<outp[i]; //the final output of this is incorrect
cout<<outp[i]; //this output is correct
}
op.close();
the text that is output by cout is correct, but when I open the text file that was created, the output is wrong with what look like Chinese characters that shouldn't have been an option for the program to output. For example, when the program should output:
O dsof
And cout prints the right output, the .txt file has this:
O獤景
I have even tried adding the characters into a string before outputting it but it doesn't help. My best guess is that the characters are combining together and getting a different value for unicode or ascii but I don't know enough about character codes to know for sure or how to stop this from happening. Is there a way to correct the output so that it doesn't do this? I am currently using a windows 8.1 computer with code::blocks 12.11 and the GNU GCC compiler in case that helps.
Some text editors try to guess the encoding of a file and occasionally get it wrong. This can particularly happen with very small amounts of text because whatever statistical analysis is being used just doesn't have enough data to make a good conclusion. Window's Notepad has/had an infamous example with the text "Bush hid the facts".
More advanced text editors (for example Notepad++) may either not experience the same problem or may give you options to change what encoding is being assumed. You could use such to verify that the contents of the file are actually correct.
Hex editors/viewers are another way, since they allow you to examine the raw bytes of the file without interpretation. For instance, HxD is a hex editor that I have used in the past.
Alternatively, you can simply output more text. The more there is, generally the less likely something will guess wrong. From some of my experiences, newlines are particularly helpful in convincing the text editor to assume the correct encoding.
there is nothing wrong with your code.
maybe the text editor you use has a default encoding.
use more advanced editors and you will get the right output.

Does Visual Studio 2010 Supports C++ Source Code in Unicode with Unicode Char in String Literal

I want to directly embed non-ASCII Unicode characters in string literals and use them in printf. This implies my source codes must be saved in utf-8 or utf-16. Visual Studio 2010 does support editing and saving C++ source files in either format. But when compiled & executed, it does not produce the correct unicode characters. Does the compiler support string literals with unicode characters embedded?
e.g.
wprintf(L" chinese characters:中文字\n"); the trailing chinese characters cannot be displayed
I don't have a Chinese version of Windows to test with, so this is complete speculation.
The console and file output functions are aware that files are not coded in UTF-16, so they attempt to convert the characters to a code page before output. Just as the default locale is "C" rather than anything based on your system settings, so too the default code page is probably an inappropriate one that does not include Chinese characters.
There is a function SetConsoleOutputCP to change the code page for the console. It is not clear if this function changes the code page used by the actual console window, or if it only affects conversions from Unicode within the program.
The easy way to test wide literals is to skip the formatting part of printf, and give your string straight to the OS: WriteConsoleW(GetStdHandle(STD_OUTPUT_HANDLE), L" chinese characters:中文字", ....
It's possible that #pragma setlocale may be what you need.