cout<< "привет"; or wcout<< L"привет"; - c++

Why
cout<< "привет";
works well while
wcout<< L"привет";
does not? (in Qt Creator for linux)

GCC and Clang defaults to treat the source file as UTF-8. Your Linux terminal is most probably configured to UTF-8 as well. So with cout<< "привет" there is a UTF-8 string which is printed in a UTF-8 terminal, all is well.
wcout<< L"привет" depends on a proper Locale configuration in order to convert the wide characters into the terminal's character encoding. The Locale needs to be initialized in order for the conversion to work (the default "classic" aka "C" locale doesn't know how to convert the wide characters). Use std::locale::global (std::locale ("")) for the Locale to match the environment configuration or std::locale::global (std::locale ("en_US.UTF-8")) to use a specific Locale (similar to this C example).
Here's the full source of the working program:
#include <iostream>
#include <locale>
using namespace std;
int main() {
std::locale::global (std::locale ("en_US.UTF-8"));
wcout << L"привет\n";
}
With g++ test.cc && ./a.out this prints "привет" (on Debian Jessie).
See also this answer about dangers of using wide characters with standard output.

Related

What header do I have to include to write Hangeul out to a console window using Visual Studio 2022?

First, briefly,
My computer with Window11OS showed this console window.
?? ? ?? C++????
C:\Users\MYNAME\source\repos\20220816ConsoleApplication1\x64\Debug\20220816ConsoleApplication1.exe (process 17604) exited with code 0.
and quetionmarks are NOT what I intended.
I think I need to use intentionally UNICODE here.
But I don't know how to do, I failed once; I'll explain later.
So, this is the code that I want to fix. And I'm using Visual Studio as Windows Console Application(C++/WInRT) for this solution-file.
#include <iostream>
int main()
{
using namespace std;
/*표준 출력 스트림으로 문장을 출력함
근데 왜 한글이 물음표로 나올까*/
cout << "나의 첫 번째 C++프로그램" << endl;
return 0;
}
But VS's error-list showed me the below.
Warning C4566 character represented by universal-character-name '\uB098' cannot be represented in the current code page (1252)
What property I've changed to build successfully was the below
VS's upper menu > Project > Properties(the lowest item) > Configuration Properties > C/C++ > Precompiled Headers > Precompiled Header : Use(/Yu) --I CHANGED IT TO-->> Not Using Precompiled Headers
I also had tried to change "Language" property just like the screenshot below, but I had failed to build at all, even question-marks hadn't shown.
What I had tried to show Hangeul-language instead of question-marks but failed.
On Windows, use the following to set the console stdout to Unicode mode. Save the file with UTF-8 w/ BOM encoding, or UTF-8 without BOM and compile with the /utf-8 switch. You Windows needs to support the Korean language or at least use a console font that supports Korean characters such as NSimSun.
test.cpp: compiled with "cl /W4 /EHsc /utf-8 test.cpp"
#include <iostream>
#include <io.h>
#include <fcntl.h>
int main()
{
using namespace std;
_setmode(_fileno(stdout), _O_U16TEXT);
wcout << L"나의 첫 번째 C++프로그램" << endl;
return 0;
}
Output (cut/paste from cmd window):
나의 첫 번째 C++프로그램

non-ASCII file paths Windows

I work on Windows and have file paths with non-ASCII symbols. For non-ASCII symbols windows using wstring. I am doing the conversion and pass them to luaL_dofile but it fails with can not find a file.
Here is my example of code:
std::wstring wstr_path = "non-ASCII path"
using convert_type = std::codecvt_utf8_utf16<wchar_t>;
std::wstring_convert<convert_type, wchar_t> converter;
std::string str_path = converter.to_bytes(wstr_path);
luaL_dofile(mRoot, str_path.c_str());
I know nothing about luaL_dofile, but it's rather unlikely it uses UTF-8. Windows file API for Unicode unaware programs uses the ANSI codepage (which corresponds to the system default locale). The ANSI codepage on English/US systems is 1252, but other system default locales have different codepages. Central European is 1250, Cyrillic is 1251, etc.
Also, you could try generating the short name for the file (see the GetShortPathName API) and feed that.

How to print 4 byte Unicode character in Windows C++ console app?

How to print "👩" emoji (Unicode code 1F469) in Windows console app using C++?
In example below I followed Printing UTF-8 Text to the Windows Console.
#include <iostream>
#include <io.h>
#include <fcntl.h>
int main()
{
_setmode(_fileno(stdout), _O_U16TEXT);
std::wcout << L"face: 👩\n";
return 0;
}
However it only prints two questionmarks:
.
"Command Prompt" (cmd.exe) app can't render this char so I'm using Windows Terminal that can render it:
The Windows Console cannot display characters outside of Plane 0. The Windows Terminal was designed to improve on the limitations of the Windows Console.
Further reading: How to use unicode characters in Windows command line?

Wide Characters not printing while using Ncurses (C++)

The below code fails to print the wide character:
#include <ncurses.h>
using namespace std;
int main(void){
initscr();
printw("█");
getch();
endwin();
}
This code seems to work on some computers and not others, although all the libraries are installed correctly.
(The terminal is capable of displaying extended char!)
I compiled this using:
g++ -std=c++14 widechartest.cpp -o widechar -lncursesw
Could somebody let me know what the problem is?
Thanks 8)
You didn't initialize the locale. The manual page points this out:
The library uses the locale which the calling program has initialized.
That is normally done with setlocale:
setlocale(LC_ALL, "");
If the locale is not initialized, the library assumes that characters
are printable as in ISO-8859-1, to work with certain legacy programs.
You should initialize the locale and not rely on specific details of
the library when the locale has not been setup.

Unicode output on windows console

The article Unicode apps in the MinGW-w64 wiki explains the following example for an Unicode application, e.g. _main.c_:
#define UNICODE
#define _UNICODE
#include <tchar.h>
int _tmain(int argc, TCHAR * argv[])
{
_tprintf(argv[1]);
return 0;
}
The above code makes use of tchar.h mapping, which allows it to both compile in Unicode and non-Unicode mode. [...] The -municode option is still required when linking if Unicode mode is used.
So I used
C:\> i686-w64-mingw32-gcc main.c -municode -o hello
_tmain.c:1:0: warning: "UNICODE" redefined
#define UNICODE
^
<command-line>:0:0: note: this is the location of the previous definition
to compile a Unicode application. But, when I run it, it returns
C:\> hello Süßer
S³▀er
So the Unicode string is wrong. I used the latest version 4.9.2 of MinGW-w64, i686 architecture and tried the Win32 and POSIX theads variants, both result in the same error. My operating system is 32-bit German Windows 7. When I used the Unicode codepage (chcp 65001), I have to use the font "Lucida Console". With this setting I get a similar error:
C:\> hello Süßer
S��er
I want to use a parameter with "ü" or "ß" in a Windows C++ program.
Solution
nwellnhof is right: The problem is the output on the console. This problem is explained in Unicode part 1: Windows console i/o approaches und Unicode part 2: UTF-8 stream mode. The latter gives a solution for Visual C++ - it worked also with Intel C++ 15. This blog post does "not yet consider the g++ compiler. All this code is Visual C++ specific. However, [the blog author has] done generally the same with g++, and [he] will probably discuss that in a third installment."
I want to open a file, which name is given by a parameter. This works simple, e. g. main.c:
#include <iostream>
#include <fstream>
using namespace std;
int main(int argc, char* argv[])
{
if ( argc > 1 ) {
// The output will be wrong, ...
cout << argv[1] << endl;
// but the name of this file will be right:
fstream fl_rsl(argv[1], ios::out);
fl_rsl.close();
}
return 0;
}
and the compilation without unicode mode
C:\> g++ main.cpp -o hello && hello Süßer
It's console output is still wrong, but the created filename is right. This is okay for me.