How to output unicode box drawing in C++? - c++

Sorry for what may sound simple, but I am trying to draw just a simple box in Visual Studio 2017 using the unicode characters from https://en.wikipedia.org/wiki/Box-drawing_character using the code below
#include <iostream>
using namespace std;
int main()
{
cout << "┏━━━━━━━━━━━━━━━━━┓" << endl;
cout << "┃" << endl;
and so on...
However, whenever I run it all of the above code simply outputs as a ? wherever there should be a line.
So is it possible to output code like this directly to the console or for each character do I have to write the numeric values for each character?

Windows console supports UTF-16LE UNICODE.
You can use some box-driving library like PDCurses for example.
Otherwise you can use the following approach
#include <windows.h>
#include <cwchar>
class output_swap {
output_swap(const output_swap&) = delete;
output_swap operator=(output_swap&) = delete;
public:
output_swap( ) noexcept:
prevCP_( ::GetConsoleCP() )
{
::SetConsoleCP( CP_WINUNICODE );
::SetConsoleOutputCP( CP_WINUNICODE );
}
~output_swap() noexcept {
::SetConsoleCP( prevCP_ );
::SetConsoleOutputCP( prevCP_ );
}
private:
::DWORD prevCP_;
};
void draw_text(const wchar_t* text)
{
static ::HANDLE _out = ::GetStdHandle(STD_OUTPUT_HANDLE);
::DWORD written;
::WriteConsoleW( _out, text, std::wcslen(text), &written, nullptr );
}
int main(int argc, const char** argv) {
output_swap swap;
draw_text(L"┏━━━━━━━━━━━━━━━━━┓\n");
draw_text(L"┃ OK ┃\n");
draw_text(L"┗━━━━━━━━━━━━━━━━━┛\n");
return 0;
}
Also check you console font, in the console settings. You are probably need a raster font, but this is also working for Consolas for example.
If you need console io streams, which can work with unicode as well as box driwing you can use my library

Windows console apps can output wide strings (L"...") directly to the terminal if the mode is set correctly. Note the use of wcout as well. Save the following source in UTF-8 encoding:
#include <iostream>
#include <io.h>
#include <fcntl.h>
using namespace std;
int main()
{
_setmode(_fileno(stdout), _O_U16TEXT);
wcout << L"┏━━━━━━━━━━━━━━━━━┓" << endl;
wcout << L"┃" << endl;
}
Compile with "cl /EHsc /utf-8 test.cpp". Output is:
┏━━━━━━━━━━━━━━━━━┓
┃

Related

std::wcout printing unicode characters but they are hidden

So, the following code:
#include <iostream>
#include <string>
#include <io.h>
#include <fcntl.h>
#include <codecvt>
int main()
{
setlocale(LC_ALL, "");
std::wstring a;
std::wcout << L"Type a string: " << std::endl;
std::getline(std::wcin, a);
std::wcout << a << std::endl;
getchar();
}
When I type "åäö" I get some weird output. The terminal's cursor is indented, but there is no text behind it. If I use my right arrow key to move the cursor forward the "åäö" reveal themselves as I click the right arrow key.
If I include English letters so that the input is "helloåäö" the output is "hello" but as I click my right arrow key "helloåäö" appears letter by letter.
Why does this happen and more importantly how can I fix it?
Edit: I compile with Visual Studio's compiler on Windows. When I tried this exact code in repl.it (they use clang) it works like a charm. Is the problem caused by my code, Windows or Visual Studio?
Windows requires some OS-specific calls to set up the console for Unicode:
#include <iostream>
#include <string>
#include <io.h>
#include <fcntl.h>
// From fctrl.h:
// #define _O_U16TEXT 0x20000 // file mode is UTF16 no BOM (translated)
// #define _O_WTEXT 0x10000 // file mode is UTF16 (translated)
int main()
{
_setmode(_fileno(stdout), _O_WTEXT); // or _O_U16TEXT, either work
_setmode(_fileno(stdin), _O_WTEXT);
std::wstring a;
std::wcout << L"Type a string: ";
std::getline(std::wcin, a);
std::wcout << a << std::endl;
getwchar();
}
Output:
Type a string: helloåäö马克
helloåäö马克

C++ string to wstring prints out incorrectly, cant get unicode path

#include <iostream>
#include <Windows.h>
#include <locale>
#include <string>
#include <codecvt>
typedef wchar_t* LPWSTR, *PWSTR;
template <typename Facet>
struct deletable_facet : Facet
{
using Facet::Facet;
};
int main(int argc, char *argv[])
{
std::cout << argv[0] << std::endl;
std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>> converter;
//std::wcout << converter.from_bytes(argv[0]) << std::endl; // range error
std::wstring_convert<deletable_facet<std::codecvt<wchar_t, char, std::mbstate_t>>> conv;
std::wstring ns = conv.from_bytes(argv[0]);
std::wcout << ns << std::endl;
wchar_t filename[MAX_PATH];
//GetModuleFileName(NULL,filename,MAX_PATH); // cant convert wstring_t* to char*
GetModuleFileNameW(NULL,filename,MAX_PATH);
std::wcout << filename << std::endl;
getchar();
return 0;
}
Output:
C:\Users\luka\Desktop\ⁿ?icΣ\unicode.exe
C:\Users\luka\Desktop\ⁿ?icΣ\unicode.exe
C:\Users\luka\Desktop\ⁿ
Actual name of the folder is üлicä
Ive been trying many many different ways for about 2 hours now, and as far as ive seen people suggested GetModuleFileName , but as you can see that returns a conversion error (typedef wchar_t* LPWSTR, *PWSTR; isnt fixing it).
So is there any way to to get the current folder path in unicode , and get the rest of the input arguments to unicode (non-latin characters)
The usage for GetModuleFileName is correct. You should see the expected result with MessageBoxW(0, filename, 0, 0);
The problem is in printing L"üлicä" on Windows console.
Try printing "üлicä" on the console:
int main(int argc, char *argv[])
{
DWORD count;
std::wstring str = GetCommandLineW() + (std::wstring)L"\n";
WriteConsoleW(GetStdHandle(STD_OUTPUT_HANDLE), str.c_str(), str.size(), &count, 0);
MessageBoxW(0, str.c_str(), 0, 0);
wchar_t filename[MAX_PATH];
GetModuleFileNameW(0, filename, MAX_PATH);
WriteConsoleW(GetStdHandle(STD_OUTPUT_HANDLE), filename, wcslen(filename), &count, 0);
return 0;
}
In Visual Studio you can also use _setmode to enable usage of std::wcout/std::wcin
You also have optional entry point wmain(int argc, wchar_t *argv[]) which provides argv in UTF16 encoding.
The main entry point provides argv in ANSI encoding (not UTF8 encoding). ANSI can loose information, unlike Unicode.
This probably is related not to the program but the console, I suggest you try to output into a file and check if the encoding is correct.
You can do that using freopen:
int main(int argc, char *argv[]){
freopen("output-file-name.txt", "w", stdout);
/*rest of code*/
}
If problem persists, try using visual studio along with _setmode(..., _O_U16TEXT) just before using wcout as described here: https://stackoverflow.com/a/9051543/9541897
Here's an example that works with Windows. You'll have to find the right compiler/linker settings to support wmain on MinGW, but it will work. _setmode enables writing Unicode directly to the terminal, and should work as long as the font supports the characters. In my example I use some Chinese, which my font supports:
#include <Windows.h>
#include <iostream>
#include "fcntl.h"
#include "io.h"
int wmain(int argc, wchar_t* argv[])
{
_setmode(_fileno(stdout), _O_U16TEXT);
std::wcout << argv[0] << std::endl;
wchar_t filename[MAX_PATH];
GetModuleFileNameW(NULL,filename,MAX_PATH);
std::wcout << filename << std::endl;
return 0;
}
Output:
马克.exe
C:\üлicä\马克.exe
Why are you typedefing LPWSTR and PWSTR manually? windows.h already handles that for you.
In any case, as #n.m. said in comments, the arguments for main() are NOT encoded in UTF-8 on Windows, so converting non-ASCII characters using a UTF8->UTF16 converter will not produce the correct output. Use the Win32 MultiByteToWideChar() function instead to convert the arguments, using CP_ACP as the codepage to convert from. Or, use wmain() instead, which provides arguments as wchar_t* instead of as char*.
That will get you the data you want. Then, you just need to deal with the issue of Unicode output to the console. As other answers point out, the Windows console does not support UTF-16 output via std::wcout by default, so you have to jump through some additional hoops to make it work correctly (there are many other questions on StackOverflow about that issue).

wstring::find() doesn't work with non-latin symbols?

I have an wide-character string (std::wstring) in my code, and I need to search wide character in it.
I use find() function for it:
wcin >> str;
wcout << ((str.find(L'ф') != wstring::npos)? L"EXIST":L"NONE");
L'ф' is a Cyrillic letter.
But find() in same call always returns npos. In a case with Latin letters find() works correctly.
It is a problem of this function?
Or I incorrectly do something?
UPD
I use MinGW and save source in UTF-8.
I also set locale with setlocale(LC_ALL, "");.
Code same wcout << L'ф'; works coorectly.
But same
wchar_t w;
wcin >> w;
wcout << w;
works incorrectly.
It is strange. Earlier I had no problems with the encoding, using setlocale ().
The encoding of your source file and the execution environment's encoding may be wildly different. C++ makes no guarantees about any of this. You can check this by outputting the hexadecimal value of your string literal:
std::wcout << std::hex << L"ф";
Before C++11, you could use non-ASCII characters in source code by using their hex values:
"\x05" "five"
C++11 adds the ability to specify their Unicode value, which in your case would be
L"\u03A6"
If you're going full C++11 (and your environment ensures these are encoded in UTF-*), you can use any of char, char16_t, or char32_t, and do:
const char* phi_utf8 = "\u03A6";
const char16_t* phi_utf16 = u"\u03A6";
const char32_t* phi_utf16 = U"\u03A6";
You must set the encoding of the console.
This works:
#include <iostream>
#include <string>
#include <io.h>
#include <fcntl.h>
#include <stdio.h>
using namespace std;
int main()
{
_setmode(_fileno(stdout), _O_U16TEXT);
_setmode(_fileno(stdin), _O_U16TEXT);
wstring str;
wcin >> str;
wcout << ((str.find(L'ф') != wstring::npos)? L"EXIST":L"NONE");
system("pause");
return 0;
}
std::wstring::find() works fine. But you have to read the input string correctly.
The following code runs fine on Windows console (the input Unicode string is read using ReadConsoleW() Win32 API):
#include <exception>
#include <iostream>
#include <sstream>
#include <stdexcept>
#include <string>
#include <windows.h>
using namespace std;
class Win32Error : public runtime_error
{
public:
Win32Error(const char* message, DWORD error)
: runtime_error(message)
, m_error(error)
{}
DWORD Error() const
{
return m_error;
}
private:
DWORD m_error;
};
void ThrowLastWin32(const char* message)
{
const DWORD error = GetLastError();
throw Win32Error(message, error);
}
void Test()
{
const HANDLE hStdIn = GetStdHandle(STD_INPUT_HANDLE);
if (hStdIn == INVALID_HANDLE_VALUE)
ThrowLastWin32("GetStdHandle failed.");
static const int kBufferLen = 200;
wchar_t buffer[kBufferLen];
DWORD numRead = 0;
if (! ReadConsoleW(hStdIn, buffer, kBufferLen, &numRead, nullptr))
ThrowLastWin32("ReadConsoleW failed.");
const wstring str(buffer, numRead - 2);
static const wchar_t kEf = 0x0444;
wcout << ((str.find(kEf) != wstring::npos) ? L"EXIST" : L"NONE");
}
int main()
{
static const int kExitOk = 0;
static const int kExitError = 1;
try
{
Test();
return kExitOk;
}
catch(const Win32Error& e)
{
cerr << "\n*** ERROR: " << e.what() << '\n';
cerr << " (GetLastError returned " << e.Error() << ")\n";
return kExitError;
}
catch(const exception& e)
{
cerr << "\n*** ERROR: " << e.what() << '\n';
return kExitError;
}
}
Output:
C:\TEMP>test.exe
abc
NONE
C:\TEMP>test.exe
abcфabc
EXIST
That's probably an encoding issue. wcin works with an encoding different from your compiler's/source code's. Try entering the ф in the console/wcin -- it will work. Try printing the ф via wcout -- it will show a different character or no character at all.
There is no platform independent way to circumvent this, but if you are on windows, you can manually change the console encoding, either with the chchp commandline command or programmatically with SetConsoleCP() (input) and SetConsoleOutputCP() (output).
You could also change your source file's/compiler's encoding. How this is done depends on your editor/compiler. If you are using MSVC, this answer might help you: https://stackoverflow.com/a/1660901/2128694

Print content of a web page issue

I've no idea what's wrong with my code, but it does print nothing to stdout, although there is some content as shown in a debugger.
#include "stdafx.h"
#include <afx.h>
#include <afxinet.h>
#include <iostream>
#include <list>
#include <string>
#include <vector>
#include "wininet.h"
using namespace std;
void DisplayPage(LPCTSTR pszURL)
{
CInternetSession session(_T("Mozilla/5.0"));
CStdioFile* pFile = NULL;
pFile = session.OpenURL(pszURL);
CString str = _T("");
while ( pFile->ReadString(str) )
{
wcout << str.GetString() << endl; // <-- here I expect some output, get nothing
// not even newline !
}
delete pFile;
session.Close();
}
// --- MAIN ---
int _tmain(int argc, _TCHAR* argv[])
{
DisplayPage( _T("http://www.google.com") );
cout << "done !" << endl;
cin.get();
return 0;
}
It is a console project. Console window pops up with message "done !" displayed only.
If anybody interested the issue was caused by non-OEM characters recieved from a web page trying to write to the default console (expecting OEM chars, translating mode). At the first non-OEM character std::wcout stops processing.
Either set the console to binary mode or convert recieved string to the appropriate encoding before sending to standard output.
#include <fcntl.h>
#include <io.h>
...
int old_transmode = _setmode(_fileno(stdout), _O_U16TEXT);
std::wcout << str.GetString() << std::endl; // print wide string characters
...
_set_mode(_fileno(stdout), old_transmode); // restore original console output mode

How to use wstring and wcout to output Chinese words in Xcode?

I try to run the code blow in Xcode 4.2:
int main(int argc, const char * argv[])
{
locale loc("chs");
locale::global(loc);
wstring text(L"你好");
wcout << text << endl;
return 0;
}
I got a error "Thread 1:signal SIGABRT".
Can you Tell me why the error happen or how to use wstring and wcout to output the Chinese words?
You don't. Mac, like other Unix systems, uses UTF8 while Windows uses "Unicode" (UTF-16).
You can print that perfectly well on Mac by using string and cout instead of wstring and wcout.
ADDENDUM
This sample works great. Compile with g++ and run as-is.
#include <string>
#include <iostream>
using namespace std;
int main(int arg, char **argv)
{
string text("汉语");
cout << text << endl;
return 0;
}
The crash is coming from the call to locale(). This SO answer seems related.
As mentioned by Mahmoud Al-Qudsi, you don't need it as you can use UTF-8 in a normal string object:
#include <string>
#include <iostream>
using namespace std;
int main(int argc, const char * argv[])
{
string text("你好");
cout<<text<<endl;
return 0;
}
Produces:
$ ./test
你好
EDIT: Oops, too late :)