Using Unicode font in C++ console app

Using Unicode font in C++ console app - c++

How do I change the font in my C++ Windows console app?
It doesn't seem to use the font cmd.exe uses by default (Lucida Console). When I run my app through an existing cmd.exe (typing name.exe) it looks like this: http://dathui.mine.nu/konsol3.png which is entierly correct.
But when I run my app seperatly (double-click the .exe) it looks like this: http://dathui.mine.nu/konsol2.png.
Same code, two different looks.
So now I wonder how I can change the font so it always looks correctly regardless of how it's run.
EDIT:
Ok, some more information. When I just use this little snippet:
SetConsoleOutputCP(CP_UTF8);
wchar_t s[] = L"èéøÞǽлљΣæča";
int bufferSize = WideCharToMultiByte(CP_UTF8, 0, s, -1, NULL, 0, NULL, NULL);
char* m = new char[bufferSize];
WideCharToMultiByte(CP_UTF8, 0, s, -1, m, bufferSize, NULL, NULL);
wprintf(L"%S", m);
it works with the correct font. But in my real application I use WriteConsoleOutput() to print strings instead:
CHAR_INFO* info = new CHAR_INFO[mWidth * mHeight];
for(unsigned int a = 0; a < mWidth*mHeight; ++a) {
info[a].Char.UnicodeChar = mWorld.getSymbol(mWorldX + (a % mWidth), mWorldY + (a / mWidth));
info[a].Attributes = mWorld.getColour(mWorldX + (a % mWidth), mWorldY + (a / mWidth));
}
COORD zero;
zero.X = zero.Y = 0;
COORD buffSize;
buffSize.X = mWidth;
buffSize.Y = mHeight;
if(!WriteConsoleOutputW(window, info, buffSize, zero, &rect)) {
exit(-1);
}
and then it uses the wrong font. I use two different windows, created like this:
mHandleA = CreateConsoleScreenBuffer(GENERIC_READ | GENERIC_WRITE, 0,
NULL, CONSOLE_TEXTMODE_BUFFER, NULL);
Might I be setting the codepage for just the standard output or something?

Windows stores the cmd settings (including the font) in the registry using the exe path as the key. The root key is 'HKEY_CURRENT_USER\Console' so if you take a look in there with regedit you should see several sub-keys named after varous exe's.
To copy the settings of an existing exe, you can export the key to a text file, then edit the file to change the key name to that of your exe, then reimport it.
You can also progmatically modify the registry though i doubt that would take immediate effect w.r.t. to your console window.

You could try the SetCurrentConsoleFontEx() function.

For Vista and above, there is SetCurrentConsoleFontEx, as already has been said.
For 2K and XP, there is an undocumented function SetConsoleFont; e.g. read here.
typedef BOOL (WINAPI *FN_SETCONSOLEFONT)(HANDLE, DWORD);
FN_SETCONSOLEFONT SetConsoleFont;
..........
HMODULE hm = GetModuleHandle(_T("KERNEL32.DLL"));
SetConsoleFont = (FN_SETCONSOLEFONT) GetProcAddress(hm, "SetConsoleFont");
// add error checking
..........
SetConsoleFont(GetStdHandle(STD_OUTPUT_HANDLE), console_font_index);
Now, console_font_index is an index into console font table, definition of which is unknown. However, console_font_index == 10 is known to identify Lucida Console (a Unicode font). I'm not sure how stable is this value across different OS versions.
UPDATE
After dutt's comment, I've run an experiment on a clean XP SP2 setup.
Initially, GetNumberOfConsoleFonts(), indeed, returns 10, and font indices 0..9 specify various raster fonts.
After I open a console with Lucida font selected in its properties (just once; I can close it immediately after opening but the effect is the same), suddenly GetNumberOfConsoleFonts() starts to return 12, and indices 10 and 11 select Lucida of different sizes.
So it seems this trick worked for me when I played with it because I always had running at least one console app with Lucida font selected.
Thus, for practical purposes, jon hanson's answer seems better. Besides offering better control, it actually works. :)

Related

Visualisation of uft-8 (Polish) not working properly

My software supports multiple languages (English, German, Polish, Russian, ...). For this reason I have some language specific files with the dialog texts in the specific language (Encoded as UTF-8).
In my mfc application I open and read those files and insert the text into my AfxMessageBoxes and other UI-Windows.
// Get the codepage number. 65001 = UTF-8
// In the real code this is a parameter in the function I call (just for clarification)
LANGID languageID = 65001;
TCHAR szCodepage[10];
GetLocaleInfo (MAKELCID (languageID, SORT_DEFAULT), LOCALE_IDEFAULTANSICODEPAGE, szCodepage, 10);
int nAnsiCodePage = _ttoi (szCodepage);
// Open the file
CFile file;
CString filename = getName();
if (!file.Open(FileName, CFile::modeRead, NULL))
{
//Check if everything is fine, else break
}
// Read the file
CString inString;
int len = file.GetLength ();
UINT n = file.Read (inString.GetBuffer(len), len);
inString.ReleaseBuffer ();
int size = MultiByteToWideChar (CP_ACP, 0, strAllItems, -1, NULL, 0);
WCHAR *ubuf = new WCHAR[size + 1];
MultiByteToWideChar ((UINT) nAnsiCodePage, (nAnsiCodePage == CP_UTF8 ?
0 : MB_PRECOMPOSED), inString, -1, ubuf, (int) size);
outString = ubuf;
file.Close ();
Result:
This mechanism is working fine for special letters of russian and german, but not for polish. I already checked the utf-8 site (http://www.utf8-chartable.de/unicode-utf8-table.pl?number=1024) and the polish characters are part of it.
I also checked the hex values of my CString and everything seems to be alright, but it is not visualized in the correct way. Just for testing I changed the used codepage from utf-8 to 1250 (Eastern Europe, Polish included) and it also did not work.
What am I doing wrong?
EDIT:
When I use:
MultiByteToWideChar (CP_UTF8 , 0, inString, -1, ubuf, (int) size);
The hex-values are shortend to the "best match" letters. Meaning my result is: mezczyzna
I am using windows 7 with the english language selected.

Well, you have two options:
A. Make your application Unicode. You don't tell us whether it actually is, but I conclude it's not. This is the 'best" solution technically, but it may require a lot of effort, and it may even not be feasible at all (eg use of non-Unicode libraries).
B. If your app is non-Unicode, you have some limitations:
- Your application will only be capable of displaying correctly one codepage using the non-unicode APIs & messages, and this unfortunately cannot be set per application, it's globally set in Windows with the "Language for non-Unicode programs" option, and requires a reboot.
- To display correctly strings containing characters not in the default codepage, you need to convert them to Unicode and use the "wide" versions of APIs & messages explicitly, to display them (eg MessageBoxW()). A little cumbersome, but doable, if the operation concerns only a small number of controls.
The machine you're working on has some western european language as the "Language for non-Unicode programs", and I come to this conclusion because "This mechanism is working fine for special letters of russian and german" and "Using MessageBoxA(0, "mężczyzna", 0, 0) does not work", as you said (though i'm not sure at all about russian, as it's a different codepage).
Apart from this, as IInspectable said, int size = MultiByteToWideChar (CP_ACP, 0, strAllItems, -1, NULL, 0); makes not sense at all, as the string is known to be UTF-8, and not of the default codepage. You may also need to remove the UTF-8 BOM header, if your file contains it.

Compiling with font resource

I'm on windows 8.1, visual studio 2017.
I'm using this pricedown font in a directx project I'm working on.
I load it with AddFontResourceEx and create a font for it with D3DXCreateFont.
When I hit "Local Windows Debugger" everything is fine, font renders. Be it in release or in debug mode.
Problem arises when I go through any executable, it never renders said font, be it release or debug.
So I went reading, I read the articles on msdn, this one and others whenever needed.
I don't think I'm doing anything wrong, my Resource View looks like this:
, and IDR_FONT1 looks like this:
The file is automatically loaded into the solution explorer (I didn't add it, VS did from the Resource.rc file), as you can see here:
With these proprieties:
I add it like so:
AddFontResourceEx("pricedown.ttf", FR_PRIVATE, 0);
this->createFont("Pricedown", 60, true, false);
Where createfont is my function to add the font (stripped down, it has arrays):
bool D3D9Render::createFont(char *name, int size, bool bold, bool italic)
{
D3DXCreateFont(m_pD3dDev, size, 0, (bold) ? FW_BOLD : FW_NORMAL, 0, (italic) ? 1 : 0, DEFAULT_CHARSET, OUT_DEFAULT_PRECIS, ANTIALIASED_QUALITY, DEFAULT_PITCH, name, &m_pFont);
return true;
}
I'm compiling it as x64 release.
As I've said, it works and renders the font when I press "Local Windows Debugger" (in any mode including x64 release), but when I go to project/x64/Release, it just won't render the font. Even the executable size is adequate.
GetLastError on the AddFontResource is 2 (ERROR_FILE_NOT_FOUND)
What am I doing wrong?

(Read the answer until the end, or you'll waste a lot of time.)
I got it. I had read over this blog post.
Here is an example on how to use AddFontMemResourceEx on a font file embedded in the resource.
HINSTANCE hResInstance = AfxGetResourceHandle( ); //Read the edit
HRSRC res = FindResource(
hResInstance,
MAKEINTRESOURCE(IDR_MYFONT),
L"BINARY" //Read The Edit
);
if (res)
{
HGLOBAL mem = LoadResource(hResInstance, res);
void *data = LockResource(mem);
size_t len = SizeofResource(hResInstance, res);
DWORD nFonts;
m_fonthandle = AddFontMemResourceEx(
data, // font resource
len, // number of bytes in font resource
NULL, // Reserved. Must be 0.
&nFonts // number of fonts installed
);
if(m_fonthandle==0)
{
MessageBox(L"Font add fails", L"Error");
}
}
Though you need afxwin.h, and from here:
afxwin.h is MFC and MFC is not included in the free version of VC++
(Express Edition)
EDIT:
You do not need to use AfxGetResourceHandle (why you would need afxwin.h), you can simply do:
HINSTANCE hResInstance = (HINSTANCE)GetModuleHandle(NULL);
And in FindResource, the 3rd parameter should be RT_FONT, and so you'd get:
HRSRC res = FindResource(hResInstance, MAKEINTRESOURCE(IDR_FONT1), RT_FONT);

Capture spawned process stdout as unicode

In my C++/WinAPI code, I want to run some commands and capture their output. To test non-ASCII output, I renamed my network connection to Ethérnét אבג БбГгДд and run ipconfig. When running in command prompt, the output comes out correctly (visible when using a supporting font like Courier New):
C:\>ipconfig
Windows IP Configuration
Ethernet adapter Ethérnét אבג БбГгДд:
(...)
I tried to redirect the output to a pipe, following the example in this answer. But the byte array returned from ReadFile() is not unicode - it's encoded in CP_OEMCP (CP437 in my case), and so the Hebrew and Russian characters come out as '?'s. Since the characters are already lost, no further handling can restore them.
Obviously it's possible, since cmd in a console window does it. How can I do it?

It would seem that ipconfig produces Unicode output when it detects that the output device is the console, and ANSI output otherwise. This is likely to be a backwards-compatibility measure.
Most other built-in command-line tools are likely to either be ANSI-only or to behave in the same way as ipconfig, for the same reason. In Windows, command-line tools are meant, well, for use on the command line; programmers are discouraged from shelling out to them and parsing the output. Instead, you should use the corresponding APIs.
If you know which language you are expecting, you might be able to choose a code page that will preserve the content.
Added by #Jonathan: Undocumented: Turns out you can control the encoding of built-in commands using the environment variable OutputEncoding. I tested with ipconfig, but presumably it works with other built-in tools as well:
> for %e in ("" Unicode Ansi UTF8) do (set OutputEncoding=%~e& ipconfig >ipconfig-%~e.txt)
> (set OutputEncoding= & ipconfig 1>ipconfig-.txt )
> (set OutputEncoding=Unicode & ipconfig 1>ipconfig-Unicode.txt )
> (set OutputEncoding=Ansi & ipconfig 1>ipconfig-Ansi.txt )
> (set OutputEncoding=UTF8 & ipconfig 1>ipconfig-UTF8.txt )
And indeed, ipconfig-*.txt are enconded as expected! Note that this is undocumented, but it does work for me.
Addendum: as of Windows 10 v1809, another alternative is to create a pseudoconsole.

console application can use different ways for output.
for console handle we can use WriteConsoleW for output already in
UNICODE.
if we want use WriteConsoleA or WriteFile for console
handle need first convert UNICODE text to multi-bytes by
WideCharToMultiByte with CodePage :=
GetConsoleOutputCP()
if we have not UNICODE text initially for output (say UTF-8 or
Ansi), need first convert it to UNICODE by
MultiByteToWideChar (with CP_UTF8 or CP_ACP) and then
already again convert it to multi-byte WideCharToMultiByte(GetConsoleOutputCP(), ..)
usual (by default) GetConsoleOutputCP() return same value as GetOEMCP(), so have the same effect in MultiByteToWideChar and WideCharToMultiByte as CP_OEMCP (this constant value is translated to GetOEMCP() )
when output handle is redirected to a file need only use WriteFile only. however application can write data to file in any format: UNICODE, Ansi (CP_ACP) , UTF-8 (CP_UTF8) etc. what is format will be used - very depend from concrete application. you can not full control this. usual you will receive multi-byte output in CP_OEMCP encoding. then you need decide how process it - faster of all you will be need first convert it to UNICODE and use unicode form. if you need Ansi - you will be need do else one conversion.
say if you try use pipe output in CP_OEMCP encoding with OutputDebugStringA - you got error (not readable) output for non english text.
but after 2 conversions CP_OEMCP -> UNICODE -> CP_ACP you can correct displayed text with OutputDebugStringA
but because OutputDebugStringW exist - here enough only to UNICODE convert
also some applications have special options for control output to file format. say ipconfig.exe looking for "OutputEncoding" Environment Variable and depended from it string value ("Unicode", "Ansi", "UTF-8") produce different output. by default (if this Environment Variable not exist or unknown value) CP_OEMCP used
example of pipe read procedure. assume that input data in CP_OEMCP encoding:
void OnRead(PVOID buf, ULONG cbTransferred)
{
if (cbTransferred)
{
if (int len = MultiByteToWideChar(CP_OEMCP, 0, (PSTR)buf, cbTransferred, 0, 0))
{
PWSTR pwz = (PWSTR)alloca((1 + len) * sizeof(WCHAR));
if (len = MultiByteToWideChar(CP_OEMCP, 0, (PSTR)buf, cbTransferred, pwz, len))
{
if (g_bUseAnsi)
{
if (cbTransferred = WideCharToMultiByte(CP_ACP, 0, pwz, len, 0, 0, 0, 0))
{
PSTR psz = (PSTR)alloca(cbTransferred + 1);
if (cbTransferred = WideCharToMultiByte(CP_ACP, 0, pwz, len, psz, cbTransferred, 0, 0))
{
DoPrint(psz, cbTransferred, OutputDebugStringA);
}
}
}
else
{
DoPrint(pwz, len, OutputDebugStringW);
}
}
}
}
}
// debugger can incomplete print too big buffer, so split it on small chunks
template<typename T> void DoPrint(T* p, ULONG len, void (WINAPI* fnOutput)(const T*))
{
ULONG cb;
T* q = p;
do
{
cb = min(len, 256);
q = p + cb;
T c = *q;
*q = 0;
fnOutput(p);
*q = c;
p = q;
} while (len -= cb);
}
about your concrete case - ipconfig.exe used WriteConsoleW for output to console. as result it not depended from current system locale and can correct display multilanguage text. but another tools, like route.exe used WriteFile for ouput (both to console and file) and convert before this UNICODE text to multi-byte by WideCharToMultiByte(CP_OEMCP,..) - as result here will be problems, if try display characters which not exist in CP_OEMCP code page (current system locale). if you have CP437 - Hebrew and Russian characters will be lost if use UNICODE -> CP_OEMCP, need only direct ouput with unicode to console and file. are this possible - dependend from concrete application. for say route.exe this not possible. for ipconfig.exe this possible, because it always write to console in unicode format, and can write to file also in unicode or utf-8 if you set "OutputEncoding" to "Unicode" or "UTF-8"

SHFileOperation fails to copy the all files from the source folder

Recently I faced with very strange behavior of SHFileOperation Windows API (in Windows 7 OS). A simple C++ code
TCHAR szFrom[_MAX_PATH+1];
_tcscpy(szFrom, From.c_str());
::PathAppend(szFrom, _T("*.*"));
*(szFrom + _tcslen(szFrom) + 1) = 0;
TCHAR szTo[_MAX_PATH+1];
_tcscpy(szTo, To.c_str());
*(szTo + _tcslen(szTo) + 1) = 0;
// Perform Copy operation
SHFILEOPSTRUCT Op = { GetDesktopWindow(), FO_COPY, szFrom, szTo,
FOF_NOCONFIRMMKDIR | FOF_NOCOPYSECURITYATTRIBS, 0, 0, 0 };
int Res = ::SHFileOperation(&Op);
if (Res != 0 || Op.fAnyOperationsAborted)
{
CString Message;
Message.Format(_T("Error code = %d"), Res);
AfxMessageBox(Message, MB_OK | MB_ICONWARNING);
}
copies only a part of files from network folder ("From" parameter) to local folder ("To" parameter). I tried the all combinations of "FOF" flags, tried to run as administrator - does not help. The return code from SHFileOperation is always 1223. I also checked my access rights to the network folder - I have the all necessary rights: read, write, modify and execute.
Originally I had 0 for the owning window handle, and changed it to the desktop window by someone's advice found in my Google search. The FOF_ flags shown here are also the result of many variations I tried in attempt to understand what is happening. The number of files copied and sub-folders created varies from trial to trial, sometimes progress dialog appears and sometimes (rarely) it even succeed to copy all the files. But I never get an error message.
What can be a reason for such behavior and is there a way to overcome that?
PS: I saw some similar questions, but no one had intelligible answer.

Why Non-Unicode apps system locale makes Unicode fonts with symbol charset displayed incorrectly?

I'm trying to display Unicode chars from Wingdings font (it's Unicode TrueType font supporting symbol charset only).
It's displayed correctly on my Win7/64 system using corresponding regional OS settings:
Formats: Russian
Location: Russia
System locale (AKA Language for Non-Unicode applications): English
But if I switch System locale to Russian, Unicode characters with codes > 127 are displayed incorrectly (replaced with boxes).
My application is created as using Unicode Charset in Visual Studio, it calls only Unicode Windows API functions.
Also I noted that several Windows apps also display such chars incorrectly with symbol fonts (Symbol, Wingdings, Webdings etc), e.g. Notepad, Beyond Compare 3. But WordPad and MS Office apps aren't affected.
Here is minimal code snippet (resources cleanup skipped for brevity):
LOGFONTW lf = { 0 };
lf.lfCharSet = SYMBOL_CHARSET;
lf.lfHeight = 50;
wcscpy_s(lf.lfFaceName, L"Wingdings");
HFONT f = CreateFontIndirectW(&lf);
SelectObject(hdc, f);
// First two chars displayed OK, 3rd and 4th aren't (replaced with boxes) if
// Non-Unicode apps language is NOT English.
TextOutW(hdc, 10, 10, L"\x7d\x7e\x81\xfc");
So the question is: why the hell Non-Unicode apps language setting affects Unicode apps?
And what is the correct (and most simple) way to display SYMBOL_CHARSET fonts without dependency to OS system locale?

The root cause of the problem is that Wingdings font is actually non-Unicode font. It supports Unicode partially, so some symbols are still displayed correctly. See #Adrian McCarthy's answer for details about how it's probably works under the hood.
Also see more info here: http://www.fileformat.info/info/unicode/font/wingdings
and here: http://www.alanwood.net/demos/wingdings.html
So what can we do to avoid such problems? I found several ways:
1. Quick & dirty
Fall back to ANSI version of API, as #user1793036 suggested:
TextOutA(hdc, 10, 10, "\x7d\x7e\x81\xfc"); // Displayed correctly!
2. Quick & clean
Use special Unicode range F0 (Private Use Area) instead of ASCII character codes. It's supported by Wingdings:
TextOutW(hdc, 10, 10, L"\xf07d\xf07e\xf081\xf0fc"); // Displayed correctly!
To explore which Unicode symbols are actually supported by font some font viewer can be used, e.g. dp4 Font Viewer
3. Slow & clean, but generic
But what to do if you don't know which characters you have to display and which font actually will be used? Here is most universal solution - draw text by glyphs to avoid any undesired translations:
void TextOutByGlyphs(HDC hdc, int x, int y, const CStringW& text)
{
CStringW glyphs;
GCP_RESULTSW gcpRes = {0};
gcpRes.lStructSize = sizeof(GCP_RESULTS);
gcpRes.lpGlyphs = glyphs.GetBuffer(text.GetLength());
gcpRes.nGlyphs = text.GetLength();
const DWORD flags = GetFontLanguageInfo(hdc) & FLI_MASK;
GetCharacterPlacementW(hdc, text.GetString(), text.GetLength(), 0,
&gcpRes, flags);
glyphs.ReleaseBuffer(gcpRes.nGlyphs);
ExtTextOutW(hdc, x, y, ETO_GLYPH_INDEX, NULL, glyphs.GetString(),
glyphs.GetLength(), NULL);
}
TextOutByGlyphs(hdc, 10, 10, L"\x7d\x7e\x81\xfc"); // Displayed correctly!
Note GetCharacterPlacementW() function usage. For some unknown reason similar function GetGlyphIndicesW() would not work returning 'unsupported' dummy values for chars > 127.

Here's what I think is happening:
The Wingdings font doesn't have Unicode mappings (a cmap table?). (You can see this by using charmap.exe: the Character set drop down control is grayed out.)
For fonts without Unicode mappings, I think Windows assumes that it depends on the "Language for Non-Unicode applications" setting.
When that's English, Windows (probably) uses code page 1252, and all the values map to themselves.
When that's Russian, Windows (probably) uses code page 1251, and then tries to remap them.
The '\x81' value in code page 1251 maps to U+0403, which obviously doesn't exist in the font, so you get a box. Similarly the, '\xFC' maps to U+044C.
I assumed that if you used ExtTextOutW with the ETO_GLYPH_INDEX flag, Windows wouldn't try to interpret the values at all and just treat them as glyph indexes into the font. But that assumption is wrong.
However, there is another flag called ETO_IGNORELANGUAGE, which is reserved, but, empirically, it seems to solve the problem.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js