Ok, so I'm trying to develop an app using C++ and Qt4 for Linux that will map certain key sequences to special Unicode characters. Also, I'm trying to make it bilingual, so the special Unicode character sent depends on the selected language. Example: AltGr+s will send ß or ș, depending whether German or Romanian is selected. On Windows, I have achieved this using AutoHotKey. However, I couldn't get IronAHK to work on Linux so I have written myself a nice Qt Application for it, using Qxt to register "global" shortcuts. I have tried this snippet:
void mainWnd::sendKeypress( unsigned int keycode )
{
Display *display = QX11Info::display();
Window curr_focus;
int revert_to;
XGetInputFocus( display, &curr_focus, &revert_to );
XTestFakeKeyEvent( display, keycode, true, 0 );
XTestFakeKeyEvent( display, keycode, false, 1 );
XFlush( display );
}
copied from another application(where it works), but here it seems to do nothing. Also, there might be a problem with the fact that the characters I'm trying to send aren't found on a US 101 Keyboard, that I currently use on my laptop(and as the layout in the OS).
So my question is: how do I make the app send a Unicode character to whichever app has focus, inserting a special character(sort of like KCharMap)? Remember, these are special characters which are not found on a normal US Keyboard. Thanks in advance.
Related
glfwSetWindowTitle(win, "Nämen");
Becomes "N?men", where '?' is in a little black, twisted square, indicating that the character could not be displayed.
How do I display 'ä'?
If you want to use non-ASCII letters in the window title, then the string has to be utf-8 encoded.
GLFW: Window title:
The window title is a regular C string using the UTF-8 encoding. This means for example that, as long as your source file is encoded as UTF-8, you can use any Unicode characters.
If you see a little black, twisted square then this indicates that the ä is encoded with some iso encoding that is not UTF-8, maybe something like latin1. To fix this you need to open it in the editor in which you can change the encoding of the file, change it to uft-8 (without BOM) and fix the ä in the title.
It seems like the GLFW implementation does not work according to the specification in this case. Probably the function still uses Latin-1 instead of UTF-8.
I had the same problem on GLFW 3.3 Windows 64 bit precompiled binaries and fixed it like this:
SetWindowTextA(glfwGetWin32Window(win),"Nämen")
The issue does not lie within GLFW but within the compiler. Encodings are handled by the major compilers as follows:
Good guy clang assumes that every file is encoded in UTF-8
Trusty gcc checks the system's settings1 and falls back on UTF-8, when it fails to determine one.
MSVC checks for BOM and uses the detected encoding; otherwise it assumes that file is encoded using the users current code page2.
You can determine your current code page by simply running chcp in your (Windows) console or PowerShell. For me, on a fresh install of Windows 11, it yields "850" (language: English; keyboard: German), which stands for Code Page 850.
To fix this issue you have several solutions:
Change your systems code page. Arguably the worst solution.
Prefix your strings with u8, escape all unicode literals and convert the string to wide char before passing Win32 functions; e.g.:
const char* title = u8"\u0421\u043b\u0430\u0432\u0430\u0020\u0423\u043a\u0440\u0430\u0457\u043d\u0456\u0021";
// This conversion is actually performed by GLFW; see footnote ^3
const int l = MultiByteToWideChar(CP_UTF8, 0, title, -1, NULL, 0);
wchar_t* buf = _malloca(l * sizeof(wchar_t));
MultiByteToWideChar(CP_UTF8, 0, title, -1, buf, l);
SetWindowTextW(hWnd, buf);
_freea(buf);
Save your source files with UTF-8 encoding WITH BOM. This allows you to write your strings without having to escape them. You'll still need to convert the string to a wide char string using the method seen above.
Specify the /utf-8 flag when compiling; this has the same effect as the previous solution, but you don't need the BOM anymore.
The solutions stated above still require you convert your good and nice string to a big chunky wide string.
Another option would be to provide a manifest4 with the activeCodePage set to UTF-8. This way all Win32 functions with the A-suffix (e.g. SetWindowTextA) now accept and properly handle UTF-8 strings, if the running system is at least or newer than Windows Version 1903.
TL;DR
Compile your application with the /utf-8 flag active.
IMPORTANT: This works for the Win32 APIs. This doesn't let you magically write Unicode emojis to the console like a hipster JS developer.
1 I suppose it reads the LC_ALL setting on linux. In the last 6 years I have never seen a distribution, that does NOT specify UTF-8. However, take this information with a grain of salt; I might be entirely wrong on how gcc handles this now.
2 If no byte-order mark is found, it assumes that the source file is encoded in the current user code page [...].
3 GLFW performs the conversion as seen here.
4 More about Win32 ANSI-APIs, Manifest and UTF-8 can be found here.
I'm trying to use WriteConsoleOutput from the WinApi to write characters to the command prompt window buffer. The thing is, I'd really like to be able to write characters such as ☺ directly into the source code, as-is, instead of using some kind of encoding/notation like '\uFFFF' or '0xFF', since I don't understand them too well (differences between codepages/character sets/etc.)
The code below showcases the simplest form of my problem. Running this code does not print ☺ into the command prompt window, but a question mark (?) instead.
#include <Windows.h>
int main()
{
HANDLE h = GetStdHandle(STD_OUTPUT_HANDLE);
CHAR_INFO c[1] = {0};
COORD cS = {1, 1};
COORD cH = {0, 0};
SMALL_RECT sr = {0, 0, 0, 0};
c[0].Attributes = FOREGROUND_INTENSITY;
c[0].Char.UnicodeChar = '☺';
WriteConsoleOutput(h, c, cS, cH, &sr);
Sleep(5000);
return 0;
}
It is vital for my code to display output identically between all Windows versions, regardless of the languages installed/used. So to my knowledge (which admittedly is absolutely minimal), I'd need to set a specific codepage (one which would hopefully be supported by the command prompt in any language Windows).
I've tried:
• Changing from using the CHAR_INFO.UnicodeChar to CHAR_INFO.AsciiChar
• Fiddling around with SetConsoleCP and SetConsoleOutputCP functions, but I haven't got a clue on how to utilize them to help me with this problem.
• Changing the Visual Studio -> Project -> Project properties.. -> Character Set setting to every possible value.
• Using specifically either WriteConsoleOutputA or WriteConsoleOutputW in addition to the aforementioned settings
• Changing the source code file encoding to UTF-8 with(/out) signature.
In my project I'm programmatically setting the command prompt font to 8x8 Terminal, which to my knowledge does not support actual unicode characters. The available characters are displayed here. Those characters do include '☺', so I'm not entirely sure my question is about unicode. I have no idea anymore. Please help.
C source has to be ascii only. If you embed non-ascii characters in a C source file, and IDE might show them in what appears to be the correct format, but the compiler quite likely treats them differently, and the executable function you pass them to can treat them differently still. It's just not portable or reliable. But you can use the escape sequence \x to embed arbitrary bytes in C strings.
UTF-8 is good for internal use, but Windows APIs don't yet support it, so you need to convert to Windows 16 bit chars (UTF-16 nearly but not quite), to display extended characters. However you have to ensure that you are calling the wide character version of the Windows API. Most Windows API functions that take string come in a A and W version (ascii and wide) for binary backwards compatibility. If you query the identifier in the IDE (go to definition etc) you should see which version you have.
This question already has an answer here:
Using SendInput to send unicode characters beyond U+FFFF
(1 answer)
Closed 7 years ago.
I don't really know what I should type in the title, but anyway, here is what I need :
I make small programs that do stuff like, "typing" the given input. Here is a small example to type "test" (as example).
#include <windows.h>
void Press(int Touch);
int main()
{
Sleep(5000);//Sleep a bit, so that you can select where to type
Press(VkKeyScan('t'));
Press(VkKeyScan('e'));
Press(VkKeyScan('s'));
Press(VkKeyScan('t'));
return 0;
}
void Press(int Touch)
{
keybd_event(Touch, 0x9d, 0, 0);
keybd_event(Touch, 0x9d, KEYEVENTF_KEYUP, 0);
}
So what I need is barely this, but with emojis. I need to be able to "type" any emoji like this one : "😂", from my program. Any ideas please ?
There's two ways you can approach this.
The first is using "alt codes":
Hold the ALT key
Press + on the number pad
Type the Unicode code point in hex
Release the ALT key
However, this method requires having EnableHexNumpad set in the Windows registry.
The second would be using the Windows clipboard.
Save the contents of the Windows clipboard
Set the clipboard contents to the Unicode character you wish to insert
Send CTRL + V to paste the character
Revert the clipboard to its previous content
keybd_event is deprecated, as mentioned on its MSDN page. Usually when you look up a Windows function and it is deprecated, you really should consider using the newer one.
Use SendInput, which among other things supports unicode keyboard input emulation.
You can send non-16-bit clean unicode characters by packing two different INPUT structures in a row.
I have Multi-Byte Char Set MFC windows application. Now I need to display international single byte ASCI characters on windows controls. I can't use ASCI characters directly because to display them correctly it is required windows locale be set to adequate country. I need to display characters in all windows locale cases. For this purpose I must convert ASCI to unicode. I can display required international characters in MessageBoxW, but how to display them on windows MFC controls using SetWindowText?
To show unicode string in MessageBoxW I construct it in wstring
WORD g [] = {0x105,0x106,0x107,0x108,0x109,0x110,0x111,0x112,0x113,0x114,0x115,0x116,0x117,0x118,0x119,0x120};
wstring gg (reinterpret_cast<wchar_t*>(g),15);
MessageBoxW(NULL, gg.c_str() , gg.c_str() , MB_ICONEXCLAMATION | MB_OK);
Seting MFC form control text:
class MyFrm: public CDialogEx
{
virtual BOOL OnInitDialog();
}
...
BOOL MyFrm::OnInitDialog()
{
GetDlgItem(IDC_EDIT_TICKET_NUMBER)->SetWindowText( ???);
}
Is it possible somehow convert wstring gg to CString and show unicode chars on window control?
You could try casting your CDialogEx 'this' object to HWND and then call explictly Win32 API to set text using wchars. So your code will look something like this:
BOOL MyFrm::OnInitDialog()
{
SetDlgItemTextW((HWND)(*this), IDC_EDIT_TICKET_NUMBER, gg.c_str());
}
But as I mentioned earlier Unicode is supported starting from Windows XP and using ASCII is really not a good idea unless you're targeting those very very old OS'es before it. Using them nowdays will cause ALL ASCII strings you pass to be firstly converted into Unicode by the Win32 API. So it is a better idea to switch your project entirely to UNICODE.
First, note that you can simply directly initialize a std::wstring with your Unicode hex character data, without any ugly useless reinterpret_cast<wchar_t*>, etc.
Instead of this:
WORD g [] = {0x105,0x106,0x107,0x108,...,0x120};
wstring gg (reinterpret_cast<wchar_t*>(g),15);
just consider that:
wstring text = L"\x0105\x0106\x0108...\0x0120";
The latter seems much cleaner to me.
Second, if you want to pass an instance to std::wstring to an MFC method that expects a const wchar_t* input string pointer, just consider using wstring::c_str() method.
In addition, the best suggestion I can give you is to just port your app to Unicode.
ASCII/MBCS should be considered programming model of the past for MFC; they bring lots of problem when you want to write "international" code.
I'm trying to display Unicode chars from Wingdings font (it's Unicode TrueType font supporting symbol charset only).
It's displayed correctly on my Win7/64 system using corresponding regional OS settings:
Formats: Russian
Location: Russia
System locale (AKA Language for Non-Unicode applications): English
But if I switch System locale to Russian, Unicode characters with codes > 127 are displayed incorrectly (replaced with boxes).
My application is created as using Unicode Charset in Visual Studio, it calls only Unicode Windows API functions.
Also I noted that several Windows apps also display such chars incorrectly with symbol fonts (Symbol, Wingdings, Webdings etc), e.g. Notepad, Beyond Compare 3. But WordPad and MS Office apps aren't affected.
Here is minimal code snippet (resources cleanup skipped for brevity):
LOGFONTW lf = { 0 };
lf.lfCharSet = SYMBOL_CHARSET;
lf.lfHeight = 50;
wcscpy_s(lf.lfFaceName, L"Wingdings");
HFONT f = CreateFontIndirectW(&lf);
SelectObject(hdc, f);
// First two chars displayed OK, 3rd and 4th aren't (replaced with boxes) if
// Non-Unicode apps language is NOT English.
TextOutW(hdc, 10, 10, L"\x7d\x7e\x81\xfc");
So the question is: why the hell Non-Unicode apps language setting affects Unicode apps?
And what is the correct (and most simple) way to display SYMBOL_CHARSET fonts without dependency to OS system locale?
The root cause of the problem is that Wingdings font is actually non-Unicode font. It supports Unicode partially, so some symbols are still displayed correctly. See #Adrian McCarthy's answer for details about how it's probably works under the hood.
Also see more info here: http://www.fileformat.info/info/unicode/font/wingdings
and here: http://www.alanwood.net/demos/wingdings.html
So what can we do to avoid such problems? I found several ways:
1. Quick & dirty
Fall back to ANSI version of API, as #user1793036 suggested:
TextOutA(hdc, 10, 10, "\x7d\x7e\x81\xfc"); // Displayed correctly!
2. Quick & clean
Use special Unicode range F0 (Private Use Area) instead of ASCII character codes. It's supported by Wingdings:
TextOutW(hdc, 10, 10, L"\xf07d\xf07e\xf081\xf0fc"); // Displayed correctly!
To explore which Unicode symbols are actually supported by font some font viewer can be used, e.g. dp4 Font Viewer
3. Slow & clean, but generic
But what to do if you don't know which characters you have to display and which font actually will be used? Here is most universal solution - draw text by glyphs to avoid any undesired translations:
void TextOutByGlyphs(HDC hdc, int x, int y, const CStringW& text)
{
CStringW glyphs;
GCP_RESULTSW gcpRes = {0};
gcpRes.lStructSize = sizeof(GCP_RESULTS);
gcpRes.lpGlyphs = glyphs.GetBuffer(text.GetLength());
gcpRes.nGlyphs = text.GetLength();
const DWORD flags = GetFontLanguageInfo(hdc) & FLI_MASK;
GetCharacterPlacementW(hdc, text.GetString(), text.GetLength(), 0,
&gcpRes, flags);
glyphs.ReleaseBuffer(gcpRes.nGlyphs);
ExtTextOutW(hdc, x, y, ETO_GLYPH_INDEX, NULL, glyphs.GetString(),
glyphs.GetLength(), NULL);
}
TextOutByGlyphs(hdc, 10, 10, L"\x7d\x7e\x81\xfc"); // Displayed correctly!
Note GetCharacterPlacementW() function usage. For some unknown reason similar function GetGlyphIndicesW() would not work returning 'unsupported' dummy values for chars > 127.
Here's what I think is happening:
The Wingdings font doesn't have Unicode mappings (a cmap table?). (You can see this by using charmap.exe: the Character set drop down control is grayed out.)
For fonts without Unicode mappings, I think Windows assumes that it depends on the "Language for Non-Unicode applications" setting.
When that's English, Windows (probably) uses code page 1252, and all the values map to themselves.
When that's Russian, Windows (probably) uses code page 1251, and then tries to remap them.
The '\x81' value in code page 1251 maps to U+0403, which obviously doesn't exist in the font, so you get a box. Similarly the, '\xFC' maps to U+044C.
I assumed that if you used ExtTextOutW with the ETO_GLYPH_INDEX flag, Windows wouldn't try to interpret the values at all and just treat them as glyph indexes into the font. But that assumption is wrong.
However, there is another flag called ETO_IGNORELANGUAGE, which is reserved, but, empirically, it seems to solve the problem.