Rendering unicode characters correctly on textbox - c++

I am working on a translation application in which users are allowed to give English input and I need to convert to a target language and display on a text box. I am facing problems in displaying unicode characters.
Complex characters are not rendering correctly. I know windows uses Uniscribe for rendering complex characters. So do I need to use that explicitly to get the correct rendering? What is the equivalent of Uniscribe in LINUX and MAC?
I am using C++ with wxWidgets framework and trying to display unicode characters on a text box. Any help would be great!

Considering that Uniscribe support in wxWidgets was merely a Google Summer of code idea this year, it seems unlikely that it's working today.
There's no trivial Linux or Mac equivalent for Uniscribe

Read up on Pango. It's the library that supports full OpenType rendering on Linux. Mac's another story.

Related

How to display characters of any language on the screen using opengl

My requirement is to display string of any language on the screen.
Currently we are using opengl to display English characters.
Same APIs are not working for other languages. Instead of characters, boxes are displayed on screen.
Can someone help in understanding opengl and find appropriate APIs to display charterers of any language?
Currently we are using opengl to display English characters.
No, you're not using OpenGL. How do I know this? Because OpenGL does not do text rendering. All it does it points, lines and triangles.
What you're using is some library that knows how to draw characters with points, lines and triangles and then uses OpenGL to get that job done. And the particular library you're using apparently doesn't know, how to deal with characters outside of the ASCII character set.
Of course it's not just that what matters. Encoding matters as well. The most recent versions of C++ support Unicode in program sources (so that you can write unicode in string literals), but that does not automatically give you unicode support in your program – it's just the compiler who knows how to deal with it, but that knowledge does not automatically transpire into the compiled program.
So far there is only one operating system in which Unicode support is so deeply ingrained that no extra work is required; in fact a particular way of encoding Unicode was invented for it, but unfortunately this is one of the most niche OS projects there is around: Plan9
Apart from Unicode, there are also many other character encoding schemes, all incompatible with each other, each for a particular kind of writing. Which means, that it's also impossible to mix characters from different writing systems in texts encoding with such localized characters sets. Hence a universal encoding scheme was invented.
You're most likely on Windows, Linux, BSD, Solaris or MacOS X. And in all of them making non-ASCII-characters work means extra work for you, the programmer. MacOS X is probably the one OS with the least barrier of entry.
So here are the questions you have to answer for yourself:
what character encoding used (hopefully Unicode)?
does the text renderer library used support code points in that encoding?
does the text renderer library come with a layout engine (the thing that positions characters) or does this have to be supplied extra?
Among the existing text renderers that can draw to OpenGL, currently Freetype-GL is the most capable; it has support for Unicode
https://github.com/rougier/freetype-gl

Visual C++/MFC: getting Japanese characters to work without UNICODE

I have software originally developed 20 years ago in Visual C++ using MFC without UNICODE. Currently strings are held either in char[] or CString, and it works on English and Japanese Windows PCs until Japanese characters are used, as these tend to get converted to strange characters or empty boxes.
Setting UNICODE is presumably the way forward but will require a massive code change, whereas quite a lot seems to work simply by setting System Locale to Japan (in “Window’s Language for non-Unicode programs” setting). I have no idea how Windows does this, but some Japanese character things now work on my English Windows PC, e.g. I can open and save Japanese filenames with no code changes. And in Japan they set System Locale to English and again much works, but not everything.
I get the impression the problems are due to using a font that doesn’t include Japanese characters. Currently I am using Arial / MS Sans Serif and charset set to ANSI_CHARSET or DEFAULT_CHARSET. Is there a different font I should be using, or can I extend these fonts to include Japanese characters? Or am I barking up the wrong tree in which case what do I do next? Am very new to all this unfortunately …
That's a common question (OK I guess not so common any more in 2015, as MBCS programs luckily are a dying breed - I still maintain several though...)
Either way, I'm afraid that, depending on your definition of 'working', to get this working you'll have to bite the bullet and convert to a Unicode build. If you can't make a business case for that, then you'll have to set the right locale (well, worse, have the user set the 'right' one) and test what works and what doesn't, and ask more specific questions on what doesn't.
If your goal is to make one application that correctly displays strings in various encodings in the 'right' way regardless of the locale settings on the computer, and compatible with every input data set / database content without the user having to be aware of encoding issues, then you're out of luck with an MBCS build.
The font missing characters is most likely not the problem. Before you go any further and/or ask further questions, you should read http://www.joelonsoftware.com/articles/Unicode.html, read it again, sleep on it, read it again, explain to somebody else what the relationship is between 'encoding', 'locale', 'character set', 'font' and 'Unicode code point', because only after you can do that, you can decide on how to progress with your application. Sorry, it's not what you want to hear, but it's the reality if you've been tasked with handling internationalization.

Can I define my own custom character shapes in ncurses?

Title says pretty much everything. Once upon a time when I was under 13, my older bro did in BorlandPascal a thing which amazed me. He defined kind of table [8][8] with values of 1 and 0, meaning respectively foreground and background. Having several of such tables he could somehow redefine default ASCII characters to look like in these tables. I have no idea how it was done, but it worked.
My question is: can I do similar thing in ncurses, and if I can then how to do it?
The short answer is no. What ncurses does is generating ANSI escape codes which are interpreted by the terminal. There are no codes for altering the font. (Althou there have been extensions propesed no commonly used terminal supports them, neither does ncurses.) And there is no generic way of communicating with the terminal through some kind of side channel for changing the font. But there might ways in some specific situations.
If you have direct access to a Linux console for example you could could do all sorts of things, much like in Borland Pascal. But it will likely be more messy and less impressive.
As the selected answer explains, this is not possible for NCurses to render custom glyphs. ncurses only manipulates the terminal screen state via escape codes (Clearing and rewriting lines to achieve interactivity).
However it should be noted that's very possible to use custom glyphs in the terminal via custom fonts.
This is what Powerline does (a popular terminal UI status line for vim, tmux and friends): https://github.com/powerline/fonts
By patching the fonts, you can inject your glyphs into the existing font being used by the terminal, which then you can access and render via ncurses as any other character.
Of course this is not ideal solution, but with some auto patching of the fonts, and careful testing, it makes it possible to build an app that uses custom glyphs—when your really in a pinch for more expressive UI tools than ncurses can offer.
Further reading: https://apw-bash-settings.readthedocs.io/en/latest/fontpatching.html

How to correctly display characters from different languages?

I am finishing application in Visual C++/Windows API and I am using MySql C Connector.
Whole application code uses ANSI, MySql C Connector is in ANSI too.
This program will be used on Polish and German computers with Windows XP/Vista/7 or 8.
I want to correcly display german umlauts and polish accent characters on:
DialogBox controls (strings are loaded from language files)
Generated XHTML documents
Strings retrieved from MySql database displayed on controls and in XHTML documents
I have heard about MultiByteToWideChar and Unicode functions (MessageBoxW etc.), but application code is nearly finished, converting is a lot of work...
How to make character encoding correctly with the least work and time?
Maybe changing system code page for non-Unicode program?
First, of course: what code set is MySQL returning? Or perhaps:
what code set was used when writing the data into the data base?
Other than that, I don't think you'll be able to avoid using
either wide characters or multibyte characters: for single byte
characters, German would use ISO 8859-1 (code page 1252) or
ISO 8859-15, Polish ISO 8859-2 (code page 1250). But what are
you doing with the characters in your own code? You may be able
to get away with UTF-8 (code page 65001), without many changes.
The real question is where the characters originally come from
(although it might not be too difficult to translate them into
UTF-8 immediately at the source); I don't think that Windows
respects the code page for input.
Although it doesn't help you much to know it, you're dealing
with an almost impossible problem, since so much depends on
things outside your program: things like the encoding of the
display font, or the keyboard driver, for example. In fact,
it's not rare for programs to display one thing on the screen,
and something different when outputting to the printer, or to
display one thing on the screen, but something different if the
data is written to a file, and read with another program. The
situation is improving—modern Unix and the Internet are
gradually (very gradually) standardizing on UTF-8, everywhere
and for everything, and Windows normally uses UTF-16 for
everything that is pure Windows (but needs to support UTF-8 for
the Internet). But even using the platform standard won't help
if the human client has installed (and is using) fonts which
don't have the characters you need.

Pseudographical environment in windows Command Prompt

actually i'm thinking of creating a cool interface for my programming assignment , so i go around searching on how to do it so that such an effect can be create , below is the image .
The Question
1.)What is needed in order to create a program that run pseudographic(semigraphic or whatever they called it) that has menu like BIOS wizard did??I always see some program run in console but it could have graphical-like looking , for example blue environment , and user can use keyboard to choose a list of setting in a menu.
Thanks for spending time reading my question.
It's called Text-based user interface. There're several libraries for doing this. I think this is what you're looking for. :)
Cross platform, Interactive text-based interface with command completion
http://www.gnu.org/s/ncurses/
Ncurses(or maybe pdcurses) is probably what you need.
In the days of 16-bit Windows console windows used to support ANSI escape sequences (via the ansi.sys driver), but they no longer do.
For the apparent line graphics you need to use a platform specific solution anyway, so I recommend just writing an abstraction (functions, class) over the Windows APIs console functions.
The line graphics is done by using characters from the original IBM PC character set, codepage 437. At first you can just hardcode the various patterns. In order to make it seem more like line drawing to the code, or from the code's perspective, so to speak, you'll have to abstract things again. As I remember there is some partial but not complete system in the original codepage 437 character codes. But for Windows console you will need to use the Unicode character codes, which probably do not preserve the original system, so perhaps just define a map where these graphics characters are placed more systematically.
Possibly that already exists.
If you don't care about portability, the Windows API for this can be found here. It will supply all the functions you need, without the need to pack additional libraries with your application.
You can also look in to graphics.h, a non-standard Borland extension that a lot of older graphical games used to use. It goes far beyond the normal limits of the console, but is only supported by 16 bit systems, of which Microsoft has all but removed support for from Windows. You'd also need an ancient Borland compiler, or an emulation, though you probably want the original look and feel.