Supporting multiple human languages

Supporting multiple human languages - c++

I am thinking about my final year project and the possibility of supporting multiple languages, e.g. English, Welsh, German etc..
Is there a standard way of supporting multiple human languages in a program?
What is the recommended file format for storing the different languages?
It is something I am clueless on but is obviously a very common feature, So any advice is welcomed.
I am most familiar with c++ using mfc for UI applications, currently learning Qt. So an answer with this bias in mind would be good.
(Sorry if this has been covered before, but searching for 'Languages' on SO returns streams of programming language related questions)

If you wanted to browse on StackOverflow for ideas you could try the internationalization, i18n, localization and l10n tags.
("i18n" == "internationalisation" because "nternationalizatio" is 18 letters. Same for localization and l10n.)

As for MFC you could use resource DLLs as described here. One of portable solutions will be using gettext library.

Apart from the already made suggestions of internationalization and localization, another term you might want to research is "Unicode".

Related

How to write your own input tool software for windows for my language?

My language Kachhi has no official Unicode support but I have developed my own fonts in ttf, otf , svg etc format.
I already run a website using same fonts.
I want users to be able to write or input in my language using my fonts
(preferably on all platforms but if not then mainly on windows)
So how can I develop a input tool software for windows?
to input custom fonts designed for my language
Can anyone help by pointing out how to build you own windows IME. Link to some tutorial or books or anything?

I apologise if I misunderstood the question - however I think you may consider using the Unicode private use area
The idea of this part of Unicode is to allow for exactly this situation (I remember someone used it for the fictional Klingon language at one point).
You can use these zones of the Unicode-tables, then provide input/output mechanisms though traditional Unicode methods.
Obviously enough, without a custom font (such as the one you've developed), these sections of the table have no meaning.

What you're aiming for is called an Input Method Editor. Essentially, this is a small program with a standardized interface, to translate user input into Unicode text.
You can pick pretty much any language that has decent Windows support. IOW, VC++.

C++ library for creating pdfs with many languages supported?

My dilemma is this: We have been using libharu for the creation of our pdfs but we recently added Hindi to our software and from what I can find, libharu doesn't support it.
I have looked around and have been unable to find a library similar to libharu (doesn't need to be open source) that supports all the languages we use, but I have failed.
I checked out all the libraries mentioned in this post, but none of them met my needs:
Open source PDF library for C/C++ application?
Also, that post is a few years old. >_<
So I ask you, kind stackoverflow people, do you know of a library for creating and editing pdfs (in c++) that supports at least the following languages? (English, Spanish, French, Turkish, German, Russian, Japanese, Chinese, Arabic, and Hindi)

Have you tried going to the source?
Adobe PDF Developer SDK
Adobe Systems owns invented the PDF language. I recommend talking to them first before going open source. They may have some libraries or SDKs to use with PDFs.

I have used JagPDF few times and from my last experience I think embedding fonts in the pdf might solve your problem.

Internationalization in MFC

It's finally (after years of postponing) the time to localize my app in a few other languages other than English.
The first challenge is to design the integration into my C++ / MFC application that has dozens of dialogs and countless strings. I came across two possible alternative implementations:
Compile and deploy localized resource files as DLLs
Extract and replace all strings with the localized version. For each
language there will be an XML (or simple text) file.
Personally I opt for the second alternative since it seems to me more flexible. The changes are many but not hard to make, and very importantly the XML files will be very easy to modify for the translators.
Any advise is greatly appreciated.
Regards,
Cosmin Unguru
http://www.batchphoto.com/

I did some long-living MFC projects with different languages.
I strongly recommend the first approach with resource-only DLLs.
The reasons:
(1) If the user does a XCOPY install, he always has the default language (English) in the main executables.
(2) If you don't translate everything (e.g. you're late with your release or forget some strings), the Windows resource functions if properly used return the resource in the default language automatically - you don't have to implement it on your own.
(3) My very person opinion: (a) Line breaks, tabs, whitespaces in XML files are a pain in your a**. (b) Merging XML files is even worse...
(4) Don't forget the encoding. It's okay in XML but your translators might use an unsuitable editor and damage the file.
And now for the main reason:
(5) You will have to rearrange many of your dialogs, because many strings are longer in e.g. French or German than in English. And making all statics, buttons, ... larger "just in case" looks crappy.
Another hint: Spend some bucks and buy one of the translation tools which import your projects / binaries and build up a translation database. This will be amortized after the first release.
Another hint (2): If possible make a release which doesn't contain any changes but only the multi-language feature. Also in future, if possible: Release your product in English. Then do the translation in one single step (per language) and release the other languages.

My good and friendly suggestion from somebody who worked a lot with localization:
Grab GNU Gettext,
Mark all your strings as _("English").
Extract all strings using gettext tool xgettext and compile dictionalries
Translate string using great tools like poedit.
Use gettext in your project and make your localization life simpler!
You can also use boost::locale for same purpose - it uses GNU Gettext dictionaries and approach but provides different and more powerful runtime and for windows developer it has very good addon - it supports wide strings that MFC requires to use for normal Unicode support.
Don't use resources and other "translation" tools that are total crap from linguistic point of view (and developer's point of view as well).
Further reading: http://cppcms.sourceforge.net/boost_locale/html/tutorial.html

Using a DLL resource library is a relatively straightforward operation, and allows you to manage not only strings, but other resources as well. And this is its main advantage, because i18n is not only about string translation.
However, depending on your needs, a text-based solution may be a better decision, because of its easier handling - resource scripts being more complex than xml files, especially for the average translator.
I would suggest creating your own abstraction layer, something like "LoadLocalizedString", etc.; in this way, you can start implementing it just with text files, and then change to something more complex when and if required in a transparent way - all the effort for making your software i18n aware would still be valid.

In our case we had diffrent dialogues per Language. The resource file was the same as the multiple laguages were implemented at development time. You could basically append on existing resource files the diferent languages. I hope it helps to find your way.

The DLL option is commonly used for this since the resource loading procedure (e.g. LoadLibrary) is already written - meaning you don't have to write any parsing/loading code. While XML is easier to edit, DLLs have a bit more security (users won't be able to easily edit them) and will allow the developer (meaning you) more time to work on application logic instead of writing a language loading system.
HMODULE hLangDLL = LoadLibrary("text_en.dll");
// more stuff
TCHAR mybuffer[1024] = {0};
LoadString(hLangDLL, IDS_MYSTRING, mybuffer, 1023);

If it is just the strings that are changing then I agree that XML is the way forward here for the exact reasons you outline. Easy for other people to edit, easy to change language at runtime, etc.
The only reason (in my eyes) that you'd choose option 1 is if things other than strings are being localized such as needing different icons.
If it's just text? I say go with the XML.

Parsing HTML to find specific links (Without Keywords)

I posted about this sort of earlier, but I am not sure how to post back to my original question as I can only comment or answer my own question.
Anyways, I need to get 4 links from a website, the latest stable build links for windows and linux, and the latest development build links for windows and linux (4 links total) within my C++ application.
I can download the page (http://www.sourcemod.net/snapshots.php) with LibCURL which is already implemented in the project, but after that I am not sure. I was looking at parsers, but I can't think of how I am going to discern link from link. Obviously using a parser I could get the first link from each table, but this does not seem efficient and would only provide me with the links to windows builds.
It looks like the links I need will be in the fourth in both tables, but I am just very familiar with a good way to go about this, so any help would be appreciated.

Maybe you'll find the location of the actual downloads, http://www.sourcemod.net/smdrop/, easier to parse.

I'm not too familiar with c++, but if you don't come across any better solutions there's BeautifulSoup for Python that is really nice for parsing Html and even deals with malformed documents well. And here's an highly rated CodeProject article on embedding Python in C/C++ that claims "This is written for programmers who are more experienced in C/C++ than in Python, the tutorial takes a practical approach and omits all theoretical discussions."
(I haven't read through it personally, as I mentioned, not terribly familiar with C++)

What are some open-source applications written in C/C++ using PostgreSQL?

I'm trying to find open source applications using PostgreSQL that are written in C/C++ so I can study them. A few open source projects using PostgreSQL are Evergreen ILS, SpamAssassin, and pgpool. However, Evergreen and SpamAssassin are written in Perl, and pgpool (written in C) is a replication tool, not a typical application. Moreover, I looked at the SQL code in Evergreen, and it is quite voluminous and complicated.
Hence, I'm looking for one or more applications using PostgreSQL, preferably those that are somewhat trivial (but not too trivial).

seen libpqxx? try asking on its mailing list (but scour their wiki first)
http://pqxx.org/development/libpqxx

pgAdmin is written using c++ using wxwidgets.

how about pgAdmin 3 ?
Also, you may find Qt4 a very easy way interact with databases programming in C++.
http://doc.trolltech.com/4.6-snapshot/sql-programming.html

Have you searched through the projects at http://pgfoundry.org ?

Two examples that are open-source:
Kexi (see kexi-project.org)
FOST4 (
http://support.felspar.com/Fost%204 )

It's pretty big, but the KDE Project's Amarok is written in C++ and can use a PgSQL backend (among several others). While it's pretty large, you may be able to find some interesting things in the database code. Since it uses a pre-defined schema (as opposed to the extremely general types of access that something like pgAdmin uses) it may have some good things to teach you. It will definitely be easier to pick apart than Evergreen, which actually has an entire middleware layer that actually does the data access through exposed services (The OpenSRF Project).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Supporting multiple human languages - c++

If you wanted to browse on StackOverflow for ideas you could try the internationalization, i18n, localization and l10n tags. ("i18n" == "internationalisation" because "nternationalizatio" is 18 letters. Same for localization and l10n.)

As for MFC you could use resource DLLs as described here. One of portable solutions will be using gettext library.

Apart from the already made suggestions of internationalization and localization, another term you might want to research is "Unicode".

Related

How to write your own input tool software for windows for my language?

C++ library for creating pdfs with many languages supported?

Internationalization in MFC

Parsing HTML to find specific links (Without Keywords)

What are some open-source applications written in C/C++ using PostgreSQL?

Categories

Resources