Standardised Language conversion?

Standardised Language conversion? - c++

Is there a standards for language conversion in programming? If this is to broad a question, then specifically for my example:
I've designed a program in c++ & hardcoded English words but I wish to adapt this to accommodate for the display of equivalent words in Italian. I am thinking of using a simple Lang.ini file like so;
English=Language
Do=Fare
Column=Colonna
etc
Load this & and swap out words at run-time. There is nothing web related.
Would there be a better way to do this & any issues I should be mindful of?
Thanks.
EDIT:
To clarify:
I wish to have the English words that I've used hardcoded in my program automatically converted to whatever language is being used on the users PC.

What you're looking for is described as "internationalization" (or, for those who appreciate a little irony, as "internationalisation"). There is a fair amount of introductory material that may be found using google.
The topic involves more than just translating words. There are also considerations about how numeric values are output, currency is represented, etc etc.
Standard C and C++ support such features. An article (from C/C++ Users Journal) on the topic is http://www.angelikalanger.com/Articles/Cuj/Internationalization/I18N.html
Separately from C++, windows also has its own features that may be used for internationalization of applications. One starting point is https://msdn.microsoft.com/en-au/library/windows/desktop/dd318661%28v=vs.85%29.aspx

It is possible to load resources for language, and save strings in resources.
https://msdn.microsoft.com/en-us/library/cc194810.aspx
Usable standart lnaguage macroses:
WORD lang_id = MAKELANGID( primary, sublang )
BYTE primary = PRIMARYLANGID( lang_id )
BYTE sublang = SUBLANGID( lang_id )
Loading resources:
HRSRC hrsrc = FindResourceEx(hMod, RT_ICON, id, langID );
HGLOBAL hglb = LoadResource(hMod, hrsrc);
LPVOID lpsz = LockResource(hglb);
Language initialization code:
static DWORD dwJapanese =
MAKELCID(MAKELANGID(LANG_JAPANESE, SUBLANG_DEFAULT));
// load Japanese resource
SetThreadLocale(dwJapanese, SORT_DEFAULT)
Use LoadString function, possible write wrapper function for convenient using like tihs:
http://www.codeproject.com/Tips/86496/Load-a-Windows-string-resource-into-a-std-string-o

Related

C++ force variables to a fixed memory location

I have written a C++ code for application, in which there are some variables that must have different values for every user will use it ( lets call it the variable X for simplicity)
X have different values for different user. This (X)should be not changed and also embedded in the exe itself ( so I can't read it from a file or any other similar solution)
I don't want to distribute the source code then compile. Instead, I want a method that makes me edit the final exe directly without need to compile ( it is just value of variable X which differs !) Is this possible ?
My idea to do this is if I can force this (X) at a constant memory location, I can then edit its value easily from Hex-editor as example. ( I mean the same ideas when hackers writes cheat tool for a certain game )
Is the mechanism of fixed memory position possible?
Is there any other idea to make what I want?
I hope my question is clear enough

In this answer I'll use Visual Studio 2017 Community Edition because I wanted to be sure to a have a development environment fully compatible with Windows.
I'll present five methods, from the most maintainable to the less. Of course the focus of this answer in strictly limited to the goal of "sharing" a C++ variable with an external tool.
Security of such an operation is a different topic and ultimately a futile attempt anyway.
Method 1 - Resources
Windows APIs1 and the PE2 support embedding resources in an executable3.
Resources are typically images, icons or localized strings but they can be anything - including raw binary data.
With Visual Studio is quite easy to add a resource: In the Solution Explorer > Resource files > Add > New item > Resource > Resource file (.rc)
This will open the Resource view, right-click on Resource.rc and select Add resource....
It's possible to create the standard resources but we need a Custom... type that we can call RAW.
This will create a new binary resource, gives it an ID and makes a few files in the solution.
Switching back to the Solution explorer we can see these new files and eventually edit the .bin file with a better hex editor than the VS's integrated one.
Of particular interest is the resource.h file that we can include to have the definition for the resource id, in my case it was IDR_RAW1.
After the bin file has been crafted we are ready to read it in the application, the pattern to use is the usual one - I don't feel like going over these API one more time a new answer so I'll link the Official documentation and provides a sample code:
#include <Windows.h>
#include "resource.h"
int WINAPI WinMain(HMODULE hModule, HMODULE hPrevModule, LPSTR lpCmdLine, int showCmd)
{
//Get an handle to our resource
HRSRC hRes = FindResource(hModule, MAKEINTRESOURCE(IDR_RAW1), "RAW");
//Load the resource (Compatibility reasons make this use two APIs)
HGLOBAL hResData = LoadResource(hModule, hRes);
LPVOID ptrData = LockResource(hResData);
/*
ptrData is out binary content. Here is assumed it was a ASCIIZ string
*/
MessageBox(NULL, (LPCSTR)ptrData, "Title", MB_ICONINFORMATION);
return 0;
}
Resources are good because they allow for an easy integration with other automatic build tools: it's easy to add a build step before the resources are compiled to generate them on the fly.
It is also very easy to alter them after the exe file as been generated - CFF Explorer III is a simple and effective tools to edit a PE module's resources.
It's even possible to replace a resource entirely thereby not limiting ourselves to keeping the new resource the same size as the old one.
Just open the module in CFF, select Resource editor, browse to the raw resource and edit/replace it. Then save.
Method 2 - PE exports
Executable are ordinary PE module just like Dlls, the difference is really a batter of a bit.
Just like Dlls can exports functions and variables4 so can exes.
With VC++ the way to tag a symbol as exported is __declspec(dllexport):
#include <Windows.h>
__declspec(dllexport) char var[30] = "Hello, cruel world!";
int WINAPI WinMain(HMODULE hModule, HMODULE hPrevModule, LPSTR lpCmdLine, int showCmd)
{
MessageBox(NULL, var, "Title 2", MB_ICONINFORMATION);
return 0;
}
The C++ side of the matter is little affected.
The editing of the PE module is less user friendly but still very easy for everyone to follow.
With CFF open the export directory, all the exports will be listed.
C++ compilers have to mangle variables names when they can be shared due to the C++ features they support - so you won't find a nice name like var in the exports but something like ?var##3PADA.
The export name doesn't really fulfil any goal in this context but you must be able to identify the correct export.
This should be easy since it's very likely to be only one.
CFF will show you the function RVA, this is the RVA (relative to the image base) of the variable, you can easily convert it into a file offset or simply use the Address converted integrated in CFF.
This will open an hex editor and points you at the right bytes.
Method 3 - Map files
If you don't want to have a PE exports pointing right at your variable you can tell VS to generate a MAP file.
Map files will list all the symbols exported by an object file (note: an object file, not a PE module).
So you must make sure a variable, in this case, is exported by your translation unit - this is the default case for "global" variables but make sure to remember to not attach the static linkage modified to it and eventually make it volatile to prevent the compiler from eliminating it during the constants folding step.
#include "Windows.h"
//extern is redundant, I use it only for documenting the intention
//volatile is a hack to prevent constant folding in this simple case
extern volatile int var2 = 3;
int WINAPI WinMain(HMODULE hModule, HMODULE hPrevModule, LPSTR lpCmdLine, int showCmd)
{
//A simple use of an int
return var2;
}
A MAP file will be generated in the output dir, along with the exe, inside it's present a row like this one:
0003:00000018 ?var2##3HC 00403018 Source.obj
This gives you the VA of the variable (403018) that you can use in CFF Address translator.
Method 4 - PE scan
You can initialise the variable with an unique value.
To be able to do so the variable size must be big enough that the probability that a random sequence of bits of equal size end up with the same value is negligible.
For example, if the var is a QWORD the probability of finding, in the PE module, another QWORD with the same value is very low (one in 264) but if the var is a byte then the probability is just one in 256.
Eventually, add a marker variable (I'd use a random array of 16 bytes) before the variable to mark it (i.e. act as the unique value).
To modify the PE use an hex editor to look for that unique value, this will give you the offset of the var to edit.
Method 5 - Reverse engineering
After each release, reverse engine the application (this is easy as you can debug it with VS along with the sources) and look where the compiler allocated the variable.
Take note of the RVA (nota bene: RVA not VA, the VA is variable) and then use CFF to edit the exe.
This requires a reverse engineering analysis each time a new release is built.
1 To be correct, "Win32" APIs.
2 I strongly advice the reader to be at least accustomized with the PE file format as I must assume so to keep this answer in topic and short. Having no understanding of the PE file format will likely result in no understanding of the question as a whole.
3 Actually, in any PE module.
4 Symbols in general.

Version Info Table Changing pe file by UpdateResource?

I'm running program correctly and I see Version Information but in the update resource api run and does not replace the compnayname.
LPCWSTR filename = _T("r1.exe");
size = GetFileVersionInfoSize(filename, &dwHandle);
std::vector<BYTE> fileInfo(size,0);
f = GetFileVersionInfo(filename, 0, size, &fileInfo[0]);
VerQueryValue(&fileInfo[0], TEXT("\\VarFileInfo\\Translation"), (LPVOID*)&pValueBuffer, &verLength);
SubBlock.Format(_T("\\StringFileInfo\\040904B0\\CompanyName"), "0x0409", "1200");
VerQueryValue(&fileInfo[0], SubBlock, (LPVOID *)&lpBuffer, &dwBytes);
ZeroMemory(lpBuffer, _tcslen(lpBuffer) * sizeof(TCHAR));
_tcscpy(lpBuffer, _T("My Company"));
HANDLE hResource = BeginUpdateResource(filename, FALSE);
VerQueryValueW(&fileInfo[0], TEXT("\\VarFileInfo\\Translation"), (LPVOID*)&pValueBuffer, &verLength);
f=UpdateResource(hResource, RT_VERSION, MAKEINTRESOURCE(VS_VERSION_INFO), MAKELANGID(SUBLANG_ENGLISH_UK, SUBLANG_DEFAULT), &fileInfo[0], sizeof(lpBuffer));
EndUpdateResource(hResource, FALSE);
How can I Replace the Company name or other String Info Table Features????

Your code snippet does not do what you expect it to do.
BeginUpdateResource, UpdateResource, EndUpdateResource indeed do the update cycle and you use the API in a presumably correct order. However your UpdateResource uses the same original data block you read from the file.
VerQueryValue extracts you the string and does not provide you with a method to update the value within the original block.
If you want to update the resource, you are responsible for reading the entire VERSIONINFO resource, for parsing it out into parts, updating the string in question, assembling the resource back into a byte buffer and then using the UpdateResource API. There is no API, to my best knowledge that helps you with parsing and assembling the VERSIONINFO data end to end, you are responsible for taking care of this yourself following MSDN data structure (and it's doable).

The GetFileVersionInfo[Size] and VerQueryValue functions abstract away some of the resource version layout details and cannot be used when you want to build resources. You can use them to read if you really want to but you have to manually create the full version resource in memory if you want to update it because 1) there are some alignment requirements and 2) it stores the string size in the string header.
MSDN has decent documentation that should help you to lay things out correctly in memory. It starts with VS_VERSIONINFO and VS_FIXEDFILEINFO and the rest are not true C/C++ compatible structs but you can study other resources in a hex-editor to make sure you are doing it correctly.

Can I get a code page from a language preference?

Windows seems to keep track of at least four dimensions of "current locale":
http://www.siao2.com/2005/02/01/364707.aspx
DEFAULT USER LOCALE
DEFAULT SYSTEM LOCALE
DEFAULT USER INTERFACE LANGUAGE
DEFAULT INPUT LOCALE
My brain hurts just trying to keep track of what the hell four separate locale's are useful for...
However, I don't grok the relationship between code page and locale (or LCID, or Language ID), all of which appear to be different (e.g. Japanese (Japan) is LANGID = 0x411 location code 1, but the code page for Japan is 932).
How can I configure our application to use the user's desired language as the default MBCS target when converting between Unicode and narrow strings?
That is to say, we used to be an MBCS application. Then we switched to Unicode. Things work well in English, but fail in Asian languages, apparently because Windows conversion functions WideCharToMultiByte and MultiByteToWideChar take an explicit code page (not a locale ID or language ID), which can be set to CP_ACP (default to ANSI code page), but don't appear to have a value for "default to user's default interface language's code page".
I mean, this is some seriously convoluted twaddle. Four separate dimensions of "current language", three different identifier types, as well as (different) string-identifiers for C library and C++ standard library.
In our previous MBCS builds, disk I/O and user I/O worked correctly: everything remained in the DEFAULT SYSTEM LOCALE (Windows XP term: "Language for non-Unicode Programs"). But now, in our UNICODE builds, everything tries to use "C" as the locale, and file I/O fails to properly transcode UNICODE to user's locale, and vice verse.
We want to have text files written out (when narrow) using the current user's language's code page. And when read in, the current user's language's code page should be converted back to UNICODE.
Help!!!
Clarification: I would ideally like to use the MUI language code page rather than the OS default code page. GetACP() returns the system default code page, but I am unaware of a function that returns the user's chosen MUI language (which auto-reverts to system default if no MUI specified / installed).

I agree with the comments by Jon Trauntvein, the GetACP function does reflect the user's language settings in the control panel. Also, based on the link to the "sorting it all out" blog, that you provided, DEFAULT USER INTERFACE LANGUAGE is the language that the Windows user interface will use, which is not the same as the language to be used by programs.
However, if you really want to use DEFAULT USER INTERFACE LANGUAGE then you get it by calling GetUserDefaultUILanguage and then you can map the language id to a code page, using the following table.
Language Identifiers and Locales
You can also use the GetLocaleInfo function to do the mapping, but first you would have to convert the language id that you got from GetUserDefaultUILanguage into a locale id, and I think you will get the name of the code page instead of a numeric value, but you could try it and see.

If all you want to be able to do is configure a locale object to use the currently selected locale settings, you should be able to do something like this:
std::locale loc = std::locale("");
You can also access the current code page in windows using the Win32 ::GetACP() function. Here is an example that I implemented in a string class to append multi-byte characters to a unicode string:
void StrUni::append_mb(char const *buff, size_t buff_len)
{
UINT current_code_page = ::GetACP();
int space_needed;
if(buff_len == 0)
return;
space_needed = ::MultiByteToWideChar(
current_code_page,
MB_PRECOMPOSED | MB_ERR_INVALID_CHARS,
buff,
buff_len,
0,
0);
if(space_needed > 0)
{
reserve(this->buff_len + space_needed + 1);
MultiByteToWideChar(
current_code_page,
MB_PRECOMPOSED | MB_ERR_INVALID_CHARS,
buff,
buff_len,
storage + this->buff_len,
space_needed);
this->buff_len += space_needed;
terminate();
}
}

Just use CW2A() or CA2W() which will take care of the conversion for you using the current system locale (or language used for non-Unicode applications).

FWIW, this is what I ended up doing:
#define _CONVERSION_DONT_USE_THREAD_LOCALE // force CP_ACP *not* CP_THREAD_ACP for MFC CString auto-conveters!!!
In application startup, construct the desired locale: m_locale(FStringA(".%u", GetACP()).GetString(), LC_CTYPE)
force it to agree with GetACP(): // force C++ and C libraries based on setlocale() to use system locale for narrow strings
m_locale = ::std::locale::global(m_locale); // we store the previous global so we can restore before termination to avoid memory loss
This gives me relatively ideal use of MFC's built-in narrow<->wide conversions in CString to automatically use the user's default language when converting to or from MBCS strings for the current locale.
Note: m_locale is type ::std::locale

GetDiskFreeSpaceEx with NULL Directory Name failing

I'm trying to use GetDiskFreeSpaceEx in my C++ win32 application to get the total available bytes on the 'current' drive. I'm on Windows 7.
I'm using this sample code: http://support.microsoft.com/kb/231497
And it works! Well, almost. It works if I provide a drive, such as:
...
szDrive[0] = 'C'; // <-- specifying drive
szDrive[1] = ':';
szDrive[2] = '\\';
szDrive[3] = '\0';
pszDrive = szDrive;
...
fResult = pGetDiskFreeSpaceEx ((LPCTSTR)pszDrive,
　　　 (PULARGE_INTEGER)&i64FreeBytesToCaller,
　　　 (PULARGE_INTEGER)&i64TotalBytes,
(PULARGE_INTEGER)&i64FreeBytes);
fResult becomes true and i can go on to accurately calculate the number of free bytes available.
The problem, however, is that I was hoping to not have to specify the drive, but instead just use the 'current' one. The docs I found online (Here) state:
lpDirectoryName [in, optional]
A directory on the disk. If this parameter is NULL, the function uses the root of the current disk.
But if I pass in NULL for the Directory Name then GetDiskFreeSpaceEx ends up returning false and the data remains as garbage.
fResult = pGetDiskFreeSpaceEx (NULL,
　　　 (PULARGE_INTEGER)&i64FreeBytesToCaller,
　　　 (PULARGE_INTEGER)&i64TotalBytes,
(PULARGE_INTEGER)&i64FreeBytes);
//fResult == false
Is this odd? Surely I'm missing something? Any help is appreciated!
EDIT
As per JosephH's comment, I did a GetLastError() call. It returned the DWORD for:
ERROR_INVALID_NAME 123 (0x7B)
The filename, directory name, or volume label syntax is incorrect.
2nd EDIT
Buried down in the comments I mentioned:
I tried GetCurrentDirectory and it returns the correct absolute path, except it prefixes it with \\?\

it returns the correct absolute path, except it prefixes it with \\?\
That's the key to this mystery. What you got back is the name of the directory with the native api path name. Windows is an operating system that internally looks very different from what you are familiar with winapi programming. The Windows kernel has a completely different api, it resembles the DEC VMS operating system a lot. No coincidence, David Cutler used to work for DEC. On top of that native OS were originally three api layers, Win32, POSIX and OS/2. They made it easy to port programs from other operating systems to Windows NT. Nobody cared much for the POSIX and OS/2 layers, they were dropped at XP time.
One infamous restriction in Win32 is the value of MAX_PATH, 260. It sets the largest permitted size of a C string that stores a file path name. The native api permits much larger names, 32000 characters. You can bypass the Win32 restriction by using the path name using the native api format. Which is simply the same path name as you are familiar with, but prefixed with \\?\.
So surely the reason that you got such a string back from GetCurrentDirectory() is because your current directory name is longer than 259 characters. Extrapolating further, GetDiskFreeSpaceEx() failed because it has a bug, it rejects the long name it sees when you pass NULL. Somewhat understandable, it isn't normally asked to deal with long names. Everybody just passes the drive name.
This is fairly typical for what happens when you create directories with such long names. Stuff just starts falling over randomly. In general there is a lot of C code around that uses MAX_PATH and that code will fail miserably when it has to deal with path names that are longer than that. This is a pretty exploitable problem too for its ability to create stack buffer overflow in a C program, technically a carefully crafted file name could be used to manipulate programs and inject malware.
There is no real cure for this problem, that bug in GetDiskFreeSpaceEx() isn't going to be fixed any time soon. Delete that directory, it can cause lots more trouble, and write this off as a learning experience.

I am pretty sure you will have to retrieve the current drive and directory and pass that to the function. I remember attempting to use GetDiskFreeSpaceEx() with the directory name as ".", but that did not work.

VC++ Resource Files and Lengthy String Resources

In our app we have resource strings that are apparently too long for the compiler. The build breaks stating the "line length is too long." I have found little information about the topic of lengthy string resources and even had a difficult time finding what the limit on such a resource string is. Eventually I found this article which gives the limit: MSDN . Have you had any expierence with limits on string resources?
Is there some way to concatonate these without doing any coding?
Any other suggestions would be greatly appriecated.

I would have a look at RCDATA resources. I used it to store large text files in my application.
Edit: Here is my MFC code, it should be able to give you some pointers.
CString CWSApplication::LoadTextResource(UINT nID)
{
HRSRC hResInfo;
HGLOBAL hResData;
hResInfo = ::FindResource(AfxGetResourceHandle(),
MAKEINTRESOURCE(nID),
RT_RCDATA);
if ( hResInfo == NULL )
{
return CString();
}
hResData = ::LoadResource(NULL, hResInfo);
if ( hResData == NULL )
{
return CString();
}
char *data = (char*)(::LockResource(hResData));
DWORD len = ::SizeofResource(NULL, hResInfo);
return CString(data, len);
}

The string resources are designed to store essentially UI-related resources and messages to be shown to the user; this way an application can be internationalized switching from one DLL containing strings for language A to another DLL containing the same string IDs for another language B. I recommend to review for what purpose are you using string resources. If you intend to store large data, use a custom binary resource in the RC. Later you can interpret it as you want.

You can embed a text file into the resource, load it and use it inside CString.

You need to use a custom data (RCDATA) to avoid such a limitation. Basically by using a binary field the compiler leaves your data alone and doesn't try to "massage" it. On the other hand, if you have string resources they are subject to getting merged (to conserve space, if you set that compiler option that is) and are stored in typically stored in a special section in the image. So you want to avoid all that and tell the compiler to "just store" your data. Use RCDATA, you already have sample code to extract it.

You may not use resource files for storing your lengthy strings.
Instead, you may put all your huge strings into say a XML file and read the string as and when you need. If you want NLS support you can also have language specific files.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js