Set Registry Value to a Wide Character String (WCHAR) in C++ - c++

I'm trying to add a wide character string to registry in C++. The problem is that the RegSetValueEx() function does not support wide chars, it only supports BYTE type (BYTE = unsigned char).
WCHAR myPath[] = "C:\\éâäà\\éâäà.exe"
RegSetValueExA(HKEY_CURRENT_USER, "MyProgram", 0, REG_SZ, myPath, sizeof(myPath)); // error: cannot convert argument 5 from WCHAR* to BYTE*
And please don't tell me I should convert WCHAR to BYTE because characters such as é and â can't be stored as 8 bit characters.
I'm sure this is possible because I tried opening regedit and adding a new key with value C:\\éâäà\\éâäà.exe and it worked. I wonder how other programs can add themselves to startup on a Russian or Chinese computer.
Is there another way to do so? Or is there a way to format wide character path using wildcards?
Edit: The Unicode version of the function RegSetValueExW() only changes the type of the second argument.

You are calling RegSetValueExA() when you should be calling RegSetValueExW() instead. But in either case, RegSetValueEx() writes bytes, not characters, that is why the lpData parameter is declared as BYTE*. Simply type-cast your character array. The REG_SZ value in the dwType parameter will let RegSetValueEx() know that the bytes represent a Unicode string. And make sure to include the null terminator in the value that you pass to the cbData parameter, per the documentation:
cbSize [in]
The size of the information pointed to by the lpData parameter, in bytes. If the data is of type REG_SZ, REG_EXPAND_SZ, or REG_MULTI_SZ, cbData must include the size of the terminating null character or characters.
For example:
WCHAR myPath[] = L"C:\\éâäà\\éâäà.exe";
RegSetValueExW(HKEY_CURRENT_USER, L"MyProgram", 0, REG_SZ, (LPBYTE)myPath, sizeof(myPath));
Or:
LPCWSTR myPath = L"C:\\éâäà\\éâäà.exe";
RegSetValueExW(HKEY_CURRENT_USER, L"MyProgram", 0, REG_SZ, (LPCBYTE)myPath, (lstrlenW(myPath) + 1) * sizeof(WCHAR));
That being said, you should not be writing values to the root of HKEY_CURRENT_USER itself. You should be writing to a subkey instead, eg:
WCHAR myPath[] = L"C:\\éâäà\\éâäà.exe";
if (RegCreateKeyEx(HKEY_CURRENT_USER, L"Software\\MyProgram", 0, NULL, REG_OPTION_NON_VOLATILE, KEY_SET_VALUE, NULL, &hKey, NULL) == 0)
{
RegSetValueExW(hKey, L"MyValue", 0, REG_SZ, (LPBYTE)myPath, sizeof(myPath));
RegCloseKey(hKey);
}

It seems to me you're trying to use the narrow/non-wide-char version of that function, which will only support ASCII. How about trying RegSetValueExW? Maybe you should also look up how the Windows API tries to supports ASCII and UNICODE as transparently as possible.

Edit: The Unicode version of the function RegSetValueExW() only changes the type of the second argument.
No it does not.
REG_SZ: A null-terminated string. This will be either a Unicode or an ANSI string, depending on whether you use the Unicode or ANSI functions.
From here:
https://learn.microsoft.com/en-us/windows/win32/sysinfo/registry-value-types

Related

Is there a way to get an application's path to add it to the registry automatically and run alongside Windows startup in C++?

I'm developing an application and would like to know if there's a way to get it's executable path automatically and run alongside Windows startup by adding it to the registry.
This is my function so far:
void Open(){
HKEY hKey;
WCHAR path[MAX_PATH]; //to store the directory
DWORD size = GetModuleFileNameW(NULL, path, MAX_PATH);
const char* StartName = "MyApplication";
LONG lnRes = RegOpenKeyEx( HKEY_CURRENT_USER,
"SOFTWARE\\Microsoft\\Windows\\CurrentVersion\\Run",
0 , KEY_WRITE,
&hKey);
if( ERROR_SUCCESS == lnRes )
{
lnRes = RegSetValueEx( hKey,
StartName,
0,
REG_SZ,
(LPBYTE)path,
size );
}
RegCloseKey(hKey);
}
I'm using GetModuleFileName to get the path, but it returns me the path with a single backslash and in the registry it only recognizes the "D" drive. For example: D:\Usuario\Desktop\log\mariobros.exe
https://prnt.sc/vondsi (Here's a print from my registry)
I suspect that the problem is that for the code to be recognized as a single backslash it needs to have a double backslash. This is how I think it should've need to be: D:\\Usuario\\Desktop\\log\\mariobros.exe
Does anyone know what could I do here?
Thanks in advance.
You are clearly compiling with UNICODE undefined in your project, which means RegOpenKeyEx() and RegSetValueEx() are actually calling the ANSI functions RegOpenKeyExA() and RegSetValueExA(), respectively (as evident by you being able to pass char* strings to them without compiler errors).
But, you are retrieving the file path as a Unicode UTF-16 string and passing it as-is to RegSetValueExA(), so you end up with embedded nul characters written to the Registry when RegSetValueExA() misinterprets your UTF-16 string as an ANSI string and re-encodes each of its bytes individually to Unicode characters. Unicode characters in the ASCII range have nul bytes in them.
Since you are using a Unicode function to retrieve the file path, and because the Registry internally stores strings in Unicode form only, you should use the Registry's Unicode functions to match that same encoding.
Also, note that the return value of GetModuleFileName(A|W) does not include the null terminator in the output string's length, but RegSetValueEx(A|W) expects the cbSize parameter to include enough bytes for a null terminator for REG_(EXPAND_|MULTI_)SZ value types.
Try this:
void Open()
{
WCHAR path[MAX_PATH]; //to store the directory
DWORD size = GetModuleFileNameW(NULL, path, MAX_PATH);
if ((size > 0) && (size < MAX_PATH))
{
HKEY hKey;
LONG lnRes = RegOpenKeyExW(HKEY_CURRENT_USER,
L"SOFTWARE\\Microsoft\\Windows\\CurrentVersion\\Run",
0, KEY_SET_VALUE,
&hKey);
if( ERROR_SUCCESS == lnRes )
{
lnRes = RegSetValueExW(hKey,
L"MyApplication",
0,
REG_SZ,
(LPBYTE)path,
(size + 1) * sizeof(WCHAR) );
RegCloseKey(hKey);
}
}
}
This looks like you are passing a wide string, while promising that it is a narrow string (evident by your C-style cast).
The second byte in a wide string is 0, and this terminates your narrow string.
Suggestion: use wide strings only while dealing with Win API.

C++: Trying to create a Run key, All i get is Chinese characters in registry.

Please save me! I am new to this, trying to figure this out. I would like to have my program add a run key to run itself on startup . Here is "my" code:
HKEY hKey = 0;
RegOpenKeyEx( HKEY_LOCAL_MACHINE,
L"Software\\Microsoft\\Windows\\CurrentVersion\\Run",
0,
KEY_ALL_ACCESS,
&hKey );
const unsigned char Path[ MAX_PATH ] = "C:\\test.exe";
RegSetValueEx( hKey, L"Testing", 0, 1, Path, strlen("C:\\test.exe") );
RegCloseKey(hKey);
This "works" except they key added reads "㩃瑜獥⹴硥" under data . Took me a while to figure out that the key is going to WoW6432Node too, thought it complied but wasn't working for the first 5 hours, much head to wall action there...
I am sure this has something to do with the way my string is formatted, ANSII vs ASCII vs the other 10 types of strings C++ doesn't seem to be able to convert between... I've tried using (BYTE*)"C:\virus.exe" and anything else i could think of... If i set the length to 1, the first character shows fine. But if its any other length, Chinese starts to show again.
Please help! I am about ready to start choking kittens here!
The problem is this:
const unsigned char Path[ MAX_PATH ] = "C:\\test.exe";
You have defined an ANSI string and then attempted to use the Unicode (UTF-16) version of RegSetValueEx:
RegSetValueEx( hKey, L"Testing", 0, 1, Path, strlen("C:\\test.exe") );
Under the hood, RegSetValueEx is a macro that aliases to RegSetValueExW because you defined the macro UNICODE.
The correct solution is to use a Unicode string literal:
const wchar_t Path[] = L"C:\\test.exe";
RegSetValueEx( hKey, L"Testing", 0, 1, (const BYTE *) Path, sizeof(Path) );
Here I used sizeof because the string is an array of characters whose size is known at compile time. For dynamic strings, use (wcslen(Path) + 1) * sizeof(*Path) instead.
Note: There is no need to specify the length of a constant literal in the declaration because the compiler can automatically deduce that in this specific scenario. It's also bad idea to duplicate the string literal inside your strlen/wcslen because if it goes out of sync your code could be broken and trigger undefined behavior.

Number of bytes of CString in C++

I have a Unicode string stored in CString and I need to know the number bytes this string takes in UTF-8 encoding. I know CString has a method getLength(), but that returns number of characters, not bytes.
I tried (beside other things) converting to char array, but I get (logically, I guess) only array of wchar_t, so this doesn't solve my problem.
To be clear about my goal. For the input lets say "aaa" I want "3" as output (since "a" takes one byte in UTF-8). But for the input "āaa", I'd like to see output "4" (since ā is two byte character).
I think this has to be quite common request, but even after 1,5 hours of search and experimenting, I couldn't find the correct solution.
I have very little experience with Windows programming, so maybe I left out some crucial information. If you feel like that, please let me know, I'll add any information you request.
As your CString contains a series of wchar_t, you can just use WideCharToMultiByte with the output charset as CP_UTF8. The function will return the number of bytes written to the output buffer, or the length of the UTF-8 encoded string
LPWSTR instr;
char outstr[MAX_OUTSTR_SIZE];
int utf8_len = WideCharToMultiByte(CP_UTF8, 0, instr, -1, outstr, MAX_OUTSTR_SIZE, NULL, NULL);
If you don't need the output string, you can simply set the output buffer size to 0
cbMultiByte
Size, in bytes, of the buffer indicated by lpMultiByteStr. If this parameter is set to 0, the function returns the required buffer size for lpMultiByteStr and makes no use of the output parameter itself.
In that case the function will return the number of bytes in UTF-8 without really outputting anything
int utf8_len = WideCharToMultiByte(CP_UTF8, 0, instr, -1, NULL, 0, NULL, NULL);
If your CString is really CStringA, i.e. _UNICODE is not defined, then you need to use Multi­Byte­To­Wide­Char to convert the string to UTF-16 and then convert from UTF-16 to UTF-8 with Wide­Char­To­Multi­byte. See How do I convert an ANSI string directly to UTF-8? But new code should never be compiled without Unicode support anyway

How to find if a character belongs to a particular codepage using c++ or calling winapi

How can we find if a character belongs to a particular codepage?
or How can we determine whether a charcter fits into currently active IME for an application.
First, Convert your UTF-8 string of characters to UTF-16 using MultiByteToWideChar
Now, reverse the process using WideCharToMultiByte passing the desired codepage as the first parameter.
Use the WC_ERR_INVALID_CHARS flag and WideCharToMultiByte will fail outright if any invalid characters are used. If you want to know which characters are not represented in the target codepage, use the lpDefaultChar, and lpUsedDefaultChar parameters.
LPCWSTR pszUtf16; // converted from utf8 source character
UINT nTargetCP = CP_ACP;
BOOL fBadCharacter = FALSE;
if(WideCharToMultiByte(nTargetCP,WC_NO_BEST_FIT_CHARS,pszUtf16,NULL,0,NULL,&fBadCharacter)
{
if(fBadCharacter)
{
// at least one character in the string was not represented in nTargetCP
}
}
The two previous answers have correctly suggested using MultiByteToWideChar then WideCharToMultiByte to translate your UTF-8 character to UTF-16, then to the current Windows codepage (CP_ACP). Check the result of WideCharToMultiByte to see if the conversion was successful.
What wasn't clear from the original question, is that you are having a particular issue with Hindi. For this language, your question is meaningless because there is no Windows ANSI codepage for Hindi, as Chris Becke pointed out. Therefore, you can never convert a Hindi character to CP_ACP, and WideCharToMultiByte will always fail.
To use Hindi on Windows, as far as I understand it, you must be a Unicode app that calls Unicode APIs.
Using the windows functions WideCharToMultiByte and MultiByteToWideChar you can convert between UTF-8 and 16-bit Unicode characters. The functions have arguments to specify the code page and to specify the behavior if an invalid character is encountered.
Thanks Chris..I am running the following code
#define CP_HINDI 0
#define CP_JAPANESE 932
#define CP_ENGLISH 1252
wchar_t wcsStringJapanese = 'あ';
wchar_t wcsStringHindi = 'र';
wchar_t wcsStringEnglish = 'A';
int main()
{
BOOL usedDefaultCharacter = FALSE;
/* Test for ENGLISH */
WideCharToMultiByte( CP_ENGLISH,
0, &wcsStringEnglish,
-1,
NULL,
0,
NULL,
&usedDefaultCharacter);
printf("usedDefaultCharacters for English? %d \n",usedDefaultCharacter);
usedDefaultCharacter = FALSE;
/*TEST FOR JAPANESE */
WideCharToMultiByte( CP_JAPANESE,
0,
&wcsStringJapanese,
-1,
NULL,
0,
NULL,
&usedDefaultCharacter);
printf("usedDefaultCharacters for Japanese? %d \n",usedDefaultCharacter);
//TEST FOR HINDI
usedDefaultCharacter = FALSE;
WideCharToMultiByte( CP_HINDI,
0,
&wcsStringHindi,
-1,
NULL,
0,
NULL,
&usedDefaultCharacter);
printf("usedDefaultCharacters for Hindi? %d \n",usedDefaultCharacter);
}
The above code returns:
usedDefaultCharacters for English? 0
usedDefaultCharacters for Japanese? 0
usedDefaultCharacters for Hindi? 1
The third line is incorrect as the Codepage for Hindi is 0 , and the string passed consists of Hindi Character and still the usedDefaultChar is set to 1 .. which should not be the case.

convert std::string to const BYTE* for RegSetValueEx()

I have a function that gets a std::string. That function calls
RegSetValueEx
the 5th parameter is the value of the registry value and expects a variable of type const BYTE*.
So I have to convert the std::string to const BYTE* and also give the length of the resulting array as the 6th parameter.
I have found a way to do it, but it feels ugly and I don't really understand what is going on. Here is a slimmed down version of that function:
void function(const std::string& newValue)
{
HKEY keyHandle;
if(RegOpenKeyEx(HKEY_CLASSES_ROOT, TEXT("some key"),0,KEY_ALL_ACCESS,&keyHandle) == ERROR_SUCCESS)
{
std::wstring wNewValue;
wNewValue.assign(newValue.begin(),newValue.end());
if (RegSetValueEx(keyHandle, TEXT("some value"), NULL, REG_SZ, (const BYTE*)(LPCTSTR)(wNewValue.c_str()), wNewValue.size()*2)==ERROR_SUCCESS)
{
//do something
}
RegCloseKey(keyHandle);
}
}
As you can see, i first make a wide string (UNICODE is defined), then use a double cast, and for the length i have to do *2, else it will only set half of the input string.
Is this form of cast the normal/best way to do it?
Why the * 2, what would be a better way?
void function(const std::string& newValue)
{
HKEY keyHandle;
if(RegOpenKeyEx(HKEY_CLASSES_ROOT, TEXT("some key"),0,KEY_ALL_ACCESS,&keyHandle) == ERROR_SUCCESS)
{
if (RegSetValueExA(keyHandle, "some value", NULL, REG_SZ, (const BYTE*)newValue.c_str(), newValue.size() + 1)==ERROR_SUCCESS)
{
//do something
}
RegCloseKey(keyHandle);
}
}
I removed the part where you convert your string to a wstring, instead you'll be using the ANSI version of RegSetValueEx explicitly.
quote from RegSetValueEx remarks in MSDN:
If dwType is the REG_SZ, REG_MULTI_SZ,
or REG_EXPAND_SZ type and the ANSI
version of this function is used
(either by explicitly calling
RegSetValueExA or by not defining
UNICODE before including the Windows.h
file), the data pointed to by the
lpData parameter must be an ANSI
character string. The string is
converted to Unicode before it is
stored in the registry.
Also note that the cbData parameter should include the size of the null termination aswell.
The * 2 is because RegSetValueEx wants to know the number of bytes to write und each char (wchar_t) in a wstring is two bytes wide. So the resulting byte-array has twice the size!
Shouldn't it be wNewValue.size()*2+2 ? The +2 for the null character?
MSDN says: The size of the information pointed to by the lpData parameter, in bytes. If the data is of type REG_SZ, REG_EXPAND_SZ, or REG_MULTI_SZ, cbData must include the size of the terminating null character or characters.
You could also copy the unicode string into a byte array:
LPWSTR pData = L"SampleGrabber";
int dwSize = wcslen(pData)*sizeof(TCHAR);
BYTE slump[256];
memset((void*) slump, 0, 256*sizeof(BYTE));
memcpy((void*) slump, (const void *) pData, dwSize*sizeof(BYTE));
globit = RegSetValueEx(hKey, NULL, 0, REG_SZ, (const BYTE*) slump, dwSize);
If you had the misfortune to be writing code for a WINCE 5.0 device and it lacked part of the regedit API.