safe convert uint8_t into wchar_t - c++

This is my very first native extension between AIR and C++, and in C++ programming as well, so I am not sure where to look for help with type conversions.
From AIR side I receive a string in uint8_t (FRE... function call). I need to use this string for writing to registry, where wchar_t is expected.
I am not sure if I can simply make a type casting, or make new wchar[] array of the same size and copy characters or do I need to use someting like MultiByteToWideChar() to convert UTF-8 characters to wchar_t. Im not sure about the correct size of output array in this case.
...
FREObject retObj = NULL;
uint32_t strLength = 0;
const uint8_t *path8 = NULL;
// this is what I get from actionscript call
FREResult status = FREGetObjectAsUTF8(argv[ARG_PATH], &strLength, &path8);
if ((status != FRE_OK) || (strLength <= 0) || (path8 == NULL)) return retObj;
LONG hStatus;
HKEY hKey;
wchar_t *path;
// ??? how to copy uint_t into wchar_t
hStatus = RegOpenKeyEx(HKEY_CURRENT_USER, path, 0, KEY_ALL_ACCESS, &hKey);
...

std::wstring_convert<std::codecvt_utf8<wchar_t>, wchar_t> cv;
std::wstring dst = cv.from_bytes(src);

Related

Issue with converting a filesystem path to a const BYTE*

What I'm trying to achieve is getting my program executed on Windows startup using the Registry. When I try to put the file location, the compiler complains it cannot convert from filesystem::path to const BYTE*. I have no idea how to fix this, since I'm a beginner when it comes to C++. I have provided the code below:
HKEY newValue;
RegOpenKey(HKEY_CURRENT_USER, "Software\\Microsoft\\Windows\\CurrentVersion\\Run", &newValue);
RegSetValueEx(newValue, "myprogram", 0, REG_SZ, fs::temp_directory_path().append(filename), sizeof("tes3t")); // This line is the issue. fs::temp_directory_path().append(filename)
RegCloseKey(newValue);
return 0;
EXCEPTION: No suitable conversion function from "std:filesystem::path" to "const BYTE *" exists
Per the RegSetValueExA() documentation, the function does not accept a std::filesystem::path object. That is what the error message is complaining about.
LSTATUS RegSetValueExA(
HKEY hKey,
LPCSTR lpValueName,
DWORD Reserved,
DWORD dwType,
const BYTE *lpData, // <-- here
DWORD cbData
);
The 5th parameter takes a const BYTE* pointer to a null-terminated C-style string. The 6th parameter takes the number of characters in the string, including the null terminator:
lpData
The data to be stored.
For string-based types, such as REG_SZ, the string must be null-terminated. With the REG_MULTI_SZ data type, the string must be terminated with two null characters.
Note lpData indicating a null value is valid, however, if this is the case, cbData must be set to '0'.
cbData
The size of the information pointed to by the lpData parameter, in bytes. If the data is of type REG_SZ, REG_EXPAND_SZ, or REG_MULTI_SZ, cbData must include the size of the terminating null character or characters.
std::filesystem::path does not have an implicit conversion to const BYTE*, hence the compiler error. You need to explicitly convert the path to a std::string or std::wstring first (prefer the latter, since the Registry stores strings as Unicode internally), before you can then save that string value to the Registry, eg:
// using std::string...
HKEY newValue;
// don't use RegOpenKey()! It is provided only for backwards compatibility with 16bit apps...
if (RegOpenKeyExA(HKEY_CURRENT_USER, "Software\\Microsoft\\Windows\\CurrentVersion\\Run", 0, KEY_SET_VALUE, &newValue) == 0)
{
// this may lose data for non-ASCII characters!
std::string s = fs::temp_directory_path().append(filename).string();
// this will convert the ANSI string to Unicode for you...
RegSetValueExA(newValue, "myprogram", 0, REG_SZ, reinterpret_cast<LPCBYTE>(s.c_str()), s.size()+1);
RegCloseKey(newValue);
}
return 0;
// using std::wstring...
HKEY newValue;
// don't use RegOpenKey()! It is provided only for backwards compatibility with 16bit apps...
if (RegOpenKeyExW(HKEY_CURRENT_USER, L"Software\\Microsoft\\Windows\\CurrentVersion\\Run", 0, KEY_SET_VALUE, &newValue) == 0)
{
// no data loss here!
std::wstring s = fs::temp_directory_path().append(filename).wstring();
// no ANSI->Unicode conversion is performed here...
RegSetValueExW(newValue, L"myprogram", 0, REG_SZ, reinterpret_cast<LPCBYTE>(s.c_str()), (s.size()+1) * sizeof(WCHAR));
RegCloseKey(newValue);
}
return 0;
The WinAPI functions do not take arguments of std::filesystem::path type so you need to convert it to a const BYTE* somehow.
This is one example:
std::string fullpath = (fs::temp_directory_path() / filename).string();
RegSetValueEx(
newValue,
"myprogram",
0,
REG_SZ,
reinterpret_cast<LPCBYTE>(fullpath.c_str()), // returns a "const char*" then cast
fullpath.size() + 1 // + 1 for null terminator
);

C++ - getting null values in value read from registry

My application properly reads and writes to the registry. Now, I need to read a registry value from:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Cryptography\MachineGuid
Here is my code:
bool GetWindowsID(string &winID)
{
HKEY hKey = 0, hKeyType = HKEY_LOCAL_MACHINE;
bool status = false;
DWORD dwType = 0;
DWORD dwBufSize = 256;
char value[256] = "\0";
if (RegOpenKeyEx(hKeyType, L"SOFTWARE\\Microsoft\\Cryptography", NULL, KEY_QUERY_VALUE|KEY_WOW64_64KEY, &hKey) == ERROR_SUCCESS)
{
dwType = REG_SZ;
if (RegQueryValueEx(hKey, L"MachineGuid", NULL, &dwType, (LPBYTE)value, &dwBufSize) == ERROR_SUCCESS)
status = true;
RegCloseKey(hKey);
}
winID.assign(value);
return status;
}
I get the guid but in value array after each character their is a "\0" value due to which only first character of array gets assigned to string. This is wierd!
You have built targeting Unicode and so the registry API is returning UTF-16 Unicode text. Instead of char use wchar_t and remember that each wchar_t element is 2 bytes wide.
Do also make sure that you account for the returned string not being null-terminated, as described in the documentation. You must take account of the value returned in dwBufSize.
I get the guid but in value array after each character their is a "\0"
value due to which only first character of array gets assigned to
string. This is wierd!
This is because you are calling the Unicode version of RegQueryValueEx(), so the string is returned in Unicode (UTF-16).
You will have to use wide character parameters to get the value.
Change this line:
char value[256] = "\0";
To use wchar_t instead.

Why obtained MachineGuid looks not alike a GUID but like Korean?

I created a simple function:
std::wstring GetRegKey(const std::string& location, const std::string& name){
const int valueLength = 10240;
auto platformFlag = KEY_WOW64_64KEY;
HKEY key;
TCHAR value[valueLength];
DWORD bufLen = valueLength*sizeof(TCHAR);
long ret;
ret = RegOpenKeyExA(HKEY_LOCAL_MACHINE, location.c_str(), 0, KEY_READ | platformFlag, &key);
if( ret != ERROR_SUCCESS ){
return std::wstring();
}
ret = RegQueryValueExA(key, name.c_str(), NULL, NULL, (LPBYTE) value, &bufLen);
RegCloseKey(key);
if ( (ret != ERROR_SUCCESS) || (bufLen > valueLength*sizeof(TCHAR)) ){
return std::wstring();
}
std::wstring stringValue(value, (size_t)bufLen - 1);
size_t i = stringValue.length();
while( i > 0 && stringValue[i-1] == '\0' ){
--i;
}
return stringValue;
}
And I call it like auto result = GetRegKey("SOFTWARE\\Microsoft\\Cryptography", "MachineGuid");
yet string looks like
㤴ㄷ㤵戰㌭㉣ⴱ㔴㍥㤭慣ⴹ㍥摢㘵〴㉡ㄵ\0009ca9-e3bd5640a251
not like RegEdit
4971590b-3c21-45e3-9ca9-e3bd5640a251
So I wonder what shall be done to get a correct representation of MachineGuid in C++?
RegQueryValueExA is an ANSI wrapper around the Unicode version since Windows NT. When building on a Unicode version of Windows, it not only converts the the lpValueName to a LPCWSTR, but it will also convert the lpData retrieved from the registry to an LPWSTR before returning.
MSDN has the following to say:
If the data has the REG_SZ, REG_MULTI_SZ or REG_EXPAND_SZ type, and
the ANSI version of this function is used (either by explicitly
calling RegQueryValueExA or by not defining UNICODE before including
the Windows.h file), this function converts the stored Unicode string
to an ANSI string before copying it to the buffer pointed to by
lpData.
Your problem is that you are populating the lpData, which holds TCHARs (WCHAR on Unicode versions of Windows) with an ANSI string.
The garbled string that you see is a result of 2 ANSI chars being used to populate a single wchar_t. That explains the Asian characters. The portion that looks like the end of the GUID is because the print function blew past the terminating null since it was only one byte and began printing what is probably a portion of the buffer that was used by RegQueryValueExA before converting to ANSI.
To solve the problem, either stick entirely to Unicode, or to ANSI (if you are brave enough to continue using ANSI in the year 2014), or be very careful about your conversions. I would change GetRegKey to accept wstrings and use RegQueryValueExW instead, but that is a matter of preference and what sort of code you plan on using this in.
(Also, I would recommend you have someone review this code since there are a number of oddities in the error checking, and a hard coded buffer size.)

One file lib to conv utf8 (char*) to wchar_t?

I am using libjson which is awesome. The only problem I have is I need to convert an utf8 string (char*) to a wide char string (wchar_t*). I googled and tried 3 different libs and they ALL failed (due to missing headers).
I don't need anything fancy. Just a one way conversion. How do I do this?
If you're on windows (which, chances are you are, given your need for wchar_t), use MultiByteToWideChar function (declared in windows.h), as so:
int length = MultiByteToWideChar(CP_UTF8, 0, src, src_length, 0, 0);
wchar_t *output_buffer = new wchar_t [length];
MultiByteToWideChar(CP_UTF8, 0, src, src_length, output_buffer, length);
Alternatively, if all you're looking for is a literal multibyte representation of your UTF8 (which is improbable, but possible), use the following (stdlib.h):
wchar_t * output_buffer = new wchar_t [1024];
int length = mbstowcs(output_buffer, src, 1024);
if(length > 1024){
delete[] output_buffer;
output_buffer = new wchar_t[length+1];
mbstowcs(output_buffer, src, length);
}
Hope this helps.
the below successfully enables CreateDirectoryW() to write to C:\Users\ПетрКарасев , basically an easier-to-understand wrapper around the MultiByteTyoWideChar mentioned by someone earlier.
std::wstring utf16_from_utf8(const std::string & utf8)
{
// Special case of empty input string
if (utf8.empty())
return std::wstring();
// Шаг 1, Get length (in wchar_t's) of resulting UTF-16 string
const int utf16_length = ::MultiByteToWideChar(
CP_UTF8, // convert from UTF-8
0, // default flags
utf8.data(), // source UTF-8 string
utf8.length(), // length (in chars) of source UTF-8 string
NULL, // unused - no conversion done in this step
0 // request size of destination buffer, in wchar_t's
);
if (utf16_length == 0)
{
// Error
DWORD error = ::GetLastError();
throw ;
}
// // Шаг 2, Allocate properly sized destination buffer for UTF-16 string
std::wstring utf16;
utf16.resize(utf16_length);
// // Шаг 3, Do the actual conversion from UTF-8 to UTF-16
if ( ! ::MultiByteToWideChar(
CP_UTF8, // convert from UTF-8
0, // default flags
utf8.data(), // source UTF-8 string
utf8.length(), // length (in chars) of source UTF-8 string
&utf16[0], // destination buffer
utf16.length() // size of destination buffer, in wchar_t's
) )
{
// не работает сука ...
DWORD error = ::GetLastError();
throw;
}
return utf16; // ура!
}
Here is a piece of code i wrote. It seems to work well enough. It returns 0 on utf8 error or when the value is > FFFF (which cant be held by a wchar_t)
#include <string>
using namespace std;
wchar_t* utf8_to_wchar(const char*utf8){
wstring sz;
wchar_t c;
auto p=utf8;
while(*p!=0){
auto v=(*p);
if(v>=0){
c = v;
sz+=c;
++p;
continue;
}
int shiftCount=0;
if((v&0xE0) == 0xC0){
shiftCount=1;
c = v&0x1F;
}
else if((v&0xF0) == 0xE0){
shiftCount=2;
c = v&0xF;
}
else
return 0;
++p;
while(shiftCount){
v = *p;
++p;
if((v&0xC0) != 0x80) return 0;
c<<=6;
c |= (v&0x3F);
--shiftCount;
}
sz+=c;
}
return (wchar_t*)sz.c_str();
}
The following (untested) code shows how to convert a multibyte string in your current locale into a wide string. So if your current locale is UTF-8, then this will suit your needs.
const char * inputStr = ... // your UTF-8 input
size_t maxSize = strlen(inputStr) + 1;
wchar_t * outputWStr = new wchar_t[maxSize];
size_t result = mbstowcs(outputWStr, inputStr, maxSize);
if (result == -1) {
cerr << "Invalid multibyte characters in input";
}
You can use setlocale() to set your locale.

RegQueryValueExW only brings back one value from registry

I am querying the registry on Windows CE. I want to pull back the DhcpDNS value from the TcpIp area of the registry, which works.
What happens though, however, is if there is two values - displayed as "x.x.x.x" "x.x.x.x" in my CE registry editor - then it only brings back one of them. I am sure this is a silly mistake but I am unsure why it is happening.
Here is the code I am using
std::string ISAPIConfig::GetTcpIpRegSetting(const std::wstring &regEntryName)
{
HKEY hKey = 0;
HKEY root = HKEY_LOCAL_MACHINE;
LONG retVal = 0;
wchar_t buffer[3000];
DWORD bufferSize = 0;
DWORD dataType = 0;
std::string dataString = "";
//Open IP regkey
retVal = RegOpenKeyExW(root, L"Comm\\PCI\\FETCE6B1\\Parms\\TcpIp", 0, 0, &hKey);
//Pull out info we need
memset(buffer, 0, sizeof(buffer));
bufferSize = sizeof(buffer);
retVal = RegQueryValueExW(hKey, regEntryName.c_str(), 0, &dataType, reinterpret_cast<LPBYTE>(buffer), &bufferSize);
Unicode::UnicodeToAnsi(buffer, dataString);
return dataString;
}
void UnicodeToAnsi(const std::wstring &wideString, std::string &ansiString){
std::wostringstream converter;
std::ostringstream converted;
std::wstring::const_iterator loop;
for(loop = wideString.begin(); loop != wideString.end(); ++loop){
converted << converter.narrow((*loop));
}
ansiString = converted.str();
}
The value is a multi_sz, which is in the format:
{data}\0{data}\0\0
I don't know what the Unicode::UnicodeToAnsi does, but it's likely just looking for that first null terminator and stopping there. You have to parse past single nulls until you hit the double-null.
EDIT
You have to update your code - very likely your interfaces. Right now you're trying to returns a string for a multi_sz which, by definition, means multiple strings. you probably want to returns a string[] (though I'd probably opt to use a couple output parameters - one that's an array pointer and the other that is a element count).
You then need to loop through the data that came back from the RegQuery call, something maybe like this (off the top of my head, not tested or compiled):
TCHAR *p = buffer;
if(bufferSize > 0)
{
do
{
Unicode::UnicodeToAnsi(p, dataString);
// do something with dataString - store it in an array or whatever
p+= _tcslen(p);
} while((p-buffer) < bufferSize)
}