Convert between wstring and string , got different results with "same" way

Convert between wstring and string , got different results with "same" way - c++

I use a function s2ws() (search from the SO,if you find something wrong please let me know)convert from string to wstring,then I use tinyxml2 to read something from xml.As we all know ,some of tinyxml2 interface use char * as input so does the return value.
The reason why convert from string to wstring is the project all using wchar_t types to deal with string.
/*
string converts to wstring
*/
std::wstring s2ws(const std::string& src)
{
std::wstring res = L"";
size_t const wcs_len = mbstowcs(NULL, src.c_str(), 0);
std::vector<wchar_t> buffer(wcs_len + 1);
mbstowcs(&buffer[0], src.c_str(), src.size());
res.assign(buffer.begin(), buffer.end() - 1);
return res;
}
/*
wstring converts to string
*/
std::string ws2s(const std::wstring & src)
{
setlocale(LC_CTYPE, "");
std::string res = "";
size_t const mbs_len = wcstombs(NULL, src.c_str(), 0);
std::vector<char> buffer(mbs_len + 1);
wcstombs(&buffer[0], src.c_str(), buffer.size());
res.assign(buffer.begin(), buffer.end() - 1);
return res;
}
The ClassES-Attribute will return char *,funciton s2ws will convert string to wstring. These two ways got different result in map m_UpdateClassification. The second method is between #if 0 and #endif. But I thinks these two ways should make no difference.
The second method will got empty string after convert,can not figure out why,If you have any clue,please let me know.
typedef std::map<std::wstring, std::wstring> CMapString;
CMapString m_UpdateClassification;
const wchar_t * First = NULL;
const wchar_t * Second = NULL;
const char *name = ClassES->Attribute( "name" );
const char *value = ClassES->Attribute( "value" );
std::wstring wname = s2ws(name);
std::wcout<< wname << std::endl;
First = wname.c_str();
std::wstring wvalue = s2ws(value);
std::wcout<< wvalue << std::endl;
Second = wvalue.c_str();
#if 0
First = s2ws(ClassES->Attribute( "name" )).c_str();
if( !First ) { m_ProdectFamily.clear(); return FALSE; }
Second = s2ws(ClassES->Attribute( "value" )).c_str();
if( !Second ) { m_ProdectFamily.clear(); return FALSE; }
#endif
m_UpdateClassification[Second] = First;

I think I found the reason,I assgin wchar_t * to wstring,After modfiy code like this,everything run well.
std::wstring First = L"";
std::wstring Second = L"";
First = s2ws(ClassES->Attribute("name"));
if( First.empty() ) { m_ProdectFamily.clear(); return FALSE; }
Second = s2ws(ClassES->Attribute("value"));
if( Second.empty() ) { m_ProdectFamily.clear(); return FALSE; }
Another question,Should I check the result of s2ws(mbstowcs) ws2s(wcstombs)?

Related

How to read Unicode string from process in Windows?

I'm trying to read a Unicode string from another process's memory with this code:
Function:
bool ReadWideString(const HANDLE& hProc, const std::uintptr_t& addr, std::wstring& out) {
std::array<wchar_t, maxStringLength> outStr;
auto readMemRes = ReadProcessMemory(hProc, (LPCVOID)addr,(LPVOID)&out, sizeof(out), NULL);
if (!readMemRes)
return false;
else {
out = std::wstring(outStr.data());
}
return true;
}
Call:
std::wstring name;
bool res = ReadWideString(OpenedProcessHandle, address, name);
std::wofstream test("test.txt");
test << name;
test.close();
This is working well with English letters, but when I try to read Cyrillic, it outputs nothing. I tried with std::string, but all I get is just a random junk like "EC9" instead of "Дебил".
I'm using Visual Studio 17 and the C++17 standard.

You can't read directly into the wstring the way you are doing. That will overwrite it's internal data members and corrupt surrounding memory, which would be very bad.
You are allocating a local buffer, but you are not using it for anything. Use it, eg:
bool ReadWideString(HANDLE hProc, std::uintptr_t addr, std::wstring& out) {
std::array<wchar_t, maxStringLength> outStr;
SIZE_T numRead = 0;
if (!ReadProcessMemory(hProc, reinterpret_cast<LPVOID>(addr), &outStr, sizeof(outStr), &numRead))
return false;
out.assign(outStr.data(), numRead / sizeof(wchar_t));
return true;
}
std::wstring name;
if (ReadWideString(OpenedProcessHandle, address, name)) {
std::ofstream test("test.txt", std::ios::binary);
wchar_t bom = 0xFEFF;
test.write(reinterpret_cast<char*>(&bom), sizeof(bom));
test.write(reinterpret_cast<const char*>(name.c_str()), name.size() * sizeof(wchar_t));
}
Alternatively, get rid of the local buffer and preallocate the wstring's memory buffer instead, then you can read directly into it, eg:
bool ReadWideString(HANDLE hProc, std::uintptr_t addr, std::wstring& out) {
out.resize(maxStringLength);
SIZE_T numRead = 0;
if (!ReadProcessMemory(hProc, reinterpret_cast<LPVOID>(addr), &out[0], maxStringLength * sizeof(wchar_t), &numRead)) {
out.clear();
return false;
}
out.resize(numRead / sizeof(wchar_t));
return true;
}
Or
bool ReadWideString(HANDLE hProc, std::uintptr_t addr, std::wstring& out) {
std::wstring outStr;
outStr.resize(maxStringLength);
SIZE_T numRead = 0;
if (!ReadProcessMemory(hProc, reinterpret_cast<LPVOID>(addr), &outStr[0], maxStringLength * sizeof(wchar_t), &numRead))
return false;
outStr.resize(numRead / sizeof(wchar_t));
out = std::move(outStr);
return true;
}

Convert C++ std::string to UTF-16-LE encoded string

I've been searching for hours today and just can't find anything that works out for me. The one I've just had a look at, with no luck, is "How to convert UTF-8 encoded std::string to UTF-16 std::string".
My question is, with a brief explanation:
I want to make a valid NTLM hash in std C++, and I'm using OpenSSL's library to create the hash using its MD4 routines. I know how to do that, so does anyone know how to convert the std::string into a UTF-16 LE encoded string which I can pass to the MD4 functions to get a correct digest?
So, can I have a std::string which holds the char type, and convert it to a UTF16-LE encoded variable length std::string_type? Whether that be std::u16string, or std::wstring?
And would I use s.c_str() or s.data() and would the length() function report correctly in both cases?

I think something like this should do the trick:
std::string utf16_to_utf8(std::u16string const& s)
{
std::wstring_convert<std::codecvt_utf8_utf16<char16_t, 0x10ffff,
std::codecvt_mode::little_endian>, char16_t> cnv;
std::string utf8 = cnv.to_bytes(s);
if(cnv.converted() < s.size())
throw std::runtime_error("incomplete conversion");
return utf8;
}
std::u16string utf8_to_utf16(std::string const& utf8)
{
std::wstring_convert<std::codecvt_utf8_utf16<char16_t, 0x10ffff,
std::codecvt_mode::little_endian>, char16_t> cnv;
std::u16string s = cnv.from_bytes(utf8);
if(cnv.converted() < utf8.size())
throw std::runtime_error("incomplete conversion");
return s;
}
Note: that std::wstring_convert is deprecated in C++17 but I still favor using it rather than a non-standard library given that it is portable, has no dependencies and will no doubt remain until replaced.
And, if all else fails, you can reimplement these same functions with alternative code without changing any other part of the application.

Apologies, firsthand... this will be an ugly reply with some long code. I ended up using the following function, while effectively compiling in iconv into my windows application file by file :)
Hope this helps.
char* conver(const char* in, size_t in_len, size_t* used_len)
{
const int CC_MUL = 2; // 16 bit
setlocale(LC_ALL, "");
char* t1 = setlocale(LC_CTYPE, "");
char* locn = (char*)calloc(strlen(t1) + 1, sizeof(char));
if(locn == NULL)
{
return 0;
}
strcpy(locn, t1);
const char* enc = strchr(locn, '.') + 1;
#if _WINDOWS
std::string win = "WINDOWS-";
win += enc;
enc = win.c_str();
#endif
iconv_t foo = iconv_open("UTF-16LE", enc);
if(foo == (void*)-1)
{
if (errno == EINVAL)
{
fprintf(stderr, "Conversion from %s is not supported\n", enc);
}
else
{
fprintf(stderr, "Initialization failure:\n");
}
free(locn);
return 0;
}
size_t out_len = CC_MUL * in_len;
size_t saved_in_len = in_len;
iconv(foo, NULL, NULL, NULL, NULL);
char* converted = (char*)calloc(out_len, sizeof(char));
char *converted_start = converted;
char* t = const_cast<char*>(in);
int ret = iconv(foo,
&t,
&in_len,
&converted,
&out_len);
iconv_close(foo);
*used_len = CC_MUL * saved_in_len - out_len;
if(ret == -1)
{
switch(errno)
{
case EILSEQ:
fprintf(stderr, "EILSEQ\n");
break;
case EINVAL:
fprintf(stderr, "EINVAL\n");
break;
}
perror("iconv");
free(locn);
return 0;
}
else
{
free(locn);
return converted_start;
}
}

Using strtok and get "cannot convert 'char*' to 'char**' in assignment"

I am trying to make a program that reads a string from a file in SPIFFS with 4 tab-separated things and then processes it into four char arrays to be used in another function. However, I get the error cannot convert 'char*' to 'char**' in assignment. Is there any idea why? Here's my code:
#include <string.h>
#include "FS.h"
#include "AdafruitIO_WiFi.h"
char *ssid;
char *pass;
char *aiduser;
char *aidkey;
// comment out the following two lines if you are using fona or ethernet
#include "AdafruitIO_WiFi.h"
//AdafruitIO_WiFi io(IO_USERNAME, IO_KEY, WIFI_SSID, WIFI_PASS);
void setupWifi(char* *aiduser, char* *aidkey, char* *ssid, char* *pass){
#define WIFIFILE "/config.txt"
int addr = 0;
bool spiffsActive = false;
if (SPIFFS.begin()) {
spiffsActive = true;
}
File f = SPIFFS.open(WIFIFILE, "r");
String str;
while (f.position()<f.size())
{
str=f.readStringUntil('\n');
str.trim();
}
// Length (with one extra character for the null terminator)
int str_len = str.length() + 1;
// Prepare the character array (the buffer)
char char_array[str_len];
// Copy it over
str.toCharArray(char_array, str_len);
const char s[2] = {9, 0};
/* get the first token */
aiduser = strtok(char_array, s);
aidpass = strtok(NULL, s);
ssid = strtok(NULL, s);
pass = strtok(NULL, s);
/* walk through other tokens
while( token != NULL ) {
printf( " %s\n", token );
token = strtok(NULL, s);
}*/
// RESULT: A thingy
}
void setup(){
setupWifi(&aiduser, &aidkey, &ssid, &pass);
AdafruitIO_WiFi io(aiduser, aidkey, ssid, pass);}
Also, I can't run the setupWifi function unless it is in setup or loop, but I can't make it in another setup because this is #included into another main file.

You get this error because of this:
void setupWifi(char* *aiduser, char* *aidkey, char* *ssid, char* *pass)
{
...
aiduser = strtok(char_array, s);
aidpass = strtok(NULL, s);
ssid = strtok(NULL, s);
pass = strtok(NULL, s);
}
This variables are double pointers, strtok returns a pointer to char, those
are not compatible types.
Because strtok returns char_array + some_offset and char_array is a local
variable in setupWifi, you need to do a copy for each of them and return the
copy instead. You can do it with strdup.
*aiduser = strdup(strtok(char_array, s));
*aidpass = strdup(strtok(NULL, s));
*ssid = strdup(strtok(NULL, s));
*pass = strdup(strtok(NULL, s));
I encourage you to always check the return value of strdup, because it can
return NULL.1
If your system does not have strdup, then you can write your own:
char *strdup(const char *text)
{
if(text == NULL)
return NULL;
char *copy = calloc(strlen(text) + 1, 1);
if(copy == NULL)
return NULL;
return strcpy(copy, text);
}
One last thing:
void setupWifi(char* *aiduser, char* *aidkey, char* *ssid, char* *pass);
It looks really awkward, never seen declaring double pointer this way. Much
easier to read would be
void setupWifi(char **aiduser, char **aidkey, char **ssid, char **pass);
Fotenotes
1While the syntax is correct, I still consider this bad practice,
because you should always check the return values of functions that return
pointers. If they return NULL, you cannot access the memory. This adds a
little bit of more code, but your program will not die of segfaults and it can
recover from the errors.
I'd also change your function to return 1 on success, 0 otherwise:
int parse_and_set(char *txt, const char *delim, char **var)
{
if(delim == NULL || var == NULL)
return 0;
char *token = strtok(txt, delim);
if(token == NULL)
return 0;
token = strdup(token);
if(token == NULL)
return NULL;
*var = token;
return 1;
}
void init_parse(char ***vars, size_t len)
{
for(size_t i = 0; i < len; ++i)
**(vars + i) = NULL;
}
int cleanup_parse(char ***vars, size_t len, int retval)
{
for(size_t i = 0; i < len; ++i)
{
free(**(vars + i));
**(vars + i) = NULL;
}
}
int setupWifi(char **aiduser, char **aidkey, char **ssid, char **pass)
{
if(aiduser == NULL || aidkey == NULL || ssid == NULL || pass == NULL)
return 0;
...
/* get the token token */
char **vars[] = { aiduser, aidkey, ssid, pass };
size_t len = sizeof vars / sizeof *vars;
init_parse(vars, len);
if(parse_and_set(char_array, s, aiduser) == 0)
return cleanup_parse(vars, len, 0);
if(parse_and_set(NULL, s, aidpass) == 0)
return cleanup_parse(vars, len, 0);
if(parse_and_set(NULL, s, ssid) == 0)
return cleanup_parse(vars, len, 0);
if(parse_and_set(NULL, s, pass) == 0)
return cleanup_parse(vars, len, 0);
...
return 1;
}

wcscpy_s not affecting wchar_t*

I'm trying to load some strings from a database into a struct, but I keep running into an odd issue. Using my struct datum,
struct datum {
wchar_t* name;
wchar_t* lore;
};
I tried the following code snippet
datum thisDatum;
size_t len = 0;
wchar_t wBuffer[2048];
mbstowcs_s(&len, wBuffer, (const char*)sqlite3_column_text(pStmt, 1), 2048);
if (len) {
thisDatum.name = new wchar_t[len + 1];
wcscpy_s(thisDatum.name, len + 1, wBuffer);
} else thisDatum.name = 0;
mbstowcs_s(&len, wBuffer, (const char*)sqlite3_column_text(pStmt, 2), 2048);
if (len) {
thisDatum.lore = new wchar_t[len + 1];
wcscpy_s(thisDatum.lore, len + 1, wBuffer);
} else thisDatum.name = 0;
However, while thisDatum.name copies correctly, thisDatum.lore is always garbage, except on two occassions. If the project is Debug, everything is fine, but that just isn't an option. I also discovered that rewriting the struct datum
struct datum {
wchar_t* lore;
wchar_t* name;
};
completely fixes the issue for thisDatum.lore, but gives me garbage for thisDatum.name.

Try something more like this:
struct datum {
wchar_t* name;
wchar_t* lore;
};
wchar_t* widen(const char *str)
{
wchar_t *wBuffer = NULL;
size_t len = strlen(str) + 1;
size_t wlen = 0;
mbstowcs_s(&wlen, NULL, 0, str, len);
if (wlen)
{
wBuffer = new wchar_t[wlen];
mbstowcs_s(NULL, wBuffer, wlen, str, len);
}
return wBuffer;
}
datum thisDatum;
thisDatum.name = widen((const char*)sqlite3_column_text(pStmt, 1));
thisDatum.lore = widen((const char*)sqlite3_column_text(pStmt, 2));
...
delete[] thisDatum.name;
delete[] thisDatum.lore;
That being said, I would use std::wstring instead:
struct datum {
std::wstring name;
std::wstring lore;
};
#include <locale>
#include <codecvt>
std::wstring widen(const char *str)
{
std::wstring_convert< std::codecvt<wchar_t, char, std::mbstate_t> > conv;
return conv.from_bytes(str);
}
datum thisDatum;
thisDatum.name = widen((const char*)sqlite3_column_text(pStmt, 1));
thisDatum.lore = widen((const char*)sqlite3_column_text(pStmt, 2));

application crashes at first strcat_s

I have tried both strcat and strcat_s, but they both crash. Does anyone know why this happens? I can't find the problem.
Crash: "Unhandled exception at 0x58636D2A (msvcr110d.dll)"
_Dst 0x00ea6b30 "C:\\Users\\Ruben\\Documents\\School\\" char *
_SizeInBytes 260 unsigned int
_Src 0x0032ef64 "CKV" const char *
available 228 unsigned int
p 0x00ea6b50 "" char *
Code:
#include <Windows.h>
#include <strsafe.h>
extern "C"
{
char* GetFilesInFolders(LPCWSTR filedir, char* path)
{
char* files = "";
char DefChar = ' ';
char* Streepje = "-";
bool LastPoint = false;
WIN32_FIND_DATA ffd;
TCHAR szDir[MAX_PATH];
HANDLE hFind = INVALID_HANDLE_VALUE;
DWORD dwError = 0;
StringCchCopy(szDir, MAX_PATH, filedir);
hFind = FindFirstFile(szDir, &ffd);
if (INVALID_HANDLE_VALUE == hFind)
return "";
do
{
DWORD attributes = ffd.dwFileAttributes;
LPCWSTR nm = ffd.cFileName;
char name[260];
WideCharToMultiByte(CP_ACP,0,ffd.cFileName,-1, name,260,&DefChar, NULL);
for (int i = 0; i <= 260; i++)
{
if (name[i] == '.')
LastPoint = true;
else if (name[i] == ' ')
break;
}
if (LastPoint == true)
{
LastPoint = false;
continue;
}
if (attributes & FILE_ATTRIBUTE_HIDDEN)
{
continue;
}
else if (attributes & FILE_ATTRIBUTE_DIRECTORY)
{
char* newfiledir = "";
char* newpath = path;
char* add = "\\";
char* extra = "*";
strcat_s(newpath, sizeof(name), name);
strcat_s(newpath, sizeof(add), add);
puts(newpath);
strcpy_s(newfiledir, sizeof(newpath) + 1, newpath);
strcat_s(newfiledir, sizeof(extra) + 1, extra);
puts(newfiledir);
size_t origsize = strlen(newfiledir) + 1;
const size_t newsize = 100;
size_t convertedChars = 0;
wchar_t wcstring[newsize];
mbstowcs_s(&convertedChars, wcstring, origsize, newfiledir, _TRUNCATE);
LPCWSTR dir = wcstring;
GetFilesInFolders(dir, newpath);
}
else
{
char* file = path;
strcat_s(file, sizeof(name), name);
puts(file);
strcat_s(files, sizeof(file), file);
strcat_s(files, sizeof(Streepje), Streepje);
puts(files);
}
}
while (FindNextFile(hFind, &ffd) != 0);
FindClose(hFind);
return files;
}
}
int _tmain(int argc, _TCHAR* argv[])
{
char* path = "C:\\Users\\Ruben\\Documents\\School\\";
char* filedir = "C:\\Users\\Ruben\\Documents\\School\\*";
size_t origsize = strlen(filedir) + 1;
const size_t newsize = 100;
size_t convertedChars = 0;
wchar_t wcstring[newsize];
mbstowcs_s(&convertedChars, wcstring, origsize, filedir, _TRUNCATE);
LPCWSTR dir = wcstring;
char* files = GetFilesInFolders(dir, path);
return 0;
}
Extra info: I don't want to use boost or strings and I want to keep this in unicode (default).

You assign a const char* to files, then attempt to append to it.
char* files = "";
// ...
strcat_s(files, sizeof(file), file);
You cannot modify a constant string literal.
I would recommend that you turn on compiler warnings and make sure to look at them. This would warn you about assigning a const char* to a char*. To fix it, you might have changed files to be const, which would then cause your strcpy_s to no longer compile.

It looks like you don't understand how variables are stored in memory or how pointers work. In your _tmain() you have char * path pointing to a constant string literal, which you pass into GetFilesInFolders(), where it gets modified. Compilers tend to allow char *s to point at constant strings for backward compatibility with old C programs. You cannot modify these. You cannot append to them. The compiler (generally) puts these in a read-only segment. That's one reason why you're getting an exception.
Your whole GetFilesInFolders() is wrong. And as DarkFalcon pointed out, you haven't allocated any space anywhere for files, you have it pointing to a constant string literal.
Get "The C++ Programming Language" and read chapter 5.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Convert between wstring and string , got different results with "same" way - c++

Related

How to read Unicode string from process in Windows?

Convert C++ std::string to UTF-16-LE encoded string

Using strtok and get "cannot convert 'char*' to 'char**' in assignment"

wcscpy_s not affecting wchar_t*

application crashes at first strcat_s

Categories

Resources