unsigned char array to wide char array/string - c++

How can/should I cast from a unsigned char array to a widechar array wchar_t or std::wstring? And how can I convert it back to a unsigned char array?
Or can OpenSSL produce a widechar hash from SHA256_Update?

Try the following:
#include <cstdlib>
using namespace std;
unsigned char* temp; // pointer to initial data
// memory allocation and filling
// calculation of string length
wchar_t* wData = new wchar_t[len+1];
mbstowcs(&wData[0], &temp1[0], len);
Сoncerning inverse casting look the example here or just use mbstowcs once again but with changing places of two first arguments.
Also WideCharToMultiByte function can be useful for Windows development, and setting locale should be considered as well (see some examples).
UPDATE:
To calculate length of string pointed by unsigned char* temp the following approach can be used:
const char* ccp = reinterpret_cast<const char*>(temp);
size_t len = mbstowcs(nullptr, &ccp[0], 0);

std::setlocale(LC_ALL, "en_US.utf8");
const char* mbstr = "hello";
std::mbstate_t state = std::mbstate_t();
// calc length
int len = 1 + std::mbsrtowcs(nullptr, &mbstr, 0, &state);
std::vector<wchar_t> wstr(len);
std::mbsrtowcs(&wstr[0], &mbstr, wstr.size(), &state);

How can/should I cast from a unsigned char array to a widechar array wchar_t or std::wstring? And how can I convert it back to a unsigned char array?
They are completely different, so you should not be doing it under most circumstances. If you provide a specific question with real code, than we can probably tell you more.
Or can OpenSSL produce a widechar hash from SHA256_Update?
No, OpenSSL cannot do this. It produces hashes which are binary strings cmposed of bytes, not chars. You are responsible for for presentation details, like narrow/wide character sets or base64 encoding.

Related

Take wchar_t and put into char?

i've tried a few things and haven't yet been able to figure out how to get const wchar_t *text (shown bellow) to pass into the variable StoreText (shown below). What am i doing wrong?
void KeyboardComplete(int localClientNum, const wchar_t *text, unsigned int len)
{
char* StoreText = text; //This is where error occurs
}
You cannot directly assign a wchar_t* to a char*, as they are different and incompatible data types.
If StoreText needs to point at the same memory address that text is pointing at, such as if you are planning on looping through the individual bytes of the text data, then a simple type-cast will suffice:
char* StoreText = (char*)text;
However, if StoreText is expected to point to its own separate copy of the character data, then you would need to convert the wide character data into narrow character data instead. Such as by:
using the WideCharToMultiByte() function on Windows:
void KeyboardComplete(int localClientNum, const wchar_t *text, unsigned int len)
{
int StoreTextLen = 1 + WideCharToMultiByte(CP_ACP, 0, text, len, NULL, 0, NULL, NULL);
std::vector<char> StoreTextBuffer(StoreTextLen);
WideCharToMultiByte(CP_ACP, 0, text, len, &StoreTextBuffer[0], StoreTextLen, NULL, NULL);
char* StoreText = &StoreText[0];
//...
}
using the std::wcsrtombs() function:
#include <cwchar>
void KeyboardComplete(int localClientNum, const wchar_t *text, unsigned int len)
{
std::mbstate_t state = std::mbstate_t();
int StoreTextLen = 1 + std::wcsrtombs(NULL, &text, 0, &state);
std::vector<char> StoreTextBuffer(StoreTextLen);
std::wcsrtombs(&StoreTextBuffer[0], &text, StoreTextLen, &state);
char *StoreText = &StoreTextBuffer[0];
//...
}
using the std::wstring_convert class (C++11 and later):
#include <locale>
void KeyboardComplete(int localClientNum, const wchar_t *text, unsigned int len)
{
std::wstring_convert<std::codecvt<wchar_t, char, std::mbstate_t>> conv;
std::string StoreTextBuffer = conv.to_bytes(text, text+len);
char *StoreText = &StoreTextBuffer[0];
//...
}
using similar conversions from the ICONV or ICU library.
First of all, for strings you should use std::wstring/std::string instead of raw pointers.
The C++11 Locale (http://en.cppreference.com/w/cpp/locale) library can be used to convert wide string to narrow string.
I wrote a wrapper function below and have used it for years. Hope it will be helpful to you, too.
#include <string>
#include <locale>
#include <codecvt>
std::string WstringToString(const std::wstring & wstr, const std::locale & loc /*= std::locale()*/)
{
std::string buf(wstr.size(), 0);
std::use_facet<std::ctype<wchar_t>>(loc).narrow(wstr.c_str(), wstr.c_str() + wstr.size(), '?', &buf[0]);
return buf;
}
wchar_t is a wide character. It is typically 16 or 32 bits per character, but this is system dependent.
char is a good ol' CHAR_BIT-sized data type. Again, how big it is is system dependent. Most likely it's going to be one byte, but I can't think of a reason why CHAR_BIT can't be 16 or 32 bits, making it the same size as wchar_t.
If they are different sizes, a direct assignment is doomed. For example an 8 bit char will see 2 characters, and quite likely 2 completely unrelated characters, for every 1 character in a 16 bit wchar_t. This would be bad.
Second, even if they are the same size, they may have different encodings. For example, the numeric value assigned to the letter 'A' may be different for the char and the wchar_t. It could be 65 in char and 16640 in wchar_t.
To make any sense in the different data type char and wchar_t will need to be translated to the other's encoding. std::wstring_convert will often perform this translation for you, but look into the locale library for more complicated translations. Both require a compiler supporting C++11 or better. In previous C++ Standards, a small army of functions provided conversion support. Third party libraries such as Boost::locale are helpful to unify and provide wider support.
Conversion functions are supplied by the operating system to translate between the encoding used by the OS and other common encodings.
You have to do a cast, you can do this:
char* StoreText = (char*)text;
I think this may work.
But you can use the wcstombs function of cstdlib library.
char someText[12];
wcstombs(StoreText,text, 12);
Last parameter most be a number of byte available in the array pointed.

Convert char* to uint8_t

I transfer message trough a CAN protocol.
To do so, the CAN message needs data of uint8_t type. So I need to convert my char* to uint8_t. With my research on this site, I produce this code :
char* bufferSlidePressure = ui->canDataModifiableTableWidget->item(6,3)->text().toUtf8().data();//My char*
/* Conversion */
uint8_t slidePressure [8];
sscanf(bufferSlidePressure,"%c",
&slidePressure[0]);
As you may see, my char* must fit in sliderPressure[0].
My problem is that even if I have no error during compilation, the data in slidePressure are totally incorrect. Indeed, I test it with a char* = 0 and I 've got unknow characters ... So I think the problem must come from conversion.
My datas can be Bool, Uchar, Ushort and float.
Thanks for your help.
Is your string an integer? E.g. char* bufferSlidePressure = "123";?
If so, I would simply do:
uint8_t slidePressure = (uint8_t)atoi(bufferSlidePressure);
Or, if you need to put it in an array:
slidePressure[0] = (uint8_t)atoi(bufferSlidePressure);
Edit: Following your comment, if your data could be anything, I guess you would have to copy it into the buffer of the new data type. E.g. something like:
/* in case you'd expect a float*/
float slidePressure;
memcpy(&slidePressure, bufferSlidePressure, sizeof(float));
/* in case you'd expect a bool*/
bool isSlidePressure;
memcpy(&isSlidePressure, bufferSlidePressure, sizeof(bool));
/*same thing for uint8_t, etc */
/* in case you'd expect char buffer, just a byte to byte copy */
char * slidePressure = new char[ size ]; // or a stack buffer
memcpy(slidePressure, (const char*)bufferSlidePressure, size ); // no sizeof, since sizeof(char)=1
uint8_t is 8 bits of memory, and can store values from 0 to 255
char is probably 8 bits of memory
char * is probably 32 or 64 bits of memory containing the address of a different place in memory in which there is a char
First, make sure you don't try to put the memory address (the char *) into the uint8 - put what it points to in:
char from;
char * pfrom = &from;
uint8_t to;
to = *pfrom;
Then work out what you are really trying to do ... because this isn't quite making sense. For example, a float is probably 32 or 64 bits of memory. If you think there is a float somewhere in your char * data you have a lot of explaining to do before we can help :/
char * is a pointer, not a single character. It is possible that it points to the character you want.
uint8_t is unsigned but on most systems will be the same size as a char and you can simply cast the value.
You may need to manage the memory and lifetime of what your function returns. This could be done with vector< unsigned char> as the return type of your function rather than char *, especially if toUtf8() has to create the memory for the data.
Your question is totally ambiguous.
ui->canDataModifiableTableWidget->item(6,3)->text().toUtf8().data();
That is a lot of cascading calls. We have no idea what any of them do and whether they are yours or not. It looks dangerous.
More safe example in C++ way
char* bufferSlidePressure = "123";
std::string buffer(bufferSlidePressure);
std::stringstream stream;
stream << str;
int n = 0;
// convert to int
if (!(stream >> n)){
//could not convert
}
Also, if boost is availabe
int n = boost::lexical_cast<int>( str )

How to convert wstring into byte vector

Hi I have a few typedefs:
typedef unsigned char Byte;
typedef std::vector<Byte> ByteVector;
typedef std::wstring String;
I need to convert String into ByteVector, I have tried this:
String str = L"123";
ByteVector vect(str.begin(), str.end());
As a result vectror contains 3 elements: 1, 2, 3. However it is wstring so every charcter in this string is wide so my expected result would be: 0, 1, 0, 2, 0, 3.
Is there any standart way to do that or I need to write some custom function.
Byte const* p = reinterpret_cast<Byte const*>(&str[0]);
std::size_t size = str.size() * sizeof(str.front());
ByteVector vect(p, p+size);
What is your actual goal? If you just want to get the bytes representing the wchar_t objects, a fairly trivial conversion would do the trick although I wouldn't use just a cast to to unsigned char const* but rather an explicit conversion.
On the other hand, if you actually want to convert the std::wstring into a sequence encoded using e.g. UTF8 or UTF16 as is usually the case when dealing with characters, the conversion used for the encoding becomes significantly more complex. Probably the easiest approach to convert to an encoding is to use C's wcstombs():
std::vector<char> target(source.size() * 4);
size_t n = wcstombs(&target[0], &source[0], target.size());
The above fragment assumes that source isn't empty and that the last wchar_t in source is wchar_t(). The conversion uses C's global locale and assumes to convert whatever character encoding is set up there. There is also a version wcstombs_l() where you can specify the locale.
C++ has similar functionality but it is a bit harder to use in the std::codecvt<...> facet. I can provide an example if necessary.

convert std::wstring to const *char in c++

How can I convert std::wstring to const *char in C++?
You can convert a std::wstring to a const wchar_t * using the c_str member function :
std::wstring wStr;
const wchar_t *str = wStr.c_str();
However, a conversion to a const char * isn't natural : it requires an additional call to std::wcstombs, like for example:
#include <cstdlib>
// ...
std::wstring wStr;
const wchar_t *input = wStr.c_str();
// Count required buffer size (plus one for null-terminator).
size_t size = (wcslen(input) + 1) * sizeof(wchar_t);
char *buffer = new char[size];
#ifdef __STDC_LIB_EXT1__
// wcstombs_s is only guaranteed to be available if __STDC_LIB_EXT1__ is defined
size_t convertedSize;
std::wcstombs_s(&convertedSize, buffer, size, input, size);
#else
std::wcstombs(buffer, input, size);
#endif
/* Use the string stored in "buffer" variable */
// Free allocated memory:
delete buffer;
You cannot do this just like that. std::wstring represents a string of wide (Unicode) characters, while char* in this case is a string of ASCII characters. There has to be a code page conversion from Unicode to ASCII.
To make the conversion you can use standard library functions such as wcstombs, or Windows' WideCharToMultiByte function.
Updated to incorporate information from comments, thanks for pointing that out.

What is the easiest way to convert a char array to a WCHAR array?

In my code, I receive a const char array like the following:
const char * myString = someFunction();
Now I want to postprocess it as a wchar array since the functions I use afterwards don't handle narrow strings.
What is the easiest way to accomplish this goal?
Eventually MultiByteToWideChar? (However, since it is a narrow string which I get as input, it doesn't have multibyte characters => probably not the most beautiful solution)
const char * myString = someFunction();
const int len = strlen(myString);
std::vector<wchar_t> myWString (len);
std::copy(myString, myString + len, myWString.begin());
const wchar_t * result = &myWString[0];
MultiByteToWideChar will work unless you are using extended characters in your narrow string. If its a plain alpha numeric string then it should work fine.
you can also look at mbstowcs which is a little less convoluted but doesn't offer the same amount of control.