C++: How to convert wstring with md5 hash to byte* array?

C++: How to convert wstring with md5 hash to byte* array? - c++

std::wstring hashStr(L"4727b105cf792b2d8ad20424ed83658c");
//....
byte digest[16];
How can I get my md5 hash in digest?
My answer is:
wchar_t * EndPtr;
for (int i = 0; i < 16; i++) {
std::wstring bt = hashStr.substr(i*2, 2);
digest[i] = static_cast<BYTE>(wcstoul(bt.c_str(), &EndPtr, 16));
}

You need to read two characters from hashStr, convert them from hex to a binary value, and put that value into the next spot in digest -- something on this order:
for (int i=0; i<16; i++) {
std::wstring byte = hashStr.substr(i*2, 2);
digest[i] = hextobin(byte);
}

C-way (I didn't test it, but it should work (though I could've screwed up somewhere) and you will get the method anyway).
memset(digest, 0, sizeof(digest));
for (int i = 0; i < 32; i++)
{
wchar_t numwc = hashStr[i];
BYTE numbt;
if (numwc >= L'0' && numwc <= L'9') //I assume that the string is right (i.e.: no SGJSGH chars and stuff) and is in uppercase (you can change that though)
{
numbt = (BYTE)(numwc - L'0');
}
else
{
numbt = 0xA + (BYTE)(numwc - L'A');
}
digest[i/2] += numbt*(2<<(4*((i+1)%2)));
}

Related

Why is strtoul returning 0 from a "1" string?

I'm trying to compact a raster file in a way that is easy to read without GDAL library (my web server cannot install GDAL). Following this question, I'm doing the following to convert a raster's bytes (only 0 and 1 values) to bits:
int main(int argc,char *argv[]) {
if (argc < 3) {
return 1;
}
GDALDataset *poDataset;
GDALAllRegister();
poDataset = (GDALDataset*)GDALOpen(argv[1],GA_ReadOnly);
if (poDataset == NULL) {
return 2;
}
int tx=poDataset->GetRasterXSize(), ty=poDataset->GetRasterYSize();
GDALRasterBand *poBand;
int nBlockXSize,nBlockYSize;
poBand = poDataset->GetRasterBand(1);
printf("Type: %s\n",GDALGetDataTypeName(poBand->GetRasterDataType()));
// Type: Byte
poBand->GetBlockSize(&nBlockXSize,&nBlockYSize);
int i, nX = tx/nBlockXSize, nY = ty/nBlockYSize;
char *data = (char*)CPLMalloc(nBlockXSize*nBlockYSize + 1);
uint32_t out[nBlockXSize*nBlockYSize/32];
char temp;
CPLErr erro;
FILE* pFile;
pFile = fopen(argv[2],"wb");
for (y=0; y<nY; y++) {
for (x=0; x<nX; x++) {
erro = poBand->ReadBlock(x,y,data);
if (erro > 0) {
return 3;
}
for (i=0; i<nBlockXSize*nBlockYSize; i+=32) {
temp = data[i+32];
data[i+32] = 0;
out[i/32] = strtoul(&data[i],0,2);
if (data[i] != 0) {
printf("%u/%u ",data[i],out[i/32]);
}
data[i+32] = temp;
}
ch = getchar(); // for debugging
}
fwrite(out,4,nBlockXSize*nBlockYSize/32,pFile);
}
fclose(pFile);
CPLFree(data);
return 0;
}
After the first set of bytes is read (for (i=0; i<nBlockXSize*nBlockYSize; i+=32)), I can see that printf("%u/%u ",data[i],out[i/32]); is printing some "1/0", meaning that, where my raster has a 1 value, this is being passed to strtoul, which is returning 0. Obviously I'm messing with something (pointers, probably), but can't find where. What am I doing wrong?

strtoul is for converting printable character data to an integer. The string should contain character codes for digits, e.g. '0', '1' etc.
Apparently in your case the source data is actually the integer value 1 and so strtoul finds there are no characters of the expected form and returns 0 .

Hexadecimal QString representation to Unsigned char array

A QString with some user input contains a MAC address, for instance "68F542F9AB22". I need to convert the QString to unsigned char array[6] of numbers, not the ASCII representation. So for the QString 68F542F9AB22 the unsigned char array at first position should be 104.

you can iterate over your QString and cut it into pieces of two and then use QString::toUShort(&ok,16), which will give you a ushort of your hex String.
Someting like
for(int i=0;i<6;++i)
{
QString hexString = yourstring.mid(i*2,2);
bool ok = false;
yourBuf[i] = (unsigned char) hexString.toUShort(&ok,16);
//if not ok, handle error
}
You shoud do some checks for the correct length of your input string and do some error handling on conversion errors.
Hope this might help you.

I would do it in the following way:
QString s("68F542F9AB22");
assert(s.size() % 2 == 0);
std::vector<unsigned char> array;
for (int i = 0; i < s.size(); i += 2)
{
QString num = s.mid(i, 2);
bool ok = false;
array.push_back(num.toUInt(&ok, 16));
assert(ok);
}

Hex to long long is a single operation and will convert up to 64 bits. That's sufficient for an 48 bits MAC. So:
bool ok = false;
auto result = input.toULongLong(&ok,16);
for(int i=5;i>=0;--i)
{
buf[i] = result & 0xFF;
result >>= 8;
}

You can try :
QString s = "68F542F9AB22";
unsigned char array[6];
if (s.length() % 2 == 0) { // test string length is even
for (unsigned long i = 0 ; i < s.length() ; i += 2) {
QString chunk = s.mid(i,2);
bool ok;
array[i/2] = static_cast<unsigned char>(chunk.toInt(&ok,16));
}
}
Note that the above code works only for even strings (but for MAC address, it's ok).
And to test the result you may use :
for (unsigned long i = 0 ; i < 6 ; i ++) {
std::cout<<"Array ["<<i<<"] = "<<static_cast<unsigned long>(array[i])<<std::endl;
}

How do I convert xor encryption from using std::string to just char or char *?

So essentially with the libraries that i'm working with I cannot use std::string, as it uses a somewhat depreciated version of C++ I need to convert this xor function from using std::string to just using char or char *. I have been trying but I cannot figure out what I am doing wrong, as I get an error. Here is the code:
string encryptDecrypt(string toEncrypt) {
char key[] = "DSIHKGDSHIGOK$%#%45434etG34th8349ty"; //Any chars will work
string output = toEncrypt;
for (int i = 0; i < toEncrypt.size(); i++)
output[i] = toEncrypt[i] ^ key[i % (sizeof(key) / sizeof(char))];
return output;
}
If anyone could help me out, that would be great. I am unsure as to why I cannot do it by simply changing the strings to char *.
Edit:
What I have tried is:
char * encryptDecrypt(char * toEncrypt) {
char key[] = "DSIHKGDSHIGOK$%#%45434etG34th8349ty"; //Any chars will work
char * output = toEncrypt;
for (int i = 0; i < sizeof(toEncrypt); i++)
output[i] = toEncrypt[i] ^ key[i % (sizeof(key) / sizeof(char))];
return output;
}
Please note I am not trying to convert an std::string to char, I simply cannot use std::string in any instance of this function. Therefore, my question is not answered. Please read my question more carefully before marking it answered...

The issue here is
char * output = toEncrypt;
This is making output point to toEncrypt which is not what you want to do. What you need to do is allocate a new char* and then copy the contents of toEncrypt into output
char * encryptDecrypt(char * toEncrypt) {
char key[] = "DSIHKGDSHIGOK$%#%45434etG34th8349ty"; //Any chars will work
int string_size = std::strlen(toEncrypt);
char * output = new char[string_size + 1]; // add one for the null byte
std::strcpy(output, toEncrypt); //copy toEncrypt into output
for (int i = 0; i < string_size; i++)
output[i] = toEncrypt[i] ^ key[i % (sizeof(key) / sizeof(char))];
return output;
}
Live Example
Since we are using dynamic memory allocation here we need to make sure that the caller deletes the memory when done otherwise it will be a memory leak.

sizeof() is a compile-time operator that evaluates the size of the type of its argument. When you do sizeof(toEncrypt), you're really just doing sizeof(char*) -- not the length of the string, which is what you want. You'll need to somehow indicate how long the toEncrypt string is. Here are two possible solutions:
Add an integer argument to encryptDecrypt specifying the length of toEncrypt in characters.
If you know that toEncrypt will never contain the null byte as a valid character for encryption / decryption (not sure of your application) and can assume that toEncrypt is null-terminated, you could use the strlen function to determine string length at runtime.
I'd recommend option 1, as strlen can introduce security holes if you're not careful, and also because it allows the use of null bytes within your string arguments.

What error are you getting? You can easily use a char* to do the same thing, I've included a sample program that verifies the functionality. This was built under VS2012.
#include <string>
#include <stdio.h>
std::string encryptDecrypt( std::string toEncrypt)
{
char key[] = "DSIHKGDSHIGOK$%#%45434etG34th8349ty"; //Any chars will work
std::string output = toEncrypt;
for (int i = 0; i < toEncrypt.size(); i++)
output[i] = toEncrypt[i] ^ key[i % (sizeof(key) / sizeof(char))];
return output;
}
void encryptDecrypt( char* toEncrypt )
{
char key[] = "DSIHKGDSHIGOK$%#%45434etG34th8349ty"; //Any chars will work
int len = strlen( toEncrypt );
for (int i = 0; i < len; i++)
toEncrypt[i] = toEncrypt[i] ^ key[i % (sizeof(key) / sizeof(char))];
}
int main( int argc, char* argv[] )
{
const char* sample = "This is a sample string to process";
int len = strlen( sample );
char* p = new char[ len + 1 ];
p[len] = '\0';
strcpy( p, sample );
std::string output = encryptDecrypt( sample );
encryptDecrypt( p );
bool match = strcmp(output.c_str(), p) == 0;
printf( "The two encryption functions %smatch.\n", match ? "" : "do not " );
return 0;
}

Why not instead of string output = toEncrypt :
char *output = new char[std::strlen(toEncrypt) + 1];
std::strcpy(output, toEncrypt);

Base 64 Encoding Losing data

This is my fourth attempt at doing base64 encoding. My first tries work but it isn't standard. It's also extremely slow!!! I used vectors and push_back and erase a lot.
So I decided to re-write it and this is much much faster! Except that it loses data. -__-
I need as much speed as I can possibly get because I'm compressing a pixel buffer and base64 encoding the compressed string. I'm using ZLib. The images are 1366 x 768 so yeah.
I do not want to copy any code I find online because... Well, I like to write things myself and I don't like worrying about copyright stuff or having to put a ton of credits from different sources all over my code..
Anyway, my code is as follows below. It's very short and simple.
const static std::string Base64Chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
inline bool IsBase64(std::uint8_t C)
{
return (isalnum(C) || (C == '+') || (C == '/'));
}
std::string Copy(std::string Str, int FirstChar, int Count)
{
if (FirstChar <= 0)
FirstChar = 0;
else
FirstChar -= 1;
return Str.substr(FirstChar, Count);
}
std::string DecToBinStr(int Num, int Padding)
{
int Bin = 0, Pos = 1;
std::stringstream SS;
while (Num > 0)
{
Bin += (Num % 2) * Pos;
Num /= 2;
Pos *= 10;
}
SS.fill('0');
SS.width(Padding);
SS << Bin;
return SS.str();
}
int DecToBinStr(std::string DecNumber)
{
int Bin = 0, Pos = 1;
int Dec = strtol(DecNumber.c_str(), NULL, 10);
while (Dec > 0)
{
Bin += (Dec % 2) * Pos;
Dec /= 2;
Pos *= 10;
}
return Bin;
}
int BinToDecStr(std::string BinNumber)
{
int Dec = 0;
int Bin = strtol(BinNumber.c_str(), NULL, 10);
for (int I = 0; Bin > 0; ++I)
{
if(Bin % 10 == 1)
{
Dec += (1 << I);
}
Bin /= 10;
}
return Dec;
}
std::string EncodeBase64(std::string Data)
{
std::string Binary = std::string();
std::string Result = std::string();
for (std::size_t I = 0; I < Data.size(); ++I)
{
Binary += DecToBinStr(Data[I], 8);
}
for (std::size_t I = 0; I < Binary.size(); I += 6)
{
Result += Base64Chars[BinToDecStr(Copy(Binary, I, 6))];
if (I == 0) ++I;
}
int PaddingAmount = ((-Result.size() * 3) & 3);
for (int I = 0; I < PaddingAmount; ++I)
Result += '=';
return Result;
}
std::string DecodeBase64(std::string Data)
{
std::string Binary = std::string();
std::string Result = std::string();
for (std::size_t I = Data.size(); I > 0; --I)
{
if (Data[I - 1] != '=')
{
std::string Characters = Copy(Data, 0, I);
for (std::size_t J = 0; J < Characters.size(); ++J)
Binary += DecToBinStr(Base64Chars.find(Characters[J]), 6);
break;
}
}
for (std::size_t I = 0; I < Binary.size(); I += 8)
{
Result += (char)BinToDecStr(Copy(Binary, I, 8));
if (I == 0) ++I;
}
return Result;
}
I've been using the above like this:
int main()
{
std::string Data = EncodeBase64("IMG." + ::ToString(677) + "*" + ::ToString(604)); //IMG.677*604
std::cout<<DecodeBase64(Data); //Prints IMG.677*601
}
As you can see in the above, it prints the wrong string. It's fairly close but for some reason, the 4 is turned into a 1!
Now if I do:
int main()
{
std::string Data = EncodeBase64("IMG." + ::ToString(1366) + "*" + ::ToString(768)); //IMG.1366*768
std::cout<<DecodeBase64(Data); //Prints IMG.1366*768
}
It prints correctly.. I'm not sure what is going on at all or where to begin looking.
Just in-case anyone is curious and want to see my other attempts (the slow ones): http://pastebin.com/Xcv03KwE
I'm really hoping someone could shed some light on speeding things up or at least figuring out what's wrong with my code :l

The main encoding issue is that you are not accounting for data that is not a multiple of 6 bits. In this case, the final 4 you have is being converted into 0100 instead of 010000 because there are no more bits to read. You are supposed to pad with 0s.
After changing your Copy like this, the final encoded character is Q, instead of the original E.
std::string data = Str.substr(FirstChar, Count);
while(data.size() < Count) data += '0';
return data;
Also, it appears that your logic for adding padding = is off because it is adding one too many = in this case.
As far as comments on speed, I'd focus primarily on trying to reduce your usage of std::string. The way you are currently converting the data into a string with 0 and 1 is pretty inefficent considering that the source could be read directly with bitwise operators.

I'm not sure whether I could easily come up with a slower method of doing Base-64 conversions.
The code requires 4 headers (on Mac OS X 10.7.5 with G++ 4.7.1) and the compiler option -std=c++11 to make the #include <cstdint> acceptable:
#include <string>
#include <iostream>
#include <sstream>
#include <cstdint>
It also requires a function ToString() that was not defined; I created:
std::string ToString(int value)
{
std::stringstream ss;
ss << value;
return ss.str();
}
The code in your main() — which is what uses the ToString() function — is a little odd: why do you need to build a string from pieces instead of simply using "IMG.677*604"?
Also, it is worth printing out the intermediate result:
int main()
{
std::string Data = EncodeBase64("IMG." + ::ToString(677) + "*" + ::ToString(604));
std::cout << Data << std::endl;
std::cout << DecodeBase64(Data) << std::endl; //Prints IMG.677*601
}
This yields:
SU1HLjY3Nyo2MDE===
IMG.677*601
The output string (SU1HLjY3Nyo2MDE===) is 18 bytes long; that has to be wrong as a valid Base-64 encoded string has to be a multiple of 4 bytes long (as three 8-bit bytes are encoded into four bytes each containing 6 bits of the original data). This immediately tells us there are problems. You should only get zero, one or two pad (=) characters; never three. This also confirms that there are problems.
Removing two of the pad characters leaves a valid Base-64 string. When I use my own home-brew Base-64 encoding and decoding functions to decode your (truncated) output, it gives me:
Base64:
0x0000: SU1HLjY3Nyo2MDE=
Binary:
0x0000: 49 4D 47 2E 36 37 37 2A 36 30 31 00 IMG.677*601.
Thus it appears you have encode the null terminating the string. When I encode IMG.677*604, the output I get is:
Binary:
0x0000: 49 4D 47 2E 36 37 37 2A 36 30 34 IMG.677*604
Base64: SU1HLjY3Nyo2MDQ=
You say you want to speed up your code. Quite apart from fixing it so that it encodes correctly (I've not really studied the decoding), you will want to avoid all the string manipulation you do. It should be a bit manipulation exercise, not a string manipulation exercise.
I have 3 small encoding routines in my code, to encode triplets, doublets and singlets:
/* Encode 3 bytes of data into 4 */
static void encode_triplet(const char *triplet, char *quad)
{
quad[0] = base_64_map[(triplet[0] >> 2) & 0x3F];
quad[1] = base_64_map[((triplet[0] & 0x03) << 4) | ((triplet[1] >> 4) & 0x0F)];
quad[2] = base_64_map[((triplet[1] & 0x0F) << 2) | ((triplet[2] >> 6) & 0x03)];
quad[3] = base_64_map[triplet[2] & 0x3F];
}
/* Encode 2 bytes of data into 4 */
static void encode_doublet(const char *doublet, char *quad, char pad)
{
quad[0] = base_64_map[(doublet[0] >> 2) & 0x3F];
quad[1] = base_64_map[((doublet[0] & 0x03) << 4) | ((doublet[1] >> 4) & 0x0F)];
quad[2] = base_64_map[((doublet[1] & 0x0F) << 2)];
quad[3] = pad;
}
/* Encode 1 byte of data into 4 */
static void encode_singlet(const char *singlet, char *quad, char pad)
{
quad[0] = base_64_map[(singlet[0] >> 2) & 0x3F];
quad[1] = base_64_map[((singlet[0] & 0x03) << 4)];
quad[2] = pad;
quad[3] = pad;
}
This is written as C code rather than using native C++ idioms, but the code shown should compile with C++ (unlike the C99 initializers elsewhere in the source). The base_64_map[] array corresponds to your Base64Chars string. The pad character passed in is normally '=', but can be '\0' since the system I work with has eccentric ideas about not needing padding (pre-dating my involvement in the code, and it uses a non-standard alphabet to boot) and the code handles both the non-standard and the RFC 3548 standard.
The driving code is:
/* Encode input data as Base-64 string. Output length returned, or negative error */
static int base64_encode_internal(const char *data, size_t datalen, char *buffer, size_t buflen, char pad)
{
size_t outlen = BASE64_ENCLENGTH(datalen);
const char *bin_data = (const void *)data;
char *b64_data = (void *)buffer;
if (outlen > buflen)
return(B64_ERR_OUTPUT_BUFFER_TOO_SMALL);
while (datalen >= 3)
{
encode_triplet(bin_data, b64_data);
bin_data += 3;
b64_data += 4;
datalen -= 3;
}
b64_data[0] = '\0';
if (datalen == 2)
encode_doublet(bin_data, b64_data, pad);
else if (datalen == 1)
encode_singlet(bin_data, b64_data, pad);
b64_data[4] = '\0';
return((b64_data - buffer) + strlen(b64_data));
}
/* Encode input data as Base-64 string. Output length returned, or negative error */
int base64_encode(const char *data, size_t datalen, char *buffer, size_t buflen)
{
return(base64_encode_internal(data, datalen, buffer, buflen, base64_pad));
}
The base64_pad constant is the '='; there's also a base64_encode_nopad() function that supplies '\0' instead. The errors are somewhat arbitrary but relevant to the code.
The main point to take away from this is that you should be doing bit manipulation and building up a string that is an exact multiple of 4 bytes for a given input.

std::string EncodeBase64(std::string Data)
{
std::string Binary = std::string();
std::string Result = std::string();
for (std::size_t I = 0; I < Data.size(); ++I)
{
Binary += DecToBinStr(Data[I], 8);
}
if (Binary.size() % 6)
{
Binary.resize(Binary.size() + 6 - Binary.size() % 6, '0');
}
for (std::size_t I = 0; I < Binary.size(); I += 6)
{
Result += Base64Chars[BinToDecStr(Copy(Binary, I, 6))];
if (I == 0) ++I;
}
if (Result.size() % 4)
{
Result.resize(Result.size() + 4 - Result.size() % 4, '=');
}
return Result;
}

How do I convert this code to c++?

I have this code:
string get_md5sum(unsigned char* md) {
char buf[MD5_DIGEST_LENGTH + MD5_DIGEST_LENGTH];
char *bptr;
bptr = buf;
for(int i = 0; i < MD5_DIGEST_LENGTH; i++) {
bptr += sprintf(bptr, "%02x", md[i]);
}
bptr += '\0';
string x(buf);
return x;
}
Unfortunately, this is some C combined with some C++. It does compile, but I don't like the printf and char*'s. I always thought this was not necessary in C++, and that there were other functions and classes to realize this. However, I don't completely understand what is going on with this:
bptr += sprintf(bptr, "%02x", md[i]);
And therefore I don't know how to convert it into C++. Can someone help me out with that?

sprintf returns number of bytes written. So this one writes to bptr two bytes (value of md[i] converted to %02x -> which means hex, padded on 2 chars with zeroes from left), and increases bptr by number of bytes written, so it points on string's (buf) end.
I don't get the bptr += '\0'; line, IMO it should be *bptr = '\0';
in C++ it should be written like this:
using namespace std;
stringstream buf;
for(int i = 0; i < MD5_DIGEST_LENGTH; i++)
{
buf << hex << setfill('0') << setw(2) << static_cast<int>(static_cast<unsigned char>(md[i]));
}
return buf.str();
EDIT: updated my c++ answer

bptr += sprintf(bptr, "%02x", md[i]);
This is printing the character in md[i] as 2 hex characters into the buffer and advancing the buffer pointer by 2. Thus the loop prints out the hex form of the MD5.
bptr += '\0';
That line is probably not doing what you want... its adding 0 to the pointer, giving you the same pointer back...
I'd implememt this something like this.
string get_md5sum(unsigned char* md) {
static const char[] hexdigits="0123456789ABCDEF";
char buf[ 2*MD5_DIGEST_LENGTH ];
for(int i = 0; i < MD5_DIGEST_LENGTH; i++) {
bptr[2*i+0] = hexdigits[ md[i] / 16 ];
bptr[2*i+1] = hexdigits[ md[i] % 16 ];
}
return string(buf,2*MD5_DIGEST_LENGTH );
}

I don't know C++, so without using pointers and strings and stuff, here's a (almost) pseudo-code for you :)
for(int i = 0; i < MD5_DIGEST_LENGTH; i++) {
buf[i*2] = hexdigits[(md[i] & 0xF0) >> 4];
buf[i*2 + 1] = hexdigits[md[i] & 0x0F];
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++: How to convert wstring with md5 hash to byte* array? - c++

std::wstring hashStr(L"4727b105cf792b2d8ad20424ed83658c"); //.... byte digest[16]; How can I get my md5 hash in digest? My answer is: wchar_t * EndPtr; for (int i = 0; i < 16; i++) { std::wstring bt = hashStr.substr(i*2, 2); digest[i] = static_cast<BYTE>(wcstoul(bt.c_str(), &EndPtr, 16)); }

You need to read two characters from hashStr, convert them from hex to a binary value, and put that value into the next spot in digest -- something on this order: for (int i=0; i<16; i++) { std::wstring byte = hashStr.substr(i*2, 2); digest[i] = hextobin(byte); }

Related

Why is strtoul returning 0 from a "1" string?

Hexadecimal QString representation to Unsigned char array

How do I convert xor encryption from using std::string to just char or char *?

Base 64 Encoding Losing data

How do I convert this code to c++?

Categories

Resources