How can I pad my md5 message with c/c++ - c++

I'm working on a program in c++ to do md5 checksums. I'm doing this mainly because I think I'll learn a lot of different things about c++, checksums, OOP, and whatever else I run into.
I'm having trouble the check sums and I think the problem is in the function padbuff which does the message padding.
#include "HashMD5.h"
int leftrotate(int x, int y);
void padbuff(uchar * buffer);
//HashMD5 constructor
HashMD5::HashMD5()
{
Type = "md5";
Hash = "";
}
HashMD5::HashMD5(const char * hashfile)
{
Type = "md5";
std::ifstream filestr;
filestr.open(hashfile, std::fstream::in | std::fstream::binary);
if(filestr.fail())
{
std::cerr << "File " << hashfile << " was not opened.\n";
std::cerr << "Open failed with error ";
}
}
std::string HashMD5::GetType()
{
return this->Type;
}
std::string HashMD5::GetHash()
{
return this->Hash;
}
bool HashMD5::is_open()
{
return !((this->filestr).fail());
}
void HashMD5::CalcHash(unsigned int * hash)
{
unsigned int *r, *k;
int r2[4] = {0, 4, 9, 15};
int r3[4] = {0, 7, 12, 19};
int r4[4] = {0, 4, 9, 15};
uchar * buffer;
int bufLength = (2<<20)*8;
int f,g,a,b,c,d, temp;
int *head;
uint32_t maxint = 1<<31;
//Initialized states
unsigned int h[4]{ 0x67452301, 0xefcdab89, 0x98badcfe, 0x10325476};
r = new unsigned int[64];
k = new unsigned int[64];
buffer = new uchar[bufLength];
if(r==NULL || k==NULL || buffer==NULL)
{
std::cerr << "One of the dyn alloc failed\n";
}
// r specifies the per-round shift amounts
for(int i = 0; i<16; i++)
r[i] = 7 + (5 * ((i)%4) );
for(int i = 16; i < 32; i++)
r[i] = 5 + r2[i%4];
for(int i = 32; i< 48; i++)
r[i] = 4 + r3[i%4];
for(int i = 48; i < 63; i++)
r[i] = 6 + r4[i%4];
for(int i = 0; i < 63; i++)
{
k[i] = floor( fabs( sin(i + 1)) * maxint);
}
while(!(this->filestr).eof())
{
//Read in 512 bits
(this->filestr).read((char *)buffer, bufLength-512);
padbuff(buffer);
//The 512 bits are now 16 32-bit ints
head = (int *)buffer;
for(int i = 0; i < 64; i++)
{
if(i >=0 && i <=15)
{
f = (b & c) | (~b & d);
g = i;
}
else if(i >= 16 && i <=31)
{
f = (d & b) | (~d & b);
g = (5*i +1) % 16;
}
else if(i >=32 && i<=47)
{
f = b ^ c ^ d;
g = (3*i + 5 ) % 16;
}
else
{
f = c ^ (b | ~d);
g = (7*i) % 16;
}
temp = d;
d = c;
c = b;
b = b + leftrotate((a + f + k[i] + head[g]), r[i]);
a = temp;
}
h[0] +=a;
h[1] +=b;
h[2] +=c;
h[3] +=d;
}
delete[] r;
delete[] k;
hash = h;
}
int leftrotate(int x, int y)
{
return(x<<y) | (x >> (32 -y));
}
void padbuff(uchar* buffer)
{
int lack;
int length = strlen((char *)buffer);
uint64_t mes_size = length % UINT64_MAX;
if((lack = (112 - (length % 128) ))>0)
{
*(buffer + length) = ('\0'+1 ) << 3;
memset((buffer + length + 1),0x0,lack);
memcpy((void*)(buffer+112),(void *)&mes_size, 64);
}
}
In my test program I run this on the an empty message. Thus length in padbuff is 0. Then when I do *(buffer + length) = ('\0'+1 ) << 3;, I'm trying to pad the message with a 1. In the Netbeans debugger I cast buffer as a uint64_t and it says buffer=8. I was trying to put a 1 bit in the most significant spot of buffer so my cast should have been UINT64_MAX. Its not, so I'm confused about how my padding code works. Can someone tell me what I'm doing and what I'm supposed to do in padbuff? Thanks, and I apologize for the long freaking question.
Just to be clear about what the padding is supposed to be doing, here is the padding excerpt from Wikipedia:
The message is padded so that its length is divisible by 512. The padding works as follows: first a single bit, 1, is appended to the end of the message. This is followed by as many zeros as are required to bring the length of the message up to 64 bits fewer than a multiple of 512. The remaining bits are filled up with 64 bits representing the length of the original message, modulo 264.
I'm mainly looking for help for padbuff, but since I'm trying to learn all comments are appreciated.

The first question is what you did:
length % UINT64_MAX doesn't make sense at all because length is in bytes and MAX is the value you can store in UINT64.
You thought that putting 1 bit in the most significant bit would give the maximum value. In fact, you need to put 1 in all bits to get it.
You shift 1 by 3. It's only half the length of the byte.
The byte pointed by buffer is the least significant in little endian. (I assume you have little endian since the debugger showed 8).
The second question how it should work.
I don't know what exactly padbuff should do but if you want to pad and get UINT64_MAX, you need something like this:
int length = strlen((char *)buffer);
int len_of_padding = sizeof(uint64_t) - length % sizeof(uint64_t);
if(len_of_padding > 0)
{
memset((void*)(buffer + length), 0xFF, len_of_padding);
}
You worked with the length of two uint64 values. May be you wanted to zero the next one:
uint64_t *after = (uint64_t*)(buffer + length + len_of_padding);
*after = 0;

Related

Copy 80 bit hex number from char array to uint16_t vector or array

Say I have a text file containing the 80bit hex number
0xabcdef0123456789abcd
My C++ program reads that using fstream into a char array called buffer.
But then I want to store it in a uint16_t array such that:
uint16_t * key = {0xabcd, 0xef01, 0x2345, 0x6789, 0xabcd}
I have tried several approaches, but I continue to get decimal integers, for instance:
const std::size_t strLength = strlen(buffer);
std::vector<uint16_t> arr16bit((strLength / 2) + 1);
for (std::size_t i = 0; i < strLength; ++i)
{
arr16bit[i / 2] <<= 8;
arr16bit[i / 2] |= buffer[i];
}
Yields:
arr16bit = {24930, 25444, 25958, 12337, 12851}
There must be an easy way to do this that I'm just not seeing.
Here is the full solution I came up with based on the comments:
int hex_char_to_int(char c) {
if (int(c) < 58) //numbers
return c - 48;
else if (int(c) < 91) //capital letters
return c - 65 + 10;
else if (int(c) < 123) //lower case letters
return c - 97 + 10;
}
uint16_t ints_to_int16(int i0, int i1, int i2, int i3) {
return (i3 * 16 * 16 * 16) + (i2 * 16 * 16) + (i1 * 16) + i0;
}
void readKey() {
const int bufferSize = 25;
char buffer[bufferSize] = { NULL };
ifstream* pStream = new ifstream("key.txt");
if (pStream->is_open() == true)
{
pStream->read(buffer, bufferSize);
}
cout << buffer << endl;
const size_t strLength = strlen(buffer);
int* hex_to_int = new int[strLength - 2];
for (int i = 2; i < strLength; i++) {
hex_to_int[i - 2] = hex_char_to_int(buffer[i]);
}
cout << endl;
uint16_t* key16 = new uint16_t[5];
int j = 0;
for (int i = 0; i < 5; i++) {
key16[i] = ints_to_int16(hex_to_int[j++], hex_to_int[j++], hex_to_int[j++], hex_to_int[j++]);
cout << "0x" << hex << key16[i] << " ";
}
cout << endl;
}
This outputs:
0xabcdef0123456789abcd
0xabcd 0xef01 0x2345 0x6789 0xabcd

How to take input 128 bit unsigned integer in c++

I am new to c++. I want to take input a unsigned 128 bit integer using scanf and print it using printf. As I am new to c++ , I only know these two methods for input output. Can someone help me out?
You could use boost, but this library set must be installed yourself:
#include <boost/multiprecision/cpp_int.hpp>
#include <iostream>
int main()
{
using namespace boost::multiprecision;
uint128_t v = 0;
std::cin >> v; // read
std::cout << v << std::endl; // write
return 0;
}
If you want to get along without boost, you can store the value into two uint64_t as such:
std::string input;
std::cin >> input;
uint64_t high = 0, low = 0, tmp;
for(char c : input)
{
high *= 10;
tmp = low * 10;
if(tmp / 10 != low)
{
high += ((low >> 32) * 10 + ((low & 0xf) * 10 >> 32)) >> 32;
}
low = tmp;
tmp = low + c - '0';
high += tmp < low;
low = tmp;
}
Printing then, however, gets more ugly:
std::vector<uint64_t> v;
while(high | low)
{
uint64_t const pow10 = 100000000;
uint64_t const mod = (((uint64_t)1 << 32) % pow10) * (((uint64_t)1 << 32) % pow10) % pow10;
tmp = high % pow10;
uint64_t temp = tmp * mod % pow10 + low % pow10;
v.push_back((tmp * mod + low) % pow10);
low = low / pow10 + tmp * 184467440737 + tmp * /*0*/9551616 / pow10 + (temp >= pow10);
high /= pow10;
}
std::vector<uint64_t>::reverse_iterator i = v.rbegin();
while(i != v.rend() && *i == 0)
{
++i;
}
if(i == v.rend())
{
std::cout << 0;
}
else
{
std::cout << *i << std::setfill('0');
for(++i; i != v.rend(); ++i)
{
std::cout << std::setw(8) << *i;
}
}
Above solution works up to (including)
340282366920938463463374516198409551615
= 0x ffff ffff ffff ffff ffff ad06 1410 beff
Above, there is an error.
Note: pow10 can be varied, then some other constants need to be adjusted, e. g. pow10 = 10:
low = low / pow10 + tmp * 1844674407370955161 + tmp * 6 / pow10 + (temp >= pow10);
and
std::cout << std::setw(1) << *i; // setw also can be dropped in this case
Increasing results in reducing the maximum number for which printing still works correctly, decreasing raises the maximum. With pow10 = 10, maximum is
340282366920938463463374607431768211425
= ffff ffff ffff ffff ffff ffff ffff ffe1
I don't know where the error for the very highest numbers comes from, yet, possibly some unconsidered overflow. Any suggestions appreciated, then I'll improve the algorithm. Until then, I'd reduce pow10 to 10 and introduce a special handling for the highest 30 failing numbers:
std::string const specialValues[0] = { /*...*/ };
if(high == 0xffffffffffffffff && low > 0xffffffffffffffe1)
{
std::cout << specialValues[low - 0xffffffffffffffe2];
}
else
{
/* ... */
}
So at least, we can handle all valid 128-bit values correctly.
You can try from_string_128_bits and to_string_128_bits with 128 bits unsigned integers in C :
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
__uint128_t from_string_128_bits(const char *str) {
__uint128_t res = 0;
for (; *str; res = res * 10 + *str++ - '0');
return res;
}
static char *to_string_128_bits(__uint128_t num) {
__uint128_t mask = -1;
size_t a, b, c = 1, d;
char *s = malloc(2);
strcpy(s, "0");
for (mask -= mask / 2; mask; mask >>= 1) {
for (a = (num & mask) != 0, b = c; b;) {
d = ((s[--b] - '0') << 1) + a;
s[b] = "0123456789"[d % 10];
a = d / 10;
}
for (; a; s = realloc(s, ++c + 1), memmove(s + 1, s, c), *s = "0123456789"[a % 10], a /= 10);
}
return s;
}
int main(void) {
__uint128_t n = from_string_128_bits("10000000000000000000000000000000000001");
n *= 7;
char *s = to_string_128_bits(n);
puts(s);
free(s); // string must be freed
// print 70000000000000000000000000000000000007
}

Can you explain what this binary swapping operation is doing?

I'm currently trying to solve this programing programing puzzle. The puzzle is about encrypting messages using the following C++ code:
int main()
{
int size;
cin >> size;
unsigned int* a = new unsigned int[size / 16]; // <- input tab to encrypt
unsigned int* b = new unsigned int[size / 16]; // <- output tab
for (int i = 0; i < size / 16; i++) { // Read size / 16 integers to a
cin >> hex >> a[i];
}
for (int i = 0; i < size / 16; i++) { // Write size / 16 zeros to b
b[i] = 0;
}
for (int i = 0; i < size; i++)
for (int j = 0; j < size; j++) {
b[(i + j) / 32] ^= ( (a[i / 32] >> (i % 32)) &
(a[j / 32 + size / 32] >> (j % 32)) & 1 ) << ((i + j) % 32); // Magic centaurian operation
}
for(int i = 0; i < size / 16; i++) {
if (i > 0) {
cout << ' ';
}
cout << setfill('0') << setw(8) << hex << b[i]; // print result
}
cout << endl;
/*
Good luck humans
*/
return 0;
}
The objective is to reverse this encoding (that should be a known mathematical operation when identified). The problem i'm facing is that i cannot understand what the encoding works and what all these binary operations are doing. Can you explain me how this encoding works?
Thank you!
To learn what the operations are, break it down loop-by-loop and line-by-line, then apply the rules of precedence. Nothing more, nothing less. If I haven't lost track somewhere in the bitwise swamp, the effect of which all boils down to exclusive XOR'ing the orignal value at index b[(i + j) / 32] by a power of 2 in the range of a signed integer (or 0). The analysis would look something like this:
for (int i = 0; i < size; i++)
for (int j = 0; j < size; j++) {
b[(i + j) / 32] ^=
( (a[i / 32] >> (i % 32)) &
(a[j / 32 + size / 32] >>
(j % 32)) & 1 ) <<
((i + j) % 32); // Magic centaurian operation
}
}
What is the first operation:
b[(i + j) / 32] ^=
This in an exclusive OR of the value at that index. If you just let idx represent the jumble that computes the index, you can write it as:
b[idx] ^= stuff
which applying the rules of precedence (right-to-left for ^=) is the same as writing:
b[idx] = b[idx] ^ stuff
The order of precedence tells us me need to figure out stuff before we can apply it to the value of b[idx]. Looking at stuff you have:
| A | << | B |
| C | & | D | | |
| | | E | & 1 | | |
+-----------------+---+-----------------------+-----+----+-------------+
( (a[i/32]>>(i%32)) & (a[j/32+size/32]>>(j%32)) & 1 ) << ( (i+j) % 32 );
Breaking in down, you have A << B, which can be further broken down as:
( C & D ) << B
or finally:
(C & E & 1) << B
The rules of precedence relevant to (C & E & 1) << B are all applied left-to-right giving deference to the parenthesis grouping.
So what is B? It is just a number that the grouping (C & E & 1) will be shifted to the left by. In terms of the index values i and j modded with the number of bits in an integer, it will simply shift the bits in grouping (C & E & 1) to the left by 0-31 bits depending on the combined value of i+j.
The grouping (C & E & 1) is an entirely similar analysis. a[i/32]>>(i%32) is nothing more than the value at a[i/32] shifted to the right by (i%32). E is the same with slightly differnt index manipulation: (a[j/32+size/32]>>(j%32)) which is just the value at that index shifted right by (j%32). The result of both of those shifts are then ANDED with 1. What that means is the entire grouping (C & E & 1) will only have a value if both C & E are odd number values.
Why only odd values? From a binary standpoint, odd numbers are the only values that will have the one-bit 1. (e.g. 5 & 7 & 1 (101 & 111 & 1) = 1). If any of the values are even or 0, then the whole grouping will be 0.
Understanding the grouping (C & E & 1) (or what we have largely grouped as A), you can now look at:
A << B
Knowing A will be 0 or 1, you know the only way the result of the shift will have value is if A is 1, and then the result of the group will just be the value of 1 shifted left by B bits. Knowing B has the range of 0-31, then the range of values for A << B are between 0 - 2147483648, but since you are shifting by between 0 - 31, the values for A << B will only be the positive powers of two between 0 - 2147483648 (binary: 0, 1, 10, 100, 1000, etc...)
Then that finally brings us to
b[idx] = b[idx] ^ stuff
which when you exclusively OR anything by a power of two, you only serve to flip the bit at the power of two in that number. (e.g. 110101 (26) ^ 1000 (8) = 111101 (61)). All other bits are unchanged. So the final effect of all the operations is to make:
b[idx] = b[idx] ^ stuff
nothing more than:
b[idx] = b[idx] ^ (power of two)
or
b[idx] = b[idx] ^ 0 /* which is nothing more than b[idx] to begin with */
Let me know if you have any questions. You can easily dump the index calculations to look at the values, but this should cover the operations at issue.
This code snippet is doing a Carry-free Multiplication Operation on the first half parts of the array (a[0:size/32]) and the second half parts of the array (a[size/32:size/16]).
I write an equivalent version in binary below the original version, hope this might help you.
#include <iostream>
#include <iomanip>
#include <ios>
using namespace std;
int main() {
int size;
cin >> size;
unsigned int* a = new unsigned int[size / 16]; // <- input tab to encrypt
unsigned int* b = new unsigned int[size / 16]; // <- output tab
bool *a1 = new bool[size];
bool *a2 = new bool[size];
bool *bb = new bool[size * 2];
for (int i = 0; i < size / 16; i++) { // Read size / 16 integers to a
cin >> hex >> a[i];
}
for (int i = 0; i < size * 2; i++) {
if (i < size) {
a1[i] = (a[i / 32] & (1 << (i % 32))) > 0; // first `size` bits are for a1
} else {
a2[i - size] = (a[i / 32] & (1 << (i % 32))) > 0; // rest `size` bits are for a2
}
}
for (int i = 0; i < size / 16; i++) { // Write size / 16 zeros to b
b[i] = 0;
}
for (int i = 0; i < size * 2; i++) {
bb[i] = 0;
}
for (int i = 0; i < size; i++)
for (int j = 0; j < size; j++) {
b[(i + j) / 32] ^= ( (a[i / 32] >> (i % 32)) &
(a[j / 32 + size / 32] >> (j % 32)) & 1 ) << ((i + j) % 32); // Magic centaurian operation
}
for (int i = 0; i < size; i++)
for (int j = 0; j < size; j++) {
bb[i + j] ^= (a1[i] & a2[j] & 1); // some operation as multiply (*) do
}
for(int i = 0; i < size / 16; i++) {
if (i > 0) {
cout << ' ';
}
cout << setfill('0') << setw(8) << hex << b[i]; // print result
}
cout << endl;
for(int i = 0; i < size / 32 * 2; i++) {
if (i > 0) {
cout << ' ';
}
unsigned int hex_number = 0;
for (int j = 0; j < 32; j++) hex_number += bb[i * 32 + j] << j;
cout << setfill('0') << setw(8) << hex << hex_number; // print result
}
cout << endl;
return 0;
}

extracting binary data out of 8-bit byte and converting it to primitive types [C++]

I have a vector of integers vector<int> that has 48 items in it. I want to extract binary data out of this(not sure if this is correct way to call it please edit it if it's wrong) i.e. a sequence of one or more bits and then convert them to a primitive type like int. I have come up with this solution:
int extractValue(vector<int> v, int startBit, int endBit) {
int beginByteIndex = (startBit / 8);
int endByteIndex = (endBit / 8);
vector<bool> bits;
bits.clear();
int startIndex = startBit % 8;
int endIndex = endBit % 8;
int value = v[beginByteIndex];
value = (value << startIndex);
int temp = 8;
if (beginByteIndex == endByteIndex) {
temp = endIndex + 1;
}
for (int i = startIndex; i < temp; i++) {
int temp = 0x80 & value;
bits.push_back(temp);
value <<= 1;
}
for (int i = beginByteIndex + 1; i < endByteIndex; i++) {
value = v[i];
for (int j = 0; j < 8; j++) {
int temp = 0x80 & value;
bits.push_back(temp);
value <<= 1;
}
}
if (endByteIndex > beginByteIndex) {
value = v[endByteIndex];
for (int i = 0; i <= endIndex; i++) {
int temp = 0x80 & value;
bits.push_back(temp);
value <<= 1;
}
}
int size = bits.size();
int p = 1;
int result = 0;
for (int i = size - 1; i >= 0; i--) {
result += (bits[i] * p);
p *= 2;
}
return result;
}
but this function is long, difficult to read and is done in C style. could someone please suggest a C++ way of doing this. I'm almost certain that C++ has a good, short and elegant way of doing this. also please edit the question so others with similar problem can benefit from it. Unfortunately My English is not that good to express it in a more general way.
EDIT:
as requested in comments for example I want to extract following information with following positions and length:
int year = extractValue(data, 0, 6);
int month = extractValue(data, 7, 10);
int day = extractValue(data, 11, 15);
a simple solution:
convert each byte to hex string (ostringstream or even sprintf can help), you got 2 digits, range is 0 to F.
for each hex digit you can create the bitmap like this:
0 = 0000,
1 = 0001,
2 = 0010,
...,
F = 1111,
add bits to the vector according to the bitmap
to recover - you take 4 bits and translate it back to digit, then take 2 digits and convert back to byte (say by adding 0x to the hex, isringstream to byte).

filling a string made by new in a function

#include <iostream>
using namespace std;
void generCad(int n, char* cad){
int longi = 1, lastchar, m = n; // calculating lenght of binary string
char actual;
do{
longi++;
n /= 2;
}while(n/2 != 0);
cad = new char[longi];
lastchar = longi - 1;
do{
actual = m % 2;
cad[lastchar] = actual;
m /= 2;
lastchar--;
}while(m/2 != 0);
cout << "Cadena = " << cad;
}
Hi! I'm having a problem here because I need a function that creates a binary string for a number n. I think the process is "good" but cout doesn't print anything, I don't know how to fill the string I've created using the new operator
The code should look like this:
void generCad(int n, char** cad)
{
int m = n, c = 1;
while (m >>= 1) // this divides the m by 2, but by shifting which is faster
c++; // here you counts the bits
*cad = new char[c + 1];
(*cad)[c] = 0; // here you end the string by 0 character
while (n)
{
(*cad)[--c] = n % 2 + '0';
n /= 2;
}
cout << "Cadena = " << *cad;
}
Note that cad is now char ** and not char *. If it is just char * then you do not get the pointer as you expect outside the function. If you do not need the string outside this function, then it may be passed as char *, but then do not forget to delete the cad before you leave the function (good habit ;-))
EDIT:
This code will probably be more readable and do the same:
char * toBin(int n)
{
int m = n, c = 1;
while (m >>= 1) // this divides the m by 2, but by shifting which is faster
c++; // here you counts the bits
char *cad = new char[c + 1];
cad[c] = 0; // here you end the string by 0 character
while (n)
{
cad[--c] = n % 2 + '0';
n /= 2;
}
cout << "Cadena = " << cad;
return cad;
}
int main()
{
char *buff;
buff = toBin(16);
delete [] buff;
return 1;
}
actual contains the numbers 0 and 1, not the characters '0' and '1'. To convert, use:
cad[lastchar] = actual + '0';
Also, since you're using cad as a C string, you need to allocate an extra character and add a NUL terminator.
actual = m % 2;
should be:
actual = m % 2 + '0';