Can you explain what this binary swapping operation is doing? - c++

I'm currently trying to solve this programing programing puzzle. The puzzle is about encrypting messages using the following C++ code:
int main()
{
int size;
cin >> size;
unsigned int* a = new unsigned int[size / 16]; // <- input tab to encrypt
unsigned int* b = new unsigned int[size / 16]; // <- output tab
for (int i = 0; i < size / 16; i++) { // Read size / 16 integers to a
cin >> hex >> a[i];
}
for (int i = 0; i < size / 16; i++) { // Write size / 16 zeros to b
b[i] = 0;
}
for (int i = 0; i < size; i++)
for (int j = 0; j < size; j++) {
b[(i + j) / 32] ^= ( (a[i / 32] >> (i % 32)) &
(a[j / 32 + size / 32] >> (j % 32)) & 1 ) << ((i + j) % 32); // Magic centaurian operation
}
for(int i = 0; i < size / 16; i++) {
if (i > 0) {
cout << ' ';
}
cout << setfill('0') << setw(8) << hex << b[i]; // print result
}
cout << endl;
/*
Good luck humans
*/
return 0;
}
The objective is to reverse this encoding (that should be a known mathematical operation when identified). The problem i'm facing is that i cannot understand what the encoding works and what all these binary operations are doing. Can you explain me how this encoding works?
Thank you!

To learn what the operations are, break it down loop-by-loop and line-by-line, then apply the rules of precedence. Nothing more, nothing less. If I haven't lost track somewhere in the bitwise swamp, the effect of which all boils down to exclusive XOR'ing the orignal value at index b[(i + j) / 32] by a power of 2 in the range of a signed integer (or 0). The analysis would look something like this:
for (int i = 0; i < size; i++)
for (int j = 0; j < size; j++) {
b[(i + j) / 32] ^=
( (a[i / 32] >> (i % 32)) &
(a[j / 32 + size / 32] >>
(j % 32)) & 1 ) <<
((i + j) % 32); // Magic centaurian operation
}
}
What is the first operation:
b[(i + j) / 32] ^=
This in an exclusive OR of the value at that index. If you just let idx represent the jumble that computes the index, you can write it as:
b[idx] ^= stuff
which applying the rules of precedence (right-to-left for ^=) is the same as writing:
b[idx] = b[idx] ^ stuff
The order of precedence tells us me need to figure out stuff before we can apply it to the value of b[idx]. Looking at stuff you have:
| A | << | B |
| C | & | D | | |
| | | E | & 1 | | |
+-----------------+---+-----------------------+-----+----+-------------+
( (a[i/32]>>(i%32)) & (a[j/32+size/32]>>(j%32)) & 1 ) << ( (i+j) % 32 );
Breaking in down, you have A << B, which can be further broken down as:
( C & D ) << B
or finally:
(C & E & 1) << B
The rules of precedence relevant to (C & E & 1) << B are all applied left-to-right giving deference to the parenthesis grouping.
So what is B? It is just a number that the grouping (C & E & 1) will be shifted to the left by. In terms of the index values i and j modded with the number of bits in an integer, it will simply shift the bits in grouping (C & E & 1) to the left by 0-31 bits depending on the combined value of i+j.
The grouping (C & E & 1) is an entirely similar analysis. a[i/32]>>(i%32) is nothing more than the value at a[i/32] shifted to the right by (i%32). E is the same with slightly differnt index manipulation: (a[j/32+size/32]>>(j%32)) which is just the value at that index shifted right by (j%32). The result of both of those shifts are then ANDED with 1. What that means is the entire grouping (C & E & 1) will only have a value if both C & E are odd number values.
Why only odd values? From a binary standpoint, odd numbers are the only values that will have the one-bit 1. (e.g. 5 & 7 & 1 (101 & 111 & 1) = 1). If any of the values are even or 0, then the whole grouping will be 0.
Understanding the grouping (C & E & 1) (or what we have largely grouped as A), you can now look at:
A << B
Knowing A will be 0 or 1, you know the only way the result of the shift will have value is if A is 1, and then the result of the group will just be the value of 1 shifted left by B bits. Knowing B has the range of 0-31, then the range of values for A << B are between 0 - 2147483648, but since you are shifting by between 0 - 31, the values for A << B will only be the positive powers of two between 0 - 2147483648 (binary: 0, 1, 10, 100, 1000, etc...)
Then that finally brings us to
b[idx] = b[idx] ^ stuff
which when you exclusively OR anything by a power of two, you only serve to flip the bit at the power of two in that number. (e.g. 110101 (26) ^ 1000 (8) = 111101 (61)). All other bits are unchanged. So the final effect of all the operations is to make:
b[idx] = b[idx] ^ stuff
nothing more than:
b[idx] = b[idx] ^ (power of two)
or
b[idx] = b[idx] ^ 0 /* which is nothing more than b[idx] to begin with */
Let me know if you have any questions. You can easily dump the index calculations to look at the values, but this should cover the operations at issue.

This code snippet is doing a Carry-free Multiplication Operation on the first half parts of the array (a[0:size/32]) and the second half parts of the array (a[size/32:size/16]).
I write an equivalent version in binary below the original version, hope this might help you.
#include <iostream>
#include <iomanip>
#include <ios>
using namespace std;
int main() {
int size;
cin >> size;
unsigned int* a = new unsigned int[size / 16]; // <- input tab to encrypt
unsigned int* b = new unsigned int[size / 16]; // <- output tab
bool *a1 = new bool[size];
bool *a2 = new bool[size];
bool *bb = new bool[size * 2];
for (int i = 0; i < size / 16; i++) { // Read size / 16 integers to a
cin >> hex >> a[i];
}
for (int i = 0; i < size * 2; i++) {
if (i < size) {
a1[i] = (a[i / 32] & (1 << (i % 32))) > 0; // first `size` bits are for a1
} else {
a2[i - size] = (a[i / 32] & (1 << (i % 32))) > 0; // rest `size` bits are for a2
}
}
for (int i = 0; i < size / 16; i++) { // Write size / 16 zeros to b
b[i] = 0;
}
for (int i = 0; i < size * 2; i++) {
bb[i] = 0;
}
for (int i = 0; i < size; i++)
for (int j = 0; j < size; j++) {
b[(i + j) / 32] ^= ( (a[i / 32] >> (i % 32)) &
(a[j / 32 + size / 32] >> (j % 32)) & 1 ) << ((i + j) % 32); // Magic centaurian operation
}
for (int i = 0; i < size; i++)
for (int j = 0; j < size; j++) {
bb[i + j] ^= (a1[i] & a2[j] & 1); // some operation as multiply (*) do
}
for(int i = 0; i < size / 16; i++) {
if (i > 0) {
cout << ' ';
}
cout << setfill('0') << setw(8) << hex << b[i]; // print result
}
cout << endl;
for(int i = 0; i < size / 32 * 2; i++) {
if (i > 0) {
cout << ' ';
}
unsigned int hex_number = 0;
for (int j = 0; j < 32; j++) hex_number += bb[i * 32 + j] << j;
cout << setfill('0') << setw(8) << hex << hex_number; // print result
}
cout << endl;
return 0;
}

Related

Copy 80 bit hex number from char array to uint16_t vector or array

Say I have a text file containing the 80bit hex number
0xabcdef0123456789abcd
My C++ program reads that using fstream into a char array called buffer.
But then I want to store it in a uint16_t array such that:
uint16_t * key = {0xabcd, 0xef01, 0x2345, 0x6789, 0xabcd}
I have tried several approaches, but I continue to get decimal integers, for instance:
const std::size_t strLength = strlen(buffer);
std::vector<uint16_t> arr16bit((strLength / 2) + 1);
for (std::size_t i = 0; i < strLength; ++i)
{
arr16bit[i / 2] <<= 8;
arr16bit[i / 2] |= buffer[i];
}
Yields:
arr16bit = {24930, 25444, 25958, 12337, 12851}
There must be an easy way to do this that I'm just not seeing.
Here is the full solution I came up with based on the comments:
int hex_char_to_int(char c) {
if (int(c) < 58) //numbers
return c - 48;
else if (int(c) < 91) //capital letters
return c - 65 + 10;
else if (int(c) < 123) //lower case letters
return c - 97 + 10;
}
uint16_t ints_to_int16(int i0, int i1, int i2, int i3) {
return (i3 * 16 * 16 * 16) + (i2 * 16 * 16) + (i1 * 16) + i0;
}
void readKey() {
const int bufferSize = 25;
char buffer[bufferSize] = { NULL };
ifstream* pStream = new ifstream("key.txt");
if (pStream->is_open() == true)
{
pStream->read(buffer, bufferSize);
}
cout << buffer << endl;
const size_t strLength = strlen(buffer);
int* hex_to_int = new int[strLength - 2];
for (int i = 2; i < strLength; i++) {
hex_to_int[i - 2] = hex_char_to_int(buffer[i]);
}
cout << endl;
uint16_t* key16 = new uint16_t[5];
int j = 0;
for (int i = 0; i < 5; i++) {
key16[i] = ints_to_int16(hex_to_int[j++], hex_to_int[j++], hex_to_int[j++], hex_to_int[j++]);
cout << "0x" << hex << key16[i] << " ";
}
cout << endl;
}
This outputs:
0xabcdef0123456789abcd
0xabcd 0xef01 0x2345 0x6789 0xabcd

Implementation of the SHA256 algorithm do not return the expected result

With the implementation below, based on the pseudo-code available here, I am trying to convert a string generated with the concatenation of the members from this class:
class BlockHeader
{
private:
int version;
string hashPrevBlock;
string hashMerkleRoot;
int time;
int bits;
int nonce;
}
into a SHA256 hash, like what was done with the python code below, available here:
>>> import hashlib
>>> header_hex = ("01000000" +
"81cd02ab7e569e8bcd9317e2fe99f2de44d49ab2b8851ba4a308000000000000" +
"e320b6c2fffc8d750423db8b1eb942ae710e951ed797f7affc8892b0f1fc122b" +
"c7f5d74d" +
"f2b9441a" +
"42a14695")
>>> header_bin = header_hex.decode('hex')
>>> hash = hashlib.sha256(hashlib.sha256(header_bin).digest()).digest()
>>> hash.encode('hex_codec')
'1dbd981fe6985776b644b173a4d0385ddc1aa2a829688d1e0000000000000000'
>>> hash[::-1].encode('hex_codec')
'00000000000000001e8d6829a8a21adc5d38d0a473b144b6765798e61f98bd1d'
I expect that my program would return the same result the program above returned, but instead, when I compile and run this:
int main() {
BlockHeader header;
header.setVersion(0x01000000);
header.setHashPrevBlock("81cd02ab7e569e8bcd9317e2fe99f2de44d49ab2b8851ba4a308000000000000");
header.setHashMerkleRoot("e320b6c2fffc8d750423db8b1eb942ae710e951ed797f7affc8892b0f1fc122b");
header.setTime(0xc7f5d74d);
header.setBits(0xf2b9441a);
header.setNonce(0x42a14695);
Sha256 hash1(header.bytes());
array<BYTE, SHA256_BLOCK_SIZE> h1 = hash1.hash();
cout << "hash1: ";
for(int i=0; i<h1.size(); i++)
printf("%.2x", h1[i]);
printf("\n");
Sha256 hash2(h1);
array<BYTE, SHA256_BLOCK_SIZE> h2 = hash2.hash();
cout << "hash2: ";
for(int i=0; i<h2.size(); i++)
printf("%.2x", h2[i]);
printf("\n");
}
the result is that:
hash1: e2245204380a75c6bc6ac56f0000000040030901000000001100011000000000
hash2: 68a74f2a36c8906068c6cd6f00000000020000000000000080a7d06f00000000
I am aware the endianess in my program are not the same of the python result, but this I can fix later, when I get the correct result. Looking in the code below, can anyone give a hint of what I am missing here?
#define ROTLEFT(a,b) (((a) << (b)) | ((a) >> (32-(b))))
#define ROTRIGHT(a,b) (((a) >> (b)) | ((a) << (32-(b))))
#define CH(x,y,z) (((x) & (y)) ^ (~(x) & (z)))
#define MAJ(x,y,z) (((x) & (y)) ^ ((x) & (z)) ^ ((y) & (z)))
#define EP0(x) (ROTRIGHT(x,2) ^ ROTRIGHT(x,13) ^ ROTRIGHT(x,22))
#define EP1(x) (ROTRIGHT(x,6) ^ ROTRIGHT(x,11) ^ ROTRIGHT(x,25))
#define SIG0(x) (ROTRIGHT(x,7) ^ ROTRIGHT(x,18) ^ ((x) >> 3))
#define SIG1(x) (ROTRIGHT(x,17) ^ ROTRIGHT(x,19) ^ ((x) >> 10))
Sha256::Sha256(vector<BYTE> data) {
SIZE64 L = data.size() / 2;
SIZE64 K = 0;
while( (L + 1 + K + 8) % 64 != 0)
K = K + 1;
for(int i=0; i<L; i++) {
BYTE c = (data[i] % 32 + 9) % 25 * 16 + (data[i+1] % 32 + 9) % 25;
source.push_back(c);
}
source.push_back(0x80);
for(int i=0; i<K; i++)
source.push_back(0x00);
SIZE64 x = L + 1 + K + 8;
for(int i=0; i<sizeof(x); i++)
source.push_back( x >> i*8 );
}
Sha256::Sha256(array<BYTE, SHA256_BLOCK_SIZE> data) {
SIZE64 L = data.size() / 2;
SIZE64 K = 0;
while( (L + 1 + K + 8) % 64 != 0)
K = K + 1;
for(int i=0; i<L; i++) {
BYTE c = (data[i] % 32 + 9) % 25 * 16 + (data[i+1] % 32 + 9) % 25;
source.push_back(c);
}
source.push_back(0x80);
for(int i=0; i<K; i++)
source.push_back(0x00);
SIZE64 x = L + 1 + K + 8;
for(int i=0; i<sizeof(x); i++)
source.push_back( x >> i*8 );
}
array<BYTE, SHA256_BLOCK_SIZE> Sha256::hash() {
array<BYTE, SHA256_BLOCK_SIZE> result;
WORD32 h0 = 0x6a09e667, h1 = 0xbb67ae85, h2 = 0x3c6ef372, h3 = 0xa54ff53a, h4 = 0x510e527f, h5 = 0x9b05688c, h6 = 0x1f83d9ab, h7 = 0x5be0cd19;
WORD32 k[64] = {0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5, 0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174, 0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da, 0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967, 0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85, 0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070, 0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3, 0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2};
WORD32 a, b, c, d, e, f, g, h, i, j, t1, t2, m[64];
for(int chunk=0; chunk<=source.size()/64; chunk++) {
for (i = 0, j = chunk*64; i < 16; ++i, j += 4)
m[i] = (source[j] << 24) | (source[j + 1] << 16) | (source[j + 2] << 8) | (source[j + 3]);
for ( ; i < 64; ++i)
m[i] = SIG1(m[i - 2]) + m[i - 7] + SIG0(m[i - 15]) + m[i - 16];
a = h0;
b = h1;
c = h2;
d = h3;
e = h4;
f = h5;
g = h6;
h = h7;
for (i = 0; i < 64; ++i) {
t1 = h + EP1(e) + CH(e,f,g) + k[i] + m[i];
t2 = EP0(a) + MAJ(a,b,c);
h = g;
g = f;
f = e;
e = d + t1;
d = c;
c = b;
b = a;
a = t1 + t2;
}
h0 += a;
h1 += b;
h2 += c;
h3 += d;
h4 += e;
h5 += f;
h6 += g;
h7 += h;
}
for(int i=0; i<4; i++) result[0] = h0 >> i;
for(int i=0; i<4; i++) result[1] = h1 >> i;
for(int i=0; i<4; i++) result[2] = h2 >> i;
for(int i=0; i<4; i++) result[3] = h3 >> i;
for(int i=0; i<4; i++) result[4] = h4 >> i;
for(int i=0; i<4; i++) result[5] = h5 >> i;
for(int i=0; i<4; i++) result[6] = h6 >> i;
for(int i=0; i<4; i++) result[7] = h7 >> i;
return result;
}
In the Sha256::hash function, result is a BYTE array, whereas h0 is a WORD32. You might want to split h0 into 4 BYTEs and store into the result array, but the for loop at the end of the function won't achieve your goal.
What you want to do is to concatenate h0 to h7, and then extract the bytes from h0 to h7 by shifting 24, 16, 8, 0 bits:
// concatenate h0 to h7
WORD32 hs[8] = {h0, h1, h2, h3, h4, h5, h6, h7};
// extract bytes from hs to result
for(int i=0; i<8; i++) { // loop from h0 to h7
result[i*4 ] = hs[i] >> 24; // the most significant byte of h_i
result[i*4+1] = hs[i] >> 16;
result[i*4+2] = hs[i] >> 8;
result[i*4+3] = hs[i]; // the least significant byte of h_i
}
EDIT
After some testing, I found another error:
for(int chunk=0; chunk<=source.size()/64; chunk++) {
^^
should be
for(int chunk=0; chunk<source.size()/64; chunk++) {
^
chuck starts from 0, so you should use < instead of <=.
For example, when source.size() is 64, you only have 1 chunk to process.
EDIT2
I fully tested your code and found two problems in the constructors of the Sha256 class.
Your code implies that you assume the vector<BYTE> passed to the constructor is a hex string. That is OK, but you use the same code for the array<BYTE, SHA256_BLOCK_SIZE> version, which is the return type of hash() function, which returns a BYTE array instead of hex string.
For a BYTE array, you can simply push the byte data[i] into the source. Also, L should be data.size() because every element has size 1 in a byte array.
Besides, you try to append the size of the input(x) to source, but x should not include the appended one and zeros, and it is the bit count of the input, so x should simply be L*8. Also, the size should be a big-endian integer, so you have to push the bigger byte first:
for(int i=0; i<sizeof(x); i++) // WRONG: little endian
for(int i=sizeof(SIZE64)-1; i>=0; i--) // Correct: big endian
I have made it execute correctly and output:
hash1: b9d751533593ac10cdfb7b8e03cad8babc67d8eaeac0a3699b82857dacac9390
hash2: 1dbd981fe6985776b644b173a4d0385ddc1aa2a829688d1e0000000000000000
If you encounter other problems, feel free to ask. You are very close to the correct answer. Hope you can fix all the bugs successfully :)
EDIT3: implementation of other function
struct BlockHeader {
int version;
string hashPrevBlock;
string hashMerkleRoot;
int time;
int bits;
int nonce;
vector<BYTE> bytes();
};
#define c2x(x) (x>='A' && x<='F' ? (x-'A'+10) : x>='a' && x<='f' ? (x-'a'+10) : x-'0')
vector<BYTE> BlockHeader::bytes() {
vector<BYTE> bytes;
for (int i=24; i>=0; i-=8) bytes.push_back(version>>i);
for (int i=0; i<hashPrevBlock.size(); i+=2)
bytes.push_back(c2x(hashPrevBlock[i])<<4 | c2x(hashPrevBlock[i+1]));
for (int i=0; i<hashMerkleRoot.size(); i+=2)
bytes.push_back(c2x(hashMerkleRoot[i])<<4 | c2x(hashMerkleRoot[i+1]));
for (int i=24; i>=0; i-=8) bytes.push_back(time>>i);
for (int i=24; i>=0; i-=8) bytes.push_back(bits>>i);
for (int i=24; i>=0; i-=8) bytes.push_back(nonce>>i);
return bytes; // return bytes instead of hex string
}
// exactly the same as the vector<BYTE> version
Sha256::Sha256(array<BYTE, SHA256_BLOCK_SIZE> data) {
SIZE64 L = data.size(); // <<
SIZE64 K = 0;
while( (L + 1 + K + 8) % 64 != 0)
K = K + 1;
// can be simplified to: int K = (128-1-8-L%64)%64;
// ** thanks to "chux - Reinstate Monica" pointing out i should be a SIZE64
for(SIZE64 i=0; i<L; i++) { // **
source.push_back(data[i]); // <<
}
source.push_back(0x80);
for(int i=0; i<K; i++)
source.push_back(0x00);
SIZE64 x = L*8; // <<
for(int i=sizeof(SIZE64)-1; i>=0; i--) { // big-endian
source.push_back(x >> i*8);
}
}
EDIT4: variable size in for loop
As "chux - Reinstate Monica" pointed out, it may be a problem if the size of the data is bigger than INT_MAX. All for-loop using a size as the upper limit should use a size_t type counter(instead of int) to prevent this problem.
// in BlockHeader::bytes()
for (size_t i=0; i<hashPrevBlock.size(); i+=2)
// in Sha256::hash()
for (size_t chunk=0; chunk<source.size()/64; chunk++)
// in main()
for (size_t i=0; i<h1.size(); i++)
for (size_t i=0; i<h2.size(); i++)
Notice that size_t is unsigned. The reverse version won't work because i is never less than 0.
for (size_t i=data.size()-1; i>=0; i--) // infinite loop

Implicit conversion or cast?

I have a function that interleaves the bits of 32 bit words and returns a 64 bit result. For this simple test case, the bottom 3 bytes are correct, and the contents of the top 5 bytes are incorrect. intToBin_32 and intToBin_64 are convenience functions to see the binary representation of the arguments and return val. I've placed casts from the 32 bit type to the 64 bit type everywhere I think they are needed, but I'm still seeing this unexpected (to me, at least) behavior. Is there an implicit conversion going on here, or is there some other reason this doesn't work correctly?
#include <stdint.h>
#include <stdio.h>
struct intString_32 {char bstr [32 + 1 + 8];};
struct intString_64 { char bstr [64 + 1 + 8];};
intString_32 intToBin_32(int a)
{
intString_32 b;
for (int i = 0; i < 8; i++)
{
for (int j = 0; j < 5; j++)
{
if (j != 4)
{
b.bstr[5*i + j] = * ((a & (1 << (31 - (4*i + j)))) ? "1" : "0");
}
else
{
b.bstr[5*i + j] = 0x20;
}
}
}
b.bstr[40] = * ( "\0" );
return b;
}
intString_64 intToBin_64(long long a)
{
intString_64 b;
for (int i = 0; i < 8; i++)
{
for (int j = 0; j < 9; j++)
{
if (j != 8)
{
b.bstr[9*i + j] = * ((a & (1 << (63 - (8*i + j)))) ? "1" : "0");
}
else
{
b.bstr[9*i + j] = 0x20;
}
}
}
b.bstr[72] = * ( "\0" );
return b;
}
uint64_t interleaveBits(unsigned int a, unsigned int b)
{
uint64_t retVal = 0;
for (unsigned int i = 0; i < 32; i++)
{
retVal |= (uint64_t)((uint64_t)((a >> i) & 0x1)) << (2*i);
retVal |= (uint64_t)((uint64_t)((b >> i) & 0x1)) << (2*i + 1);
}
return retVal;
}
int main(int arc, char* argv)
{
unsigned int foo = 0x0004EDC7;
unsigned int bar = 0x5A5A00FF;
uint64_t bat = interleaveBits(foo, bar);
printf("foo: %s \n", intToBin_32(foo).bstr);
printf("bar: %s \n", intToBin_32(bar).bstr);
printf("bat: %s \n\n", intToBin_64(bat).bstr);
}
Through debugging I noticed it's your intToBin_64 which is wrong, to be specific, in this line:
b.bstr[9*i + j] = * ((a & (1 << (63 - (8*i + j)))) ? "1" : "0");
take a closer look on the shift:
(1 << (63 - (8*i + j)))
The literal 1 is a integer, however, shifting a integer by more than 31 bits is undefined behavior. Shift a longlong instead:
b.bstr[9*i + j] = * ((a & (1ll << (63 - (8*i + j)))) ? "1" : "0");

addition using bitwise operators

so the idea of my class is to take a string of numbers const char* s = "123456654987" i took each couple of number and stored them in one byte
num[0] = 12 , num[1] = 34 and so on .....
this is how i did it
unsigned char* num;
num = new unsigned char[ strlen(s)/2 + strlen(s)%2];
if(strlen(s)%2 == 1)
num[0] = s[0]-'0';
unsigned int i;
int j=strlen(s)%2;
for(i=strlen(s)%2;i<strlen(s);i+=2)
{
int left = s[i] - '0';
int right = s[i+1] - '0';
num[j] = left << 4 ;
num[j] |= right;
j++;
}
for example s[0] = 12 is represented in memory as 00010010 not as 00000110
but now that i'm trying to overload the += operator i didn't know how to proceed
my best try was this but even i know that is not going to work
int i,sum,carry=0;
for(i=this->size-1;i>=0;i--)
{
sum = ((num[i] ^ rhs.num[i]) ^ carry);
carry = ((num[i] & rhs.num[i]) | (num[i] & carry)) | (rhs.num[i] & carry);
num[i] = sum;
}
anyhelp guys
You will need to do the addition one digit (4 bit) at a time because 9+9=18 and 18 won't fit in 4 bits.
x-oring multibit digits however is not the correct operation.. the correct algorithm for sum is something like
int carry = 0;
for(int i=0; i<n; i++) {
if ((i & 1) == 0) {
int x = (a[i] & 15) + (b[i] & 15) + carry;
result[i] = (x & 15);
carry = x > 15;
} else {
int x = (a[i] >> 4) + (b[i] >> 4) + carry;
result[i] |= (x << 4);
carry = x > 15;
}
}
Working in assembler many processors supports detection of an overflow in the lower 4 bits when doing an operation and there are specific instructions to "fix" the result so that it becomes the correct two-digit binary decimal representation (e.g. x86 provides DAA instruction to fix the result of an addition).
Working at the C level however this machinery is not available.

How can I pad my md5 message with c/c++

I'm working on a program in c++ to do md5 checksums. I'm doing this mainly because I think I'll learn a lot of different things about c++, checksums, OOP, and whatever else I run into.
I'm having trouble the check sums and I think the problem is in the function padbuff which does the message padding.
#include "HashMD5.h"
int leftrotate(int x, int y);
void padbuff(uchar * buffer);
//HashMD5 constructor
HashMD5::HashMD5()
{
Type = "md5";
Hash = "";
}
HashMD5::HashMD5(const char * hashfile)
{
Type = "md5";
std::ifstream filestr;
filestr.open(hashfile, std::fstream::in | std::fstream::binary);
if(filestr.fail())
{
std::cerr << "File " << hashfile << " was not opened.\n";
std::cerr << "Open failed with error ";
}
}
std::string HashMD5::GetType()
{
return this->Type;
}
std::string HashMD5::GetHash()
{
return this->Hash;
}
bool HashMD5::is_open()
{
return !((this->filestr).fail());
}
void HashMD5::CalcHash(unsigned int * hash)
{
unsigned int *r, *k;
int r2[4] = {0, 4, 9, 15};
int r3[4] = {0, 7, 12, 19};
int r4[4] = {0, 4, 9, 15};
uchar * buffer;
int bufLength = (2<<20)*8;
int f,g,a,b,c,d, temp;
int *head;
uint32_t maxint = 1<<31;
//Initialized states
unsigned int h[4]{ 0x67452301, 0xefcdab89, 0x98badcfe, 0x10325476};
r = new unsigned int[64];
k = new unsigned int[64];
buffer = new uchar[bufLength];
if(r==NULL || k==NULL || buffer==NULL)
{
std::cerr << "One of the dyn alloc failed\n";
}
// r specifies the per-round shift amounts
for(int i = 0; i<16; i++)
r[i] = 7 + (5 * ((i)%4) );
for(int i = 16; i < 32; i++)
r[i] = 5 + r2[i%4];
for(int i = 32; i< 48; i++)
r[i] = 4 + r3[i%4];
for(int i = 48; i < 63; i++)
r[i] = 6 + r4[i%4];
for(int i = 0; i < 63; i++)
{
k[i] = floor( fabs( sin(i + 1)) * maxint);
}
while(!(this->filestr).eof())
{
//Read in 512 bits
(this->filestr).read((char *)buffer, bufLength-512);
padbuff(buffer);
//The 512 bits are now 16 32-bit ints
head = (int *)buffer;
for(int i = 0; i < 64; i++)
{
if(i >=0 && i <=15)
{
f = (b & c) | (~b & d);
g = i;
}
else if(i >= 16 && i <=31)
{
f = (d & b) | (~d & b);
g = (5*i +1) % 16;
}
else if(i >=32 && i<=47)
{
f = b ^ c ^ d;
g = (3*i + 5 ) % 16;
}
else
{
f = c ^ (b | ~d);
g = (7*i) % 16;
}
temp = d;
d = c;
c = b;
b = b + leftrotate((a + f + k[i] + head[g]), r[i]);
a = temp;
}
h[0] +=a;
h[1] +=b;
h[2] +=c;
h[3] +=d;
}
delete[] r;
delete[] k;
hash = h;
}
int leftrotate(int x, int y)
{
return(x<<y) | (x >> (32 -y));
}
void padbuff(uchar* buffer)
{
int lack;
int length = strlen((char *)buffer);
uint64_t mes_size = length % UINT64_MAX;
if((lack = (112 - (length % 128) ))>0)
{
*(buffer + length) = ('\0'+1 ) << 3;
memset((buffer + length + 1),0x0,lack);
memcpy((void*)(buffer+112),(void *)&mes_size, 64);
}
}
In my test program I run this on the an empty message. Thus length in padbuff is 0. Then when I do *(buffer + length) = ('\0'+1 ) << 3;, I'm trying to pad the message with a 1. In the Netbeans debugger I cast buffer as a uint64_t and it says buffer=8. I was trying to put a 1 bit in the most significant spot of buffer so my cast should have been UINT64_MAX. Its not, so I'm confused about how my padding code works. Can someone tell me what I'm doing and what I'm supposed to do in padbuff? Thanks, and I apologize for the long freaking question.
Just to be clear about what the padding is supposed to be doing, here is the padding excerpt from Wikipedia:
The message is padded so that its length is divisible by 512. The padding works as follows: first a single bit, 1, is appended to the end of the message. This is followed by as many zeros as are required to bring the length of the message up to 64 bits fewer than a multiple of 512. The remaining bits are filled up with 64 bits representing the length of the original message, modulo 264.
I'm mainly looking for help for padbuff, but since I'm trying to learn all comments are appreciated.
The first question is what you did:
length % UINT64_MAX doesn't make sense at all because length is in bytes and MAX is the value you can store in UINT64.
You thought that putting 1 bit in the most significant bit would give the maximum value. In fact, you need to put 1 in all bits to get it.
You shift 1 by 3. It's only half the length of the byte.
The byte pointed by buffer is the least significant in little endian. (I assume you have little endian since the debugger showed 8).
The second question how it should work.
I don't know what exactly padbuff should do but if you want to pad and get UINT64_MAX, you need something like this:
int length = strlen((char *)buffer);
int len_of_padding = sizeof(uint64_t) - length % sizeof(uint64_t);
if(len_of_padding > 0)
{
memset((void*)(buffer + length), 0xFF, len_of_padding);
}
You worked with the length of two uint64 values. May be you wanted to zero the next one:
uint64_t *after = (uint64_t*)(buffer + length + len_of_padding);
*after = 0;