counting the number of bit required to represent an integer in 2's complement - bit-manipulation

I have to write a function that count the number of bit required to represent an int in 2's complement form. The requirement:
1. can only use: ! ~ & ^ | + << >>
2. no loops and conditional statement
3. at most, 90 operators are used
currently, I am thinking something like this:
int howManyBits(int x) {
int mostdigit1 = !!(0x80000000 & x);
int mostdigit2 = mostdigit1 | !!(0x40000000 & x);
int mostdigit3 = mostdigit2 | !!(0x20000000 & x);
//and so one until it reach the least significant digit
return mostdigit1+mostdigit2+...+mostdigit32+1;
}
However, this algorithm doesn't work. it also exceed the 90 operators limit. any suggestion, how can I fix and improve this algorithm?

With 2's complement integers, the problem are the negative numbers. A negative number is indicated by the most significant bit: If it is set, the number is negative.
The negative of a 2's complement integer n is defined as -(1's complement of n)+1.
Thus, I would first test for the negative sign. If it is set, the number of bits required is simply the number of bits available to represent an integer, e.g. 32 bits. If not, you can simply count the number of bits required by shifting repeatedly n by one bit right, until the result is zero. If n, e.g., would be +1, e.g. 000…001, you had to shift it once right to make the result zero, e.g. 1 times. Thus you need 1 bit to represent it.

Related

Set every nth bit in an integer without for loop

Is there a way to set every nth bit in an integer without using a for loop?
For example, if n = 3, then the result should be ...100100100100. This is easy enough with a for loop, but I am curious if this can be done without one.
--
For my particular application, I need to do this with a custom 256-bit integer type, that has all the bit operations that a built-in integer has. I'm currently using lazily initialized tables (using for loops) and that is good enough for what I'm doing. This was mostly an exercise in bit-twidling for me, but I couldn't figure out how to do it in a few steps/instructions, and couldn't easily find anything online about this.
… I need to do this with a custom 256-bit integer type.
Set r to 256 % n.
Set d to ((uint256_t) 1 << n) - 1. Then the binary representation of d is a string of n 1 bits.
Set t to UINT256_MAX << r >> r. This removes the top r bits from UINT256_MAX. UINT256_MAX is of course 2256−1. This leaves t as a string of width-r 1 bits, and width-r is some multiple of n, say k*n.
Set t to t/d. As a string of k*n 1 bits divided by a string of n 1 bits, this produces a quotient that is 000…0001 repeated k times, where each 000…0001 is n-1 0 bits followed by one 1 bit.
Now t is the desired bit pattern except the highest desired bit may be missing if r is not zero. To add this bit, if needed, OR t with t << n.
Now t is the desired value.
Alternately:
Set t to 1.
OR t with t << n.
OR t with t << 2*n.
OR t with t << 4*n.
OR t with t << 8*n.
OR t with t << 16*n.
OR t with t << 32*n.
OR t with t << 64*n.
OR t with t << 128*n.
Those shifts must be defined (shifting by zero would suffice) or suppressed when the shift amount exceeds the integer width, 256 bits.

Why is the result of a bitwise shift unrecoverable if there is a mathematical equivalent of the same operation?

Take for example the number 91. That number in binary is 1011011. If you shift that number to the right by 5 bits, you would get 2 (10 in binary). According to a google search, bit shifting to the left or right by a certain amount of bits is the same as multiplying or dividing the number by 2 to the power of the number of bits to be shifted, respectively. so to get from 91 to 2 by bit shifting, the equation would look like this: 91 / 2^5, which is also 91 / 32. Now, of course if you did that in your calculator, there would be some decimal values, which aren't included when bit shifting. The resulting 2 is actually 2.84357. I'm sure you know that if you do a certain operation on a number and then you do the inverse, the result would be what you had in the first place. So does decimal precision have something to do with this?
There is a mathematical equivalent of shifting to the right... and the mathematical operation is UNRECOVERABLE.
You seem to think that shifting to the right is:
bit shifting to the left or right by a certain amount of bits is the same as multiplying or dividing the number by 2
This is what you will hear people casually say, but it is only half right. As it it is not the same but only similar.
The correct statement is:
shifting a base-2 number one digit to the right is THE SAME as dividing by two in the integer domain
If you have an integer calculator, if you did 91/32 you will get 2. You will not get ANY decimal point because we are operating in the integer domain.
For real numbers, the equivalent operation is:
FLOOR(91/32)
Which is also unrecoverable because it also results in 2.
The lesson here is be careful when listening to what people CASUALLY say. Casual speech is often imprecise and assumes the listener is familiar with the subject. You need to dig deeper what the statement is actually trying to say.
As for why it is unrecoverable? Division of integers give two results: the quotient (which is the main result) and the remainder. When we divide 91 by 32 we are doing this:
2
_____
32 ) 91
64
__
27
So we get the result of 2 and a remainder of 27. The reason you can't get 91 by multiplying 2*32 is because we threw away the remainder.
You can get the result back if you saved the remainder. However, calculating the remainder is not a matter of simple shifts. Here's an example of how to make it reversable in C:
int test () {
int a = 91;
int b = 32;
int result;
int remainder;
result = a / b; // result will be 2
remainder = a % b; // remainder will be 27
return (result * b) + remainder; // returns 91
}
You can only recover the result of an operation if it has a 1-1 mapping between the inputs and outputs, i.e. it has an inverse function. But not all mathematical functions have an inverse function
For example if f(x) = x >> n with >> is the shift operator then it'll be equivalent to
f(x) = ⌊x/2n⌋
with ⌊ ⌋ being the floor function. Since there are many inputs that lead to the same output, the relationship isn't 1-1 and there can't be an inverse function for it. This function works the same for both signed and unsigned right shift:
91 >> 5 == floor(91.0/32.0) == 2
-91 >> 5 == floor(-91.0/32.0) == -3
Similarly for an unsigned left shift function g(x) = x << n then the equivalent is
g(x) = (x * 2n) mod 2N
with N being the size in bits of x, because integer math in hardware, C and many other languages always reduce modulo 2N due to the limit of register size and the use of two's complement. And it's clear that the modulo function also isn't invertible/recoverable. The signed left shift is almost the same with some small modifications

What is the purpose of "int mask = ~0;"?

I saw the following line of code here in C.
int mask = ~0;
I have printed the value of mask in C and C++. It always prints -1.
So I do have some questions:
Why assigning value ~0 to the mask variable?
What is the purpose of ~0?
Can we use -1 instead of ~0?
It's a portable way to set all the binary bits in an integer to 1 bits without having to know how many bits are in the integer on the current architecture.
C and C++ allow 3 different signed integer formats: sign-magnitude, one's complement and two's complement
~0 will produce all-one bits regardless of the sign format the system uses. So it's more portable than -1
You can add the U suffix (i.e. -1U) to generate an all-one bit pattern portably1. However ~0 indicates the intention clearer: invert all the bits in the value 0 whereas -1 will show that a value of minus one is needed, not its binary representation
1 because unsigned operations are always reduced modulo the number that is one greater than the largest value that can be represented by the resulting type
That on a 2's complement platform (that is assumed) gives you -1, but writing -1 directly is forbidden by the rules (only integers 0..255, unary !, ~ and binary &, ^, |, +, << and >> are allowed).
You are studying a coding challenge with a number of restrictions on operators and language constructions to perform given tasks.
The first problem is return the value -1 without the use of the - operator.
On machines that represent negative numbers with two's complement, the value -1 is represented with all bits set to 1, so ~0 evaluates to -1:
/*
* minusOne - return a value of -1
* Legal ops: ! ~ & ^ | + << >>
* Max ops: 2
* Rating: 1
*/
int minusOne(void) {
// ~0 = 111...111 = -1
return ~0;
}
Other problems in the file are not always implemented correctly. The second problem, returning a boolean value representing the fact the an int value would fit in a 16 bit signed short has a flaw:
/*
* fitsShort - return 1 if x can be represented as a
* 16-bit, two's complement integer.
* Examples: fitsShort(33000) = 0, fitsShort(-32768) = 1
* Legal ops: ! ~ & ^ | + << >>
* Max ops: 8
* Rating: 1
*/
int fitsShort(int x) {
/*
* after left shift 16 and right shift 16, the left 16 of x is 00000..00 or 111...1111
* so after shift, if x remains the same, then it means that x can be represent as 16-bit
*/
return !(((x << 16) >> 16) ^ x);
}
Left shifting a negative value or a number whose shifted value is beyond the range of int has undefined behavior, right shifting a negative value is implementation defined, so the above solution is incorrect (although it is probably the expected solution).
Loooong ago this was how you saved memory on extremely limited equipment such as the 1K ZX 80 or ZX 81 computer. In BASIC, you would
Let X = NOT PI
rather than
LET X = 0
Since numbers were stored as 4 byte floating points, the latter takes 2 bytes more than the first NOT PI alternative, where each of NOT and PI takes up a single byte.
There are multiple ways of encoding numbers across all computer architectures. When using 2's complement this will always be true:~0 == -1. On the other hand, some computers use 1's complement for encoding negative numbers for which the above example is untrue, because ~0 == -0. Yup, 1s complement has negative zero, and that is why it is not very intuitive.
So to your questions
the ~0 is assigned to mask so all the bits in mask are equal 1 -> making mask & sth == sth
the ~0 is used to make all bits equal to 1 regardless of the platform used
you can use -1 instead of ~0 if you are sure that your computer platform uses 2's complement number encoding
My personal thought - make your code as much platform-independent as you can. The cost is relatively small and the code becomes fail proof

How to use negative number with openssl's BIGNUM?

I want a C++ version of the following Java code.
BigInteger x = new BigInteger("00afd72b5835ad22ea5d68279ffac0b6527c1ab0fb31f1e646f728d75cbd3ae65d", 16);
BigInteger y = x.multiply(BigInteger.valueOf(-1));
//prints y = ff5028d4a7ca52dd15a297d860053f49ad83e54f04ce0e19b908d728a342c519a3
System.out.println("y = " + new String(Hex.encode(y.toByteArray())));
And here is my attempt at a solution.
BIGNUM* x = BN_new();
BN_CTX* ctx = BN_CTX_new();
std::vector<unsigned char> xBytes = hexStringToBytes(“00afd72b5835ad22ea5d68279ffac0b6527c1ab0fb31f1e646f728d75cbd3ae65d");
BN_bin2bn(&xBytes[0], xBytes.size(), x);
BIGNUM* negative1 = BN_new();
std::vector<unsigned char> negative1Bytes = hexStringToBytes("ff");
BN_bin2bn(&negative1Bytes[0], negative1Bytes.size(), negative1);
BIGNUM* y = BN_new();
BN_mul(y, x, negative1, ctx);
char* yHex = BN_bn2hex(y);
std::string yStr(yHex);
//prints y = AF27542CDD7775C7730ABF785AC5F59C299E964A36BFF460B031AE85607DAB76A3
std::cout <<"y = " << yStr << std::endl;
(Ignored the case.) What am I doing wrong? How do I get my C++ code to output the correct value "ff5028d4a7ca52dd15a297d860053f49ad83e54f04ce0e19b908d728a342c519a3". I also tried setting negative1 by doing BN_set_word(negative1, -1), but that gives me the wrong answer too.
The BN_set_negative function sets a negative number.
The negative of afd72b5835ad22ea5d68279ffac0b6527c1ab0fb31f1e646f728d75cbd3ae65d is actually -afd72b5835ad22ea5d68279ffac0b6527c1ab0fb31f1e646f728d75cbd3ae65d , in the same way as -2 is the negative of 2.
ff5028d4a7ca52dd15a297d860053f49ad83e54f04ce0e19b908d728a342c519a3 is a large positive number.
The reason you are seeing this number in Java is due to the toByteArray call . According to its documentation, it selects the minimum field width which is a whole number of bytes, and also capable of holding a two's complement representation of the negative number.
In other words, by using the toByteArray function on a number that current has 1 sign bit and 256 value bits, you end up with a field width of 264 bits. However if your negative number's first nibble were 7 for example, rather than a, then (according to this documentation - I haven't actually tried it) you would get a 256-bit field width out (i.e. 8028d4..., not ff8028d4.
The leading 00 you have used in your code is insignificant in OpenSSL BN. I'm not sure if it is significant in BigInteger although the documentation for that constructor says "The String representation consists of an optional minus or plus sign followed by a sequence of one or more digits in the specified radix. "; so the fact that it accepts a minus sign suggests that if the minus sign is not present then the input is treated as a large positive number, even if its MSB is set. (Hopefully a Java programmer can clear this paragraph up for me).
Make sure you keep clear in your mind the distinction between a large negative value, and a large positive number obtained by modular arithmetic on that negative value, such as is the output of toByteArray.
So your question is really: does Openssl BN have a function that emulates the behaviour of BigInteger.toByteArray() ?
I don't know if such a function exists (the BN library has fairly bad documentation IMHO, and I've never heard of it being used outside of OpenSSL, especially not in a C++ program). I would expect it doesn't, since toByteArray's behaviour is kind of weird; and in any case, all of the BN output functions appear to output using a sign-magnitude format, rather than a two's complement format.
But to replicate that output, you could add either 2^256 or 2^264 to the large negative number , and then do BN_bn2hex . In this particular case, add 2^264, In general you would have to measure the current bit-length of the number being stored and round the exponent up to the nearest multiple of 8.
Or you could even output in sign-magnitude format (using BN_bn2hex or BN_bn2mpi) and then iterate through inverting each nibble and fixing up the start!
NB. Is there any particular reason you want to use OpenSSL BN? There are many alternatives.
Although this is a question from 2014 (more than five years ago), I would like to solve your problem / clarify the situation, which might help others.
a) One's complement and two's complement
In finite number theory, there is "one's complement" and "two's complement" representation of numbers. One's complement stores absolute (positive) values only and does not know a sign. If you want to have a sign for a number stored as one's complement, then you have to store it separately, e.g. in one bit (0=positive, 1=negative). This is exactly the situation of floating point numbers (IEEE 754). The mantissa is stored as the one's complement together with the exponent and one additional sign bit. Numbers in one's complement have two zeros: -0 and +0 because you treat the sign independently of the absolute value itself.
In two's complement, the most significant bit is used as the sign bit. There is no '-0' because negating a value in two's complement means performing the logical NOT (in C: tilde) operation followed by adding one.
As an example, one byte (in two's complement) can be one of the three values 0xFF, 0x00, 0x01 meaning -1, 0 and 1. There is no room for the -0. If you have, e.g. 0xFF (-1) and want to negate it, then the logical NOT operation computes 0xFF => 0x00. Adding one yields 0x01, which is 1.
b) OpenSSL BIGNUM and Java BigInteger
OpenSSL's BIGNUM implementation represents numbers as one's complement. The Java BigInteger treats numbers as two's complement. That was your desaster. Your big integer (in hex) is 00afd72b5835ad22ea5d68279ffac0b6527c1ab0fb31f1e646f728d75cbd3ae65d. This is a positive 256bit integer. It consists of 33 bytes because there is a leading zero byte 0x00, which is absolutely correct for an integer stored as two's complement because the most significant bit (omitting the initial 0x00) is set (in 0xAF), which would make this number a negative number.
c) Solution you were looking for
OpenSSL's function bin2bn works with absolute values only. For OpenSSL, you can leave the initial zero byte or cut it off - does not make any difference because OpenSSL canonicalizes the input data anyway, which means cutting off all leading zero bytes. The next problem of your code is the way you want to make this integer negative: You want to multiply it with -1. Using 0xFF as the only input byte to bin2bn makes this 255, not -1. In fact, you multiply your big integer with 255 yielding the overall result AF27542CDD7775C7730ABF785AC5F59C299E964A36BFF460B031AE85607DAB76A3, which is still positive.
Multiplication with -1 works like this (snippet, no error checking):
BIGNUM* x = BN_bin2bn(&xBytes[0], (int)xBytes.size(), NULL);
BIGNUM* negative1 = BN_new();
BN_one(negative1); /* negative1 is +1 */
BN_set_negative(negative1, 1); /* negative1 is now -1 */
BN_CTX* ctx = BN_CTX_new();
BIGNUM* y = BN_new();
BN_mul(y, x, negative1, ctx);
Easier is:
BIGNUM* x = BN_bin2bn(&xBytes[0], (int)xBytes.size(), NULL);
BN_set_negative(x,1);
This does not solve your problem because as M.M said, this just makes -afd72b5835ad22ea5d68279ffac0b6527c1ab0fb31f1e646f728d75cbd3ae65d from afd72b5835ad22ea5d68279ffac0b6527c1ab0fb31f1e646f728d75cbd3ae65d.
You are looking for the two's compülement of your big integer, which is
int i;
for (i = 0; i < (int)sizeof(value); i++)
value[i] = ~value[i];
for (i = ((int)sizeof(posvalue)) - 1; i >= 0; i--)
{
value[i]++;
if (0x00 != value[i])
break;
}
This is an unoptimized version of the two's complement if 'value' is your 33 byte input array containing your big integer prefixed by the byte 0x00. The result of this operation are the 33 bytes ff5028d4a7ca52dd15a297d860053f49ad83e54f04ce0e19b908d728a342c519a3.
d) Working with two's complement and OpenSSL BIGNUM
The whole sequence is like this:
Prologue: If input is negative (check most significant bit), then compute two's complement of input.
Convert to BIGNUM using BN_bin2bn
If input was negative, then call BN_set_negative(x,1)
Main function: Carry out all arithmetic operations using OpenSSL BIGNUM package
Call BN_is_negative to check for negative result
Convert to raw binary byte using BN_bn2bin
If result was negative, then compute two's complement of result.
Epilogue: If result was positive and result raw (output of step 7) byte's most significant bit is set, then prepend a byte 0x00. If result was negative and result raw byte's most significant bit is clear, then prepend a byte 0xFF.

Shift left/right adding zeroes/ones and dropping first bits

I've got to program a function that receives
a binary number like 10001, and
a decimal number that indicates how many shifts I should perform.
The problem is that if I use the C++ operator <<, the zeroes are pushed from behind but the first numbers aren't dropped... For example
shifLeftAddingZeroes(10001,1)
returns 100010 instead of 00010 that is what I want.
I hope I've made myself clear =P
I assume you are storing that information in int. Take into consideration, that this number actually has more leading zeroes than what you see, ergo your number is most likely 16 bits, meaning 00000000 00000001 . Maybe try AND-ing it with number having as many 1 as the number you want to have after shifting? (Assuming you want to stick to bitwise operations).
What you want is to bit shift and then limit the number of output bits which can be active (hold a value of 1). One way to do this is to create a mask for the number of bits you want, then AND the bitshifted value with that mask. Below is a code sample for doing that, just replace int_type with the type of value your using -- or make it a template type.
int_type shiftLeftLimitingBitSize(int_type value, int numshift, int_type numbits=some_default) {
int_type mask = 0;
for (unsigned int bit=0; bit < numbits; bit++) {
mask += 1 << bit;
}
return (value << numshift) & mask;
}
Your output for 10001,1 would now be shiftLeftLimitingBitSize(0b10001, 1, 5) == 0b00010.
Realize that unless your numbits is exactly the length of your integer type, you will always have excess 0 bits on the 'front' of your number.