Is there a way to properly left rotate (not just shift) BigIntegers of a fixed size?
I tried writing a method which resembles the classic rotation method which is used to rotate integers, but it does not work on BigIntegers. It just shifts the bits to the left by r positions, filling zeros at the end.
public static BigInteger rotate(BigInteger n, int r){
return n.shiftLeft(r).or(n.shiftRight(128-r));
}
EDIT: Not using BigIntegers and using arrays of longs or integers looks like another option, but I'm not sure how you'd be able to combine them (except using BigIntegers) to perform the rotation.
That is actually not so easy. Where would the rotation point be? That is easy for fixed size numbers like 32 bit or 64 bit integers, but not for BigIntegers.
But... in theory, BigIntegers are unlimited in size, and two's complement (or at least, they behave like they are, in reality they are usually sign-magnitude). So positive numbers are (virtually) preceded with an unlimited number of 0 bits and negative numbers with an unlimited number of 1 bits.
So rotating left by 1 would actually mean that you shift left by 1, and if the number was/is negative, the lowest bit is set to 1.
UPDATE
If the BigInteger is just used to represent a fixed size integer (BigIntegers themselves do not have a fixed size), you will have to move the top bits to the bottom. Then you can do something like:
public static BigInteger rotateLeft(BigInteger value, int shift, int bitSize)
{
// Note: shift must be positive, if necessary add checks.
BigInteger topBits = value.shiftRight(bitSize - shift);
BigInteger mask = BigInteger.ONE.shiftLeft(bitSize).subtract(BigInteger.ONE);
return value.shiftLeft(shift).or(topBits).and(mask);
}
And you call it like:
public static void main(String[] args)
{
BigInteger rotated = rotateLeft(new
BigInteger("1110000100100011010001010110011110001001101010111100110111101111" +
"1111111011011100101110101001100001110110010101000011001000010010",
2), 7, 128);
System.out.println(rotated.toString(2));
}
Note: I did test this and it seems to produce the desired result:
10010001101000101011001111000100110101011110011011110111111111110110111001011101010011000011101100101010000110010000100101110000
If the bitSize is fixed (e.g. always 128), you can pre-calculate the mask and do not have to pass the bitSize to the function, of course.
EDIT:
To obtain the mask, instead of shifting BigInteger.ONE left, you can just as well do:
BigInteger.ZERO.setBit(bitSize).subtract(BigInteger.ONE);
That is probably a little faster.
Related
I made a simple function that I called symmetricDelta() which calculates the delta between value and previous "symmetrically". What I mean by that is: consider a number-line from e.g. 0 to ULLONG_MAX, where you connected the left and right ends of the number-line... To determine a "symmetric" delta, assume the change is positive if value - previous is less than half of the span, otherwise assume the change is negative, and we wrapped-around the number-line.
See a simple version of this for uint64_ts below:
int64_t symmetricDelta(uint64_t value, uint64_t previous) {
if (value-previous < (1ULL << 63)) {
uint64_t result = value - previous;
return result;
} else {
uint64_t negativeResult = previous - value;
return -1 * negativeResult;
}
}
Usage:
uint64_t value = ULLONG_MAX;
uint64_t previous = 0;
// Result: -1, not UULONG_MAX
cout << symmetricDelta(value, previous) << endl;
Demo: https://onlinegdb.com/BJ8FFZgrP
Other value examples, assume a uint8_t version for simplicity:
symmetricalDifference(1, 0) == 1
symmetricalDifference(0, 1) == -1
symmetricalDifference(0, 255) == 1
symmetricalDifference(255, 0) == -1
symmetricalDifference(227, 100) == 127
symmetricalDifference(228, 100) == -128
My question is: Is there an "official" name for what I'm calling "symmetrical subtraction"? This feels like the kind of thing that might already be implemented in the C++ STL, but I wouldn't even know what to search for...
Yes. The name is subtraction modulo 2^64. And it's identical to what your machine does with the instruction
int64_t symmetricDelta(uint64_t value, uint64_t previous) {
return (int64_t)(value-previous);
}
In C and C++, unsigned arithmetic is defined to wrap around, effectively joining the ends of the representable number range into a circle. This is the basis for the 2-complement representation of signed integers: Your CPU simply declares half the number circle to be interpreted negative. This part is the upper part in unsigned, with the -1 corresponding to the maximum representable unsigned integer. Simply because the 0 is next on the circle.
Side note:
This allows the CPU to use the exact same circuitry for signed and unsigned arithmetic. The CPU only provides an add instruction that is used irrespective of whether the numbers should be interpreted as signed or unsigned. This is true for addition, subtraction and multiplication, they all exist as sign-ignorant instructions. Only the division is implemented in a signed and an unsigned variant, as are the comparison instructions / the flag bits that the CPU provides.
Side note 2:
The above is not fully true, as modern CPUs implement saturating arithmetic as part of their vector units (AVX etc.). Because saturating arithmetic means clipping the result to the ends of the representable range instead of wrapping around, this clipping depends on where the circle of numbers is assumed to be broken. As such, saturating arithmetic instructions typically exist in signed and unsigned variants.
End of the needless background rambling...
So, when you subtract two numbers in unsigned representation, the result is the unsigned number of steps that you have to take to reach the minuend from the subtrahend. And by reinterpreting the result as a signed integer, you are interpreting a long route (that goes more than half around the circle) as the corresponding short route in the opposite direction.
There is one pitfall: 1 << 63 is not representable. It is exactly on the opposite side of the number circle from the zero, and since its sign bit is set, it's interpreted as -(1 << 63). If you try to negate it, the bit pattern does not change one bit (just like -0 == 0), so your computer happily declares that - -(1 << 63) == -(1 << 63). This is probably not a problem to you, but it's better to be aware of this, because it might bite you.
I need to do substantial simple algebras on equisized arrays of small integers. The operations consist of only three kinds: (i) add arrays and (ii) subtract arrays element-wisely, and (iii) compare if all elements in one array are no less / greater than their counterparts in another.
To boost cache locality and computing speed, I cram the small integers of every array bit-by-bit into a certain number of 64-bit integers. The number of 64-bit integers for use is determined by numbers of bits assigned to array elements. Let a[j] denote an array element. My design of bits for a[j] consists of (i) bits that can hold the largest absolute value a[j] could hit during computation, (ii) a sign bit, and (iii) a bit on the sign bit's left. The leftmost bit holds the possible carry from the right and gets zeroed after addition or subtraction.
Below is a toy example of adding, subtracting and comparing two 64-bit integers, each of which includes five small integers: the first 10 bits, the next 5 bits, the next 10 bits, the next 13 bits, and the next 20 bits. The rest bits are useless and set to 0.
// leftmostBitMask =
// 0b0111111111011110111111111011111111111101111111111111111111000000
// ^ ^ ^ ^ ^
// leftmost
std::size_t add(std::size_t x, std::size_t y, std::size_t leftmostBitMask)
{
return (x + y) & leftmostBitMask;
}
std::size_t minus(std::size_t x, std::size_t y, std::size_t leftmostBitMask)
{
return (x - y + ((~leftmostBitMask) << 1)) & leftmostBitMask;
}
bool notAllGreaterEqual(std::size_t x, std::size_t y, std::size_t leftmostBitMask)
{
// return (minus(x, y, leftmostBitMask) & (leftmostBitMask >> 1)) == 0;
return (x - y) & ((~leftmostBitMask) >> 1);
}
My algorithms seem complex, especially the comparison function. Are there any faster solutions?
Thanks!
BTW, SIMD is not what I am describing. My question is one lower level of optimization than SIMD.
More background: the idea serves a quite complex search algorithm in multidimensional space. We observed large differences between magnitudes of values in different dimensions. For instance, during computing an important 6-dimensional test case, one dimension could reach 50000 in absolute value yet all the others fall well below 1000. Without integer compression, each object requires a 32-bit array of size 6, while integer compression reduces the dimensionality to 1 (64-bit integer). Such reduction prompts me to think about cramming integers..
After careful thought and comprehensive simulations, algorithms listed in the question turn out largely over-engineered. The leftmost bit for receiving carry is unnecessary. The code below works:
// signBitMask =
// 0b1000000000100001000000000100000000000010000000000000000000000000
// ^ ^ ^ ^ ^
// sign bit
std::size_t add(std::size_t x, std::size_t y)
{
return x + y;
}
std::size_t subtract(std::size_t x, std::size_t y)
{
return x - y;
}
bool notAllGreaterEqual(std::size_t x, std::size_t y, std::size_t signBitMask)
{
return (x - y) & signBitMask != 0;
}
The key factor here is that every comparison made on two arrays is AND-based. We require notAllGreaterEqual() returns true as long as one elemental integer in x is below its counterpart in y. At first glance, the solution above could hardly be true: What happens when one negative elemental integer is added to a positive counterpart and the result stays positive? There must be a carry over the sign bit. In this case isn't the successive elemental integer contaminated? The answer is yes, but it does not matter. Collectively notAllGreaterEqual() would still fully serve its purpose. Instead of thinking in bits, one can easily prove notAllGreaterEqual() correct with elementary algebra. Problems could come only if we want to recover the integer array from those 64-bit buffers.
Creating the 64-bit buffer consists of (i) casting integers to std::size_t, (ii) shift the integer by pre-computed bits, and (iii) adding shifted integers. If an integer is negative being shifted to the right, 1 must be padded on its left.
I want a C++ version of the following Java code.
BigInteger x = new BigInteger("00afd72b5835ad22ea5d68279ffac0b6527c1ab0fb31f1e646f728d75cbd3ae65d", 16);
BigInteger y = x.multiply(BigInteger.valueOf(-1));
//prints y = ff5028d4a7ca52dd15a297d860053f49ad83e54f04ce0e19b908d728a342c519a3
System.out.println("y = " + new String(Hex.encode(y.toByteArray())));
And here is my attempt at a solution.
BIGNUM* x = BN_new();
BN_CTX* ctx = BN_CTX_new();
std::vector<unsigned char> xBytes = hexStringToBytes(“00afd72b5835ad22ea5d68279ffac0b6527c1ab0fb31f1e646f728d75cbd3ae65d");
BN_bin2bn(&xBytes[0], xBytes.size(), x);
BIGNUM* negative1 = BN_new();
std::vector<unsigned char> negative1Bytes = hexStringToBytes("ff");
BN_bin2bn(&negative1Bytes[0], negative1Bytes.size(), negative1);
BIGNUM* y = BN_new();
BN_mul(y, x, negative1, ctx);
char* yHex = BN_bn2hex(y);
std::string yStr(yHex);
//prints y = AF27542CDD7775C7730ABF785AC5F59C299E964A36BFF460B031AE85607DAB76A3
std::cout <<"y = " << yStr << std::endl;
(Ignored the case.) What am I doing wrong? How do I get my C++ code to output the correct value "ff5028d4a7ca52dd15a297d860053f49ad83e54f04ce0e19b908d728a342c519a3". I also tried setting negative1 by doing BN_set_word(negative1, -1), but that gives me the wrong answer too.
The BN_set_negative function sets a negative number.
The negative of afd72b5835ad22ea5d68279ffac0b6527c1ab0fb31f1e646f728d75cbd3ae65d is actually -afd72b5835ad22ea5d68279ffac0b6527c1ab0fb31f1e646f728d75cbd3ae65d , in the same way as -2 is the negative of 2.
ff5028d4a7ca52dd15a297d860053f49ad83e54f04ce0e19b908d728a342c519a3 is a large positive number.
The reason you are seeing this number in Java is due to the toByteArray call . According to its documentation, it selects the minimum field width which is a whole number of bytes, and also capable of holding a two's complement representation of the negative number.
In other words, by using the toByteArray function on a number that current has 1 sign bit and 256 value bits, you end up with a field width of 264 bits. However if your negative number's first nibble were 7 for example, rather than a, then (according to this documentation - I haven't actually tried it) you would get a 256-bit field width out (i.e. 8028d4..., not ff8028d4.
The leading 00 you have used in your code is insignificant in OpenSSL BN. I'm not sure if it is significant in BigInteger although the documentation for that constructor says "The String representation consists of an optional minus or plus sign followed by a sequence of one or more digits in the specified radix. "; so the fact that it accepts a minus sign suggests that if the minus sign is not present then the input is treated as a large positive number, even if its MSB is set. (Hopefully a Java programmer can clear this paragraph up for me).
Make sure you keep clear in your mind the distinction between a large negative value, and a large positive number obtained by modular arithmetic on that negative value, such as is the output of toByteArray.
So your question is really: does Openssl BN have a function that emulates the behaviour of BigInteger.toByteArray() ?
I don't know if such a function exists (the BN library has fairly bad documentation IMHO, and I've never heard of it being used outside of OpenSSL, especially not in a C++ program). I would expect it doesn't, since toByteArray's behaviour is kind of weird; and in any case, all of the BN output functions appear to output using a sign-magnitude format, rather than a two's complement format.
But to replicate that output, you could add either 2^256 or 2^264 to the large negative number , and then do BN_bn2hex . In this particular case, add 2^264, In general you would have to measure the current bit-length of the number being stored and round the exponent up to the nearest multiple of 8.
Or you could even output in sign-magnitude format (using BN_bn2hex or BN_bn2mpi) and then iterate through inverting each nibble and fixing up the start!
NB. Is there any particular reason you want to use OpenSSL BN? There are many alternatives.
Although this is a question from 2014 (more than five years ago), I would like to solve your problem / clarify the situation, which might help others.
a) One's complement and two's complement
In finite number theory, there is "one's complement" and "two's complement" representation of numbers. One's complement stores absolute (positive) values only and does not know a sign. If you want to have a sign for a number stored as one's complement, then you have to store it separately, e.g. in one bit (0=positive, 1=negative). This is exactly the situation of floating point numbers (IEEE 754). The mantissa is stored as the one's complement together with the exponent and one additional sign bit. Numbers in one's complement have two zeros: -0 and +0 because you treat the sign independently of the absolute value itself.
In two's complement, the most significant bit is used as the sign bit. There is no '-0' because negating a value in two's complement means performing the logical NOT (in C: tilde) operation followed by adding one.
As an example, one byte (in two's complement) can be one of the three values 0xFF, 0x00, 0x01 meaning -1, 0 and 1. There is no room for the -0. If you have, e.g. 0xFF (-1) and want to negate it, then the logical NOT operation computes 0xFF => 0x00. Adding one yields 0x01, which is 1.
b) OpenSSL BIGNUM and Java BigInteger
OpenSSL's BIGNUM implementation represents numbers as one's complement. The Java BigInteger treats numbers as two's complement. That was your desaster. Your big integer (in hex) is 00afd72b5835ad22ea5d68279ffac0b6527c1ab0fb31f1e646f728d75cbd3ae65d. This is a positive 256bit integer. It consists of 33 bytes because there is a leading zero byte 0x00, which is absolutely correct for an integer stored as two's complement because the most significant bit (omitting the initial 0x00) is set (in 0xAF), which would make this number a negative number.
c) Solution you were looking for
OpenSSL's function bin2bn works with absolute values only. For OpenSSL, you can leave the initial zero byte or cut it off - does not make any difference because OpenSSL canonicalizes the input data anyway, which means cutting off all leading zero bytes. The next problem of your code is the way you want to make this integer negative: You want to multiply it with -1. Using 0xFF as the only input byte to bin2bn makes this 255, not -1. In fact, you multiply your big integer with 255 yielding the overall result AF27542CDD7775C7730ABF785AC5F59C299E964A36BFF460B031AE85607DAB76A3, which is still positive.
Multiplication with -1 works like this (snippet, no error checking):
BIGNUM* x = BN_bin2bn(&xBytes[0], (int)xBytes.size(), NULL);
BIGNUM* negative1 = BN_new();
BN_one(negative1); /* negative1 is +1 */
BN_set_negative(negative1, 1); /* negative1 is now -1 */
BN_CTX* ctx = BN_CTX_new();
BIGNUM* y = BN_new();
BN_mul(y, x, negative1, ctx);
Easier is:
BIGNUM* x = BN_bin2bn(&xBytes[0], (int)xBytes.size(), NULL);
BN_set_negative(x,1);
This does not solve your problem because as M.M said, this just makes -afd72b5835ad22ea5d68279ffac0b6527c1ab0fb31f1e646f728d75cbd3ae65d from afd72b5835ad22ea5d68279ffac0b6527c1ab0fb31f1e646f728d75cbd3ae65d.
You are looking for the two's compülement of your big integer, which is
int i;
for (i = 0; i < (int)sizeof(value); i++)
value[i] = ~value[i];
for (i = ((int)sizeof(posvalue)) - 1; i >= 0; i--)
{
value[i]++;
if (0x00 != value[i])
break;
}
This is an unoptimized version of the two's complement if 'value' is your 33 byte input array containing your big integer prefixed by the byte 0x00. The result of this operation are the 33 bytes ff5028d4a7ca52dd15a297d860053f49ad83e54f04ce0e19b908d728a342c519a3.
d) Working with two's complement and OpenSSL BIGNUM
The whole sequence is like this:
Prologue: If input is negative (check most significant bit), then compute two's complement of input.
Convert to BIGNUM using BN_bin2bn
If input was negative, then call BN_set_negative(x,1)
Main function: Carry out all arithmetic operations using OpenSSL BIGNUM package
Call BN_is_negative to check for negative result
Convert to raw binary byte using BN_bn2bin
If result was negative, then compute two's complement of result.
Epilogue: If result was positive and result raw (output of step 7) byte's most significant bit is set, then prepend a byte 0x00. If result was negative and result raw byte's most significant bit is clear, then prepend a byte 0xFF.
I've got to program a function that receives
a binary number like 10001, and
a decimal number that indicates how many shifts I should perform.
The problem is that if I use the C++ operator <<, the zeroes are pushed from behind but the first numbers aren't dropped... For example
shifLeftAddingZeroes(10001,1)
returns 100010 instead of 00010 that is what I want.
I hope I've made myself clear =P
I assume you are storing that information in int. Take into consideration, that this number actually has more leading zeroes than what you see, ergo your number is most likely 16 bits, meaning 00000000 00000001 . Maybe try AND-ing it with number having as many 1 as the number you want to have after shifting? (Assuming you want to stick to bitwise operations).
What you want is to bit shift and then limit the number of output bits which can be active (hold a value of 1). One way to do this is to create a mask for the number of bits you want, then AND the bitshifted value with that mask. Below is a code sample for doing that, just replace int_type with the type of value your using -- or make it a template type.
int_type shiftLeftLimitingBitSize(int_type value, int numshift, int_type numbits=some_default) {
int_type mask = 0;
for (unsigned int bit=0; bit < numbits; bit++) {
mask += 1 << bit;
}
return (value << numshift) & mask;
}
Your output for 10001,1 would now be shiftLeftLimitingBitSize(0b10001, 1, 5) == 0b00010.
Realize that unless your numbits is exactly the length of your integer type, you will always have excess 0 bits on the 'front' of your number.
I have to write a function that count the number of bit required to represent an int in 2's complement form. The requirement:
1. can only use: ! ~ & ^ | + << >>
2. no loops and conditional statement
3. at most, 90 operators are used
currently, I am thinking something like this:
int howManyBits(int x) {
int mostdigit1 = !!(0x80000000 & x);
int mostdigit2 = mostdigit1 | !!(0x40000000 & x);
int mostdigit3 = mostdigit2 | !!(0x20000000 & x);
//and so one until it reach the least significant digit
return mostdigit1+mostdigit2+...+mostdigit32+1;
}
However, this algorithm doesn't work. it also exceed the 90 operators limit. any suggestion, how can I fix and improve this algorithm?
With 2's complement integers, the problem are the negative numbers. A negative number is indicated by the most significant bit: If it is set, the number is negative.
The negative of a 2's complement integer n is defined as -(1's complement of n)+1.
Thus, I would first test for the negative sign. If it is set, the number of bits required is simply the number of bits available to represent an integer, e.g. 32 bits. If not, you can simply count the number of bits required by shifting repeatedly n by one bit right, until the result is zero. If n, e.g., would be +1, e.g. 000…001, you had to shift it once right to make the result zero, e.g. 1 times. Thus you need 1 bit to represent it.