Bitwise operations and shifts problems - c++

I am testing the function fitsBits(int x, int n) on my own and I figure out there is a condition that doesn't fit in this function, what is the problem?
/*
* fitsBits - return 1 if x can be represented as an
* n-bit, two's complement integer.
* 1 <= n <= 32
* Examples: fitsBits(5,3) = 0, fitsBits(-4,3) = 1
* Legal ops: ! ~ & ^ | + << >>
* Max ops: 15
* Rating: 2
*/
int fitsBits(int x, int n) {
int r, c;
c = 33 + ~n;
r = !(((x << c)>>c)^x);
return r;
}
It seems like it gives the wrong answer in
fitsBits(0x80000000, 0x20);
It gives me 1, but actually it should be 0...
How could I fix it?
Thank you!

fitsBits(0x80000000, 0x20);
This function returns 1, because the first argument of your function is int, which is (in practice these days) a 32 bit signed integer. The largest value that signed 32 bit integer can represent is 0x7FFFFFFF, which is less than the value you are passing in. Because of that your value gets truncated and becomes -0x80000000, something that 32 bit integer can represent. Therefore your function returns 1 (yes, my first argument is something that can be represented using 0x20 = 32 bits).
If you want your function to properly classify number 0x80000000 as something that cannot be represented using 32 bits, you need to change the type of the first argument of your function. One options would've been using an unsigned int, but from your problem definition it seems like you need to properly handle negative numbers, so your remaining option is long long int, that can hold numbers between -0x8000000000000000 and 0x7FFFFFFFFFFFFFFF.
You will need to do couple more adjustments: you need to explicitly specify that your constant is of type long long by using LL suffix, and you now need to shift by 64 - c, not by 32 - c:
#include <stdio.h>
int fitsBits(long long x, int n) {
long long r;
int c;
c = 65 + ~n;
r = !(((x << c)>>c)^x);
return r;
}
int main() {
printf("%d\n", fitsBits(0x80000000LL, 0x20));
return 0;
}
Link to IDEONE: http://ideone.com/G8I3kZ

Left shifts that cause overflow are undefined for signed types. Hence the compiler may optimise (x<<c)>>c as simply x, and the entire function reduces down to return 1;.
Probably you want to use unsigned types.
A second cause of undefined behavior in your code is that c may be greater than or equal to the width of int. A shift of more than the width of the integer type is undefined behavior.

r = (((x << c)>>c)^x); //This will give you 0, meaning r = 0;
OR
r = !((x << c)>>c);
Your function can be simplified to
int fitsBits(int x) {
int r, c;
c = 33;
r = (((x << c)>>c)^x);
return r;
}
Note that when NOT(!) is brought you're asking for opposite of r

Related

unsigned bit field holding negative value

I'd like to work with 12 bits unsigned integer. Since I am working with array, it is of interest for me to have overflowing value, e.g., 0 - 1 = 4095.
I tried the following but I don't obtain the expected behaviour:
struct bit_field
{
unsigned int x: 12; // 12 bits
};
bit_field ii, jj, kk;
ii.x = 4096;
jj.x = 1;
kk.x = 0;
cout << ii.x;
cout << kk.x - jj.x;
Output:
>> 0 // ov as expected
>> -1 // expected 4095
This is how C/C++ is expected to work; you don't get arbitarily sized integers. your storage width declaration within the struct doesn't change that: the type your operators see is still unsigned int. It's just that you say "when I store this, it's 12 bits".
Because kk.x and kk.x are unsigned integers, their subtraction works just as defined for these: their subtraction is promoting values to signed integers.
Note that you're writing C++, so you can perfectly well write your own class that implements the mathematical operations you want and has cast operators for integer types.

Assign a negative number to an unsigned int

This code gives the meaningful output
#include <iostream>
int main() {
unsigned int ui = 100;
unsigned int negative_ui = -22u;
std::cout << ui + negative_ui << std::endl;
}
Output:
78
The variable negative_ui stores -22, but is an unsigned int.
My question is why does unsigned int negative_ui = -22u; work.
How can an unsigned int store a negative number? Is it save to be used or does this yield undefined behaviour?
I use the intel compiler 18.0.3. With the option -Wall no warnings occurred.
Ps. I have read What happens if I assign a negative value to an unsigned variable? and Why unsigned int contained negative number
How can an unsigned int store a negative number?
It doesn't. Instead, it stores a representable number that is congruent with that negative number modulo the number of all representable values. The same is also true with results that are larger than the largest representable value.
Is it save to be used or does this yield undefined behaviour?
There is no UB. Unsigned arithmetic overflow is well defined.
It is safe to rely on the result. However, it can be brittle. For example, if you add -22u and 100ull, then you get UINT_MAX + 79 (i.e. a large value assuming unsigned long long is a larger type than unsigned) which is congruent with 78 modulo UINT_MAX + 1 that is representable in unsigned long long but not representable in unsigned.
Note that signed arithmetic overflow is undefined.
Signed/Unsigned is a convention. It uses the last bit of the variable (in case of x86 int, the last 31th bit). What you store in the variable takes the full bit length.
It's the calculations that follow that take the upper bit as a sign indicator or ignore it. Therefore, any "unsigned" variable can contain a signed value which will be converted to the unsigned form when the unsigned variable participates in a calculation.
unsigned int x = -1; // x is now 0xFFFFFFFF.
x -= 1; // x is now 0xFFFFFFFE.
if (x < 0) // false. x is compared as 0xFFFFFFFE.
int x = -1; // x stored as 0xFFFFFFFF
x -= 1; // x stored as 0xFFFFFFFE
if (x < 0) // true, x is compared as -2.
Technically valid, bad programming.

why using int64_t gives wrong result while double works as expected for simple integer multiplications

here is my code :
using integer = int64_t;
integer factorial(integer number) {
return number <= 0 ? 1 : number * factorial(number - 1);
}
integer binomial_coefficent(integer n, integer r) {
return factorial(n) / (factorial(r) * factorial(n - r));
}
int main()
{
using namespace std;
cout << binomial_coefficent(40, 20) << endl;
return 0;
}
this prints
0
which is wrong answer but if i change integer type to double that will print 1.37847e+11
which is the correct answer,my question is why using int64_t gives me incorrect answer
and int64_t doesn't overflow either
It does though. For debugging things like this, you can run this with -fsanitize=signed-integer-overflow (implied by -fsanitize=undefined) in GCC or clang to see:
runtime error: signed integer overflow: 21 * 2432902008176640000 cannot be represented in type 'long'
runtime error: signed integer overflow: 2432902008176640000 * 2432902008176640000 cannot be represented in type 'long'
40! is about 8e47. A 64 bit signed integer could hold at max 2^63-1, about 1e19.
factorial(40) does overflow, and since overflow of signed integer types is undefined behavior, anything you observe could not be explained.
Welcome in the world of finite precision numbers! fact(40) is 815915283247897734345611269596115894272000000000 or 0x8eeae81b84c7f27e080fde64ff05254000000000 which will obviously not fit even in a uint64_t neither in a 128 bits long long since it actually requires 160 bits!
But the binomial coefficient 40, 20 can indeed be computed using uint64_t provided you use the correct algorithm that human beings were used to before computers come everywhere around:
integer binomial_coefficient(integer n, integer r) {
integer bc = 1;
integer q = n - r;
for(integer i=1; i<=r; i++) {
br = br * (q + i) / i;
}
return bc;
}
This one will give you the correct value of 137846528820 with no overflow.
(above function omits the test for r <= n/2 that can be an additional optimisation because Cn,p is by construction Cn,n-p)

Carry bits in incidents of overflow

/*
* isLessOrEqual - if x <= y then return 1, else return 0
* Example: isLessOrEqual(4,5) = 1.
* Legal ops: ! ~ & ^ | + << >>
* Max ops: 24
* Rating: 3
*/
int isLessOrEqual(int x, int y)
{
int msbX = x>>31;
int msbY = y>>31;
int sum_xy = (y+(~x+1));
int twoPosAndNegative = (!msbX & !msbY) & sum_xy; //isLessOrEqual is FALSE.
// if = true, twoPosAndNegative = 1; Overflow true
// twoPos = Negative means y < x which means that this
int twoNegAndPositive = (msbX & msbY) & !sum_xy;//isLessOrEqual is FALSE
//We started with two negative numbers, and subtracted X, resulting in positive. Therefore, x is bigger.
int isEqual = (!x^!y); //isLessOrEqual is TRUE
return (twoPosAndNegative | twoNegAndPositive | isEqual);
}
Currently, I am trying to work through how to carry bits in this operator.
The purpose of this function is to identify whether or not int y >= int x.
This is part of a class assignment, so there are restrictions on casting and which operators I can use.
I'm trying to account for a carried bit by applying a mask of the complement of the MSB, to try and remove the most significant bit from the equation, so that they may overflow without causing an issue.
I am under the impression that, ignoring cases of overflow, the returned operator would work.
EDIT: Here is my adjusted code, still not working. But, I think this is progress? I feel like I'm chasing my own tail.
int isLessOrEqual(int x, int y)
{
int msbX = x >> 31;
int msbY = y >> 31;
int sign_xy_sum = (y + (~x + 1)) >> 31;
return ((!msbY & msbX) | (!sign_xy_sum & (!msbY | msbX)));
}
I figured it out with the assistance of one of my peers, alongside the commentators here on StackOverflow.
The solution is as seen above.
The asker has self-answered their question (a class assignment), so providing alternative solutions seems appropriate at this time. The question clearly assumes that integers are represented as two's complement numbers.
One approach is to consider how CPUs compute predicates for conditional branching by means of a compare instruction. "signed less than" as expressed in processor condition codes is SF ≠ OF. SF is the sign flag, a copy of the sign-bit, or most significant bit (MSB) of the result. OF is the overflow flag which indicates overflow in signed integer operations. This is computed as the XOR of the carry-in and the carry-out of the sign-bit or MSB. With two's complement arithmetic, a - b = a + ~b + 1, and therefore a < b = a + ~b < 0. It remains to separate computation on the sign bit (MSB) sufficiently from the lower order bits. This leads to the following code:
int isLessOrEqual (int a, int b)
{
int nb = ~b;
int ma = a & ((1U << (sizeof(a) * CHAR_BIT - 1)) - 1);
int mb = nb & ((1U << (sizeof(b) * CHAR_BIT - 1)) - 1);
// for the following, only the MSB is of interest, other bits are don't care
int cyin = ma + mb;
int ovfl = (a ^ cyin) & (a ^ b);
int sign = (a ^ nb ^ cyin);
int lteq = sign ^ ovfl;
// desired predicate is now in the MSB (sign bit) of lteq, extract it
return (int)((unsigned int)lteq >> (sizeof(lteq) * CHAR_BIT - 1));
}
The casting to unsigned int prior to the final right shift is necessary because right-shifting of signed integers with negative value is implementation-defined, per the ISO-C++ standard, section 5.8. Asker has pointed out that casts are not allowed. When right shifting signed integers, C++ compilers will generate either a logical right shift instruction, or an arithmetic right shift instruction. As we are only interested in extracting the MSB, we can isolate ourselves from the choice by shifting then masking out all other bits besides the LSB, at the cost of one additional operation:
return (lteq >> (sizeof(lteq) * CHAR_BIT - 1)) & 1;
The above solution requires a total of eleven or twelve basic operations. A significantly more efficient solution is based on the 1972 MIT HAKMEM memo, which contains the following observation:
ITEM 23 (Schroeppel): (A AND B) + (A OR B) = A + B = (A XOR B) + 2 (A AND B).
This is straightforward, as A AND B represent the carry bits, and A XOR B represent the sum bits. In a newsgroup posting to comp.arch.arithmetic on February 11, 2000, Peter L. Montgomery provided the following extension:
If XOR is available, then this can be used to average
two unsigned variables A and B when the sum might overflow:
(A+B)/2 = (A AND B) + (A XOR B)/2
In the context of this question, this allows us to compute (a + ~b) / 2 without overflow, then inspect the sign bit to see if the result is less than zero. While Montgomery only referred to unsigned integers, the extension to signed integers is straightforward by use of an arithmetic right shift, keeping in mind that right shifting is an integer division which rounds towards negative infinity, rather than towards zero as regular integer division.
int isLessOrEqual (int a, int b)
{
int nb = ~b;
// compute avg(a,~b) without overflow, rounding towards -INF; lteq(a,b) = SF
int lteq = (a & nb) + arithmetic_right_shift (a ^ nb, 1);
return (int)((unsigned int)lteq >> (sizeof(lteq) * CHAR_BIT - 1));
}
Unfortunately, C++ itself provides no portable way to code an arithmetic right shift, but we can emulate it fairly efficiently using this answer:
int arithmetic_right_shift (int a, int s)
{
unsigned int mask_msb = 1U << (sizeof(mask_msb) * CHAR_BIT - 1);
unsigned int ua = a;
ua = ua >> s;
mask_msb = mask_msb >> s;
return (int)((ua ^ mask_msb) - mask_msb);
}
When inlined, this adds just a couple of instructions to the code when the shift count is a compile-time constant. If the compiler documentation indicates that the implementation-defined handling of signed integers of negative value is accomplished via arithmetic right shift instruction, it is safe to simplify to this six-operation solution:
int isLessOrEqual (int a, int b)
{
int nb = ~b;
// compute avg(a,~b) without overflow, rounding towards -INF; lteq(a,b) = SF
int lteq = (a & nb) + ((a ^ nb) >> 1);
return (int)((unsigned int)lteq >> (sizeof(lteq) * CHAR_BIT - 1));
}
The previously made comments regarding use of a cast when converting the sign bit into a predicate apply here as well.

Modulo division returning negative number

I am carrying out the following modulo division operations from within a C program:
(5^6) mod 23 = 8
(5^15) mod 23 = 19
I am using the following function, for convenience:
int mod_func(int p, int g, int x) {
return ((int)pow((double)g, (double)x)) % p;
}
But the result of the operations when calling the function is incorrect:
mod_func(23, 5, 6) //returns 8
mod_func(23, 5, 15) //returns -6
Does the modulo operator have some limit on the size of the operands?
5 to the power 15 is 30,517,578,125
The largest value you can store in an int is 2,147,483,647
You could use 64-bit integers, but beware you'll have precision issues when converting from double eventually.
From memory, there is a rule from number theory about the calculation you are doing that means you don't need to compute the full power expansion in order to determine the modulo result. But I could be wrong. Been too many years since I learned that stuff.
Ahh, here it is: Modular Exponentiation
Read that, and stop using double and pow =)
int mod_func(int p, int g, int x)
{
int r = g;
for( int i = 1; i < x; i++ ) {
r = (r * g) % p;
}
return r;
}
The integral part of pow(5, 15) is not representable in an int (assuming the width of int is 32-bit). The conversion (from double to int in the cast expression) is undefined behavior in C and in C++.
To avoid undefined behavior, you should use fmod function to perform the floating point remainder operation.
My guess is the problem is 5 ^ 15 = 30517578125 which is greater than INT_MAX (2147483647). You are currently casting it to an int, which is what's failing.
As has been said, your first problem in
int mod_func(int p, int g, int x) {
return ((int)pow((double)g, (double)x)) % p;
}
is that pow(g,x) often exceeds the int range, and then you have undefined behaviour converting that result to int, and whatever the resulting int is, there is no reason to believe it has anything to do with the desired modulus.
The next problem is that the result of pow(g,x) as a double may not be exact. Unless g is a power of 2, the mathematical result cannot be exactly represented as a double for large enough exponents even if it is in range, but it could also happen if the mathematical result is exactly representable (depends on the implementation of pow).
If you do number-theoretic computations - and computing the residue of a power modulo an integer is one - you should only use integer types.
For the case at hand, you can use exponentiation by repeated squaring, computing the residue of all intermediate results. If the modulus p is small enough that (p-1)*(p-1) never overflows,
int mod_func(int p, int g, int x) {
int aux = 1;
g %= p;
while(x > 0) {
if (x % 2 == 1) {
aux = (aux * g) % p;
}
g = (g * g) % p;
x /= 2;
}
return aux;
}
does it. If p can be larger, you need to use a wider type for the calculations.