How to check for INT_MAX without overflowing int - c++

I created a BigInt class which allows for huge numbers (from any bases 2-36) far beyond the integer max by storing each digit in a vector. I need to be able to convert this back to an integer but return the int max/min instead if the max is reached, otherwise there will ofc be an integer overflow.
My question is how can I check if I have exceeded the max without overflowing the integer I am building. I have tried moving the if statements at the bottom into the for loop but my integer still overflows. I feel like the solution is simple but I just can't grasp it.
// Convert BigInt to integer base 10 and return that int
// If BigInt > INT_MAX, return INT_MAX.
// If BigInt < INT_MIN, return INT_MIN.
int BigInt::to_int() const{
int number = 0;
for(size_t i = 0; i < vec.size(); i++) {
number += vec[i] * pow(base, i);
}
if (!isPositive) { number *= -1; }
if (number > INT_MAX) { return INT_MAX; }
if (number < INT_MIN) { return INT_MIN; }
return number;
}

Comparing an int value to INT_MAX is pointless except for equality because all values are less than or equal.
Performing overflow check after the overflowing signed operations is pointless because either they show that there was no overflow or the behaviour of the program is undefined. Always do the check before attempting operations that would overflow the result.
In this case, convert INT_MAX to your BigInt type and compare that with *this.

Preliminary info: you need to check for overflow before calculating something that overflows. If the overflow already happened, it's too late.
Adding an overflow check to your version of to_int() is tricky because you build up your value starting from the one's place. Because of this approach, you try to add pow(base, i), which could overflow an int by itself and that is not easy to detect in advance. Possible, but let's consider something else.
If you were to build up your value ending at the one's place (i.e. repeatedly calculate number*base + digit), you could check for overflow before multiplying. Here is some math, using shorter names for an easier read. Let x and M be integers, base a positive integer, and d some non-negative integer less than base. (M short for "max" and d short for "digit".) Division will mean real-valued division, as I can trunc() the result to get integer division. We want to know how x*base + d compares to M.
If x*base + d <= M then dividing by base gives x + d/base <= M/base, hence x <= trunc(M/base).
By the contrapositive, if x > trunc(M/base) then x*base + d > M.
If x*base + d >= M then dividing by base gives x + d/base >= M/base, hence x >= trunc(M/base).
By the contrapositive, if x < trunc(M/base) then x*base + d < M.
If x == trunc(M/base) then x*base == trunc(M/base)*base. Add M%base to both sides to get x*base + M%base == M. Well, I hope you'll accept the observation that trunc(M/base)*base + M%base == M. If you can accept that much, then the comparison between x*base + d and M is the same as the comparison between d and M%base.
Done with the math. Let's put this into code. You might note a performance increase as well, depending on how your compiler optimizes.
// Tests if number * base + next_digit will overflow an int.
bool will_overflow(int number, int base, int next_digit )
{
if ( number > INT_MAX/base )
return true;
if ( number < INT_MAX/base )
return false;
// It's close enough that the next digit decides it.
return next_digit > INT_MAX % base;
}
// Convert BigInt to integer base 10 and return that int
// If BigInt > INT_MAX, return INT_MAX.
// If BigInt < INT_MIN, return INT_MIN.
int BigInt::to_int() const {
int number = 0;
// Loop in the reverse direction. Be careful with unsigned values!
for(size_t i = vec.size(); i > 0; --i) {
if ( will_overflow(number, base, vec[i-1]) )
return isPositive ? INT_MAX : INT_MIN;
number = number * base + vec[i-1];
}
return number;
}
I will point out one small cheat in this. There is a single negative value that fits in an int, but whose absolute value is greater than INT_MAX. If that singular value comes up, this function will incorrectly detect it as an overflow and return INT_MIN. Fortunately, that works out fine since the singular value is INT_MIN. :)

Related

Floating point error in C++ code

I am trying to solve a question in which i need to find out the number of possible ways to make a team of two members.(note: a team can have at most two person)
After making this code, It works properly but in some test cases it shows floating point error ad i can't find out what it is exactly.
Input: 1st line : Number of test cases
2nd line: number of total person
Thank you
#include<iostream>
using namespace std;
long C(long n, long r)
{
long f[n + 1];
f[0] = 1;
for (long i = 1; i <= n; i++)
{
f[i] = i * f[i - 1];
}
return f[n] / f[r] / f[n - r];
}
int main()
{
long n, r, m,t;
cin>>t;
while(t--)
{
cin>>n;
r=1;
cout<<C(n, min(r, n - r))+1<<endl;
}
return 0;
}
You aren't getting a floating point exception. You are getting a divide by zero exception. Because your code is attempting to divide by the number 0 (which can't be done on a computer).
When you invoke C(100, 1) the main loop that initializes the f array inside C increases exponentially. Eventually, two values are multiplied such that i * f[i-1] is zero due to overflow. That leads to all the subsequent f[i] values being initialized to zero. And then the division that follows the loop is a division by zero.
Although purists on these forums will say this is undefined, here's what's really happening on most 2's complement architectures. Or at least on my computer....
At i==21:
f[20] is already equal to 2432902008176640000
21 * 2432902008176640000 overflows for 64-bit signed, and will typically become -4249290049419214848 So at this point, your program is bugged and is now in undefined behavior.
At i==66
f[65] is equal to 0x8000000000000000. So 66 * f[65] gets calculated as zero for reasons that make sense to me, but should be understood as undefined behavior.
With f[66] assigned to 0, all subsequent assignments of f[i] become zero as well. After the main loop inside C is over, the f[n-r] is zero. Hence, divide by zero error.
Update
I went back and reverse engineered your problem. It seems like your C function is just trying to compute this expression:
N!
-------------
R! * (N-R)!
Which is the "number of unique sorted combinations"
In which case instead of computing the large factorial of N!, we can reduce that expression to this:
n
[ ∏ i ]
n-r
--------------------
R!
This won't eliminate overflow, but will allow your C function to be able to take on larger values of N and R to compute the number of combinations without error.
But we can also take advantage of simple reduction before trying to do a big long factorial expression
For example, let's say we were trying to compute C(15,5). Mathematically that is:
15!
--------
10! 5!
Or as we expressed above:
1*2*3*4*5*6*7*8*9*10*11*12*13*14*15
-----------------------------------
1*2*3*4*5*6*7*8*9*10 * 1*2*3*4*5
The first 10 factors of the numerator and denominator cancel each other out:
11*12*13*14*15
-----------------------------------
1*2*3*4*5
But intuitively, you can see that "12" in the numerator is already evenly divisible by denominators 2 and 3. And that 15 in the numerator is evenly divisible by 5 in the denominator. So simple reduction can be applied:
11*2*13*14*3
-----------------------------------
1 * 4
There's even more room for greatest common divisor reduction, but this is a great start.
Let's start with a helper function that computes the product of all the values in a list.
long long multiply_vector(std::vector<int>& values)
{
long long result = 1;
for (long i : values)
{
result = result * i;
if (result < 0)
{
std::cout << "ERROR - multiply_range hit overflow" << std::endl;
return 0;
}
}
return result;
}
Not let's implement C as using the above function after doing the reduction operation
long long C(int n, int r)
{
if ((r >= n) || (n < 0) || (r < 0))
{
std::cout << "invalid parameters passed to C" << std::endl;
return 0;
}
// compute
// n!
// -------------
// r! * (n-r)!
//
// assume (r < n)
// Which maps to
// n
// [∏ i]
// n - r
// --------------------
// R!
int end = n;
int start = n - r + 1;
std::vector<int> numerators;
std::vector<int> denominators;
long long numerator = 1;
long long denominator = 1;
for (int i = start; i <= end; i++)
{
numerators.push_back(i);
}
for (int i = 2; i <= r; i++)
{
denominators.push_back(i);
}
size_t n_length = numerators.size();
size_t d_length = denominators.size();
for (size_t n = 0; n < n_length; n++)
{
int nval = numerators[n];
for (size_t d = 0; d < d_length; d++)
{
int dval = denominators[d];
if ((nval % dval) == 0)
{
denominators[d] = 1;
numerators[n] = nval / dval;
}
}
}
numerator = multiply_vector(numerators);
denominator = multiply_vector(denominators);
if ((numerator == 0) || (denominator == 0))
{
std::cout << "Giving up. Can't resolve overflow" << std::endl;
return 0;
}
long long result = numerator / denominator;
return result;
}
You are not using floating-point. And you seem to be using variable sized arrays, which is a C feature and possibly a C++ extension but not standard.
Anyway, you will get overflow and therefore undefined behaviour even for rather small values of n.
In practice the overflow will lead to array elements becoming zero for not much larger values of n.
Your code will then divide by zero and crash.
They also might have a test case like (1000000000, 999999999) which is trivial to solve, but not for your code which I bet will crash.
You don't specify what you mean by "floating point error" - I reckon you are referring to the fact that you are doing an integer division rather than a floating point one so that you will always get integers rather than floats.
int a, b;
a = 7;
b = 2;
std::cout << a / b << std::endl;
this will result in 3, not 3.5! If you want floating point result you should use floats instead like this:
float a, b;
a = 7;
b = 2;
std::cout << a / b << std::end;
So the solution to your problem would simply be to use float instead of long long int.
Note also that you are using variable sized arrays which won't work in C++ - why not use std::vector instead??
Array syntax as:
type name[size]
Note: size must a constant not a variable
Example #1:
int name[10];
Example #2:
const int asize = 10;
int name[asize];

Hash Function Clarification

Went over this in class today:
const int tabsize = 100000;
int hash(string s) {
const int init = 21512712, mult = 96169, emergency = 876127;
int v = init;
for (int i=0; i<s.length(); i+=1)
v = v * mult + s[i];
if (v < 0) v = -v;
if (v < 0) v = emergency;
return v % tabsize;
}
Having some trouble figuring out what the last 2 if-statements are supposed to do.
Any ideas?
Thanks
The first if statement takes care of overflow behavior of signed integers. Thus if the integer gets too big that it wraps and becomes negative, this if statement ensures that only the positive integer is returned.
The second if statement is used to take care of the rare case of where v is 2147483648.
Note that positive signed 32 bit integers only go up to 231 - 1 or 2147483647 while the negative can go down to -231 or -2147483648.This number is negative and even negating it still gives a negative number. So that is what the emergency number is for
int main() {
int t = -2147483648;
std::cout << (-t) << std::endl;
}
They ensure the v is positive, because when you use the % operator on a negative number you can get a negative result which is not desirable for a hash value.
However, this does get into undefined behavior with the integer overflow so it might not work everywhere.

Determining if square root is an integer

In my program, I am trying to take the find the largest prime factor of the number 600851475143. I have made one for loop that determines all the factors of that number and stores them in a vector array. The problem I am having is that I don't know how to determine if the factor can be square rooted and give a whole number rather than a decimal. My code so far is:
#include <iostream>
#include <vector>
#include <math.h>
using namespace std;
vector <int> factors;
int main()
{
double num = 600851475143;
for (int i=1; i<=num; i++)
{
if (fmod(num,i)==0)
{
factors.push_back(i);
}
}
for (int i=0; i<factors.size(); i++)
{
if (sqrt(factor[i])) // ???
}
}
Can someone show me how to determine whether a number can be square rooted or not through my if statement?
int s = sqrt(factor[i]);
if ((s * s) == factor[i])
As hobbs pointed out in the comments,
Assuming that double is the usual 64-bit IEEE-754 double-precision float, for values less than 2^53 the difference between one double and the next representable double is less than or equal to 1. Above 2^53, the precision is worse than integer.
So if your int is 32 bits you are safe. If you have to deal with numbers bigger than 2^53, you may have some precision errors.
Perfect squares can only end in 0, 1, 4, or 9 in base 16, So for 75% of your inputs (assuming they are uniformly distributed) you can avoid a call to the square root in exchange for some very fast bit twiddling.
int isPerfectSquare(int n)
{
int h = n & 0xF; // h is the last hex "digit"
if (h > 9)
return 0;
// Use lazy evaluation to jump out of the if statement as soon as possible
if (h != 2 && h != 3 && h != 5 && h != 6 && h != 7 && h != 8)
{
int t = (int) floor( sqrt((double) n) + 0.5 );
return t*t == n;
}
return 0;
}
usage:
for ( int i = 0; i < factors.size(); i++) {
if ( isPerfectSquare( factor[ i]))
//...
}
Fastest way to determine if an integer's square root is an integer
The following should work. It takes advantage of integer truncation.
if (int (sqrt(factor[i])) * int (sqrt(factor[i])) == factor[i])
It works because the square root of a non-square number is a decimal. By converting to an integer, you remove the fractional part of the double. Once you square this, it is no longer equal to the original square root.
You also have to take into account the round-off error when comparing to cero. You can use std::round if your compiler supports c++11, if not, you can do it yourself (here)
#include <iostream>
#include <vector>
#include <math.h>
using namespace std;
vector <int> factors;
int main()
{
double num = 600851475143;
for (int i=1; i<=num; i++)
{
if (round(fmod(num,i))==0)
{
factors.push_back(i);
}
}
for (int i=0; i<factors.size(); i++)
{
int s = sqrt(factor[i]);
if ((s * s) == factor[i])
}
}
You are asking the wrong question. Your algorithm is wrong. (Well, not entirely, but if it were to be corrected following the presented idea, it would be quite inefficient.) With your approach, you need also to check for cubes, fifth powers and all other prime powers, recursively. Try to find all factors of 5120=5*2^10 for example.
The much easier way is to remove a factor after it was found by dividing
num=num/i
and only increase i if it is no longer a factor. Then, if the iteration encounters some i=j^2 or i=j^3,... , all factors j, if any, were already removed at an earlier stage, when i had the value j, and accounted for in the factor array.
You could also have mentioned that this is from the Euler project, problem 3. Then you would have, possibly, found the recent discussion "advice on how to make my algorithm faster" where more efficient variants of the factorization algorithm were discussed.
Here is a simple C++ function I wrote for determining whether a number has an integer square root or not:
bool has_sqrtroot(int n)
{
double sqrtroot=sqrt(n);
double flr=floor(sqrtroot);
if(abs(sqrtroot - flr) <= 1e-9)
return true;
return false;
}
As sqrt() function works with floating-point it is better to avoid working with its return value (floating-point calculation occasionally gives the wrong result, because of precision error). Rather you can write a function- isSquareNumber(int n), which will decide if the number is a square number or not and the whole calculation will be done in integer.
bool isSquareNumber(int n){
int l=1, h=n;
while(l<=h){
int m = (l+h) / 2;
if(m*m == n){
return true;
}else if(m*m > n){
h = m-1;
}else{
l = m+1;
}
}
return false;
}
int main()
{
// ......
for (int i=0; i<factors.size(); i++){
if (isSquareNumber(factor[i]) == true){
/// code
}
}
}

Find largest unsigned int .... Why doesn't this work?

Couldn't you initialize an unsigned int and then increment it until it doesn't increment anymore? That's what I tried to do and I got a runtime error "Timeout." Any idea why this doesn't work? Any idea how to do it correctly?
#include
int main() {
unsigned int i(0), j(1);
while (i != j) {
++i;
++j;
}
std::cout << i;
return 0;
}
Unsigned arithmetic is defined as modulo 2^n in C++ (where n is
the number of bits). So when you increment the maximum value,
you get 0.
Because of this, the simplest way to get the maximum value is to
use -1:
unsigned int i = -1;
std::cout << i;
(If the compiler gives you a warning, and this bothers you, you
can use 0U - 1, or initialize with 0, and then decrement.)
Since i will never be equal to j, you have an infinite loop.
Additionally, this is a very inefficient method for determining the maximum value of an unsigned int. numeric_limits gives you the result without looping for 2^(16, 32, 64, or however many bits are in your unsigned int) iterations. If you didn't want to do that, you could write a much smaller loop:
unsigned int shifts = sizeof(unsigned int) * 8; // or CHAR_BITS
unsigned int maximum_value = 1;
for (int i = 1; i < shifts; ++i)
{
maximum_value <<= 1;
++maximum_value;
}
Or simply do
unsigned int maximum = (unsigned int)-1;
i will always be different than j, so you have entered an endless loop. If you want to take this approach, your code should look like this:
unsigned int i(0), j(1);
while (i < j) {
++i;
++j;
}
std::cout << i;
return 0;
Notice I changed it to while (i<j). Once j overflows i will be greater than j.
When an overflow happens, the value doesn't just stay at the highest, it wraps back abound to the lowest possible number.
i and j will be never equal to each other. When an unsigned integral value achieves its maximum adding to it 1 will result in that the next value will be the minimum that is 0.
For example if to consider unsigned char then its maximum is 255. After adding 1 you will get 0.
So your loop is infinite.
I assume you're trying to find the maximum limit that an unsigned integer can store (which is 65,535 in decimal). The reason that the program will time out is because when the int hits the maximum value it can store, it "Goes off the end." The next time j increments, it will be 65,535; i will be 0.
This means that the way you're going about it, i would NEVER equal j, and the loop would run indefinitely. If you changed it to what Damien has, you'd have i == 65,535; j equal to 0.
Couldn't you initialize an unsigned int and then increment it until it doesn't increment anymore?
No. Unsigned arithmetic is modular, so it wraps around to zero after the maximum value. You can carry on incrementing it forever, as your loop does.
Any idea how to do it correctly?
unsigned int max = -1; // or
unsigned int max = std::numeric_limits<unsigned int>::max();
or, if you want to use a loop to calculate it, change your condition to (j != 0) or (i < j) to stop when j wraps. Since i is one behind j, that will contain the maximum value. Note that this might not work for signed types - they give undefined behaviour when they overflow.

Computing large combinations

How would you compute a combination such as (100,000 choose 50,000)?
I have tried three different approaches thus far but for obvious reasons each has failed:
1) Dynamic Programming- The size of the array just gets to be so ridiculous it seg faults
unsigned long long int grid[p+1][q+1];
//Initialise x boundary conditions
for (long int i = 0; i < q; ++i) {
grid[p][i] = 1;
}
//Initialise y boundary conditions
for (long int i = 0; i < p; ++i) {
grid[i][q] = 1;
}
for (long int i = p - 1; i >= 0; --i) {
for (long int j = q - 1; j >= 0; --j) {
grid[i][j] = grid[i+1][j] + grid[i][j+1];
}
}
2) Brute Force - Obviously calculating even 100! isn't realistic
unsigned long long int factorial(long int n)
{
return (n == 1 || n == 0) ? 1 : factorial(n - 1) * n;
}
3) Multiplicative Formula- I'm unable to store the values they are just so large
const int gridSize = 100000; //say 100,000
unsigned long long int paths = 1;
for (int i = 0; i < gridSize; i++) {
paths *= (2 * gridSize) - i;
paths /= i + 1;
}
// code from (http://www.mathblog.dk/project-euler-15/)
If it helps for context the aim of this is to solve the "How many routes are there through an m×n grid" problem for large inputs. Maybe I am miss-attacking the problem?
C(100000, 50000) is a huge number with 30101 decimal digits: http://www.wolframalpha.com/input/?i=C%28100000%2C+50000%29
Obviously unsigned long long will not be enough to store it. You need some arbitrary large integers library, like GMP: http://en.wikipedia.org/wiki/GNU_Multiple_Precision_Arithmetic_Library
Otherwise, multiplicative formula should be good enough.
"How would you compute ..." depends very much on the desired accuracy. Precise results can only be computed with arbitrary precission numbers (eg. GMP), but it is rather likely that you don't really need the exact result.
In that case I would use the Stirling Approximation for factorials ( http://en.wikipedia.org/wiki/Stirling%27s_approximation ) and calculate with doubles. The number of summands in the expansion can be used to regulate the error. The wikipedia page will also give you an error estimate.
Here is recursive formula that might help : -
NCk = (N-1)C(k-1)*N/K
Use a recursive call for (N-1)C(K-1) first then evaluate NCk on result.
As your numbers will be very large use one of following alternatives.
GMP
Use your own implementation where you can store numbers as sequence of binary bits in array and use booth's algorithm for multiplication
and shift & subtract for division.