Remainder of a division with c++ getting 'Stack Overflow' exception [duplicate] - c++

This question already has answers here:
Why does floating-point arithmetic not give exact results when adding decimal fractions?
(31 answers)
Closed 3 years ago.
I'm trying to understand some concepts in C++, and I made this code to get the remainder of a division (like % operator):
double resto(double a, double b) {
if (a == b) return 0;
else if (a < b) return a;
else {
return resto(a-b, b);
}
}
When I run it with lowers numbers like (12,3) or (2,3), it runs fine.
But if I try to run it with the parameters (2147483647 * 1024, 3) I get:
Stack overflow (parameters: 0x0000000000000001, 0x000000F404403F20)
As I'm new in C++, I'm not sure if it's something with Visual Studio 2017 or it's the compiler, or stack memory, etc.

resto(2147483647 * 1024, 3);
is going to recurse 2147483647 * 1024 / 3, or about 733 billion, times. Every recursive call is using a small amount of Automatic storage for parameters and book-keeping and the program will likely run out of storage before it reaches even a million iterations.
For this you will have to use a loop or smarter logic (for example subtracting larger multiples of b until using smaller numbers begins to make sense), but fmod is probably going to be faster and more effective.
Other notes:
2147483647 * 1024
is an integer times an integer. This math will take place in ints and overflow if int is 16 or 32 bit on your system. Exactly what happens when you overflow a signed integer is undefined, but typically the number does a 2s compliment wrap-around (to -1024 assuming 32 bit integer). More details on overflowing integers in Is signed integer overflow still undefined behavior in C++?. Use
2147483647.0 * 1024
to force floating point numbers.
Also watch out for Is floating point math broken? Floating point is imprecise and it's often difficult to get to floating point numbers that should be the same to actually be the same. a == b is often false when you expect true. In addition if one number gets too much larger than the other a-b may have no visible effect because b is lost in the noise at the end of a. The difference between the two cannot be represented correctly.

Related

Warning C26451: Arithmetic overflow (when subtracting two ints)

When performing the following line of code:
int max = 50, min = -30;
double num = rand() % (max - min) - min;
I get the following warning from Visual Studio 2019:
Warning C26451 Arithmetic overflow: Using operator '-' on a 4 byte value and then casting the result to a 8 byte value. Cast the value to the wider type before calling operator '-' to avoid overflow (io.2).
I'm not sure how this is applicable, as I am taking the modulus of a double, which will return and integer, and then subtracting another integer from it, before storing it in a double (which I'm fairly certain isn't the problem).
Is this a bug or am I doing something that could result in truncation etc.?
Thanks
The difference (in both of subtractions) can result in overflow. The overflow may happen if you subtract negative values from positive, or positive values from negative.
For example, if you subtract something negative from a maximum possible value, you'll get an overflow.
As you have wider destination type, it is possible that you intend this wider result to fit this wider type without overflow. This will not happen, unless you cast one of your operands.
I don't think it is practical to do such cast here, as you use % operator, this will not work with double. And anyway it will not handle overflow, just because you cannot have more range here than range of rand().
Apparently if you want to fix the possible overflow of rand(), you'll need std::uniform_int_distribution with your range. This would fix other rand() problems along (thread safety and not good randomness), but would add a bit of complexity as well.
But sure if you always have -30 to 50 range, there's no overflow, and you can treat it as a false warning. If you literally have int max = 50, min = -30; I even see it as a bug of static analysis to emit this warning. Sure static analysis cannot predict the result of rand(), but there's % to truncate it. Maybe use Help > Send Feedback > Report a problem if you care.

How to properly avoid SIGFPE and overflow on arithmetic operations

I've been trying to create a Fraction class as complete as possible, to learn C++, classes and related stuff on my own. Among other things, I wanted to ensure some level of "protection" against floating point exceptions and overflows.
Objective:
Avoid overflow and floating point exceptions in arithmetic operations found in common operations, expending the least time/memory. If avoiding is not possible, then at least detect it.
Also, the idea is to not cast to some bigger type. That creates a handful of problems (like there might be no bigger type)
Cases I've found:
Overflow on +, -, *, /, pow, root
Operations are mostly straightforward (a and b are Long):
a+b: if LONG_MAX - b > a then there's overflow. (not enough. a or b might be negatives)
a-b: if LONG_MAX - a > -b then there's overflow. (Idem)
a*b: if LONG_MAX / b > a then there's overflow. (if b != 0)
a/b: might thrown SIGFPE if a << b or overflow if b << 0
pow(a,b): if (pow(LONG_MAX, 1.0/b) > a then there's overflow.
pow(a,1.0/b): Similar to a/b
Overflow on abs(x) when x = LONG_MIN (or equivalent)
This is funny. Every signed type has a range [-x-1,x] of possible values. abs(-x-1) = x+1 = -x-1 because overflow. This means there is a case where abs(x) < 0
SIGFPE with big numbers divided by -1
Found when applying numerator/gcd(numerator,denominator). Sometimes gcd returned -1 and I got a floating point exception.
Easy fixes:
On some operations is easy to check for overflow. If that's the case, I can always cast to double (with the risk of loosing precision over big integers). The idea is to find a better solution, without casting.
In Fraction arithmetics, sometimes I can do extra checking for simplifications: to solve a/b * c/d (co-primes), I can reduce to co-primes a/d and c/b first.
I can always do cascade if's asking if a or b are <0 or > 0. Not the prettiest. Besides that awful choice, I can create a function neg() that will avoid that overflow
T neg(T x){if (x > 0) return -x; else return x;},
I can take abs(x) of gcd and any similar situation (anywhere x > LONG_MIN)
I'm not sure if 2. and 3. are the best solutions, but seems good enough. I'm posting those here so maybe anyone has a better answer.
Ugliest fixes
In most operations I need to do a lot of extra operations to check and avoid overflow. Here is were I'm pretty sure I can learn a thing or two.
Example:
Fraction Fraction::operator+(Fraction f){
double lcm = max(den,f.den);
lcm /= gcd(den, f.den);
lcm *= min(den,f.den);
// a/c + b/d = [a*(lcm/d) + b*(lcm/c)] / lcm //use to create normal fractions
// a/c + b/d = [a/lcm * (lcm/c)] + [b/lcm * (lcm/d)] //use to create fractions through double
double p = (double)num;
p *= lcm / (double)den;
double q = (double)f.num;
q *= lcm / (double)f.den;
if(lcm >= LONG_MAX || (p + q) >= LONG_MAX || (p + q) <= LONG_MIN){
//cerr << "Aproximating " << num << "/" << den << " + " << f.num << "/" << f.den << endl;
p = (double)num / lcm;
p *= lcm / (double)den;
q = (double)f.num / lcm;
q *= lcm / (double)f.den;
return Fraction(p + q);
}
else
return normal(p + q, (long)lcm);
}
Which is the best way to avoid overflow on these arithmetic operations?
Edit: There are a handfull of questions in this site quite similar, but those are not the same (detect instead of avoid, unsigned instead of signed, SIGFPE in specific no-related situations).
Checking all of them I found some answers that upon modification might be usefull to give a propper answer, like:
Detect overflow in unsigned addition (not my case, I'm working with signed):
uint32_t x, y;
uint32_t value = x + y;
bool overflow = value < x; // Alternatively "value < y" should also work
Detect overflow in signed operations. This might be a bit too general, with a lot of branches, and doesn't discuss how to avoid overflow.
The CERT rules mentioned in an answer, are a good starting point, but again only discuss how to detect.
Other answers are too general and I wonder if there are any answer more specific for the cases I'm looking at.
You need to differentiate between floating point operations and integral operations.
Concerning the latter, operations on unsigned types do not normally overflow, except for division by zero which is undefined behaviour by definition IIRC. This is closely related to the fact that C(++) standard mandates a binary representation for unsigned numbers, which virtually makes them a ring.
In contrast, the C(++) standard allows for multiple implementations of signed numbers (sign+magnitude, 1's complement or, most widely used, 2's complement). So signed overflow is defined to be undefined behaviour, possibly to give compiler implementers more freedom to generate efficient code for their target machines. Also this is the reason for your worries with abs(): At least in 2's complement representation, there is no positive number that is equal in magnitude to the largest negative number in magnitude. Refer to CERT rules for elaboration.
On the floating point side SIGFPE has historically been coined for signalling floating point exceptions. However, given the variety of implementations of the arithmetic units in processors nowadays, SIGFPE should be considered a generic signal that reports arithmetic errors. For instance, the glibc reference manual gives a list of possible reasons, explicitely including integral division by zero.
It is worth noting that floating point operations as per ANSI/IEEE Std 754, which is most commonly used today I suppose, are specifically designed to be a kind of error-proof. This means that for example, when an addition overflows it gives a result of infinity and typically sets a flag that you can check later. It is perfectly legal to use this infinite value in further calculations as the floating point operations have been defined for affine arithmetic. This once was meant to allow long running computations (on slow machines) to continue even with intermediate overflows etc. Note that certain operations are forbidden even in affine arithmetic, for example dividing infinity by infinity or subtracting infinity by infinity.
So the bottom line is that floating point computations should not normally cause floating point exceptions. Yet you can have so-called traps which cause SIGFPE (or a similar mechanism) to be triggered whenever the above mentioned flags become raised.

Using scientific notation in for loops

I've recently come across some code which has a loop of the form
for (int i = 0; i < 1e7; i++){
}
I question the wisdom of doing this since 1e7 is a floating point type, and will cause i to be promoted when evaluating the stopping condition. Should this be of cause for concern?
The elephant in the room here is that the range of an int could be as small as -32767 to +32767, and the behaviour on assigning a larger value than this to such an int is undefined.
But, as for your main point, indeed it should concern you as it is a very bad habit. Things could go wrong as yes, 1e7 is a floating point double type.
The fact that i will be converted to a floating point due to type promotion rules is somewhat moot: the real damage is done if there is unexpected truncation of the apparent integral literal. By the way of a "proof by example", consider first the loop
for (std::uint64_t i = std::numeric_limits<std::uint64_t>::max() - 1024; i ++< 18446744073709551615ULL; ){
std::cout << i << "\n";
}
This outputs every consecutive value of i in the range, as you'd expect. Note that std::numeric_limits<std::uint64_t>::max() is 18446744073709551615ULL, which is 1 less than the 64th power of 2. (Here I'm using a slide-like "operator" ++< which is useful when working with unsigned types. Many folk consider --> and ++< as obfuscating but in scientific programming they are common, particularly -->.)
Now on my machine, a double is an IEEE754 64 bit floating point. (Such as scheme is particularly good at representing powers of 2 exactly - IEEE754 can represent powers of 2 up to 1022 exactly.) So 18,446,744,073,709,551,616 (the 64th power of 2) can be represented exactly as a double. The nearest representable number before that is 18,446,744,073,709,550,592 (which is 1024 less).
So now let's write the loop as
for (std::uint64_t i = std::numeric_limits<std::uint64_t>::max() - 1024; i ++< 1.8446744073709551615e19; ){
std::cout << i << "\n";
}
On my machine that will only output one value of i: 18,446,744,073,709,550,592 (the number that we've already seen). This proves that 1.8446744073709551615e19 is a floating point type. If the compiler was allowed to treat the literal as an integral type then the output of the two loops would be equivalent.
It will work, assuming that your int is at least 32 bits.
However, if you really want to use exponential notation, you should better define an integer constant outside the loop and use proper casting, like this:
const int MAX_INDEX = static_cast<int>(1.0e7);
...
for (int i = 0; i < MAX_INDEX; i++) {
...
}
Considering this, I'd say it is much better to write
const int MAX_INDEX = 10000000;
or if you can use C++14
const int MAX_INDEX = 10'000'000;
1e7 is a literal of type double, and usually double is 64-bit IEEE 754 format with a 52-bit mantissa. Roughly every tenth power of 2 corresponds to a third power of 10, so double should be able to represent integers up to at least 105*3 = 1015, exactly. And if int is 32-bit then int has roughly 103*3 = 109 as max value (asking Google search it says that "2**31 - 1" = 2 147 483 647, i.e. twice the rough estimate).
So, in practice it's safe on current desktop systems and larger.
But C++ allows int to be just 16 bits, and on e.g. an embedded system with that small int, one would have Undefined Behavior.
If the intention to loop for a exact integer number of iterations, for example if iterating over exactly all the elements in an array then comparing against a floating point value is maybe not such a good idea, solely for accuracy reasons; since the implicit cast of an integer to float will truncate integers toward zero there's no real danger of out-of-bounds access, it will just abort the loop short.
Now the question is: When do these effects actually kick in? Will your program experience them? The floating point representation usually used these days is IEEE 754. As long as the exponent is 0 a floating point value is essentially an integer. C double precision floats 52 bits for the mantissa, which gives you integer precision to a value of up to 2^52, which is in the order of about 1e15. Without specifying with a suffix f that you want a floating point literal to be interpreted single precision the literal will be double precision and the implicit conversion will target that as well. So as long as your loop end condition is less 2^52 it will work reliably!
Now one question you have to think about on the x86 architecture is efficiency. The very first 80x87 FPUs came in a different package, and later a different chip and as aresult getting values into the FPU registers is a bit awkward on the x86 assembly level. Depending on what your intentions are it might make the difference in runtime for a realtime application; but that's premature optimization.
TL;DR: Is it safe to to? Most certainly yes. Will it cause trouble? It could cause numerical problems. Could it invoke undefined behavior? Depends on how you use the loop end condition, but if i is used to index an array and for some reason the array length ended up in a floating point variable always truncating toward zero it's not going to cause a logical problem. Is it a smart thing to do? Depends on the application.

Why pow() return 999... in C++ [duplicate]

While running the following lines of code:
int i,a;
for(i=0;i<=4;i++)
{
a=pow(10,i);
printf("%d\t",a);
}
I was surprised to see the output, it comes out to be 1 10 99 1000 9999 instead of 1 10 100 1000 10000.
What could be the possible reason?
Note
If you think it's a floating point inaccuracy that in the above for loop when i = 2, the values stored in variable a is 99.
But if you write instead
a=pow(10,2);
now the value of a comes out to be 100. How is that possible?
You have set a to be an int. pow() generates a floating point number, that in SOME cases may be just a hair less than 100 or 10000 (as we see here.)
Then you stuff that into the integer, which TRUNCATES to an integer. So you lose that fractional part. Oops. If you really needed an integer result, round may be a better way to do that operation.
Be careful even there, as for large enough powers, the error may actually be large enough to still cause a failure, giving you something you don't expect. Remember that floating point numbers only carry so much precision.
The function pow() returns a double. You're assigning it to variable a, of type int. Doing that doesn't "round off" the floating point value, it truncates it. So pow() is returning something like 99.99999... for 10^2, and then you're just throwing away the .9999... part. Better to say a = round(pow(10, i)).
This is to do with floating point inaccuracy. Although you are passing in ints they are being implicitly converted to a floating point type since the pow function is only defined for floating point parameters.
Mathematically, the integer power of an integer is an integer.
In a good quality pow() routine this specific calculation should NOT produce any round-off errors. I ran your code on Eclipse/Microsoft C and got the following output:
1 10 100 1000 10000
This test does NOT indicate if Microsoft is using floats and rounding or if they are detecting the type of your numbers and choosing the appropriate method.
So, I ran the following code:
#include <stdio.h>
#include <math.h>
main ()
{
double i,a;
for(i=0.0; i <= 4.0 ;i++)
{
a=pow(10,i);
printf("%lf\t",a);
}
}
And got the following output:
1.000000 10.000000 100.000000 1000.000000 10000.000000
No one spelt out how to actually do it correctly - instead of pow function, just have a variable that tracks the current power:
int i, a, power;
for (i = 0, a = 1; i <= 4; i++, a *= 10) {
printf("%d\t",a);
}
This continuing multiplication by ten is guaranteed to give you the correct answer, and quite OK (and much better than pow, even if it were giving the correct results) for tasks like converting decimal strings into integers.

C++ integer floor function

I want to implement greatest integer function. [The "greatest integer function" is a quite standard name for what is also known as the floor function.]
int x = 5/3;
My question is with greater numbers could there be a loss of precision as 5/3 would produce a double?
EDIT: Greatest integer function is integer less than or equal to X.
Example:
4.5 = 4
4 = 4
3.2 = 3
3 = 3
What I want to know is 5/3 going to produce a double? Because if so I will have loss of precision when converting to int.
Hope this makes sense.
You will lose the fractional portion of the quotient. So yes, with greater numbers you will have more relative precision, such as compared with 5000/3000.
However, 5 / 3 will return an integer, not a double. To force it to divide as double, typecast the dividend as static_cast<double>(5) / 3.
Integer division gives integer results, so 5 / 3 is 1 and 5 % 3 is 2 (the remainder operator). However, this doesn't necessarily hold with negative numbers. In the original C++ standard, -5 / 3 could be either -1 (rounding towards zero) or -2 (the floor), but -1 was recommended. In the latest C++0B draft (which is almost certainly very close to the final standard), it is -1, so finding the floor with negative numbers is more involved.
5/3 will always produce 1 (an integer), if you do 5.0/3 or 5/3.0 the result will be a double.
As far as I know, there is no predefined function for this purpose.
It might be necessary to use such a function, if for some reason floating-point calculations are out of question (e.g. int64_t has a higher precision than double can represent without error)
We could define this function as follows:
#include <cmath>
inline long
floordiv (long num, long den)
{
if (0 < (num^den))
return num/den;
else
{
ldiv_t res = ldiv(num,den);
return (res.rem)? res.quot-1
: res.quot;
}
}
The idea is to use the normal integer divison, but adjust for negative results to match the behaviour of the double floor(double) function. The point is to truncate always towards the next lower integer, irrespective of the position of the zero point. This can be very important if the intention is to create even sized intervals.
Timing measurements show that this function here only creates a small overhead compared with the built-in / operator, but of course the floating point based floor function is significantly faster....
Since in C and C++, as others have said, / is integer division, it will return an int. in particular, it will return the floor of the double answer... (C and C++ always truncate) So, basically 5/3 is exactly what you want.
It may get a little weird in negatives as -5/3 => -2 which may or may not be what you want...