Calculator with specific methods only. Normal and recursive - c++

So atm im stuck with my calculator. It is only allowed to use following methods:
int succ(int x){
return ++x;
}
int neg(int x){
return -x;
}
What i already got is +, -. *. Iterativ an also recursive (so i can also use them if needed).
Now im stuck on the divide method because i dont know how to deal with the commas and the logic behind it. Just to imagine what it looks like to deal with succ() and neg() heres an example of an subtraction iterativ and recursive:
int sub(int x, int y){
if (y > 0){
y = neg(y);
x = add(x, y);
return x;
}
else if (y < 0){
y = neg(y);
x = add(x, y);
return x;
}
else if (y == 0) {
return x;
}
}
int sub_recc(int x, int y){
if (y < 0){
y = neg(y);
x = add_recc(x, y);
return x;
} else if (y > 0){
x = sub_recc(x, y - 1);
x = x - 1;
return x;
}else if( y == 0) {
return x;
}
}

If you can substract and add, then you can handle integer division. In pseudo code it is just:
division y/x is:
First handle signs because we will only divide positive integers
set sign = 0
if y > 0 then y = neg(y), sign = 1 - sign
if x > 0 then y = neg(y), sign = 1 - sign
ok, if sign is 0 nothing to do, if sign is 1, we will negate the result
Now the quotient is just the number of times you can substract the divisor:
set quotient = 0
while y > x do
y = y - x
quotient = quotient + 1
Ok we have the absolute value of the quotient, now for the sign:
if sign == 1, then quotient = neg(quotient)
The correct translation in C++ language as well as the recursive part are left as an exercise...
Hint for recursion y/x == 1 + (y-x)/x while y>x
Above was the integer part. Integer is nice and easy because it gives exact operations. A floating point representation in a base is always something close to mantissa * baseexp where mantissa is either an integer number with a maximum number of digits or a number between 0 and 1 (said normal representation). And you can pass from one representation to the other but changing the exponent part by the number of digits of the mantissa: 2.5 is 25 10-1 (int mantissa) of .25 101 (0 <= mantissa < 1).
So if you want to operate base 10 floating point numbers you should:
convert an integer to a floating point (mantissa + exponent) representation
for addition and substraction, the result exponent is a priori the greater of the exponents. Both mantissa shall be scaled to that exponent and added/substracted. Then the final exponent must be adjusted because the operation may have added an additional digit (7 + 9 = 16) or have caused the highest order ones to vanish (101 - 98 - 3)
for product, you add the exponents and multiply the mantissas, and then normalize (adjust exponent) the resul
for division, you scale the mantissa by the maximum number of digits, make the division with the integer division algorithm, and again normalise. For example 1/3 with a precision of 6 digits is obtained with:
1/3 = (1 * 106 /3) * 10-6 = (1000000/3) * 10-6
it give 333333 * 10-6 so .333333 in normalized form
Ok, it will be a lot of boiling plate code, but nothing really hard.
Log story made short: just remember how you learned that with a paper and a pencil...

Related

How do I avoid getting -0 when dividing in c++

I have a script in which I want to find the chunk my player is in.
Simplified version:
float x = -5
float y = -15
int chunkSize = 16
int player_chunk_x = int(x / chunkSize)
int player_chunk_y = int(y / chunkSize)
This gives the chunk the player is in, but when x or y is negative but not less than the chunkSize (-16), player_chunk_x or player_chunk_y is still 0 or '-0' when I need -1
Of course I can just do this:
if (x < 0) x--
if (y < 0) y--
But I was wondering if there is a better solution to my problem.
Thanks in advance.
Since C++20 it's impossible to get an integral type signed negative zero, and was only possible in a rare (but by no means extinct) situation where your platform had 1's complement int. It's still possible in C (although rare), and adding 0 to the result will remove it.
It's possible though to have a floating point signed negative zero. For that, adding 0.0 will remove it.
Note that for an integral -0, subtracting 1 will yield -1.
Your issue is that you are casting a floating point value to an integer value.
This rounds to zero by default.
If you want consistent round down, you first have to floor your value:
int player_chunk_x = int(std::floor(x / chunkSize);
If you don't like negative numbers then don't use them:
int player_chunk_x = (x - min_x) / chunkSize;
int player_chunk_y = (y - min_y) / chunkSize;
If you want integer, in this case -1 on ( -5%16 or anything like it ) then this is possible using a math function:
Possible Ways :
using floor ->
float x = -5;
float y = -15;
int chunkSize = 16;
int player_chunk_x = floor(x / chunkSize)
// will give -1 for (-5 % 16);
// 0 for (5%16)
// 1 for any value between 1 & 2 and so on
int player_chunk_y = floor(y / chunkSize);

Exact value of a floating-point number as a rational

I'm looking for a method to convert the exact value of a floating-point number to a rational quotient of two integers, i.e. a / b, where b is not larger than a specified maximum denominator b_max. If satisfying the condition b <= b_max is impossible, then the result falls back to the best approximation which still satisfies the condition.
Hold on. There are a lot of questions/answers here about the best rational approximation of a truncated real number which is represented as a floating-point number. However I'm interested in the exact value of a floating-point number, which is itself a rational number with a different representation. More specifically, the mathematical set of floating-point numbers is a subset of rational numbers. In case of IEEE 754 binary floating-point standard it is a subset of dyadic rationals. Anyway, any floating-point number can be converted to a rational quotient of two finite precision integers as a / b.
So, for example assuming IEEE 754 single-precision binary floating-point format, the rational equivalent of float f = 1.0f / 3.0f is not 1 / 3, but 11184811 / 33554432. This is the exact value of f, which is a number from the mathematical set of IEEE 754 single-precision binary floating-point numbers.
Based on my experience, traversing (by binary search of) the Stern-Brocot tree is not useful here, since that is more suitable for approximating the value of a floating-point number, when it is interpreted as a truncated real instead of an exact rational.
Possibly, continued fractions are the way to go.
The another problem here is integer overflow. Think about that we want to represent the rational as the quotient of two int32_t, where the maximum denominator b_max = INT32_MAX. We cannot rely on a stopping criterion like b > b_max. So the algorithm must never overflow, or it must detect overflow.
What I found so far is an algorithm from Rosetta Code, which is based on continued fractions, but its source mentions it is "still not quite complete". Some basic tests gave good results, but I cannot confirm its overall correctness and I think it can easily overflow.
// https://rosettacode.org/wiki/Convert_decimal_number_to_rational#C
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <stdint.h>
/* f : number to convert.
* num, denom: returned parts of the rational.
* md: max denominator value. Note that machine floating point number
* has a finite resolution (10e-16 ish for 64 bit double), so specifying
* a "best match with minimal error" is often wrong, because one can
* always just retrieve the significand and return that divided by
* 2**52, which is in a sense accurate, but generally not very useful:
* 1.0/7.0 would be "2573485501354569/18014398509481984", for example.
*/
void rat_approx(double f, int64_t md, int64_t *num, int64_t *denom)
{
/* a: continued fraction coefficients. */
int64_t a, h[3] = { 0, 1, 0 }, k[3] = { 1, 0, 0 };
int64_t x, d, n = 1;
int i, neg = 0;
if (md <= 1) { *denom = 1; *num = (int64_t) f; return; }
if (f < 0) { neg = 1; f = -f; }
while (f != floor(f)) { n <<= 1; f *= 2; }
d = f;
/* continued fraction and check denominator each step */
for (i = 0; i < 64; i++) {
a = n ? d / n : 0;
if (i && !a) break;
x = d; d = n; n = x % n;
x = a;
if (k[1] * a + k[0] >= md) {
x = (md - k[0]) / k[1];
if (x * 2 >= a || k[1] >= md)
i = 65;
else
break;
}
h[2] = x * h[1] + h[0]; h[0] = h[1]; h[1] = h[2];
k[2] = x * k[1] + k[0]; k[0] = k[1]; k[1] = k[2];
}
*denom = k[1];
*num = neg ? -h[1] : h[1];
}
All finite double are rational numbers as OP well stated..
Use frexp() to break the number into its fraction and exponent. The end result still needs to use double to represent whole number values due to range requirements. Some numbers are too small, (x smaller than 1.0/(2.0,DBL_MAX_EXP)) and infinity, not-a-number are issues.
The frexp functions break a floating-point number into a normalized fraction and an integral power of 2. ... interval [1/2, 1) or zero ...
C11 ยง7.12.6.4 2/3
#include <math.h>
#include <float.h>
_Static_assert(FLT_RADIX == 2, "TBD code for non-binary FP");
// Return error flag
int split(double x, double *numerator, double *denominator) {
if (!isfinite(x)) {
*numerator = *denominator = 0.0;
if (x > 0.0) *numerator = 1.0;
if (x < 0.0) *numerator = -1.0;
return 1;
}
int bdigits = DBL_MANT_DIG;
int expo;
*denominator = 1.0;
*numerator = frexp(x, &expo) * pow(2.0, bdigits);
expo -= bdigits;
if (expo > 0) {
*numerator *= pow(2.0, expo);
}
else if (expo < 0) {
expo = -expo;
if (expo >= DBL_MAX_EXP-1) {
*numerator /= pow(2.0, expo - (DBL_MAX_EXP-1));
*denominator *= pow(2.0, DBL_MAX_EXP-1);
return fabs(*numerator) < 1.0;
} else {
*denominator *= pow(2.0, expo);
}
}
while (*numerator && fmod(*numerator,2) == 0 && fmod(*denominator,2) == 0) {
*numerator /= 2.0;
*denominator /= 2.0;
}
return 0;
}
void split_test(double x) {
double numerator, denominator;
int err = split(x, &numerator, &denominator);
printf("e:%d x:%24.17g n:%24.17g d:%24.17g q:%24.17g\n",
err, x, numerator, denominator, numerator/ denominator);
}
int main(void) {
volatile float third = 1.0f/3.0f;
split_test(third);
split_test(0.0);
split_test(0.5);
split_test(1.0);
split_test(2.0);
split_test(1.0/7);
split_test(DBL_TRUE_MIN);
split_test(DBL_MIN);
split_test(DBL_MAX);
return 0;
}
Output
e:0 x: 0.3333333432674408 n: 11184811 d: 33554432 q: 0.3333333432674408
e:0 x: 0 n: 0 d: 9007199254740992 q: 0
e:0 x: 1 n: 1 d: 1 q: 1
e:0 x: 0.5 n: 1 d: 2 q: 0.5
e:0 x: 1 n: 1 d: 1 q: 1
e:0 x: 2 n: 2 d: 1 q: 2
e:0 x: 0.14285714285714285 n: 2573485501354569 d: 18014398509481984 q: 0.14285714285714285
e:1 x: 4.9406564584124654e-324 n: 4.4408920985006262e-16 d: 8.9884656743115795e+307 q: 4.9406564584124654e-324
e:0 x: 2.2250738585072014e-308 n: 2 d: 8.9884656743115795e+307 q: 2.2250738585072014e-308
e:0 x: 1.7976931348623157e+308 n: 1.7976931348623157e+308 d: 1 q: 1.7976931348623157e+308
Leave the b_max consideration for later.
More expedient code is possible with replacing pow(2.0, expo) with ldexp(1, expo) #gammatester or exp2(expo) #Bob__
while (*numerator && fmod(*numerator,2) == 0 && fmod(*denominator,2) == 0) could also use some performance improvements. But first, let us get the functionality as needed.

compute midpoint in floating point

Given two floating point numbers (IEEE single or double precision), I would like to find the number that lies half-way between them, but not in the sense of (x+y)/2 but with respect to actually representable numbers.
if both x and y are positive, the following works
float ieeeMidpoint(float x, float y)
{
assert(x >= 0 && y >= 0);
int xi = *(int*)&x;
int yi = *(int*)&y;
int zi = (xi+yi)/2;
return *(float*)&zi;
}
The reason this works is that positive ieee floating point numbers (including subnormals and infinity) keep their order when doing a reinterpreting cast. (this is not true for the 80-bit extended format, but I don't need that anyway).
Now I am looking for an elegant way to do the same that includes the case when one or both of the numbers are negative. Of course it is easy to do with a bunch of if's, but I was wondering if there is some nice bit-magic, prefarably without any branching.
Figured it out myself. the order of negative number is reversed when doing the reinterpreting cast, so that is the only thing one needs to fix. This version is longer than I hoped it would be, but its only some bit-shuffling, so it should be fast.
float ieeeMidpoint(float x, float y)
{
// check for NaN's (Note that subnormals and infinity work fine)
assert(x ==x && y == y);
// re-interpreting cast
int xi = *(int*)&x;
int yi = *(int*)&y;
// reverse negative numbers
// (would look cleaner with an 'if', but I like not branching)
xi = xi ^ ((xi >> 31) & 0x3FFFFFFF);
yi = yi ^ ((yi >> 31) & 0x3FFFFFFF);
// compute average of xi,yi (overflow-safe)
int zi = (xi >> 1) + (yi >> 1) + (xi & 1);
// reverse negative numbers back
zi = zi ^ ((zi >> 31) & 0x3FFFFFFF);
// re-interpreting back to float
return *(float*)&zi;
}

Make sure float is less than double C++

Here's what I want to do:
Take a double (which is between -1 and 1) and cast it to a float. But I want to make sure that the float is ALWAYS less than the double.
Is there any straightforward way to do this?
For reference, here's something I came up with.
float DoubleToSmallerFloat (double X) // ex. X = 0.79828470019999997
{
float Y = X; // 0.79828471 -> note this is greater than X
double Diff = X - Y;
return Y - Abs (Diff) * 10;
}
If you are able to use C++11 then you can use nextafter() for this:
float doubleToSmallerFloat(double x) {
float f = x;
return f < x ? f : nextafter(f, -1.0f);
}
I think that is a good question. Look at IEEE 754 single-precision and double-precision binary floating-point format
.
The real value assumed by a given 32 bit binary32 data with a given biased sign s, exponent e (the 8 bit unsigned integer), and a 23 bit fraction (mantissa) is
s * m * (2 ^(e-127)),
where m is
For double use 1023 instead of 127: s * m * (2 ^(e-1023))
First case is exponent e and sign s save its values after double-float cast. Then float mantissa is almost first digits of the double mantissa. You need to slightly decrease the value of float mantissa.
Second case. Exponent (e-127) from float is greater than exponent (e-1023) from double. Then I hope that fraction part is 23 zeros. Ok. Decrease exponent part and set fraction part to 23 ones. To get access for the fields use union.
union {
float fl;
uint32_t dw;
} f;
int s = ( f.dw >> 31 ) ? -1 : 1; /* sign */
int e = ( f.dw >> 23 ) & 0xFF; /* exponent */
int fract = f.dw & 0x7FFFFF; /* fraction */

Fast ceiling of an integer division in C / C++

Given integer values x and y, C and C++ both return as the quotient q = x/y the floor of the floating point equivalent. I'm interested in a method of returning the ceiling instead. For example, ceil(10/5)=2 and ceil(11/5)=3.
The obvious approach involves something like:
q = x / y;
if (q * y < x) ++q;
This requires an extra comparison and multiplication; and other methods I've seen (used in fact) involve casting as a float or double. Is there a more direct method that avoids the additional multiplication (or a second division) and branch, and that also avoids casting as a floating point number?
For positive numbers where you want to find the ceiling (q) of x when divided by y.
unsigned int x, y, q;
To round up ...
q = (x + y - 1) / y;
or (avoiding overflow in x+y)
q = 1 + ((x - 1) / y); // if x != 0
For positive numbers:
q = x/y + (x % y != 0);
Sparky's answer is one standard way to solve this problem, but as I also wrote in my comment, you run the risk of overflows. This can be solved by using a wider type, but what if you want to divide long longs?
Nathan Ernst's answer provides one solution, but it involves a function call, a variable declaration and a conditional, which makes it no shorter than the OPs code and probably even slower, because it is harder to optimize.
My solution is this:
q = (x % y) ? x / y + 1 : x / y;
It will be slightly faster than the OPs code, because the modulo and the division is performed using the same instruction on the processor, because the compiler can see that they are equivalent. At least gcc 4.4.1 performs this optimization with -O2 flag on x86.
In theory the compiler might inline the function call in Nathan Ernst's code and emit the same thing, but gcc didn't do that when I tested it. This might be because it would tie the compiled code to a single version of the standard library.
As a final note, none of this matters on a modern machine, except if you are in an extremely tight loop and all your data is in registers or the L1-cache. Otherwise all of these solutions will be equally fast, except for possibly Nathan Ernst's, which might be significantly slower if the function has to be fetched from main memory.
You could use the div function in cstdlib to get the quotient & remainder in a single call and then handle the ceiling separately, like in the below
#include <cstdlib>
#include <iostream>
int div_ceil(int numerator, int denominator)
{
std::div_t res = std::div(numerator, denominator);
return res.rem ? (res.quot + 1) : res.quot;
}
int main(int, const char**)
{
std::cout << "10 / 5 = " << div_ceil(10, 5) << std::endl;
std::cout << "11 / 5 = " << div_ceil(11, 5) << std::endl;
return 0;
}
There's a solution for both positive and negative x but only for positive y with just 1 division and without branches:
int div_ceil(int x, int y) {
return x / y + (x % y > 0);
}
Note, if x is positive then division is towards zero, and we should add 1 if reminder is not zero.
If x is negative then division is towards zero, that's what we need, and we will not add anything because x % y is not positive
How about this? (requires y non-negative, so don't use this in the rare case where y is a variable with no non-negativity guarantee)
q = (x > 0)? 1 + (x - 1)/y: (x / y);
I reduced y/y to one, eliminating the term x + y - 1 and with it any chance of overflow.
I avoid x - 1 wrapping around when x is an unsigned type and contains zero.
For signed x, negative and zero still combine into a single case.
Probably not a huge benefit on a modern general-purpose CPU, but this would be far faster in an embedded system than any of the other correct answers.
I would have rather commented but I don't have a high enough rep.
As far as I am aware, for positive arguments and a divisor which is a power of 2, this is the fastest way (tested in CUDA):
//example y=8
q = (x >> 3) + !!(x & 7);
For generic positive arguments only, I tend to do it like so:
q = x/y + !!(x % y);
This works for positive or negative numbers:
q = x / y + ((x % y != 0) ? !((x > 0) ^ (y > 0)) : 0);
If there is a remainder, checks to see if x and y are of the same sign and adds 1 accordingly.
simplified generic form,
int div_up(int n, int d) {
return n / d + (((n < 0) ^ (d > 0)) && (n % d));
} //i.e. +1 iff (not exact int && positive result)
For a more generic answer, C++ functions for integer division with well defined rounding strategy
For signed or unsigned integers.
q = x / y + !(((x < 0) != (y < 0)) || !(x % y));
For signed dividends and unsigned divisors.
q = x / y + !((x < 0) || !(x % y));
For unsigned dividends and signed divisors.
q = x / y + !((y < 0) || !(x % y));
For unsigned integers.
q = x / y + !!(x % y);
Zero divisor fails (as with a native operation). Cannot cause overflow.
Corresponding floored and modulo constexpr implementations here, along with templates to select the necessary overloads (as full optimization and to prevent mismatched sign comparison warnings):
https://github.com/libbitcoin/libbitcoin-system/wiki/Integer-Division-Unraveled
Compile with O3, The compiler performs optimization well.
q = x / y;
if (x % y) ++q;