Comput modulo between floating point numbers in C++ - c++

I have the following code to compute modulo between two floating point numbers:
auto mod(float x, float denom)
{
return x>= 0 ? std::fmod(x, denom) : denom + std::fmod(x + 1.0f, denom) - 1.0f;
}
It does only work partially for negative x:
-8 0
-7.75 0.25
-7.5 0.5
-7.25 0.75
-7 1
-6.75 1.25
-6.5 1.5
-6.25 1.75
-6 2
-5.75 2.25
-5.5 2.5
-5.25 2.75
-5 3
-4.75 -0.75 <== should be 3.25
-4.5 -0.5 <== should be 3.5
-4.25 -0.25 <== should be 3.75
-4 0
-3.75 0.25
-3.5 0.5
-3.25 0.75
-3 1
-2.75 1.25
-2.5 1.5
-2.25 1.75
-2 2
-1.75 2.25
-1.5 2.5
-1.25 2.75
-1 3
-0.75 3.25
-0.5 3.5
-0.25 3.75
0 0
How to fix it for negative x. Denom is assumed to be an integer greater than 0. Note: fmod as is provided by the standard library is broken for x < 0.0f.
x is in the left column, and the output is in the right column, like so:
for(size_t k = 0; k != 65; ++k)
{
auto x = 0.25f*(static_cast<float>(k) - 32);
printf("%.8g %.8g\n", x, mod(x, 4));
}

Note: fmod as is provided by the standard library is broken for x < 0.0f
I guess you want the result to always be a positive value1:
In mathematics, the result of the modulo operation is an equivalence class, and any member of the class may be chosen as representative; however, the usual representative is the least positive residue, the smallest non-negative integer that belongs to that class (i.e., the remainder of the Euclidean division).
The usual workaround was shown in Igor Tadetnik's comment, but that seems not enough.
#IgorTandetnik That worked. Pesky signed zero though, but I guess you cannot do anything about that.
Well, consider this(2, 3):
auto mod(double x, double denom)
{
auto const r{ std::fmod(x, denom) };
return std::copysign(r < 0 ? r + denom : r, 1);
}
1) https://en.wikipedia.org/wiki/Modulo
2) https://en.cppreference.com/w/cpp/numeric/math/copysign
3) https://godbolt.org/z/fdr9cbsYT

Related

Getting 'ValueError: x and y must be 1D arrays of the same length' when they are in fact 1D arrays of same length

I have this dataframe:
key variable value
0 0.25 -0.2 606623.455859
1 0.27 -0.2 621462.029200
2 0.30 -0.2 640299.078053
3 0.33 -0.2 653686.910706
4 0.35 -0.2 659278.593742
5 0.37 -0.2 665684.466383
6 0.40 -0.2 671975.695814
7 0.25 0 530091.733402
8 0.27 0 542501.852937
9 0.30 0 557799.179433
10 0.33 0 571140.149887
11 0.35 0 575117.783803
12 0.37 0 582709.048163
13 0.40 0 588168.965913
14 0.25 0.2 466275.721535
15 0.27 0.2 478678.452615
16 0.30 0.2 492749.041489
17 0.33 0.2 500792.917910
18 0.35 0.2 503620.638204
19 0.37 0.2 507884.996510
20 0.40 0.2 512504.976664
21 0.25 0.5 351579.595889
22 0.27 0.5 359555.855803
23 0.30 0.5 368924.362358
24 0.33 0.5 375069.238800
25 0.35 0.5 377847.414729
26 0.37 0.5 381146.573247
27 0.40 0.5 383836.933547
And I am trying to make a contour plot using this dataframe with the following code:
x = df['key'].values
y = df['variable'].values
z = df['value'].values
plt.tricontourf(x, y, z, colors='k')
I keep getting this error:
ValueError: x and y must be 1D arrays of the same length
But whenever I check the len, .size, .shape, and .ndim of x and y, they are 1D arrays of the same length. Does anyone know why I would get this error?
x.shape returns (28L,) and y.shape returns (28L,) as well
Okay I found a way to make it work. Really not sure why it didn't work the original way because I was feeding tricontourf 1D arrays, but basically I wrarpped my data in a list() function just to double make sure it was 1D arrays. This made it work. Here's the code:
x = df_2020_pivot['key'].values
y = df_2020_pivot['variable'].values
z = df_2020_pivot['value'].values
plt.tricontourf(list(x), list(y), list(z))
plt.show()
And this is what it produced
I had the same issue crop up. I was passing in two numpy arrays of the same length, and got the 'must be 1D arrays of same length' error. Looking at type(array), the arrays I was passing in were numpy.ndarrays. I used array.tolist() to turn them into simple (1D) lists, and this removed the error for me. Wrapping in the list() function as mentioned above also works.
x = df['key'].values.tolist()
y = df['variable'].values.tolist()
z = df['value'].values
plt.tricontourf(x, y, z, colors='k')

Why does this expression resolve to zero?

int a = 1/2 == 0.25 * 2;
I'm not sure why I'm not seeing this. Am I missing something with precedence?
Let digging:
int a = 1/2 == 0.25 * 2;
First, 1/2 == 0 (type of int), and 0.25 * 2 == 0.5 (type of double). So does 0 equal to 0.5? No. So a receives the value of 0 (FALSE).

Why are all my double precision calculations in a loop give the same results?

I have some trouble doing this question:
Your first program is a simple mathematical calculation program. The program will take inputs from the user. Evaluate f(x) and display the results in a table. The inputs from the user will be: two doubles xmin and xmax. A symbolic constant (POINTS) is used to determine the number of rows in the table. The equation is the sum of two cosine functions;
f(x) = 0.0572 cos(4.667 x) + 0.0218 π cos(12.22 x); [Equation 1]
Hint:
You may use the function cos found in <cmath>, in your calculations.
The value for π is a named constant, and its value is 3.1416;
Approach:
Define a symbolic constant (POINTS) to set the number of rows to 20,
Ask the user for the values of xmin and xmax;
Using the three values above (POINTS, xmin and xmax) compute the values for increments on x;
Use Equation 1 to compute the values of f(x);
Display a table with the following format:
Assume the value for POINTS is 21, and the user enters xmin = −2 and xmax = 2; then your program should display a table like:
X-Value | Y-Value
__________|__________
-2 -0.0043
-1.8 -0.0982
-1.6 +0.0378
-1.4 +0.0438
-1.2 +0.0099
-1 +0.0618
-0.8 -0.1118
-0.6 -0.0198
-0.4 -0.0047
-0.2 -0.0184
0 +0.1257
0.2 -0.0184
0.4 -0.0047
0.6 -0.0198
0.8 -0.1118
1 0.0618
1.2 0.0099
1.4 0.0438
1.6 0.0738
1.8 -0.0982
2 -0.0043
─────────────────────
So far I got parts 1 to 4 but I'm am struggling with part 5. Here is what I have for my code.
#include <iostream>
#include <string>
#include <cmath>
#define pi 3.1416
#define POINTS 20
using namespace std;
int main()
{
int Xmin, Xmax;
int step;
cout << "Enter a value for xMin and xMax:\n";
cin >> Xmin >> Xmax;
double x, y;
step = (Xmax - Xmin) / POINTS;
cout << "X-VALUES " << "" << "| " << "" << "Y-VALUES" << endl;
cout << "_________" << "" << "|_" << "" << "_________" << endl;
for (int i = 0; i < POINTS; ++i)
{
x = Xmin + (step * i);
y = 0.0572 * cos(4.667 * x) + 0.0218 * pi * cos(12.22 * x);
cout << x << "\t " << y << endl;
}
cout << "____________________" << endl;
return 0;
}
It does everything, except it does not print the right values. here is what it prints from my code:
Enter a value for xMin and xMax:
-2
2
X-Value | Y-Value
__________|__________
-2 -0.0043
-2 -0.0043
-2 -0.0043
-2 -0.0043
-2 -0.0043
-2 -0.0043
-2 -0.0043
-2 -0.0043
-2 -0.0043
-2 -0.0043
-2 -0.0043
-2 -0.0043
-2 -0.0043
-2 -0.0043
-2 -0.0043
-2 -0.0043
-2 -0.0043
-2 -0.0043
-2 -0.0043
-2 -0.0043
-2 -0.0043
─────────────────────
I don't know what am I doing wrong. I can't get the values exactly like the table on top in part 5. It 's just repeating itself 20 times. The i increments by one so shouldn't I see a value change on the next increment when the program outputs the x and y? Help me, I'm so confused.
You have some problems with integer division and type precision.
Obviously you wan't to apply step with a double precision context, thus step itself should be declared as double type:
double step = .0;
The expression (Xmax - Xmin) / POINTS lacks for integer division that can't provide values from the range +-]0-1.0[. To overcome the integer division issue make at least one operand being casted to double:
double step = (double)(Xmax - Xmin) / POINTS;
Live Demo
First, use initialization instead of declaring a variable with no value and later assigning a value to it. So, for example, remove double x, y; and change these two lines:
x = Xmin + (step * i);
y = 0.0572 * cos(4.667 * x) + 0.0218 * pi * cos(12.22 * x);
to these:
double x = Xmin + (step * i);
double y = 0.0572 * cos(4.667 * x) + 0.0218 * pi * cos(12.22 * x);
That makes it much easier to see what types are involved in expressions like those.
Now do the same for the variable step. If the problem isn't immediately obvious when you do that, add an output statement that shows the value of step.

How to efficiently cycle consecutive numbers c++

I am looking for a solution for cycling through consecutive numbers based on an input value. Similar to modulo, but different for negative numbers. Is there a better solution compared to the inefficient code below? Here is some input/output examples:
Numbers range 0 to 2
-2 -> 1
-1 -> 2
0 -> 0
1 -> 1
2 -> 2
3 -> 0
4 -> 1
//Inefficient Code example
int getConsecutiveVal(int min, int max, int input) //Inclusive in this scenario
{
while (input>max)
input -= (1+max-min);
while (input<min)
input += (1+max-min);
return input;
}
//Incorrect Code example since func(0,2,-1) returns 2
int getConsecutiveVal(int min, int max, int input)
{
return (input % (1+max-min))+min;
}
To be able to increment or decrement, I used the following function. It's more than 1 line, but fewer math operations. It's similar in spirit to the original poster's format. Tested for positive and negative cases.
int16_t cycleIncDec(int16_t x, int16_t dir, int16_t xmin, int16_t xmax) {
// inc/dec with constrained range
// the supplied xmax must be greater than xmin
x += dir;
if (x > xmax) x = xmin;
else if (x < xmin) x = xmax;
return x;
}
Output of cycleIncDec() with various start values and step sizes
x: 11: +1 0 1 2 3 4 5 6 0 1 2 3
x: 4: -1 3 2 1 0 -1 -2 -3 -4 -5 -6 -7
x: -8: -1 -13 -12 -11 -10 -9 -8 -13 -12 -11 -10 -9
x:-190: -2 -192 -194 -196 -198 -200 -170 -172 -174 -176 -178 -180
In principle, you need the modulo operator. The problem is that in C it doesn't work as expected for negative numbers.
If you know the minimum input value, you can just add a positive number x big enough to transform all negative numbers to positive. It won't affect the result if x % R = 0 (in your example R=3.)
In your example, if you add, say, 3*10 to all inputs and perform the modulo operation you'll get the desired result:
mod(3*10+[-2 -1 0 1 2 3 4], 3)
= 1 2 0 1 2 0 1
(the above is matlab notation and is specialized to the example you have presented. I'll leave it to you to extend it to arbitrary min/max)
A specific formula for the case you have presented:
You have suggested using
((input+abs(input)*(1+max-min)) % (1+max-min))+min
However, this formula does not work. For two reasons:
First, if input=0, the abs() returns 0 and you get the minimum value as output (This is not always what your explicit while-based loop produces)
Second, you forgot to subtract min from the input before the operation.
So the correct formula is the following (using x for input):
(x - xmin + (1+abs(x))*(1+xmax-xmin)) % (1+xmax-xmin) + xmin
You can call % twice to get you the right behaviour, since a%b, for positive b, is guaranteed to lie in [-b+1, b+1].
int getConsecutiveVal(int min, int max, int input)
{
int range_len = (1 + max - min);
input -= min;
return (((input % range_len) + range_len) % range_len) + min;
}

C/C++: 1.00000 <= 1.0f = False

Can someone explain why 1.000000 <= 1.0f is false?
The code:
#include <iostream>
#include <stdio.h>
using namespace std;
int main(int argc, char **argv)
{
float step = 1.0f / 10;
float t;
for(t = 0; t <= 1.0f; t += step)
{
printf("t = %f\n", t);
cout << "t = " << t << "\n";
cout << "(t <= 1.0f) = " << (t <= 1.0f) << "\n";
}
printf("t = %f\n", t );
cout << "t = " << t << "\n";
cout << "(t <= 1.0f) = " << (t <= 1.0f) << "\n";
cout << "\n(1.000000 <= 1.0f) = " << (1.000000 <= 1.0f) << "\n";
}
The result:
t = 0.000000
t = 0
(t <= 1.0f) = 1
t = 0.100000
t = 0.1
(t <= 1.0f) = 1
t = 0.200000
t = 0.2
(t <= 1.0f) = 1
t = 0.300000
t = 0.3
(t <= 1.0f) = 1
t = 0.400000
t = 0.4
(t <= 1.0f) = 1
t = 0.500000
t = 0.5
(t <= 1.0f) = 1
t = 0.600000
t = 0.6
(t <= 1.0f) = 1
t = 0.700000
t = 0.7
(t <= 1.0f) = 1
t = 0.800000
t = 0.8
(t <= 1.0f) = 1
t = 0.900000
t = 0.9
(t <= 1.0f) = 1
t = 1.000000
t = 1
(t <= 1.0f) = 0
(1.000000 <= 1.0f) = 1
As correctly pointed out in the comments, the value of t is not actually the same 1.00000 that you are defining in the line below.
Printing t with higher precision with std::setprecision(20) will reveal its actual value: 1.0000001192092895508.
The common way to avoid these kinds of issues is to compare not with 1, but with 1 + epsilon, where epsilon is a very small number, that is maybe one or two magnitudes greater than your floating point precision.
So you would write your for loop condition as
for(t = 0; t <= 1.000001f; t += step)
Note that in your case, epsilon should be atleast ten times greater than the maximum possible floating point error, as the float is added ten times.
As pointed out by Muepe and Alain, the reason for t != 1.0f is that 1/10 can not be precisely represented in binary floating point numbers.
Floating point types in C++ (and most other languages) are implemented using an approach that uses the available bytes (for example 4 or 8) for the following 3 components:
Sign
Exponent
Mantissa
Lets have a look at it for a 32 bit (4 byte) type which often is what you have in C++ for float.
The sign is just a simple bit beeing 1 or 0 where 0 could mean its positive and 1 that its negative. If you leave every standardization away that exists you could also say 0 -> negative, 1 -> positive.
The exponent could use 8 bits. Opposed to our daily life this exponent is not ment to be used to the base 10 but base 2. That means 1 as an exponent does not correspond to 10 but to 2, and the exponent 2 means 4 (=2^2) and not 100 (=10^2).
Another important part is, that for floating point variables we also might want to have negative exponents like 2^-1 beeing 0.5, 2^-2 for 0.25 and so on. Thus we define a bias value that gets subtracted from the exponent and yields the real value. In this case with 8 bits we'd choose 127 meaning that an exponent of 0 gives 2^-127 and an exponent of 255 means 2^128. But, there is an exception to this case. Usually two values of the exponent are used to mark NaN and infinity. Therefore the real exponent is from 0 to 253 giving a range from 2^-127 to 2^126.
The mantissa obviously now fills up the remaining 23 bits. If we see the mantissa as a series of 0 and 1 you can imagine its value to be like 1.m where m is the series of those bits, but not in powers of 10 but in powers of 2. So 1.1 would be 1 * 2^0 + 1 * 2^-1 = 1 + 0.5 = 1.5. As an example lets have a look at the following mantissa (a very short one):
m = 100101 -> 1.100101 to base 2 -> 1 * 2^0 + 1 * 2^-1 + 0 * 2^-2 + 0 * 2^-3 + 1 * 2^-4 + 0 * 2^-5 + 1 * 2^-6 = 1 * 1 + 1 * 0.5 + 1 * 1/16 + 1 * 1/64 = 1.578125
The final result of a float is then calculated using:
e * 1.m * (sign ? -1 : 1)
What exactly is going wrong in your loop: Your step is 0.1! 0.1 is a very bad number for floating point numbers to base 2, lets have a look why:
sign -> 0 (as its non-negative)
exponent -> The first value smaller than 0.1 is 2^-4. So the exponent should be -4 + 127 = 123
mantissa -> For this we check how many times the exponent is 0.1 and then try to convert the fraction to a mantissa. 0.1 / (2^-4) = 0.1/0.0625 = 1.6. Considering the mantissa gives 1.m our mantissa should be 0.6. So lets convert that to binary:
0.6 = 1 * 2^-1 + 0.1 -> m = 1
0.1 = 0 * 2^-2 + 0.1 -> m = 10
0.1 = 0 * 2^-3 + 0.1 -> m = 100
0.1 = 1 * 2^-4 + 0.0375 -> m = 1001
0.0375 = 1 * 2^-5 + 0.00625 -> m = 10011
0.00625 = 0 * 2^-6 + 0.00625 -> m = 100110
0.00625 = 0 * 2^-7 + 0.00625 -> m = 1001100
0.00625 = 1 * 2^-8 + 0.00234375 -> m = 10011001
We could continue like thiw until we have our 23 mantissa bits but i can tell you that you get:
m = 10011001100110011001...
Therefore 0.1 in a binary floating point environment is like 1/3 is in a base 10 system. Its a periodic infinite number. As the space in a float is limited there comes the 23rd bit where it just has to cut of, and therefore 0.1 is a tiny bit greater than 0.1 as there are not all infinite parts of the number in the float and after 23 bits there would be a 0 but it gets rounded up to a 1.
The reason is that 1.0/10.0 = 0.1 can not be represented exactly in binary, just as 1.0/3.0 = 0.333.. can not be represented exactly in decimals.
If we use
float step = 1.0f / 8;
for example, the result is as expected.
To avoid such problems, use a small offset as shown in the answer of mic_e.