This question already has answers here:
Is floating point math broken?
(31 answers)
Closed 7 years ago.
I am new to Fortran90 , I have write a simple program to add two floating point numbers as follows:
program Numbers_sum
implicit none
REAL :: sum
sum = 1.6+2
print*,"Sum =", sum
end
I am getting the answer as Sum = 3.5999999
Why it is not getting 3.6. How Can I make this program to get the exact answer?? Any help will be appreciated.
There is no way to write 3.6 in base 2 with finite digits. It's 2 + 1 + 1/2 + 1/16 + ...
You can, however, hide the rounding error by selecting proper formatting:
write(*, '(F11.6)') sum
If you want to calculate in higher precision, you could use this:
REAL(KIND=8) :: var
Or, if you want to be really proper:
program numbers_sum
implicit none
integer, parameter :: dp = selected_real_kind(P=12)
real(kind=dp) :: sum1
sum1 = 1.6_dp + 2
print *, "Sum = ", sum1
end
But even this won't eliminate the rounding completely.
Cheers
Related
This question already has answers here:
Assigning a lower precision number to a higher precision in Fortran90
(1 answer)
Numerical Precision in Fortran 95:
(2 answers)
Closed 4 years ago.
I'm trying to understand the difference between these two:
REAL*8 X
X=1.5D0
WRITE(6,*) (0.6D0 + (2*x)/5.)
WRITE(6,*) (0.6 + (2*x)/5.D0)
I would expect them to give identical results, but I get instead
1.2000000000000000
1.2000000238418580
Why is 0.6 not cast to double precision in the second case even if it is supposed to be summed to a value in double precision? What is happening?
This question already has answers here:
What is the behavior of integer division?
(6 answers)
Closed 4 years ago.
I was calculating the volume of a sphere and after tons of research I found that I cannot use:
float sphereRadius = 2.33;
float volSphere = 0;
volSphere = (4/3) * (M_PI) * std::pow(sphereRadius, 3);
But must add the 3.0 instead to get the right answer.
volSphere = (4/3.0) * (M_PI) * std::pow(sphereRadius, 3);
Why must a decimal be added to get the correct calculation?
(4/3) is one integer divided by another integer, which results in another integer. An integer can't be 1.33 or anything like that, so it gets truncated to 1. With the decimal, you're telling it to be a double instead, and dividing an integer by a double results in a double, which supports fractions.
This question already has an answer here:
Fibonacci numbers becoming negative after a certain term
(1 answer)
Closed 5 years ago.
I was solving Project Euler problems using Fortran; the problem is to create a Fibonacci sequence and find the sum of all even numbers that come under 4 million. Here's what I wrote
implicit none
integer*4::a(1:4000000),sum
integer*4::i,maxc
maxc = 3999999
a(1) = 1
a(2) = 2
do i = 3,maxc,1
a(i) = a(i-1) + a(i-2)
end do
sum = 0
do i = 1,maxc
if (mod(a(i),2)==0) then
sum = sum + a(i)
end if
end do
print*,sum
end
The output is -1833689714
Any idea what went wrong?
Due to the size of the integer kind you chose, there is a limit to the numbers you can represent.
In your code it is 2147483647 (with gfortran, obtained by print *,huge(sum)).
It can be shown that this limit is exceeded for i=59 in your implementation.
Then, you get an integer overflow, and the value becomes negative.
Simply using a floating point representation for the sum, i.e.
real :: sum
Does the trick.
This question already has answers here:
Why does dividing two int not yield the right value when assigned to double?
(10 answers)
Closed 5 years ago.
I think the title says everything. I want to define a variable i as the fraction 1/12. However, i is 0.
double i = 1/12;
std::cout << i; // Output: 0
Or, more specific, I want to calculate a power of something:
im_ = std::pow((1 + i), (1/12)) - 1;
However, the compile evaluates (1/12) as 0 and thus the result is wrong.
Simple because 1/12 is evaluated as integer math, not floating point math.
1/12 becomes 0 because integer math does not take into account the decimal fractions.
To get the expected result you will need to write down the numbers as a floating point literal, like this: 1.0/12.0.
More details can be found here: Why can't I return a double from two ints being divided
This question already has answers here:
Division in C++ not working as expected
(6 answers)
Closed 8 years ago.
Helo, I'm new to programming and run into an issue, I have an integer, for example 158, and I divide it by 100 that i get is 1, but I want 1.58 instead
It is probably known issue, but sorry, I'm noob, for now :)
Just cast this to float number
int i = 158;
float f = (float)i / 100; //less precision
double d = (double)i / 100; //more precision
//other way
int i = 158;
float f = i / 100.0; //less precision
double d = i / 100.0; //more precision
What you are doing is dividing integer from integer, in this case result always integer, to get floating point number at least one of two operand has to be floating point number.
You need to divide by 100.0 rather than 100
Dividing by an integer in C++ is always going to give you an integer, so it will never be completely accurate. That being said, it was mentioned above that you can divide by a double or long to get the accurate decimal number that you desire.