Numerical stability of double zero [closed] - c++

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I have a vector that contains non-negative doubles. I want to distinguish the cases when an entry is equal to zero and when an entry is greater than zero.
Is it numerically safe to just check if(a>0.0) or can this cause problems? I have no a-priori lower bound for the non-zero values, except machine precision. Should I create a helper-vector containing integers to mark the zero-values for safe checking?
For better understanding: The entries of the vector are something like weights on a graph, and I figured I don't need the adjacency matrix to keep track of the graph topology.
EDIT: My question is: Can and will 0.0 be exactly represented in doubles?

Floating point numbers aren't literally evil. Nor are they designed by stupid people. The one and only issue you need to concern yourself with here, is that of rounding.
A number which is set to zero, will be zero. There would be no reason to design a computational system which did not behave this way.
A number which is set to 0.1 will not be 0.1, because 0.1 is not exactly representable and is therefore rounded to the nearest representable number; see Is floating point math broken? for details. But if you set two variables to 0.1 they will compare equal to each other, because 0.1 is rounded the same way each time. (In fact the rounding happens during compilation; at runtime you're just setting the variable to the pre-rounded value.)
Similarly, a number which is set to 0.1 * 3 - 0.3 may not be equal to zero, because 0.1 was rounded, and then the rounded result was multiplied by 3 and that result was rounded, and so on.
So the issue is not one of representation, but of computation. If you set something to a particular value, that's the value it has. If it got there through a sequence of inexact computations, you can't rely on exact equality.

Related

Floating point number answer difference between c++ and calculator [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
I am calculating a floating point number by formula: number=1/(n-2.001)
Where n is any integer from 1 to infinite.
But it give me different answer in laptop and scientific calculator.
C++ calculation : 0.333444
Calculator answer: 0.3334444815
I have to get all digits in c++. How i get this.
Decimal equivalent of 1/3 is 0.33333333333333….
An infinite length number would require infinite memory to store, and we typically have 4 or 8 bytes. Therefore, Floating point numbers store only a certain number of significant digits, and the rest are lost.
NOTE : When outputting floating point numbers, cout has a default precision of 6 and it truncates anything after that.
The precision of a floating point number defines how many significant digits it can represent without information loss.
Therefore in your case only 6 decimals points are outputted and rest are turncated
To change the Precision of floating-point data types in C++ check this

How to represent irrational numbers in c++ [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I want to represent all irrational numbers with a class in C++.
How can I do that? What suppose to be my data members and functions?
thanks in advance...
The only way that I think that you may be able to achieve something of this nature would be with identifiers and not the actual mathematical or number representations. Even in pure mathematics Irrational numbers are labeled as irrational due to given postulates. Even a human can not truly represent an irrational number by its digits. So the only thing I can suggest is to have an identifier of the known irrational numbers such as something like this:
enum Irrational {
PI = 0,
E,
SQRT2,
...
};
Then you might want to make an association of them with a map like this:
std::map<Irrational, double> myIrrationals;
myIrrations.insert( std::make_pair<Irrational, double>( PI, 3.141592654 ) );
Then your check for irrational numbers would be true if they are found in this map and false otherwise.
You cannot represent irrational numbers even in the pure math, except symbolically (like Pi, sqrt(2) - you can say "Pi" but you cannot write its exact value on the paper). And the same applies to the computer representation - if you want to represent them exactly, you cannot represent them as a "real" numbers, only symbolically (in the computer it is actually difficult to represent even the rational numbers precisely).
So, to answer your question - as a consequence of the above, your data members could be for example strings (symbols or entire expressions represented as strings, like "Pi" or "sqrt(2)") and/or combined with expression trees (operators and operands to store the expressions which represent the irrational numbers, like operator=sqrt, operand=2 and alike).

How to avoid to show a float: -0.0 [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I want to show my float number with just 1 digit and avoid to show this case -0.0.
My program is a C/C++ development for an Arduino board.
How can I do that?
(Applies to both C and C++).
IEEE754 floating point defines a signed zero. This is the effect you're observing here. One may obtain negative zero as the result of certain computations, for instance as the result of arithmetic underflow on a negative number, or −1.0 * 0.0, or simply as −0.0.
-0.0 is defined to equal 0.0.
One solution would be to analyse it as a special case: x == -0.0 ? /*handle -0.0 and 0.0 here: e.g. ::fabs(x)*/ : /*non-zero cases here*/.
See http://en.wikipedia.org/wiki/Signed_zero
you can try printing out this way using the c ternary operator.
printf("%2.1f",(f<0 &&f >-1)?-f:f);
here f is an floating point variable and if u get value of f as -0.0001 , the function will print out as 0.0 .
To avoid showing the negative sign you will need to actually round of the number before displaying it since the +/- of a number is irrespective of the precision

How to design an algorithm that multiplies two floats without '*'? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Closed 8 years ago.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Improve this question
How do I design an algorithm that takes two floats and multiplies them using only addition, bit shifting and bitwise operations?
I have already found one like this for integers, but that doesn't work with floats.
I have also found another that is much like what I need but log is also prohibited in my case.
The floats are stored according to the IEEE754 standard. I have also tried to keep their exponent part, and bitwise multiply their fractional part with no luck.
According to http://en.wikipedia.org/wiki/IEEE_floating_point, an IEEE754 number x = (-1)^s * c * b^q is represented by s,c,b,q , all are integers. for Two floating point numbers with the same base b is the same.
So the multiplication of two floating point numbers x and y is:
(-1)^(s1+s2)*c1*c2*b^(q1+q2) so the new floating point is represented by: s1+s2, c1*c2, b q1+q2 so you only have left to deal with multiplication of c1 and c2, both are integers so you are done.

how to differ rational and irrational number in C++ [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
how to tell my float variable store an irrational number?
I'm a kind of newbie in C++
and I dont know many library function to be implemented
I want to make an exception for every calculation that end up being an irrational number
C++ doesn't have general arbitrary-precision rational numbers implemented. The available numbers are size-limited integers and floating point numbers.
A floating point number (in the common IEEE format) is however an integer multiplied by an exact power of two (positive or negative).
Even numbers like 0.1 = 1/10 are impossible to represent exactly because the denominator is not a power of two.
So the answer is simple :-) ... any number you will face with C++ is rational, more than that is an integer multiplied by a (possibly negative) power of two.
There are libraries implementing arbitrary precision integers and rational numbers, but they're not part of standard C++.
C++, by default, can only manage rational numbers. Moreover it's a very specific subset of the rationals where
The numerator is not too big in absolute value
The denominator is a power of two and it's not too big
When you write
double x = 1.0;
x = x / 10.0;
you get a result that is already outside of the capability of the C++ language because the denominator is not a power of two.
What the computer will do is storing into x a close approximation because 0.1 it's a number that cannot be stored exactly in IEEE double format.
Floating point numbers are an approximation of the number. It is accurate as best that it can do with the limited amount of room to play in.
So the best bet is to limit the effect of both. It is called algebra. Also enables one to reduce round errors.