Float or Double Special Value

Float or Double Special Value - c++

I have double (or float) variables that might be "empty", as in holding no valid value. How can I represent this condition with the built in types float and double?
One option would be a wrapper that has a float and a boolean, but that can´t work, as my libraries have containers that store doubles and not objects that behave as doubles. Another would be using NaN (std::numeric_limits). But I see no way to check for a variable being NaN.
How can I solve the problem of needing a "special" float value to mean something other than the number?

We have done that by using NaN:
double d = std::numeric_limits<double>::signaling_NaN();
bool isNaN = (d != d);
NaN values compared for equality against itself will yield false. That's the way you test for NaN, but which seems to be only valid if std::numeric_limits<double>::is_iec559 is true (if so, it conforms to ieee754 too).
In C99 there is a macro called isnan for this in math.h, which checks a floating point number for a NaN value too.

In Visual C++, there is a non-standard _isnan(double) function that you can import through float.h.
In C, there is a isnan(double) function that you can import through math.h.
In C++, there is a isnan(double) function that you can import through cmath.
As others have pointed out, using NaN's can be a lot of hassle. They are a special case that has to be dealt with like NULL pointers. The difference is that a NaN will not usually cause core dumps and application failures, but they are extremely hard to track down. If you decide to use NaN's, use them as little as possible. Overuse of NaN's is an offensive coding practice.

It's not a built-in type, but I generally use boost::optional for this kind of thing. If you absolutely can't use that, perhaps a pointer would do the trick -- if the pointer is NULL, then you know the result doesn't contain a valid value.

One option would be a wrapper that has a float ad a boolean, but that can´t work, as my libraries have containers that store doubles and not objects that behave as doubles.
That's a shame. In C++ it's trivial to create a templated class that auto-converts to the actual double (reference) attribute. (Or a reference to any other type for that matter.) You just use the cast operator in a templated class. E.g.: operator TYPE & () { return value; } You can then use a HasValue<double>anywhere you'd normally use a double.
Another would be using NaN (std::numeric_limits). But i see no way to check for a variable being NaN.
As litb and James Schek also remarked, C99 provides us with isnan().
But be careful with that! Nan values make math & logic real interesting! You'd think a number couldn't be both NOT>=foo and NOT<=foo. But with NaN, it can.
There's a reason I keep a WARN-IF-NAN(X) macro in my toolbox. I've had some interesting problems arise in the past.

Related

What is difference between std::stof and atof when each one should be used?

I have read some documents about each, for example
http://www.cplusplus.com/reference/string/stof/
http://www.cplusplus.com/reference/cstdlib/atof/
I understand that atof is part of <cstdlib> and has const char* as input parameter, and std::stof is part of <string> and has different input format.
But it's not clear,
can they be used interchangeably:
do they convert to same float value with same input?
what scenario is best to use for each of these?

I assume you meant to compare std::atof with std::stod (both return double).
Just comparing the two linked reference pages yields the following differences :
std::atof takes a char*, while std::stod takes either a std::string or a std::wstring (ie. it has support for wide strings)
std::stod will also return the index of the first unconverted character if the pos parameter is not NULL (useful for further parsing of the string)
if the converted value falls outside of the range of a double, std::atof will return an undefined value, while std::stod will throw a std::out_of_range exception (definitely better than an undefined value)
if no conversion can be performed, std::atof will return 0.0, while std::stod will throw a std::invalid_argument exception (easier to distinguish with an actual converted 0.0)
These are all positive points for std::stod, making it the more advanced alternative of the 2.

Everything Sander said is correct. However, you specifically asked:
Can they be used interchangeably?
The answer is no, at least not in the general case. If you're using atof, chances are good that you have legacy C code. In that case, you must be careful about introducing code that can throw exceptions, especially in "routine" situations such as when a user gives bad input.
Do they convert to same float value with same input?
No. They both convert to a double value, not a float. Conversion to the same value isn't specifically guaranteed by the standard (to my knowledge), and it is possible that there is round-off somewhere that is slightly different. However, given valid input, I would be pretty surprised if there were a difference between the return values from the two in the same compiler.
What scenario is best to use for each of these?
If:
You have a conforming C++ 11 or later compiler, and
Your code can tolerate exceptions (or you're willing to catch them around every call), and
You already have a std::string or the performance hit of a conversion is unimportant.
Or, if you are a C++ beginner and don't know the answers to these questions.
Then I would suggest using std::stod. Otherwise, you could consider using atof.

Generic and specific functions to get real and imaginary parts of complex variables

In Fortran, I always work with double precision, so I have been using specific functions like dble and dimag to get real and imaginary parts of complex variables. However, for other functions like sin, I no longer use dsin because the former returns a value of proper kind (i.e., sin is a generic function). The same seems to hold for complex variables. So my question is:
1) What are the most recommended generic functions for getting real and imaginary parts?
-- It seems that real(z), aimag(z), and conjg(z) return a proper kind always (via experiments with gfortran), i.e., if z is double precision, those functions return double precision. Is this guaranteed? Also, is the behavior dependent on the standard used by the compiler? (i.e., Fortran 77 vs 90 or later, particularly for real(z)?)
2) If I (nevertheless) want to use specific functions that receives only double precision arguments and always return double precision values, what are the specific functions?
-- I have been using dble(z) and dreal(z), dimag(z), dconjg(z) up to now, but some web pages say that they are vendor extensions (though commonly supported by many compilers).
I have read various pages but the information is rather confusing (i.e., it is not very clear what is the "standard" way), so I would appreciate any advice on the choice of such functions.

As background, what do we mean by kinds of real and complex variables? Of course, you know what is meant by the kind of a real object.
A complex object consists of a real and an imaginary part. If a complex object has a given kind then each component is a real of kind corresponding to the kind of the complex object.
That's a long way of saying, if
complex(kind=k) z
then KIND(z%re) and KIND(z%im) both evaluate to k (using the complex part designators introduced by Fortran 2008 for clarity).
Now, the real intrinsic generic takes a complex expression and returns its real component. It does so subject to the following F2008 rule (13.7.138), where A is the argument:
If A is of type complex and KIND is not present, the kind type parameter is the kind type parameter of A.
So, yes: in current Fortran real without a requested kind will always give you a real of kind that of the complex's real component. Whether that's double precision or otherwise.
Similarly, aimag returns a real (corresponding to the imaginary part) of kind that of the complex number. Unlike real, aimag doesn't accept a kind= argument controlling the result kind.
Things are different for Fortran 77: there was no similar concept of kind, and just one complex.
dble is a standard intrinsic. Although this always returns a double precision it is still generic and will accept any numeric. dble(a) is the same as real(a,kind(0d0)), whatever the type of a. There is no (standard) specific.
dreal, dimag and dconjg are not standard intrinsics.
I suppose one could create specific wrappers around real if one cared greatly.

Is there a way to prevent developers from using std::min, std::max?

We have an algorithm library doing lots of std::min/std::max operations on numbers that could be NaN. Considering this post: Why does Release/Debug have a different result for std::min?, we realised it's clearly unsafe.
Is there a way to prevent developers from using std::min/std::max?
Our code is compiled both with VS2015 and g++. We have a common header file included by all our source files (through /FI option for VS2015 and -include for g++). Is there any piece of code/pragma that could be put here to make any cpp file using std::min or std::max fail to compile?
By the way, legacy code like STL headers using this function should not be impacted. Only the code we write should be impacted.

I don't think making standard library functions unavailable is the correct approach. First off, NaN are a fundamental aspect of how floating point value work. You'd need to disable all kinds of other things, e.g., sort(), lower_bound(), etc. Also, programmers are paid for being creative and I doubt that any programmer reaching for std::max() would hesitate to use a < b? b: a if std::max(a, b) doesn't work.
Also, you clearly don't want to disable std::max() or std::min() for types which don't have NaN, e.g., integers or strings. So, you'd need a somewhat controlled approach.
There is no portable way to disable any of the standard library algorithms in namespace std. You could hack it by providing suitable deleted overloads to locate uses of these algorithms, e.g.:
namespace std {
float max(float, float) = delete; // **NOT** portable
double max(double, double) = delete; // **NOT** portable
long double max(long double, long double) = delete; // **NOT** portable
// likewise and also not portable for min
}

I'm going a bit philosophical here and less code. But I think the best approach would be to educate those developers, and explain why they shouldn't code in a specific way. If you'll be able to give them a good explanation, then not only will they stop using functions that you don't want them to. They will be able to spread the message to other developers in the team.
I believe that forcing them will just make them come up with work arounds.

As modifying std is disallowed, following is UB, but may work in your case.
Marking the function as deprecated:
Since c++14, the deprecated attribute:
namespace std
{
template <typename T>
[[deprecated("To avoid to use Nan")]] constexpr const T& (min(const T&, const T&));
template <typename T>
[[deprecated("To avoid to use Nan")]] constexpr const T& (max(const T&, const T&));
}
Demo
And before
#ifdef __GNUC__
# define DEPRECATED(func) func __attribute__ ((deprecated))
#elif defined(_MSC_VER)
# define DEPRECATED(func) __declspec(deprecated) func
#else
# pragma message("WARNING: You need to implement DEPRECATED for this compiler")
# define DEPRECATED(func) func
#endif
namespace std
{
template <typename T> constexpr const T& DEPRECATED(min(const T&, const T&));
template <typename T> constexpr const T& DEPRECATED(max(const T&, const T&));
}
Demo

There's no portable way of doing this since, aside from a couple of exceptions, you are not allowed to change anything in std.
However, one solution is to
#define max foo
before including any of your code. Then both std::max and max will issue compile-time failures.
But really, if I were you, I'd just get used to the behaviour of std::max and std::min on your platform. If they don't do what the standard says they ought to do, then submit a bug report to the compiler vendor.

If you get different results in debug and release, then the problem isn't getting different results. The problem is that one version, or probably both, are wrong. And that isn't fixed by disallowing std::min or std::max or replacing them with different functions that have defined results. You have to figure out which outcome you would actually want for each function call to get the correct result.

I'm not going to answer your question exactly, but instead of disallowing std::min and std::max altogether, you could educate your coworkers and make sure that you are consistently using a total order comparator instead of a raw operator< (implicitly used by many standard library algorithms) whenever you use a function that relies on a given order.
Such a comparator is proposed for standardization in P0100 — Comparison in C++ (as well as partial and weak order comparators), probably targeting C++20. Meanwhile, the C standard committee has been working for quite a while on TS 18661 — Floating-point extensions for C, part 1: Binary floating-point arithmic, apparently targeting the future C2x (should be ~C23), which updates the <math.h> header with many new functions required to implement the recent ISO/IEC/IEEE 60559:2011 standard. Among the new functions, there is totalorder (section 14.8), which compares floating point numbers according to the IEEE totalOrder:
totalOrder(x, y) imposes a total ordering on canonical members of the format of x and y:
If x < y, totalOrder(x, y) is true.
If x > y, totalOrder(x, y) is false.
If x = y
totalOrder(-0, +0) is true.
totalOrder(+0, -0) is false.
If x and y represent the same floating point datum:
If x and y have negative sign, totalOrder(x, y) is true if and only if the exponent of x ≥ the exponent of y.
Otherwise totalOrder(x, y) is true if and only if the exponent of x ≤ the exponent of y.
If x and y are unordered numerically because x or y is NaN:
totalOrder(−NaN, y) is true where −NaN represents a NaN with negative sign bit and y is a floating-point number.
totalOrder(x, +NaN) is true where +NaN represents a NaN with positive sign bit and x is a floating-point number.
If x and y are both NaNs, then totalOrder reflects a total ordering based on:
negative sign orders below positive sign
signaling orders below quiet for +NaN, reverse for −NaN
lesser payload, when regarded as an integer, orders below greater payload for +NaN, reverse for −NaN.
That's quite a wall of text, so here is a list that helps to see what's greater than what (from greater to lesser):
positive quiet NaNs (ordered by payload regarded as integer)
positive signaling NaNs (ordered by payload regarded as integer)
positive infinity
positive reals
positive zero
negative zero
negative reals
negative infinity
negative signaling NaNs (ordered by payload regarded as integer)
negative quiet NaNs (ordered by payload regarded as integer)
Unfortunately, this total order currently lacks library support, but it is probably possible to hack together a custom total order comparator for floating point numbers and use it whenever you know there will be floating point numbers to compare. Once you get your hands on such a total order comparator, you can safely use it everywhere it is needed instead of simply disallowing std::min and std::max.

If you compile using GCC or Clang, you can poison these identifiers.
#pragma GCC poison min max atoi /* etc ... */
Using them will issue a compiler error:
error: attempt to use poisoned "min"
The only problem with this in C++ is you can only poison "identifier tokens", not std::min and std::max, so actually also poisons all functions and local variables by the names min and max... maybe not quite what you want, but maybe not a problem if you choose Good Descriptive Variable Names™.
If a poisoned identifier appears as part of the expansion of a macro
which was defined before the identifier was poisoned, it will not
cause an error. This lets you poison an identifier without worrying
about system headers defining macros that use it.
For example,
#define strrchr rindex
#pragma GCC poison rindex
strrchr(some_string, 'h');
will not produce an error.
Read the link for more info, of course.
https://gcc.gnu.org/onlinedocs/gcc-3.3/cpp/Pragmas.html

You've deprecated std::min std::max. You can find instances by doing a search with grep. Or you can fiddle about with the headers themselves to break std::min, std::max. Or you can try defining min / max or std::min, std::max to the preprocessor. The latter is a bit dodgy because of C++ namespace, if you define std::max/min you don't pick up using namespace std, if you define min/max, you also pick up other uses of these identifiers.
Or if the project has a standard header like "mylibrary.lib" that everyone includes, break std::min / max in that.
The functions should return NaN when passed NaN, of course. But the natural way of writing them will trigger always false.

IMO the failure of the C++ language standard to require min(NaN, x) and min(x, NaN) to return NaN and similarly for max is a serious flaw in the C++ language standards, because it hides the fact that a NaN has been generated and results in surprising behaviour. Very few software developers do sufficient static analysis to ensure that NaNs can never be generated for all possible input values. So we declare our own templates for min and max, with specialisations for float and double to give correct behaviour with NaN arguments. This works for us, but might not work for those who use larger parts of the STL than we do. Our field is high integrity software, so we don't use much of the STL because dynamic memory allocation is usually banned after the startup phase.

CABS(x) function for complex(8)

Is there an absolute value function for a complex value in double precision? When I try CABS() I get
V(1,j) = R(j,j) + (R(j,j)/cabs(R(j,j)))*complexnorm2(R(j:m,j))
"Error: Type of argument 'a' in call to 'cabs' at (1) should be
COMPLEX(4), not COMPLEX(8)"
I have read there's a function called CDABS() but I wasnt sure if that was the same thing?

There is no reason using anything else than ABS(). Generics for intrinsic procedures were already present in FORTRAN 77. You can use them for all intrinsic numeric types.
If you want to see the table of available specific functions of the generic ABS(), see https://gcc.gnu.org/onlinedocs/gfortran/ABS.html , but they are mostly useful only to be passed as actual arguments. You can see that CDABS() is a non-standard extension and I do not recommend to use it.

CABS is defined by the standard to take an argument of type default complex. In your implementation this looks like complex(kind=4). There is no standard function CDABS, although your implementation may perhaps offer one: read the appropriate documentation.
Further, there is no standard specific function for the generic function ABS which takes a double complex argument. Again, your implementation may offer one called something other than CDABS.
That said, the generic function ABS takes any integer, real, or complex argument. Use that.

COMPLEX*8 and complex(KIND=8) are not the same.
The first one, is 4 byte real and 4 byte imaginary.
The complex(KIND=8) or COMPLEX(KIND=C_DOUBLE) is actually a double precision real and double precision imaginary... So equivalent to COMPLEX*16.
As mentioned ABS() should be fine.

Implicit casting Integer calculation to float in C++

Is there any compiler that has a directive or a parameter to cast integer calculation to float implicitly. For example:
float f = (1/3)*5;
cout << f;
the "f" is "0", because calculation's constants(1, 3, 10) are integer. I want to convert integer calculation with a compiler directive or parameter. I mean, I won't use explicit casting or ".f" prefix like that:
float f = ((float)1/3)*5;
or
float f = (1.0f/3.0f)*5.0f;
Do you know any c/c++ compiler which has any parameter to do this process without explicit casting or ".f" thing?

Any compiler that did what you want would no longer be a conforming C++ compiler. The semantics of integer division are well specified (at least for positive numbers), and you're proposing to change that.
It would also be dangerous since it would wind up applying to everything, and you might at some point have code that relies on standard integer arithmetic, which would silently be invalid. (After all, if you had tests that would catch that, you presumably would have tests that would catch the undesired integer arithmetic.)
So, the only advice I've got is to write unit tests, have code reviews, and try to avoid magic numbers (instead defining them as const float).

If you don't like either of the two methods you mentioned, you're probably out of luck.
What are you hoping to accomplish with this? Any specialized operator that did "float-division" would have to convert ints to floats at some point after tokenization, which means you're not going to get any performance benefit on the execution.

In C++ it's a bit odd to see a bunch of numeric values sprinkled through the code. Generally it is considered best practice to move any 'magic numbers' like these to their own static const float value, which removes this problem.

No, those two options are the best you have.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js