Avoid casting floating point constant - casting

I'm creating cargo that (among other things) will implement idiomatic angle measurment. When creating methods to convert between angle units I've found problem:
impl<T> Angle<T>
where T: Float {
pub fn to_deg(self) -> Self {
Deg(match self {
Rad(v) => v * cast(180.0 / f64::consts::PI).unwrap(),
Deg(v) => v,
Grad(v) => v * cast(180.0 / 200.0).unwrap() // how to get rid of this cast?
})
}
}
Runnable
The cast of 180.0 / 200.0 seem really unneded for me? Is there any way to get rid of this?
When I delete cast then I get:
src/angles.rs:42:28: 42:33 error: mismatched types:
expected `T`,
found `_`
(expected type parameter,
found floating-point variable) [E0308]
src/angles.rs:42 Grad(v) => v * 180.0 / 200.0
^~~~~

When you have a generic function with a type parameter, such as T, you don't get to choose the type. The type is forced on you by the caller of the function.
The error here is that you're trying to assign a specific f32/f64 type to a type T, which could be anything that implements Float.
You know in practice it's going to be either one of the floating-point types, but theoretically the type system won't stop someone from implementing Float on a string or an array, or a tuple of two function pointers, or any other bizarre thing that can't be assigned a float. When the compiler can't guarantee it'll always work, including in theory in the future, then it won't accept it.
If you want to assign a float value to T, you have to declare that this operation is possible, e.g. by adding f32: Into<T>, and using 180f32.into().

Related

Is it necessary to cast literal constants in templated functions?

I am writing a code that has to compile in single and double precision. The original version was only in double precision, but I am trying to enable single precision now by using templates.
My question is: is it necessary to cast the 1. and the 7.8 to the specified type with static_cast<TF>(1.) for instance, or will the compiler take care of it? I find the casting not remarkably pretty and prefer to stay away from it. (I have other functions that are much longer and that contain many more literal constants).
template<typename TF>
inline TF phih_stable(const TF zeta)
{
// Hogstrom, 1988
return 1. + 7.8*zeta;
}
Casting and implicit conversions are two things. For this example, you could treat the template function as if it were two overloaded functions, but with the same code within. At the interface level (parameters, return value) the compiler will generate implicit conversions.
Now, the question you will have to ask yourself is this: Do those implicit conversions do what I want? If they do, just leave it as it is. If they don't, you could try to add explicit conversions (maybe use the function-style casting like TF(1.)) or, you could specialize this function for double and float.
Another option, less general but maybe it works here, is that you switch the code around, i.e. that you write the code for single-precision float and then let the compiler apply its implicit conversions. Since the conversions usually only go to the bigger type, it should fit for both double and float without incurring any overhead for float.
When you do:
return 1. + 7.8*zeta;
The literals 1. and 7.8 are double, so when if zeta is a float, it will first be converted to double, then the whole computation will be done in double-precision, and the result will be cast back to float, this is equivalent to:
return (float)(1. + 7.8 * (double)zeta);
Otherwise said, this is equivalent to calling phih_stable(double) and storing the result in a float, so your template would be useless for float.
If you want the computation to be made in single precision, you need the casts1:
return TF(1.) + TF(7.8) * zeta;
What about using 1.f and 7.8f? The problem is that (double)7.8f != 7.8 due to floating point precision. The difference is around 1e-7, the actual stored value for 7.8f (assuming 32-bits float) is:
7.80000019073486328125
While the actual stored value for 7.8 (assuming 64-bits double) is:
7.79999999999999982236431605997
So you have to ask yourself if you accept this loss of precision.
You can compare the two following implementations:
template <class T>
constexpr T phih_stable_cast(T t) {
return T(1l) + T(7.8l) * t;
}
template <class T>
constexpr T phih_stable_float(T t) {
return 1.f + 7.8f * t;
}
And the following assertions:
static_assert(phih_stable_cast(3.4f) == 1. + 7.8f * 3.4f, "");
static_assert(phih_stable_cast(3.4) == 1. + 7.8 * 3.4, "");
static_assert(phih_stable_cast(3.4l) == 1. + 7.8l * 3.4l, "");
static_assert(phih_stable_float(3.4f) == 1.f + 7.8f * 3.4f, "");
static_assert(phih_stable_float(3.4) == 1. + 7.8 * 3.4, "");
static_assert(phih_stable_float(3.4l) == 1.l + 7.8l * 3.4l, "");
The two last assertions fail due to the loss of precision when doing the computation.
1 You should even downcast from long double to not lose precision when using your function with long double: return TF(1.l) + TF(7.8l) * zeta;.

Confusion about float data type declaration in C++

a complete newbie here. For my school homework, I was given to write a program that displays -
s= 1 + 1/2 + 1/3 + 1/4 ..... + 1/n
Here's what I did -
#include<iostream.h>
#include<conio.h>
void main()
{
clrscr();
int a;
float s=0, n;
cin>>a;
for(n=1;n<=a;n++)
{
s+=1/n;
}
cout<<s;
getch();
}
It perfectly displays what it should. However, in the past I have only written programs which uses int data type. To my understanding, int data type does not contain any decimal place whereas float does. So I don't know much about float yet. Later that night, I was watching some video on YouTube in which he was writing the exact same program but in a little different way. The video was in some foreign language so I couldn't understand it. What he did was declared 'n' as an integer.
int a, n;
float s=0;
instead of
int a
float s=0, n;
But this was not displaying the desired result. So he went ahead and showed two ways to correct it. He made changes in the for loop body -
s+=1.0f/n;
and
s+=1/(float)n;
To my understanding, he declared 'n' a float data type later in the program(Am I right?). So, my question is, both display the same result but is there any difference between the two? As we are declaring 'n' a float, why he has written 1.0f instead of n.f or f.n. I tried it but it gives error. And in the second method, why we can't write 1(float)/n instead of 1/(float)n? As in the first method we have added float suffix with 1. Also, is there a difference between 1.f and 1.0f?
I tried to google my question but couldn't find any answer. Also, another confusion that came to my mind after a few hours is - Why are we even declaring 'n' a float? As per the program, the sum should come out as a real number. So, shouldn't we declare only 's' a float. The more I think the more I confuse my brain. Please help!
Thank You.
The reason is that integer division behaves different than floating point division.
4 / 3 gives you the integer 1. 10 / 3 gives you the integer 3.
However, 4.0f / 3 gives you the float 1.3333..., 10.0f / 3 gives you the float 3.3333...
So if you have:
float f = 4 / 3;
4 / 3 will give you the integer 1, which will then be stored into the float f as 1.0f.
You instead have to make sure either the divisor or the dividend is a float:
float f = 4.0f / 3;
float f = 4 / 3.0f;
If you have two integer variables, then you have to convert one of them to a float first:
int a = ..., b = ...;
float f = (float)a / b;
float f = a / (float)b;
The first is equivalent to something like:
float tmp = a;
float f = tmp / b;
Since n will only ever have an integer value, it makes sense to define it as as int. However doing so means that this won't work as you might expect:
s+=1/n;
In the division operation both operands are integer types, so it performs integer division which means it takes the integer part of the result and throws away any fractional component. So 1/2 would evaluate to 0 because dividing 1 by 2 results in 0.5, and throwing away the fraction results in 0.
This in contrast to floating point division which keeps the fractional component. C will perform floating point division if either operand is a floating point type.
In the case of the above expression, we can force floating point division by performing a typecast on either operand:
s += (float)1/n
Or:
s += 1/(float)n
You can also specify the constant 1 as a floating point constant by giving a decimal component:
s += 1.0/n
Or appending the f suffix:
s += 1.0f/n
The f suffix (as well as the U, L, and LL suffixes) can only be applied to numerical constants, not variables.
What he is doing is something called casting. I'm sure your school will mention it in new lectures. Basically n is set as an integer for the entire program. But since integer and double are similar (both are numbers), the c/c++ language allows you to use them as either as long as you tell the compiler what you want to use it as. You do this by adding parenthesis and the data type ie
(float) n
he declared 'n' a float data type later in the program(Am I right?)
No, he defined (thereby also declared) n an int and later he explicitly converted (casted) it into a float. Both are very different.
both display the same result but is there any difference between the two?
Nope. They're the same in this context. When an arithmetic operator has int and float operands, the former is implicitly converted into the latter and thereby the result will also be a float. He's just shown you two ways to do it. When both the operands are integers, you'd get an integer value as a result which may be incorrect, when proper mathematical division would give you a non-integer quotient. To avoid this, usually one of the operands are made into a floating-point number so that the actual result is closer to the expected result.
why he has written 1.0f instead of n.f or f.n. I tried it but it gives error. [...] Also, is there a difference between 1.f and 1.0f?
This is because the language syntax is defined thus. When you're declaring a floating-point literal, the suffix is to use .f. So 5 would be an int while 5.0f or 5.f is a float; there's no difference when you omit any trailing 0s. However, n.f is syntax error since n is a identifier (variable) name and not a constant number literal.
And in the second method, why we can't write 1(float)/n instead of 1/(float)n?
(float)n is a valid, C-style casting of the int variable n, while 1(float) is just syntax error.
s+=1.0f/n;
and
s+=1/(float)n;
... So, my question is, both display the same result but is there any difference between the two?
Yes.
In both C and C++, when a calculation involves expressions of different types, one or more of those expressions will be "promoted" to the type with greater precision or range. So if you have an expression with signed and unsigned operands, the signed operand will be "promoted" to unsigned. If you have an expression with float and double operands, the float operand will be promoted to double.
Remember that division with two integer operands gives an integer result - 1/2 yields 0, not 0.5. To get a floating point result, at least one of the operands must have a floating point type.
In the case of 1.0f/n, the expression 1.0f has type float1, so the n will be "promoted" from type int to type float.
In the case of 1/(float) n, the expression n is being explicitly cast to type float, so the expression 1 is promoted from type int to float.
Nitpicks:
Unless your compiler documentation explicitly lists void main() as a legal signature for the main function, use int main() instead. From the online C++ standard:
3.6.1 Main function
...
2 An implementation shall not predefine the main function. This function shall not be overloaded. It shall have a declared return type of type int, but otherwise its type is implementation-defined...
Secondly, please format your code - it makes it easier for others to read and debug. Whitespace and indentation are your friends - use them.
1. The constant expression 1.0 with no suffix has type double. The f suffix tells the compiler to treat it as float. 1.0/n would result in a value of type double.

How do I convert between numeric types safely and idiomatically?

Editor's note: This question is from a version of Rust prior to 1.0 and references some items that are not present in Rust 1.0. The answers still contain valuable information.
What's the idiomatic way to convert from (say) a usize to a u32?
For example, casting using 4294967295us as u32 works and the Rust 0.12 reference docs on type casting say
A numeric value can be cast to any numeric type. A raw pointer value can be cast to or from any integral type or raw pointer type. Any other cast is unsupported and will fail to compile.
but 4294967296us as u32 will silently overflow and give a result of 0.
I found ToPrimitive and FromPrimitive which provide nice functions like to_u32() -> Option<u32>, but they're marked as unstable:
#[unstable(feature = "core", reason = "trait is likely to be removed")]
What's the idiomatic (and safe) way to convert between numeric (and pointer) types?
The platform-dependent size of isize / usize is one reason why I'm asking this question - the original scenario was I wanted to convert from u32 to usize so I could represent a tree in a Vec<u32> (e.g. let t = Vec![0u32, 0u32, 1u32], then to get the grand-parent of node 2 would be t[t[2us] as usize]), and I wondered how it would fail if usize was less than 32 bits.
Converting values
From a type that fits completely within another
There's no problem here. Use the From trait to be explicit that there's no loss occurring:
fn example(v: i8) -> i32 {
i32::from(v) // or v.into()
}
You could choose to use as, but it's recommended to avoid it when you don't need it (see below):
fn example(v: i8) -> i32 {
v as i32
}
From a type that doesn't fit completely in another
There isn't a single method that makes general sense - you are asking how to fit two things in a space meant for one. One good initial attempt is to use an Option — Some when the value fits and None otherwise. You can then fail your program or substitute a default value, depending on your needs.
Since Rust 1.34, you can use TryFrom:
use std::convert::TryFrom;
fn example(v: i32) -> Option<i8> {
i8::try_from(v).ok()
}
Before that, you'd have to write similar code yourself:
fn example(v: i32) -> Option<i8> {
if v > std::i8::MAX as i32 {
None
} else {
Some(v as i8)
}
}
From a type that may or may not fit completely within another
The range of numbers isize / usize can represent changes based on the platform you are compiling for. You'll need to use TryFrom regardless of your current platform.
See also:
How do I convert a usize to a u32 using TryFrom?
Why is type conversion from u64 to usize allowed using `as` but not `From`?
What as does
but 4294967296us as u32 will silently overflow and give a result of 0
When converting to a smaller type, as just takes the lower bits of the number, disregarding the upper bits, including the sign:
fn main() {
let a: u16 = 0x1234;
let b: u8 = a as u8;
println!("0x{:04x}, 0x{:02x}", a, b); // 0x1234, 0x34
let a: i16 = -257;
let b: u8 = a as u8;
println!("0x{:02x}, 0x{:02x}", a, b); // 0xfeff, 0xff
}
See also:
What is the difference between From::from and as in Rust?
About ToPrimitive / FromPrimitive
RFC 369, Num Reform, states:
Ideally [...] ToPrimitive [...] would all be removed in favor of a more principled way of working with C-like enums
In the meantime, these traits live on in the num crate:
ToPrimitive
FromPrimitive

I get datatype, but what is the point of defining types in sml?

It seems like defining types in SML isnt that helpful:
type point = int * int
val origin : point = (0, 0)
But I could easily just use int * int for typing methods, no? Compared with datatype, with which it seems you can do more interesting things like:
datatype Point = PlanePoint of (int * int) | SpacePoint of (int * int * int)
val origin : Point = SpacePoint(0, 0, 0)
Out of curiosity, what are situations where you really just gotta have a type defined?
The reason is mostly type safety. I will try to explain with a simple example.
Say you have a module that uses 2 types that are represented with real * real
For example, a 2d point like in your example, and a line represented by a slope and a y intercept. Now if you're writing a function like lies_on_line which takes a a point and a line and returns a boolean whether the point lies on the line you have 2 choices for a signature:
val lies_on_line : (int * int) * (int * int) -> bool
Ord
val lies_on_line : point * line -> bool
It's obvious that the 2nd example makes it harder to make mistakes.
Also, while it's more of a benefit for modules, naming a type allows you to change its representation without changing code that uses the type (indirectly through the module).
It makes sense to define aliases for your types in the context of your problem domain. That way you can think in your design in terms of more relevant and meaningful types.
For instance if you are writing a word processor program then you have types like:
type Word = string
type Sentence = Word list
which may make more sense than string and string list.

Should I always use the appropriate literals for number types?

I'm often using the wrong literals in expressions, e.g. dividing a float by an int, like this:
float f = read_f();
float g = f / 2;
I believe that the compiler will in this case first convert the int literal (2) to float, and then apply the division operator. GCC and Clang have always let stuff like that pass, but Visual C++ warns about an implicit conversion. So I have to write it like this:
float f = read_f();
float g = f / 2.0f;
That got me wondering: Should I always use the appropriate literals for float, double, long etc.? I normally use int literals whenever I can get away with it, but I'm not sure if that's actually a good idea.
Is this a likely cause of subtle errors?
Is this only an issue for expressions or also for function parameters?
Are there warning levels for GCC or Clang that warn about such implicit conversions?
How about unsigned int, long int etc?
You should always explicitly indicate the type of literal that you intend to use. This will prevent problems when for example this sort of code:
float foo = 9.0f;
float bar = foo / 2;
changes to the following, truncating the result:
int foo = 9;
float bar = foo / 2;
It's a concern with function parameters as well when you have overloading and templates involved.
I know gcc has -Wconversion but I can't recall everything that it covers.
For integer values that fit in int I usually don't qualify those for long or unsigned as there is usually much less chance there for subtle bugs.
There's pretty much never an absolutely correct answer to a "should" question. Who's going to use this code, and for what? That's relevant here. But also, particularly for anything to do with floats, it's good to get into the habit of specifying exactly the operations you require. float*float is done in single-precision. anything with a double is done double-precision, 2 gets converted to a double so you're specifying different operations here.
The best answer here is What Every Computer Scientist Should Know About Floating-Point Arithmetic. I'd say don't tl;dr it, there are no simple answers with floating point.