Suppose x and y are of type int.
Are the two expressions:
int(4*x/y)
and:
int(x/(y/4))
always evaluate to the same for all x and y of type int? They should mathematically, but only the second expression is consistent (i.e., producing the expected value) in a program I've written.
In many programming languages, 4*x/y and x/(y/4) are different because y/4, an integer, is the truncated result of the division of y by 4. No such truncation exists in 4*x/y. On obvious difference in when y is 1, in which case the second expression divides by zero, whereas the first one computes 4*x.
i assume x and y are ints
then of course not :)
on integers: x/4=floor(x/4) mathematically
which gives you:
floor(4*x/y)
and
floor(x/floor(y/4))
Related
In C++, the conversion of an integer value of type I to a floating point type F will be exact — as static_cast<I>(static_cast<F>(i)) == i — if the range of I is a part of the range of integral values of F.
Is it possible, and if yes how, to calculate the loss of precision of static_cast<F>(i) (without using another floating point type with a wider range)?
As a start, I tried to code a function that would return if a conversion is safe or not (safe, meaning no loss of precision), but I must admit I am not so sure about its correctness.
template <class F, class I>
bool is_cast_safe(I value)
{
return std::abs(alue) < std::numeric_limits<F>::digits;
}
std::cout << is_cast_safe<float>(4) << std::endl; // true
std::cout << is_cast_safe<float>(0x1000001) << std::endl; // false
Thanks in advance.
is_cast_safe can be implemented with:
static const F One = 1;
F ULP = std::scalbn(One, std::ilogb(value) - std::numeric_limits<F>::digits + 1);
I U = std::max(ULP, One);
return value % U;
This sets ULP to the value of the least digit position in the result of converting value to F. ilogb returns the position (as an exponent of the floating-point radix) for the highest digit position, and subtracting one less than the number of digits adjusts to the lowest digit position. Then scalbn gives us the value of that position, which is the ULP.
Then value can be represented exactly in F if and only if it is a multiple of the ULP. To test that, we convert the ULP to I (but substitute 1 if it is less than 1), and then take the remainder of value divided by the ULP (or 1).
Also, if one is concerned the conversion to F might overflow, code can be inserted to handle this as well.
Calculating the actual amount of the change is trickier. The conversion to floating-point could round up or down, and the rule for choosing is implementation-defined, although round-to-nearest-ties-to-even is common. So the actual change cannot be calculated from the floating-point properties we are given in numeric_limits. It must involve performing the conversion and doing some work in floating-point. This definitely can be done, but it is a nuisance. I think an approach that should work is:
Assume value is non-negative. (Negative values can be handled similarly but are omitted for now for simplicity.)
First, test for overflow in conversion to F. This in itself is tricky, as the behavior is undefined if the value is too large. Some similar considerations were addressed in this answer to a question about safely converting from floating-point to integer (in C).
If the value does not overflow, then convert it. Let the result be x. Divide x by the floating-point radix r, producing y. If y is not an integer (which can be tested using fmod or trunc) the conversion was exact.
Otherwise, convert y to I, producing z. This is safe because y is less than the original value, so it must fit in I.
Then the error due to conversion is (z-value/r)*r + value%r.
I loss = abs(static_cast<I>(static_cast<F>(i))-i) should do the job. The only exception if i's magnitude is large, so static_cast<F>(i) would generate an out-of-I-range F.
(I supposed here that I abs(I) is available)
a complete newbie here. For my school homework, I was given to write a program that displays -
s= 1 + 1/2 + 1/3 + 1/4 ..... + 1/n
Here's what I did -
#include<iostream.h>
#include<conio.h>
void main()
{
clrscr();
int a;
float s=0, n;
cin>>a;
for(n=1;n<=a;n++)
{
s+=1/n;
}
cout<<s;
getch();
}
It perfectly displays what it should. However, in the past I have only written programs which uses int data type. To my understanding, int data type does not contain any decimal place whereas float does. So I don't know much about float yet. Later that night, I was watching some video on YouTube in which he was writing the exact same program but in a little different way. The video was in some foreign language so I couldn't understand it. What he did was declared 'n' as an integer.
int a, n;
float s=0;
instead of
int a
float s=0, n;
But this was not displaying the desired result. So he went ahead and showed two ways to correct it. He made changes in the for loop body -
s+=1.0f/n;
and
s+=1/(float)n;
To my understanding, he declared 'n' a float data type later in the program(Am I right?). So, my question is, both display the same result but is there any difference between the two? As we are declaring 'n' a float, why he has written 1.0f instead of n.f or f.n. I tried it but it gives error. And in the second method, why we can't write 1(float)/n instead of 1/(float)n? As in the first method we have added float suffix with 1. Also, is there a difference between 1.f and 1.0f?
I tried to google my question but couldn't find any answer. Also, another confusion that came to my mind after a few hours is - Why are we even declaring 'n' a float? As per the program, the sum should come out as a real number. So, shouldn't we declare only 's' a float. The more I think the more I confuse my brain. Please help!
Thank You.
The reason is that integer division behaves different than floating point division.
4 / 3 gives you the integer 1. 10 / 3 gives you the integer 3.
However, 4.0f / 3 gives you the float 1.3333..., 10.0f / 3 gives you the float 3.3333...
So if you have:
float f = 4 / 3;
4 / 3 will give you the integer 1, which will then be stored into the float f as 1.0f.
You instead have to make sure either the divisor or the dividend is a float:
float f = 4.0f / 3;
float f = 4 / 3.0f;
If you have two integer variables, then you have to convert one of them to a float first:
int a = ..., b = ...;
float f = (float)a / b;
float f = a / (float)b;
The first is equivalent to something like:
float tmp = a;
float f = tmp / b;
Since n will only ever have an integer value, it makes sense to define it as as int. However doing so means that this won't work as you might expect:
s+=1/n;
In the division operation both operands are integer types, so it performs integer division which means it takes the integer part of the result and throws away any fractional component. So 1/2 would evaluate to 0 because dividing 1 by 2 results in 0.5, and throwing away the fraction results in 0.
This in contrast to floating point division which keeps the fractional component. C will perform floating point division if either operand is a floating point type.
In the case of the above expression, we can force floating point division by performing a typecast on either operand:
s += (float)1/n
Or:
s += 1/(float)n
You can also specify the constant 1 as a floating point constant by giving a decimal component:
s += 1.0/n
Or appending the f suffix:
s += 1.0f/n
The f suffix (as well as the U, L, and LL suffixes) can only be applied to numerical constants, not variables.
What he is doing is something called casting. I'm sure your school will mention it in new lectures. Basically n is set as an integer for the entire program. But since integer and double are similar (both are numbers), the c/c++ language allows you to use them as either as long as you tell the compiler what you want to use it as. You do this by adding parenthesis and the data type ie
(float) n
he declared 'n' a float data type later in the program(Am I right?)
No, he defined (thereby also declared) n an int and later he explicitly converted (casted) it into a float. Both are very different.
both display the same result but is there any difference between the two?
Nope. They're the same in this context. When an arithmetic operator has int and float operands, the former is implicitly converted into the latter and thereby the result will also be a float. He's just shown you two ways to do it. When both the operands are integers, you'd get an integer value as a result which may be incorrect, when proper mathematical division would give you a non-integer quotient. To avoid this, usually one of the operands are made into a floating-point number so that the actual result is closer to the expected result.
why he has written 1.0f instead of n.f or f.n. I tried it but it gives error. [...] Also, is there a difference between 1.f and 1.0f?
This is because the language syntax is defined thus. When you're declaring a floating-point literal, the suffix is to use .f. So 5 would be an int while 5.0f or 5.f is a float; there's no difference when you omit any trailing 0s. However, n.f is syntax error since n is a identifier (variable) name and not a constant number literal.
And in the second method, why we can't write 1(float)/n instead of 1/(float)n?
(float)n is a valid, C-style casting of the int variable n, while 1(float) is just syntax error.
s+=1.0f/n;
and
s+=1/(float)n;
... So, my question is, both display the same result but is there any difference between the two?
Yes.
In both C and C++, when a calculation involves expressions of different types, one or more of those expressions will be "promoted" to the type with greater precision or range. So if you have an expression with signed and unsigned operands, the signed operand will be "promoted" to unsigned. If you have an expression with float and double operands, the float operand will be promoted to double.
Remember that division with two integer operands gives an integer result - 1/2 yields 0, not 0.5. To get a floating point result, at least one of the operands must have a floating point type.
In the case of 1.0f/n, the expression 1.0f has type float1, so the n will be "promoted" from type int to type float.
In the case of 1/(float) n, the expression n is being explicitly cast to type float, so the expression 1 is promoted from type int to float.
Nitpicks:
Unless your compiler documentation explicitly lists void main() as a legal signature for the main function, use int main() instead. From the online C++ standard:
3.6.1 Main function
...
2 An implementation shall not predefine the main function. This function shall not be overloaded. It shall have a declared return type of type int, but otherwise its type is implementation-defined...
Secondly, please format your code - it makes it easier for others to read and debug. Whitespace and indentation are your friends - use them.
1. The constant expression 1.0 with no suffix has type double. The f suffix tells the compiler to treat it as float. 1.0/n would result in a value of type double.
I am aware of the inherent imprecision of floats. What I'm confused about is why I would get returned "100" from something I would expect to resolve 0.99999999 etc. How could 0.33*3 ever possibly yield 100?
Here is my code: if I say
float x = 100.0/3.0;
printf("%f",x*3.0);
The output is "99.999996". Yet if I say
printf("%f",(100.0/3)*3);
The output is "100". Shouldn't they be identical? I would expect x to resolve to (100.0/3.0), exactly what's written there in plaintext -- yet they yield two different results.
The problem is that your second expression is not equivalent to the first one: it uses doubles throughout, while the first one has a conversion to float after the division, forcing the intermediate result to lower precision.
To build a fully equivalent expression, add a cast to float after division, like this:
printf("%f", ((float)(100.0/3.0))*3.0);
// ^^^^^
This produces the same output as your first example, i.e. "99.999996 (demo)
If you use double for x in your first example, you get the output 100.000000, too:
double x = 100.0/3.0;
printf("%f",x*3.0);
(another demo).
What if I have something like this:
int a = 20;
int min = INT_MIN;
if(-a - min)
//do something
Assume that INT_MIN if positive is more than INT_MAX. Would min ever be converted by the compiler to something like -min as in -INT_MIN, which could be undefined?
You are right that unary minus applied to INT_MIN can be undefined, but this does not happen in your example.
-a - min is parsed as (-a) - min. Variable min is only involved in binary subtraction, and the first operand only needs to be strictly negative for the result to be defined.
If the compiler transforms the subtraction to something else, it is its responsibility to ensure that the new version always computes the same thing as the old version.
The result of x - y is defined as the mathematical result of subtracting y from x. If the mathematical result can be represented in the result type (int in this case), then there is no overflow.
A compiler is free to transform the expression in any way it likes, such as by changing
x - y
to
x + (-y)
but only if the transformation keeps the same behavior in cases where the original behavior is well defined. In the case of y == INT_MIN, it can still perform the transformation as long as the undefined behavior of evaluating -INT_MIN yields the same end result (which it typically will).
To answer the question in the title:
Is INT_MIN subtracted from any integer considered undefined behavior?
INT_MIN - INT_MIN == 0, and cannot overflow.
Incidentally, I think you mean int rather than "integer". int is just one of several integer types.
I like to ask that what happens if we pass a fractional number when dereferencing an array in C or C++. An example of what I mean:
int arr1[],arr2[];
for (i = 0; i < 5; ++i)
{
if (i % 2 == 0)
arr1[i]=i;
else
arr2[i/2]=i;
}
What would be the compiler do when it sees arr2[3/2]?
i/2 is integer division. The result of this division will again be an integer, namely the result of the division truncated towards 0. (3/2==1; -5/2==-2;) (As a side note, the division and truncation are all a single operation: integer division. Most compilers will execute this in a single clock cycle.) So you will not be passing a fraction to an array-index.
If you try to pass a data type which can be a fraction (for example a double), the compiler will generate an error.
The division would happen first, and the answer would then be used as the array index. So, in your example, 3/2 would resolve to 1 (truncation), and then it would assign arr2[1]=i.
3/2 yields an integer result equal to 1. There is no 'fraction' in such line, ever.
arr2[3/2] ==== arr2[1]
array index should be integer. If you use a float type, it would be cast to an integer.
integer1 / integer2 yields another integer in c/c++.
What would be the compiler do when it sees arr2[3/2]?
The compiler would do nothing. The expression "3/2" is valid and will result in an integer at runtime.