Confusion about float data type declaration in C++ - c++

a complete newbie here. For my school homework, I was given to write a program that displays -
s= 1 + 1/2 + 1/3 + 1/4 ..... + 1/n
Here's what I did -
#include<iostream.h>
#include<conio.h>
void main()
{
clrscr();
int a;
float s=0, n;
cin>>a;
for(n=1;n<=a;n++)
{
s+=1/n;
}
cout<<s;
getch();
}
It perfectly displays what it should. However, in the past I have only written programs which uses int data type. To my understanding, int data type does not contain any decimal place whereas float does. So I don't know much about float yet. Later that night, I was watching some video on YouTube in which he was writing the exact same program but in a little different way. The video was in some foreign language so I couldn't understand it. What he did was declared 'n' as an integer.
int a, n;
float s=0;
instead of
int a
float s=0, n;
But this was not displaying the desired result. So he went ahead and showed two ways to correct it. He made changes in the for loop body -
s+=1.0f/n;
and
s+=1/(float)n;
To my understanding, he declared 'n' a float data type later in the program(Am I right?). So, my question is, both display the same result but is there any difference between the two? As we are declaring 'n' a float, why he has written 1.0f instead of n.f or f.n. I tried it but it gives error. And in the second method, why we can't write 1(float)/n instead of 1/(float)n? As in the first method we have added float suffix with 1. Also, is there a difference between 1.f and 1.0f?
I tried to google my question but couldn't find any answer. Also, another confusion that came to my mind after a few hours is - Why are we even declaring 'n' a float? As per the program, the sum should come out as a real number. So, shouldn't we declare only 's' a float. The more I think the more I confuse my brain. Please help!
Thank You.

The reason is that integer division behaves different than floating point division.
4 / 3 gives you the integer 1. 10 / 3 gives you the integer 3.
However, 4.0f / 3 gives you the float 1.3333..., 10.0f / 3 gives you the float 3.3333...
So if you have:
float f = 4 / 3;
4 / 3 will give you the integer 1, which will then be stored into the float f as 1.0f.
You instead have to make sure either the divisor or the dividend is a float:
float f = 4.0f / 3;
float f = 4 / 3.0f;
If you have two integer variables, then you have to convert one of them to a float first:
int a = ..., b = ...;
float f = (float)a / b;
float f = a / (float)b;
The first is equivalent to something like:
float tmp = a;
float f = tmp / b;

Since n will only ever have an integer value, it makes sense to define it as as int. However doing so means that this won't work as you might expect:
s+=1/n;
In the division operation both operands are integer types, so it performs integer division which means it takes the integer part of the result and throws away any fractional component. So 1/2 would evaluate to 0 because dividing 1 by 2 results in 0.5, and throwing away the fraction results in 0.
This in contrast to floating point division which keeps the fractional component. C will perform floating point division if either operand is a floating point type.
In the case of the above expression, we can force floating point division by performing a typecast on either operand:
s += (float)1/n
Or:
s += 1/(float)n
You can also specify the constant 1 as a floating point constant by giving a decimal component:
s += 1.0/n
Or appending the f suffix:
s += 1.0f/n
The f suffix (as well as the U, L, and LL suffixes) can only be applied to numerical constants, not variables.

What he is doing is something called casting. I'm sure your school will mention it in new lectures. Basically n is set as an integer for the entire program. But since integer and double are similar (both are numbers), the c/c++ language allows you to use them as either as long as you tell the compiler what you want to use it as. You do this by adding parenthesis and the data type ie
(float) n

he declared 'n' a float data type later in the program(Am I right?)
No, he defined (thereby also declared) n an int and later he explicitly converted (casted) it into a float. Both are very different.
both display the same result but is there any difference between the two?
Nope. They're the same in this context. When an arithmetic operator has int and float operands, the former is implicitly converted into the latter and thereby the result will also be a float. He's just shown you two ways to do it. When both the operands are integers, you'd get an integer value as a result which may be incorrect, when proper mathematical division would give you a non-integer quotient. To avoid this, usually one of the operands are made into a floating-point number so that the actual result is closer to the expected result.
why he has written 1.0f instead of n.f or f.n. I tried it but it gives error. [...] Also, is there a difference between 1.f and 1.0f?
This is because the language syntax is defined thus. When you're declaring a floating-point literal, the suffix is to use .f. So 5 would be an int while 5.0f or 5.f is a float; there's no difference when you omit any trailing 0s. However, n.f is syntax error since n is a identifier (variable) name and not a constant number literal.
And in the second method, why we can't write 1(float)/n instead of 1/(float)n?
(float)n is a valid, C-style casting of the int variable n, while 1(float) is just syntax error.

s+=1.0f/n;
and
s+=1/(float)n;
... So, my question is, both display the same result but is there any difference between the two?
Yes.
In both C and C++, when a calculation involves expressions of different types, one or more of those expressions will be "promoted" to the type with greater precision or range. So if you have an expression with signed and unsigned operands, the signed operand will be "promoted" to unsigned. If you have an expression with float and double operands, the float operand will be promoted to double.
Remember that division with two integer operands gives an integer result - 1/2 yields 0, not 0.5. To get a floating point result, at least one of the operands must have a floating point type.
In the case of 1.0f/n, the expression 1.0f has type float1, so the n will be "promoted" from type int to type float.
In the case of 1/(float) n, the expression n is being explicitly cast to type float, so the expression 1 is promoted from type int to float.
Nitpicks:
Unless your compiler documentation explicitly lists void main() as a legal signature for the main function, use int main() instead. From the online C++ standard:
3.6.1 Main function
...
2 An implementation shall not predefine the main function. This function shall not be overloaded. It shall have a declared return type of type int, but otherwise its type is implementation-defined...
Secondly, please format your code - it makes it easier for others to read and debug. Whitespace and indentation are your friends - use them.
1. The constant expression 1.0 with no suffix has type double. The f suffix tells the compiler to treat it as float. 1.0/n would result in a value of type double.

Related

Warning about arithmetic overflow when multiplying numbers

I'm writing a program to calculate the result of numbers:
int main()
{
float a, b;
cin >> a >> b;
float result = b + a * a * 0.4;
cout << result;
}
but I have a warning at a * a and it said Warning C26451 Arithmetic overflow: Using operator '*' on a 4 byte value and then casting the result to a 8 byte value. Cast the value to the wider type before calling operator '*' to avoid overflow (io.2). Sorry if this a newbie question, can anyone help me with this? Thank you!
In the C language as described in the first edition of K&R, all floating-point operations were performed by converting operands to a common type (specifically double), performing operations with that type, and then if necessary converting the result to whatever type was needed. On many platforms, that was the most convenient and space-efficient way of handling floating-point math. While the Standard still allows implementations to behave that way, it also allows implementations to perform floating-point operations on smaller types to be performed using those types directly.
As written, the subexpression a * a * 0.5; would be performed by multiplying a * a together using float type, then multiply by a value 0.5 which is of type double. This latter multiplication would require converting the float result of a * a to double. If e.g. a had been equal to 2E19f, then performing the multiply using type float would yield a value too large to be represented using that type. Had the code instead performed the multiplication using type double, then the result 4E38 would be representable in that type, and the result of multiplying that by 0.5 (i.e. 2E38) would be within the range that is representable by float.
Note that in this particular situation, the use of float for the intermediate computations would only affect the result if a was within narrow ranges of very large or very small. If instead of multiplying by 0.5 one had multiplied by other values, however, the choice of whether to use float or double for the first multiplication could affect the accuracy of rounding. Generally, using double for both multiplies would yield slightly more accurate results, but at the expense of increased execution time. Using float for both may yield better execution speed, but at the result of reduced precision. If the floating-point constant had been something that isn't precisely representable in float, converting to double and multiplying by a double constant may yield slightly more accurate results than using float for everything, but in most cases where one would want that precision, one would also want the increased position that would be achieved by using double for the first multiply as well.
Let's look at the error message.
Using operator '*' on a 4 byte value
It is describing this code:
a * a
Your float is 4 bytes. The result of the multiplication is 4 bytes. And the result of a multiplication may overflow.
and then casting the result to a 8 byte value.
It is describing this code:
(result) * 0.4;
Your result is 4 bytes. 0.4 is a double, which is 8 bytes. C++ will promote your float result to a double before performing this multiplication.
So...
The compiler is observing that you are doing float math that could overflow and then immediately converting the result to a double, making the potential overflow unnecessary.
Change the code to this to remove the float to double to float conversions.
float result = b + a * a * 0.4f;
I read the question as "how to change the code to remove the warning?".
If you take the advice in the warning's text literally:
float result = b + (double)a * a * 0.4;
But this is nonsense — if an overflow happens, your result will probably not fit into float result.
It looks like in your case overflow is not possible, and you feel perfectly fine doing all calculations with float. If so, just write 0.4f instead of 0.4 — then the constant will have float type (instead of double), and the result will also be float.
If you want to "fix" the problem with overflow
double result = b + (double)a * a * 0.4;
But then you must also change the following code, which uses the result. And you don't remove the possibility of overflow, you just make it much less likely.

Why do parentheses make a difference in this simple code

When I add two parentheses in this float ,the output becomes zero.
int main()
{
float a=12.00*(20/100);
cout <<a<<endl;
}
If I remove the parentheses, the output will be 2.4
but if I kept it, the output will be zero . Why ???
12.00*20/100 is parsed as (12.00*20)/100 and is evaluated:
Since 12.00 contains a decimal point; it is a double with value 12.
20 is an int with value 20.
In 12.00*20, the int 20 is converted to double, yielding a double with value 20.
Then the double values 12 and 20 are multiplied, producing a double value 240.
Then (12.00*20)/100 is dividing a double 240 by an int 100.
The int 100 is converted to double 100.
Then the double 240 is divided by the double 100, producing a double value of approximately 2.4 (exactly 2.399999999999999911182158029987476766109466552734375 when IEEE-754 binary64 is used).
In contrast, 12.00*(20/100) is evaluated:
20 and 100 are both int values, so 20/100 is performed with int division. int division produces an int result with the fraction discarded. So the result is the int value 0.
Then 12.00*(20/100) is multiplying the double value 12 by the int value 0, which produces 0.
In summary, two things are at play:
Given an expression a*b/c where the two operators * and / have otherwise equal precedence, they are structured to perform the left operation first. (This is the rule for multiplicative operators. Some operators, such as assignment, associate right-to-left.)
Multiplications or divisions with int operands are done with int arithmetic, even if the overall expression contains double operands somewhere else. In multiplications with mixed int and double operands, the int operand is converted to double.
When you use
float a=12.00*(20/100);
the term (20/200) is computed first before the result of that term is multiplied with 12.00. That's because the parenthetical term has higher precedence than the multiplication operator. 20/100 is computed using integer division, which results in 0.
When you use
float a=12.00*20/100;
The term 12.00*20/100 is evaluated as (12.00*20)/100 since the multiplication operator and division operator have the order of precedence and they have left to right associativity. Those operations are performed by promoting 20 and 100 to double. Hence, you get the expected answer.
I think one important thing worth mentioning here which is highly recommended in mordern C++, Integer and Floating point Literals.
As already mentioned in most answers, by default in C++ a numeric with no literals is referred as integer types. But in mordern C++ you can be specific about the type using floating point literals "f".
Thus, in this way you know how you can explicitly let compiler know what type of numericals you are tying to use and get your expected result.
In below case, (20.0f/100.0f) is not refered as integer type any more, and here I'm explicitly specifying compiler to treat these numericals as floating point types.
(Remember even using 20.0 is by default treated as double types and not float types.)
Example, try this:
#include <iostream>
int main()
{
auto a = 12.0f * (20.0f/100.0f);
std::cout <<a<<std::endl;
}

Getting an int instead of a float

I am doing something like this
int a = 3;
int b = 4;
float c = a/b ; //This returns 0 while its suppose to return 0.75
I wanted to know why the above code doesn't work ? I realize that 3 is an int and 4 is an int too. However the result is a float which is being assigned to float. However I am getting a 0 here. Any suggestions on what I might be doing wrong ?
The division is evaluated first, and because it is two integer operands, it evaluates to an integer... which then only get assigned to a float.
This is due to a predefined set of rules that decreases in type complexity. To force the result to be of a particular type (at least), at least one of the operands needs to be of that type. (via a static_cast< > )
Thus:
float c = a / static_cast<float>(b);
float c = a/b ;
a and b are integers, so it is integer division.
From the C++ standard:
5.6 Multiplicative operators [expr.mul]
For integral operands the / operator yields the algebraic quotient with any fractional part discarded.
Instaed, try this:
float c = a / static_cast<float>(b);
(As #TrevorHickey suggested, static_cast<float> is better than old-style (float) cast.)
You cant divide two ints and receive a float. You either have to cast to a float or have the types as a float.
float a = 3;
float b = 4;
float c = a/b;
or
float c = (float)a/(float)b;
HINT: the result from integer division is integer. The result of the division is then assigned to a float. That is a/b results in an int. Cast that however you want, but you aren't gonna get 0.75 out of it.
If you are working in C++, you should use the static_cast method over the implicit cast.
This will ensure that the type can be safely cast at compile time.
float c = a/static_cast<float>(b);

Store division of integers in float value C++

I just wanted to ask what happens number wise if i do not typecast integers to float when storing in a float variable like this:
int32 IntVar1 = 100
int32 IntVar2 = 200
float FloatVar = IntVar1/IntVar2;
Currently i am doing this:
int32 IntVar1 = 100
int32 IntVar2 = 200
float FloatVar = float(IntVar1)/float(IntVar2);
But in the amount of code i have, this looks really retarded. I thought about changing my int variables to float, but i guess that would be a performance hit. And since the integer values are not supposed to hold any decimals, it feels like a complete waste.
So i wonder, are there any way that option 1 could be working? Or do i have to typecast OR change variables to float? (All typecasting pretty much makes the code unreadable)
I wouldn't worry too much about premature optimization. If it makes more sense for your values to be expressed as float types, go for it. If your program doesn't run as fast as you need, and you've profiled it and know that the floating point operations are the problem, then start thinking about how to speed it up.
I'd value readability over all of the casting, which seems to be your instinct as well.
Also, since this question is tagged C++, I think it's (unfortunately?) more idiomatic to do:
float FloatVar = static_cast<float>(IntVar1)/IntVar2
Behold the magic of functions:
float div(int x, int y)
{
return float(x) / float(y);
}
Now you can say:
int32 IntVar1 = 100
int32 IntVar2 = 200
float FloatVar = div(IntVar1, IntVar2);
You need at least one of those operands to be float, otherwise the division will be truncated. I usually cast the first operand:
float FloatVar = (float)IntVar1/IntVar2;
which, elegance-wise, isn't that bad.
As per the ISO/IEC standard- N3797 - section 5.6
For integral operands the / operator yields the algebraic quotient
with any fractional part discarded; if the quotient a/b is
representable in the type of the result, (a/b)*b + a%b is equal to a;
otherwise, the behavior of both a/b and a%b is undefined
The discarding of the fractional part is called truncation towards zero.
There is no wonder if the fractional part is discarded in
22/7

Why is (1/2)*x different from 0.5*x? [duplicate]

This question already has answers here:
What is the behavior of integer division?
(6 answers)
Closed 1 year ago.
This behaves as wanted:
double t = r[1][0] * .5;
But this doesn't:
double t = ((1/2)*r[1][0]);
r is a 2-D Vector.
Just thought of a possibility. Is it because (1/2) is considered an int and (1/2) == 0?
Is it because (1/2) is considered an int and (1/2) == 0?
Yes, both of those literals are of type int, therefore the result will be of type int, and that result is 0.
Instead, make one of those literals a float or double and you'll end up with the floating point result of 0.5, ie:
double t = ((1.0/2)*r[1][0]);
Because 1.0 is of type double, the int 2 will be promoted to a double and the result will be a double.
Write this instead:
double t = ((1/2.0)*r[1][0]);
1 / 2 is an integer division and the result is 0.
1 / 2.0 is a floating point division (with double values after the usual arithmetic conversions) and its result is 0.5.
Because 1/2 is int/int division. That means whatever is the result will have anything after the decimal point removed (truncated). So 1/2 = 0.5 = 0.
Normally I always write the first number in double : 1.0/2 …..
If you make the very first number a double then all remaining calculation is done in double only.
double t = r[1][0] * .5;
is equivalent to:
double t = ((1/2f)*r[1][0]);
and not:
double t = ((1/2)*r[1][0]);
Due to loss of decimal part when the temporary result of 1/2 is stored in an int variable.
As a guideline whenever there is a division and there is a possibility of the answer being real number, do not use int or make one of the operands float or double or use cast.
You can write 1.0/2.0 instead. 1/2 displays this behaviour because both the denominator and numerator act are of an integer type and a variable of an integer type divided by another variable of an integer type is always truncated to an integer.
I cannot merit or demerit the standard of the question but this seem very critical issue to me. We assume that compiler will do the laundry for us all the time , but that is not true some times.
Is there any way to avoid this situation ?
Possibly
OR
More importantly knowing the monster (C,C++) as most of the people point out above
I would like to know if there are other ways to trace these "truncation" issues at compile time