The output returns in string format and not int. Eg: the function returns "1563.7383" instead of 1563.7383 - output-formatting

def polysum(n,s):
import math
r = math.pi/n
area = ((0.25)*n*(s**2))/ math.tan(r)
per = n*s
total = area + (per**2)
return ("%.4f" % total)
The output returns in string format and not int. Eg: the function returns "1563.7383" instead of 1563.7383

You did not add a language tag in your question, but judging from the syntax I would assume it is Python here.
First, note that 1563.7383 cannot be int, but it can be float.
Your problem is in your usage of "%.4f" % total, which converts total to a string with 4 decimal places.
You should probably just return total in your function (i.e. treat it as a float, without rounding, throughout your program). Only round it off using "%.4f" % total when you are actually outputting the value.
Alternatively, if you absolutely must round off the value in your function and have it as a float, consider using the built-in function round() instead. Your usage would then be return round(total, 4)

Related

float number to string converting implementation in STD

I faced with a curious issue. Look at this simple code:
int main(int argc, char **argv) {
char buf[1000];
snprintf_l(buf, sizeof(buf), _LIBCPP_GET_C_LOCALE, "%.17f", 0.123e30f);
std::cout << "WTF?: " << buf << std::endl;
}
The output looks quire wired:
123000004117574256822262431744.00000000000000000
My question is how it's implemented? Can someone show me the original code? I did not find it. Or maybe it's too complicated for me.
I've tried to reimplement the same transformation double to string with Java code but was failed. Even when I tried to get exponent and fraction parts separately and summarize fractions in cycle I always get zeros instead of these numbers "...822262431744". When I tried to continue summarizing fractions after the 23 bits (for float number) I faced with other issue - how many fractions I need to collect? Why the original code stops on left part and does not continue until the scale is end?
So, I really do not understand the basic logic, how it implemented. I've tried to define really big numbers (e.g. 0.123e127f). And it generates huge number in decimal format. The number has much higher precision than float can be. Looks like this is an issue, because the string representation contains something which float number cannot.
Please read documentation:
printf, fprintf, sprintf, snprintf, printf_s, fprintf_s, sprintf_s, snprintf_s - cppreference.com
The format string consists of ordinary multibyte characters (except %), which are copied unchanged into the output stream, and conversion specifications. Each conversion specification has the following format:
introductory % character
...
(optional) . followed by integer number or *, or neither that specifies precision of the conversion. In the case when * is used, the precision is specified by an additional argument of type int, which appears before the argument to be converted, but after the argument supplying minimum field width if one is supplied. If the value of this argument is negative, it is ignored. If neither a number nor * is used, the precision is taken as zero. See the table below for exact effects of precision.
....
Conversion Specifier
Explanation
Expected Argument Type
f F
converts floating-point number to the decimal notation in the style [-]ddd.ddd. Precision specifies the exact number of digits to appear after the decimal point character. The default precision is 6. In the alternative implementation decimal point character is written even if no digits follow it. For infinity and not-a-number conversion style see notes.
double
So with f you forced form ddd.ddd (no exponent) and with .17 you have forced to show 17 digits after decimal separator. With such big value printed outcome looks that odd.
Finally I've found out what the difference between Java float -> decimal -> string convertation and c++ float -> string (decimal) convertation. I did not find the original source code, but I replicated the same code in Java to make it clear. I think the code explains everything:
// the context size might be calculated properly by getting maximum
// float number (including exponent value) - its 40 + scale, 17 for me
MathContext context = new MathContext(57, RoundingMode.HALF_UP);
BigDecimal divisor = BigDecimal.valueOf(2);
int tmp = Float.floatToRawIntBits(1.23e30f)
boolean sign = tmp < 0;
tmp <<= 1;
// there might be NaN value, this code does not support it
int exponent = (tmp >>> 24) - 127;
tmp <<= 8;
int mask = 1 << 23;
int fraction = mask | (tmp >>> 9);
// at this line we have all parts of the float: sign, exponent and fractions. Let's build mantissa
BigDecimal mantissa = BigDecimal.ZERO;
for (int i = 0; i < 24; i ++) {
if ((fraction & mask) == mask) {
// i'm not sure about speed, maybe division at each iteration might be faster than pow
mantissa = mantissa.add(divisor.pow(-i, context));
}
mask >>>= 1;
}
// it was the core line where I was losing accuracy, because of context
BigDecimal decimal = mantissa.multiply(divisor.pow(exponent, context), context);
String str = decimal.setScale(17, RoundingMode.HALF_UP).toPlainString();
// add minus manually, because java lost it if after the scale value become 0, C++ version of code doesn't do it
if (sign) {
str = "-" + str;
}
return str;
Maybe topic is useless. Who really need to have the same implementation like C++ has? But at least this code keeps all precision for float number comparing to the most popular way converting float to decimal string:
return BigDecimal.valueOf(1.23e30f).setScale(17, RoundingMode.HALF_UP).toPlainString();
The C++ implementation you are using uses the IEEE-754 binary32 format for float. In this format, the closet representable value to 0.123•1030 is 123,000,004,117,574,256,822,262,431,744, which is represented in the binary32 format as +13,023,132•273. So 0.123e30f in the source code yields the number 123,000,004,117,574,256,822,262,431,744. (Because the number is represented as +13,023,132•273, we know its value is that exactly, which is 123,000,004,117,574,256,822,262,431,744, even though the digits “123000004117574256822262431744” are not stored directly.)
Then, when you format it with %.17f, your C++ implementation prints the exact value faithfully, yielding “123000004117574256822262431744.00000000000000000”. This accuracy is not required by the C++ standard, and some C++ implementations will not do the conversion exactly.
The Java specification also does not require formatting of floating-point values to be exact, at least in some formatting operations. (I am going from memory and some supposition here; I do not have a citation at hand.) It allows, perhaps even requires, that only a certain number of correct digits be produced, after which zeros are used if needed for positioning relative to the decimal point or for the requested format.
The number has much higher precision than float can be.
For any value represented in the float format, that value has infinite precision. The number +13,023,132•273 is exactly +13,023,132•273, which is exactly 123,000,004,117,574,256,822,262,431,744, to infinite precision. The precision the format has for representing numbers affects only which numbers it can represent, not how precisely it represents the numbers that it does represent.

Python code to convert decimal to binary

I need to write a Python script that will convert and number x in base 10 to binary with up to n values after the decimal point. And I can't just use bin(x)! Here's what I have:
def decimal_to_binary(x, n):
x = float(x)
test_str = str(x)
dec_at = test_str.find('.')
#This section will work with numbers in front of the decimal
p=0
binary_equivalent = [0]
c=0
for m in range(0,100):
if 2**m <= int(test_str[0:dec_at]):
c += 1
else:
break
for i in range(c, -1, -1):
if 2**i + p <= (int(test_str[0:dec_at])):
binary_equivalent.append(1)
p = p + 2**i
else:
binary_equivalent.append(0)
binary_equivalent.append('.')
#This section will work with numbers after the decimal
q=0
for j in range(-1, -n-1, -1):
if 2**j + q <= (int(test_str[dec_at+1:])):
binary_equivalent.append(1)
q = q + 2**j
else:
binary_equivalent.append(0)
print float((''.join(map(str, binary_equivalent))))
So say you call the function by decimal_to_binary(123.456, 4) it should convert 123.456 to binary with 4 places after the decimal, yielding 1111011.0111.
The first portion is fine - it will take the numbers in front of the decimal, in this case 123, and convert it to binary, outputting 1111011
However, the second portion, which deals with values after the decimal, is not doing what I think it should. The output it gives is not .0111, but rather .1111
I ran through the code with pen and paper writing down the value for each variable and it should work. But it doesn't. Can anyone help me fix this?
I call the function as decimal_to_binary(123.456, 4) and it prints out 1111011.1111
You're close, but there's an issue with your comparison when you go beyond the decimal:
if 2**j + q <= (int(test_str[dec_at+1:])):
What you're doing here is comparing a fractional value (since j is always negative) to a whole integer value. This comparison will, for all practical purposes, always be true.
Based on the surrounding logic, my guess would be that you're attempting to compare it to the actual decimal value here. Using your data, that would be 0.4 on the first iteration, so you expect the statement to be evaluated as:
0.5 <= 0.4
The actual comparison in your code is:
0.5 <= 4
There are two separate issues here:
You're taking all of the numbers after the decimal point, but not actually including the decimal point itself in your extraction. This is primarily why you are getting whole numbers in your test incorrectly. This is fixed simply by referencing test_str[dec_at:] rather than test_str[dec_at+1:]
You're casting to int. Even if you applied the change in the first point, your code would still not run correctly. However, in that case it would be because the cast would truncate the value down to 0 on every iteration. Cast to a float instead: float(test_str[dec_at:])
Your comparison line thus becomes if 2**j + q <= (float(test_str[dec_at:])):, which provides the correct output on my machine.
Note that floating point comparisons can be "finicky" in some situations, depending on rounding and the like. There are ways to mitigate this if needed.

Distinguish between Integer and Double in V8

In my implementation I provide a function to JavaScript that accepts a parameter.
v8::Handle<v8::Value> TableGetValueIdForValue(const v8::Arguments& args) {
v8::Isolate* isolate = v8::Isolate::GetCurrent();
v8::HandleScope handle_scope(isolate);
auto val = args[1];
if (val->IsNumber()) {
auto num = val->ToNumber();
// How to check if Int or Double
} else {
// val == string
}
}
Now this parameter can have basically any type. As I support Int, Float and String I want to efficiently check for these types. Using IsNumber() and IsStringObject() I can make sure that the objects are numberish or a string.
But now I need to differentiate between an integer value and a float. What is the best way to perform this test? Is there a way to call / use the typeof function exposed to JS?
Quick Answer
v8::Value::NumberValue() will will return the value of the javascript Number without loss of precision.
Explanation
It is true that the set of numbers representable by int64_t and double is different. And so is natural to be concerned about what happens if the value is actually int64_t because v8::Value defines both
V8EXPORT int64_t v8::Value::IntegerValue() const;
V8EXPORT double v8::Value::NumberValue() const;
What is a v8::Number?
Consider v8::Number doc
Detailed Description
A JavaScript number value (ECMA-262, 4.3.20)
IntegerValue does return an int64_t, but there will be no more precision available, because the value is stored internally as a double-precision 64-bit binary format IEEE 754 value.
Testing in a browser
Checking if javascript can represent a value that a double can't but an int64_t can.
2^63 - 1 is equal to 9223372036854775807
Try typing the following in a javascript console; this value is parsed but the extra precision is thrown away because double can't represent it.
>9223372036854775807
the result
9223372036854776000
Try IsInt32 or IsUint32() to check the number is integer or not.
https://github.com/v8/v8/blob/master/include/v8.h#L1313
Try using this line;
bool isInt = ( num->NumberValue() ) % 1 == 0;
NumberValue returns a double with the number's value, and the % 1 == 0 will return true if the value returned is evenly divisible by 1.

Why do I get two different outputs here?

The following two pieces of code produce two different outputs.
//this one gives incorrect output
cpp_dec_float_50 x=log(2)
std::cout << std::setprecision(std::numeric_limits<cpp_dec_float_50>::digits)<< x << std::endl;
The output it gives is
0.69314718055994528622676398299518041312694549560547
which is only correct upto the 15th decimal place. Had x been double, even then we'd have got first 15 digits correct. It seems that the result is overflowing. I don't see though why it should. cpp_dec_float_50 is supposed to have 50 digits precision.
//this one gives correct output
cpp_dec_float_50 x=2
std::cout << std::setprecision(std::numeric_limits<cpp_dec_float_50>::digits)<< log(x) << std::endl;
The output it gives is
0.69314718055994530941723212145817656807550013436026
which is correct according to wolframaplha .
When you do log(2), you're using the implementation of log in the standard library, which takes a double and returns a double, so the computation is carried out to double precision.
Only after that's computed (to, as you noted, a mere 15 digits of precision) is the result converted to your 50-digit extended precision number.
When you do:
cpp_dec_float_50 x=2;
/* ... */ log(x);
You're passing an extended precision number to start with, so (apparently) an extended precision overload of log is being selected, so it computes the result to the 50 digit precision you (apparently) want.
This is really just a complex version of:
float a = 1 / 2;
Here, 1 / 2 is integer division because the parameters are integers. It's only converted to a float to be stored in a after the result is computed.
C++ rules for how to compute a result do not depend on what you do with that result. So the actual calculation of log(2) is the same whether you store it in an int, a float, or a cpp_dec_float_50.
Your second bit of code is the equivalent of:
float b = 1;
float c = 2;
float a = b / c;
Now, you're calling / on a float, so you get floating point division. C++'s rules do take into account the types of arguments and paramaters. That's complex enough, and trying to also take into account what you do with the result would make C++'s already overly-complex rules incomprehensible to mere mortals.

How to obtain a value based on a certain probability

I have some functions which generate double, float, short, long random values. I have another function to which I pass the datatype and which should return a random value. Now I need to choose in that function the return value based on the passed datatype. For example, if I pass float, I need:
the probability that the return is a float is 70%, the probability that the return is a double, short or long is 10% each. I can make calls to the other function for generating the corresponding random values, but how do I fit in the probabilistic weights for the final return? My code is in C++.
Some pointers are appreciated.
Thanks.
C++ random numbers have uniform distribution. If you need random variables of another distribution you need to base its mathematical formula on uniform distribution.
If you don't have a mathematical formula for your random variable you can do something like this:
int x = rand() % 10;
if (x < 7)
{
// return float
}
else (if x == 7)
{
// return double
}
else (if x == 8)
{
// return short
}
else (if x == 9)
{
// return long
}
This can serve as an alternative for future references which can
get the probability of precise values such as 99.999% or 0.0001%
To get probability(real percentage) do as such:
//70%
double probability = 0.7;
double result = rand() / RAND_MAX;
if(result < probability)
//do something
I have used this method to create very large percolated grids and it works like a charm for precision values.
I do not know if I understand correctly what you want to do, but if you just want to assure that the probabilities are 70-10-10-10, do the following:
generate a random number r in (1,2,3,4,5,6,7,8,9,10)
if r <= 7: float
if r == 8: short
if r == 9: double
if r == 10: long
I think you recognize and can adapt the pattern to arbitrary probability values.
mmonem has a nice probabilistic switch, but returning different types isn't trivial either. You need a single type that may adequately (for your purposes) encode any of the values - check out boost::any, boost::variant, union, or convert to the most capable type (probably double), or a string representation.