SAS Format in calculation - sas

I am creating New variable as AGE.The CUTOFF value is 100 and it is divided by 12 so the value is exactly 8.3333.....But Few freshness values are 8.3333333. I have to pick the value of SEGMENT if FRESHNESS>= 100/12, but its picking AMU where freshness is 8.3333... The format of FRESHNESS is F12.9 and CUTOFF is BEST12.
data new;
set SEGMENT_AGE;
IF Freshness< CUTOFF/12 THEN AGE=AMU;
ELSE AGE=SEGMENT;
RUN;
I tried with different format making cutoff to F12.9 , still its not working

You're running into an issue of floating point precision. If a number is a repeating decimal (in binary), you may have two different values (the higher or lower - ie, 0.333333333333333333 or 0.3333333333333333333334) depending on how it was arrived at. IE:
1-(1/3) - (1/3) = 0.33333333333333333334
0+(1/3) = 0.33333333333333333333
So do not assume it is precisely equal just because it looks like it should be. Further, some numbers in decimal that are not repeating decimals are repeating in binary - 7/10 for example is 0.7 decimal but is not storable precisely in binary.
You should compare rounded numbers if you need to compare precisely; for example,
if round(freshness,0.001) < round(cutoff/12,0.001) ...
should result in your calculations matching your expectations.

Related

Identify the value with highest number of decimal values

I have a range of values and I want to count the decimal points of all values in the range and display the max count. the formula should exclude the zeroes at the end(not count ending zeroes in the decimal points).
for example, in the above sample, in the whole range the max of count of decimal places is 4 excluding the ending zeroes. so the answer is 4 to be displayed in cell D2
I tried doing regex, but do not know how do I do it for a whole range of values.
Please help!
try:
=INDEX(MAX(LEN(IFERROR(REGEXEXTRACT(TO_TEXT(A2:C4), "(\..+)")*1))-2))
Player0's solution is a good start, but uses TO_TEXT which seems to rely on the formatting of your cells.
If you want to safely compute the number of decimal places, use the TEXT function instead.
TEXT(number, format) requires a format whose max. number of decimal places has to be specified. There is no way around this, because formulas like =1/3 can have infinitely many decimal places.
Therefore, first decide on the max, precision for your use-case (here we use 8). Then use below function which works independently from your document's formatting and language:
=INDEX(MAX(
LEN(REGEXEXTRACT(
TEXT(ABS(A2:C4); "."&REPT("#";8));
"[,.].*$"
))-1
))
We subtract -1 since LEN(REGEXEXTRACT()) also counts the decimal separator (. for english, , for many others) .
Everything after the 8th decimal place is ignored. If all your numbers are something like 123.00000000987 the computed max. is 0. If you prefer it to be 8 instead, then add ROUNDUP( ; 8):
=INDEX(MAX(
LEN(REGEXEXTRACT(
TEXT(ROUNDUP(ABS(A2:C4);8); "."&REPT("#";8));
"[,.].*$"
))-1
))

Changing Decimal Number to Percent

I have a column I would like to change from a decimal number to a percent. Here are some example numbers: 28.97, 42.83 and 99.25.
Because of where the decimal point is located, when I change the value to percentage, the numbers turn to 2897%, 4283% and 9925% instead of 28.97%, 42.83% and 99.25%.
Is there a way in DAX or Power Query Editor to move the decimal point over to the left two spots so I can get the correct output?

C++ Xtensor increase floating point significant numbers

I am building a neural network and using xtensor for array multiplication in feed forward. The network takes in xt::xarray<double> and outputs a decimal number between 0 and 1. I have been given a sheet for expected output. when i compare my output with the provided sheet, I found that all the results differ after exactly 7 digits. for example if the required value is 0.1234567890123456, I am getting values like 0.1234567-garbage-numbers-so-that-total-numbers-equal-16, 0.1234567993344660, 0.1234567221155667.
I know I can not get that exact number 0.1234567890123456 due to floating point math. But how can I debug/ increase precision to be close to that required number. thanks
Update:
xt::xarray<double> Layer::call(xt::xarray<double> input)
{
return xt::linalg::dot(input, this->weight) + this->bias;
}
for code I am simply calling this call method a bunch of times where weight and bias are xt::xarray<double> arrays.

Why does the binaryw. format behave differently above width 58?

Consider the following program and output:
data _null_;
input a;
length b $64;
do i = 1 to 64;
fmtname = cats('binary',i);
b = cats(putn(a,fmtname));
put i= b=;
end;
cards;
1
;
run;
Output (SAS 9.1.3, Windows 7 x64):
i=1 b=1
i=2 b=01
i=3 b=001
i=4 b=0001
i=5 b=00001
/*Skipped a few very similar lines*/
i=58 b=0000000000000000000000000000000000000000000000000000000001
i=59 b=11111110000000000000000000000000000000000000000000000000000
i=60 b=111111110000000000000000000000000000000000000000000000000000
i=61 b=1111111110000000000000000000000000000000000000000000000000000
i=62 b=11111111110000000000000000000000000000000000000000000000000000
i=63 b=011111111110000000000000000000000000000000000000000000000000000
i=64 b=0011111111110000000000000000000000000000000000000000000000000000
Last few lines of output from SAS 9.4 on Linux x64:
i=60 b=000000000000000000000000000000000000000000000000000000000001
i=61 b=1111111110000000000000000000000000000000000000000000000000000
i=62 b=11111111110000000000000000000000000000000000000000000000000000
i=63 b=011111111110000000000000000000000000000000000000000000000000000
i=64 b=0011111111110000000000000000000000000000000000000000000000000000
This behaviour is rather unexpected, to me at least, and doesn't seem to be documented on the help page. It agrees with the document I found here for width 64 - standard double precision - but I don't understand why it flips over at width 59.
I don't quite get the same result - mine switches at 61 - but I believe the answer is the same.
Up to some point - 58, 60, somewhere around there - SAS is showing you the fixed-point integer representation of the number. Test this with a decimal, like so:
data _null_;
a=3.14159265358979323846264338327950288419716939937510582;
length b $64;
put a= hex4.;
put a= hex8.;
put a= hex16.;
do i = 1 to 64;
fmtname = cats('binary',i);
b = cats(putn(a,fmtname));
put i= b=;
end;
run;
And you will get a sort-of-surprising result - you see 000...0011 for most of your rows, up through 60. The documentation doesn't explicitly mention this, but it does show it in the example (123.45 and 123 are identical in binary8.).
Then starting at 61, or 59 for you I'm guessing, you see the actual representation of the number as SAS internally stores it (or, arguably, how Intel internally stores it).
The binary documentation doesn't explain this well, but the HEX. documentation does explain it pretty clearly in a tip:
If w< 16, the HEXw. format converts real binary numbers to fixed-point integers before writing them as hexadecimal characters. It also writes negative numbers in two's complement notation, and right aligns digits. If w is 16, HEXw. displays floating-point values in their hexadecimal form.
Binary is doing the same, and on my machine it happens right at the point HEX would also make the change - at 15x4=60. And HEX. shows the same - notice below; hex4. and hex8. show a different result than hex16..
To be clear, the value shown at binary64. is correct, and not any sort of truncation (though 61-63, and in your example 59-60, are left-truncated).
I did find a SAS usage note regarding this, though it's clearly out of date based on our tests:
Beginning with SASĀ® Version 7, the BINARYw. format was changed to be more consistent with the HEXw. format. When the HEXw. format uses a width of 16, (corresponding to 8 bytes of data), it produces a hexadecimal representation of the floating point value. The BINARYw. format changed so that widths of 57-64 produce a binary representation of the floating point value, since widths of 57-64 correspond to 8 bytes of data.
It also contains a suggestion for how to get consistent results for integers, which may be of use.
BIN_64=PUT(PUT(VALUE,S370FIB8.),$BINARY64.);
S370FIB8. is a format that converts numbers to their fixed integer binary representation, in IBM Mainframe format. (I.e., it writes the integer in Big-Endian format, which is not what you'd get on an Intel machine.)

Converting CHAR to NUM with varying decimal places

I am trying to convert a column stored from character to numeric. The problem is that this column has varying number of decimal places.
For example,
Data
1052969525
392282764.234
221018301.2
130010764.7894
82340150
183779233.4
I have determined that the likely maximum of decimal places is 4, the width required would be about 15. So I have attempted the following:
datanum = input(data, 15.4);
But this appears to put the decimal place in the wrong place, especially for those that have no decimal places. What is the most reasonable way to convert this column from char to numeric? This column is part of a database table uploaded by someone else so there's not much option to change that. Thanks.
You don't normally supply the decimal width in informats. For a normal number, you only supply the width, and SAS will figure out the decimal for you (based on the position of the decimal point).
datanum = input(data,15.);
The .d part of an informat is to allow for compatibility with (mostly) older systems that did not have decimals in the data, to save space. For example, if I'm reading in money amounts, and I only have 6 spaces:
123456
882348
100000
123400
I can read that in as an integer amount of cents - or I can do:
input cost 6.2;
That will then tell SAS to place the decimal before the last 2 characters.