Cheers!
Long story short I'm trying to deal with SAS rounding function and floating point precision.
Basically, I need to round half down to two decimals so, i.e. 1.235 should round to 1.23 and not 1.24
Here follow an example:
number
desired_output
1.230
1.23
1.235
1.23
1.2355
1.24
1.2305
1.23
1.231
1.23
1.236
1.24
I have tried many ways without success (i.e. several combinations of round(), ceil(), floor() and rounde() functions) but, to replicate the exercise, here below some tests:
data test;
input number desired_output;
datalines;
1.230 1.23
1.235 1.23
1.2355 1.24
1.2305 1.23
1.231 1.23
1.236 1.24
;
run;
data test;
set test;
round_01=round(number,.01);
round_ceil_01=ceil(round_01*100)/100;
round_floor_01=floor(round_01*100)/100;
round_even=rounde(number,.01);
less_half_rounding_factor=round(number-0.0005,.01);
run;
Thank you in advance!
How about:
round(number,0.01) - 0.01*(mod(number,0.01)=0.005)
Remove 0.01 when the next digit is exactly 5.
Test: Let's generate numbers to 5 decimal places and keep only those where the ROUND() function differs from the "round_down" logic above.
data test;
do integer=1 to 100000 ;
number=integer/100000 ;
round=round(number,0.01);
round_down = round(number,0.01) - 0.01*(mod(number,0.01)=0.005);
if round ne round_down then output;
end;
run;
Now let's check if any of them are those that are different are not those where the 3 least significant decimal places are 500, that is exactly X.XX500 .
data test2;
set test;
where 500 ne mod(integer,1000) ;
run;
So there were 100 cases where ROUND and ROUND_DOWN differed and they were all the cases where the value had a 5 in the thousands place and zeros after that.
Hope this is what you are looking for
data test;
set test;
value = input(put(number,4.2)best.);
run;
Related
I have been asked to write some code in SAS that rounds a number up but only if the digit in the thousandth place is greater than one. For example, 78.858 would obviously round up to 78.86 but would also want to take 78.852 and round up to 78.86.
I would just do it in two operations. Use the normal ROUND() function. Then check how much it changed. And then based on that difference decide whether or not to add an extra hundredth.
Example:
data have;
input x ;
cards;
78.858
78.86
78.852
78.8515
;
data want;
set have;
round=round(x,0.01);
diff = x-round;
if diff > 0.001 then round=round+0.01 ;
run;
Results
OBS x round diff
1 78.8580 78.86 -.0020
2 78.8600 78.86 0.0000
3 78.8520 78.86 0.0020
4 78.8515 78.86 0.0015
This question already has answers here:
Is floating point math broken?
(31 answers)
Closed 3 years ago.
I had a program which requires one to search values from -100.00 to +100.00 with incrementation of 0.01 inside a for loop. But the if conditions arent working properly even if code is correct...
As an example I tried printing a small section i.e if(i==1.5){cout<<"yes...";}
it was not working even though the code was attaining the value i=1.5, i verified that by printing each of the values too.
#include<iostream>
#include<stdio.h>
using namespace std;
int main()
{
double i;
for(i=-1.00; i<1.00; i=i+0.01)
{
if(i>-0.04 && i<0.04)
{
cout<<i;
if(i==0.01)
cout<<"->yes ";
else
cout<<"->no ";
}
}
return 0;
}
Output:
-0.04->no -0.03->no -0.02->no -0.02->no -0.01->no 7.5287e-016->no 0.01->no 0.02->no 0.03->no
Process returned 0 (0x0) execution time : 1.391
(notice that 0.01 is being attained but still it prints 'no')
(also notice that 0.04 is being printed even if it wasn't instructed to do so)
use this if(abs(i - 0.01) < 0.00000000001) instead.
double - double precision floating point type. Usually IEEE-754 64 bit
floating point type
The crux of the problem is that numbers are represented in this format as a whole number times a power of two; rational numbers (such as 0.01, which is 1/100) whose denominator is not a power of two cannot be exactly represented.
In simple word, if the number can't be represented by a sum of 1/(2^n) you don't have the exact number you want to use. So to compare two double numbers calculate the absolute difference between them and use a tolerance value e.g. 0.00000000001 .
Doubles are stored in binary format. To cut things short fractional part is written as binary. Now let's imagine it's size is 1 bit. So you've two possible values (for fraction only): .0 and .5. With two bits you have: .0 .25 .5 .75. With three bits: .125 .25 .375 .5 .625 .75 .875. And so on. But you'll never get 0.1. So what computer does? It cheats. It lies to you, that 0.1 you see is 0.1. While it more looks like 0.1000000000000000002 or something like this. Why it looks like 0.1? Because formatting of floating point values has long standing tradition of rounding numbers, so 0.10000000000001 becomes 0.1. As a result 0.1 * 10 won't equal 1.0.
Correct solution is to avoid floating point numbers, unless you don't care for precision. If your program breaks, once your floating point value "changes" by miniscule amount, then you need to find another way. In your case using non-fractional values will be enough:
for(auto ii=-100; ii<100; ++ii)
{
if(ii>-4 && ii<4)
{
cout << (ii / 100.0);
if(ii==1)
cout<<"->yes ";
else
cout<<"->no ";
}
}
I have a time variable that is expressed as a character in SAS. Example: 0:04 0:12 0:01 0:11 etc. I would like to convert it to a numeric variable 0.04 0.12 0.01 etc.
Using this code:
data work.set2; set work.set;
TIME2 = input(TIME, best4.);
;
run;
creates a new column with nothing but missing values. Can you advice on what to improve in my code?
SAS stores dates and times as numbers, time is the number of seconds. I think converting it to a SAS time is your best option. And there is a significant difference between 0.1 and 10 seconds because one is 6 seconds and one is 10 seconds. For example if you had 0.1 and 0.2 and took the difference that's 0.1 -> is that now a 10 or 6 second difference. You really need to think this through on how you want to interpret it and using your approach will be problematic.
The difference in times will not be reflected correctly.
Also, is 0:04 4 seconds or 4 minutes. The standard connotation would be 4 minutes, which is 240 seconds.
Here's how you can convert it:
data have;
x = '0:04';output;
x = '0:12';output;
x = '0:11'; output;
x = '1:00'; output;
x = '4:25'; output;
run;
data want;
set have;
sas_time = input(x, time.);
sas_time2 = sas_time;
format sas_time2 time4.;
/*if it's seconds*/
seconds = input(scan(x, 1, ':'), 8.)*60 + input(scan(x, 2, ':'), 8.);
run;
proc print data=want;run;
If your times are of type string:
WITHDOTS=translate(TIME2,'.',':');
Source:
https://communities.sas.com/t5/Base-SAS-Programming/Find-And-Replace-within-a-string/td-p/45104
Good day,
I had this issue where I was writing some numbers to database, which should have had value 0.1 in SAS, but for some bizarre reason appeared as 0.09 in SQL database. When I manually checked the dataset it showed 0.10 in format 12.2.
So what I do is check if the values are actually 0.1 or somewhat below this:
data _checking;
set publish_data;
if value < 0.1;
dummy = value*10000000;
run;
It appeared that number of observations fulfill the first condition. Ok... That explains why the values come out as 0.09. Rounding issue.
However, all dummy values come out as integers. I tried 10, 100, 1k, 10k all appear to come out as integers. (1, 10, 100 ...)
Next step I try:
data _checking2;
set _checking;
if dummy<10; /*Depending on the factorial*/
run;
This is consistent. Dummy retains the value 'a little below the value shown'.
I solved the issue by round(value,.1);
Questions:
How to observe the actual value stored in dataset? (Especially in case 'a little below')
If first condition if is true, then how can the checking with dummy still show integer values. (Because in computers epsilon has to have actual value)
2.b Or is this just a display issue? Or does SAS has flag for 'value minus epsilon'?
Answer 1:
The most precise and least human way to see the actual value is to observe the underlying IEEE bytes using HEX format.
Answer 2:
The default format for those new dummy variables is BEST12., so you won't see any small offsets if they are smaller than what best12. will show, or more precisely epsilon < 1e-(12-log10(x)). The SAS format could be considered a display issue in this case.
If your use case is that of a 'shown' value must be the actual value sent to a remote database then you will want to use ROUND prior to populating the remote tables.
data x;
x = 1/3; output;
x = 0.1 - 1e-13; output;
format x 12.2;
run;
data y;
set x;
put x= x= HEX16.;
xhex = x;
format xhex hex16.;
array dummy dummy1-dummy13;
do _n_ = 1 to 13;
dummy(_n_) = x * 10**_n_;
end;
run;
proc print data=y;
run;
data z;
do p = 0 to 10;
do q = 1 to 15;
array z z1-z15;
z(q) = 10**p + 10**-q;
end; output;
end;
drop p q;
run;
==== LOG ====
x=0.33 x=3FD5555555555555
x=0.10 x=3FB9999999997D74
==== PRINT ====
Obs x xhex dummy1 dummy2 dummy3 dummy4 dummy5 dummy6 dummy7
1 0.33 3FD5555555555555 3.33333 33.3333 333.333 3333.33 33333.33 333333.33 3333333.33
2 0.10 3FB9999999997D74 1.00000 10.0000 100.000 1000.00 10000.00 100000.00 1000000.00
Obs dummy8 dummy9 dummy10 dummy11 dummy12 dummy13
1 33333333.33 333333333.33 3333333333.3 33333333333 333333333333 3.3333333E12
2 10000000.00 100000000.00 1000000000.0 10000000000 100000000000 999999999999
You can try a different format. try 32.31 or best32.
Subtract 0.1-value and look at the result. Again, use a format with a lot of decimal places.
You are probably not seeing the value in the dummy variables because the epsilon is very small and the dummy is still getting rounded for display.
Try dummy=value*1e16 or higher.
Numbers in SAS are C doubles, fwiw.
I have a double precision variable x = 10, and when I use the statement: Print(,) x Fortran will print out a lengthy number as 10.0000000000000 . I only want 2 digits after the decimal point (.), that is 10.00 what should I do , instead of using Print(,) ? Thank you all in advance.
X=10
WRITE(*,44) X
44 FORMAT(F4.2)
I think the FORMAT statement is what you're after. The F4.2 says to write a real in 4 columns with 2 digits after the decimal.