Adjusting hour to account for time difference - sas

I have an edit check
"If Period = 1,2,3 or 4 and Study Hour = 1 then the Time should be 1 hour plus or minus 15 minutes post-dose of study drug from the same period".These are to be programmed with a +/- 20-minute window of Study Hour 1.00 (relative to their dosing time) It is the protocol window, so even if the event was scheduled not exactly at the 1 hour, we are looking for the deviation window from the 1 hour not the time point of the event. Here is the merged data
This is my code. I'm getting a lot of flags here so what am I doing wrong?. For context, there is a prothour variable that is 1 but the actual hour time point is 0.77. Should I adjust the 0.77 somehow to account for this?
data medfst;
set dm.ex;
ptno=strip(compress(clientid,'-'))+0;
if ex_stdat=. or ex_sttim=. then delete;
medday= day;
rename hour=medhour;
proc sort;
by ptno period day medhour;
run;
data medfst;
set medfst;
by ptno period;
if first.period;
ex_datetime1=put(ex_stdat,date9.-r)||' '||put(ex_sttim,time8.-l);
ex_datetime=input(ex_datetime1,datetime20.);
keep scrid clientid ptno period ex_datetime ex_stdat ex_sttim medhour day;
format ex_datetime datetime20.;
proc sort;
by ptno period day medhour;
run;
data vs;
set dm.vs;
ptno=strip(compress(clientid,'-'))+0;
if VS_TEST in ('SYSTOLIC');
if prothour in ('1');
proc sort nodupkey;
by ptno period day hour;
run;
data vs1;
set vs;
vs_datetime1=put(vs_dat,date9.-r)||' '||put(vs_tim,time8.-l);
vs_datetime=input(vs_datetime1,datetime20.);
keep scrid clientid day hour ptno period vs_dat vs_tim vs_datetime vs_com;
format vs_datetime datetime20.;
proc sort;
by ptno period day;
run;
data temp;
merge medfst (in=a) vs1;
by ptno period;
if a;
run;
data final_temp;
set temp;
newhour=hour-medhour;
datediff=vs_dat-ex_stdat;
timediff=vs_tim-ex_sttim;
diff=datediff*24*3600+timediff;
newdiff=round(diff-newhour*(60*60));
format diff time8. newdiff time8. timediff time8.;
run;
data final;
set final_temp;
%inc_subjs;
***** *****;
*********************************************************************************************************;
attrib extra reason length=$5000.;
*********************************************************************************************************;
* Edit check code and footnote *;
***** *****;
if abs(diff) lt '00:45:00't or abs(diff) gt '01:15:00't then do;
reason=trim(reason)||'If Period = 1,2,3 or 4 and Study Hour = 1 then the Time should be 1 hour plus or minus 15 minutes post dose of study drug from the same period#';
extra = trim(extra)||', Hour based on Dose = '||trim(left(medhour))||', Vital Signs hour = '||trim(left(prothour))||', Time deviated = '||trim(put(diff,time8.))||', comment = '||trim(left(vs_com));
end;

You can round to a nearest multiple using the second argument of ROUND function.
ROUND(argument <, rounding-unit>)
Required Argument
argument
is a numeric constant, variable, or expression to be rounded.
Optional Argument
rounding-unit
is a positive, numeric constant, variable, or expression that specifies the rounding unit.
Round a time value to the nearest hour (time is seconds, hour is 3600 seconds)
closest_hour = ROUND(mytime, 3600);
Round hour (number) to nearest hour (time value)
closest_hour = ROUND(myhour*3600, 3600);
and of course, round hour (number) to nearest whole hour (number)
closest_hr = ROUND(myhour); * default rounding unit is 1;

Related

Calculate the top 5 and summarize them by store

Let's say I have stores all around the world and I want to know what was my top losses sales across the world per store. What is the code for that?!
here is my try:
proc sort data= store out=sorted_store;
by store descending amount;
run;
and
data calc1;
do _n_=1 by 1 until(last.store);
set sorted_store;
by store;
if _n_ <= 5 then "Sum_5Largest_Losses"n=sum(amount);
end;
run;
but this just prints out the 5:th amount and not 1.. TO .. 5! and I really don't know how to select the top 5 of EACH store . I think a kind of group by would be a perfect fit. But first things, first. How do I selct i= 1...5 ? And not just = 5?
There is also way of doing it with proc sql:
data have;
input store$ amount;
datalines;
A 100
A 200
A 300
A 400
A 500
A 600
A 700
B 1000
B 1100
C 1200
C 1300
C 1400
D 600
D 700
E 1000
E 1100
F 1200
;
run;
proc sql outobs=4; /* limit to first 10 results */
select store, sum(amount) as TOTAL_AMT
from have
group by 1
order by 2 desc; /* order them to the TOP selection*/
quit;
The data step sum(,) function adds up its arguments. If you only give it one argument then there is nothing to actually sum so it just returns the input value.
data calc1;
do _n_=1 by 1 until(last.store);
set sorted_store;
by store;
if _n_ <= 5 then Sum_5Largest_Losses=sum(Sum_5Largest_Losses,amount);
end;
run;
I would highly recommend learning the basic methods before getting into DOW loops.
Add a counter so you can find the first 5 of each store
As the data step loops the sum accumulates
Output sum for counter=5
proc sort data= store out=sorted_store;
by store descending amount;
run;
data calc1;
set sorted_store;
by store;
*if first store then set counter to 1 and total sum to 0;
if first.store then do ;
counter=1;
total_sum=0;
end;
*otherwise increment the counter;
else counter+1;
*accumulate the sum if counter <= 5;
if counter <=5 then total_sum = sum(total_sum, amount);
*output only when on last/5th record for each store;
if counter=5 then output;
run;

SAS code (Change from Baseline time Point)

In a clinical trial, Systolic and diastolic blood pressure are measured pre-dose (0 hr) and at 1,2,4,8 hour post- dose.
Twelve subjects were studied. The SAS dataset has the following structure
Variable-Vol Length - 8 Label- Subject Number
Variable- Ntime Length- 8 Label Nominal time post-dose (hours)
Variable- Sups Length- 8 Label- Supine Systolic BP (mmHg)
What SAS code could I use to calculate the change from baseline (Oh) at each time point, and then calculate the mean, minimum, maximum change from baseline for the 12 subjects? Edit: This is what I've tried so far
data postbase;
do until (last.vol);
*** Only keep pre-dose values;
set save.vitals (where=(not(ntime <= 0 )));
by Vol Ntime;
if Ntime <= 0 then bl = Sups;
else do;
chgbl = Sups - bl;
output;
end;
end;
run;
data postbase;
set save.vitals;
by subject time volume;
retain baseline;
if time=0 then baseline=volume;
else change = volume - baseline;
run;
I think your code is too complex by far and I couldn't parse your variable names so just made them up.
I set baseline volume whenever time = 0 and then do the change every other time.
RETAIN causes the value to stay until it's reset. If you have times that may not be 0 or missing baseline then you may need to modify the query.

Monthly and Annual Compound Interest in SAS using DO Loop

I am studying SAS on my own. I have no one to refer to so I just wanted to check if my code is correct.
In a fixed term deposit of 25 years calculate the total amount at the end of
term with initial amount of $5,00,000 and annual interest rate of 7 % */
1) Compounded Annually
2) Compounded Monthly.Show the amount at monthly level
My Code:
data deposit;
amount = 500000;
rate = 0.07;
do year = 1 to 25;
amount + earned;
earned + (amount*0.07);
principal = amount + earned;
output;
end;
run;
For the second question compounded monthly
data deposit1;
rate = 0.006;
amount1 = 500000;
do year = 1 to 25;
do month = 1 to 12;
earned1 + (earned1 + amount1)*0.006;
amount1 + earned1;
output;
end;
end;
run;
Pasting the Screenshots of Solution 1 and Solution 2
I am confused because when I compound annually and monthly both have different results at the end of a particular year.
Please suggest if anything is wrong in my code. Thank you for your time and attention.
It looks like you are double-counting your earned1 variable in the monthly compounding code.
earned1 + (earned1 + amount1)*0.006;
amount1 + earned1;
Should be:
earned1 = amount1*0.07**(1/12);
amount1 + earned1;
Note also you will not want to round the interest rate.

Assign missing variables values based on distribution SAS

I would like to assign IDs with blank Sizes a size based on the frequency distribution of their Group.
Dataset A contains a snapshot of my data:
ID Group Size
1 A Large
2 B Small
3 C Small
5 D Medium
6 C Large
7 B Medium
8 B -
Dataset B shows the frequency distribution of the Sizes among the Groups:
Group Small Medium Large
A 0.31 0.25 0.44
B 0.43 0.22 0.35
C 0.10 0.13 0.78
D 0.29 0.27 0.44
For ID 8, we know that it has a 43% probability of being "small", a 22% probability of being "medium" and a 35% probability of being "large". That's because these are the Size distributions for Group B.
How do I assign ID 8 (and other blank IDs) a Size based on the Group distributions in Dataset B? I'm using SAS 9.4. Macros, SQL, anything is welcome!
The table distribution is ideal for this. The last datastep here shows that; before that I set things up to create the data at random and determine the frequency table, so you can skip that if you already do that.
See Rick Wicklin's blog about simulating multinomial data for an example of this in other use cases (and more information about the function).
*Setting this up to help generate random data;
proc format;
value sizef
low - 1.3 = 'Small'
1.3 <-<2.3 = 'Medium'
2.3 - high = 'Large'
;
quit;
*Generating random data;
data have;
call streaminit(7);
do id = 1 to 1e5;
group = byte(65+rand('Uniform')*4); *A = 65, B = 66, etc.;
size = put((rank(group)-66)*0.5 + rand('Uniform')*3,sizef.); *Intentionally making size somewhat linked to group to allow for differences in the frequency;
if rand('Uniform') < 0.05 then call missing(size); *A separate call to set missingness;
output;
end;
run;
proc sort data=have;
by group;
run;
title "Initial frequency of size by group";
proc freq data=have;
by group;
tables size/list out=freq_size;
run;
title;
*Transpose to one row per group, needed for table distribution;
proc transpose data=freq_size out=table_size prefix=pct_;
var percent;
id size;
by group;
run;
data want;
merge have table_size;
by group;
array pcts pct_:; *convenience array;
if first.group then do _i = 1 to dim(pcts); *must divide by 100 but only once!;
pcts[_i] = pcts[_i]/100;
end;
if missing(size) then do;
size_new = rand('table',of pcts[*]); *table uses the pcts[] array to tell SAS the table of probabilities;
size = scan(vname(pcts[size_new]),2,'_');
end;
run;
title "Final frequency of size by group";
proc freq data=want;
by group;
tables size/list;
run;
title;
You can also do this with a random value and some if-else logic:
proc sql;
create table temp_assigned as select
a.*, rand("Uniform") as random_roll, /*generate a random number from 0 to 1*/
case when missing(size) then
case when calculated random_roll < small then small
when calculated random_roll < sum(small, medium) then medium
when calculated random_roll < sum(small, medium, large) then large
end end as value_selected, /*pick the value of the size associated with that value in each group*/
coalesce(case when calculated value_selected = small then "Small"
when calculated value_selected = medium then "Medium"
when calculated value_selected = large then "Large" end, size) as group_assigned /*pick the value associated with that size*/
from temp as a
left join freqs as b
on a.group = b.group;
quit;
Obviously you can do this without creating the value_selected variable, but I thought showing it for demonstrative purposes would be helpful.

Determine rates of change for different groups

I have a SAS issue that I know is probably fairly straightforward for SAS users who are familiar with array programming, but I am new to this aspect.
My dataset looks like this:
Data have;
Input group $ size price;
Datalines;
A 24 5
A 28 10
A 30 14
A 32 16
B 26 10
B 28 12
B 32 13
C 10 100
C 11 130
C 12 140
;
Run;
What I want to do is determine the rate at which price changes for the first two items in the family and apply that rate to every other member in the family.
So, I’ll end up with something that looks like this (for A only…):
Data want;
Input group $ size price newprice;
Datalines;
A 24 5 5
A 28 10 10
A 30 14 12.5
A 32 16 15
;
Run;
The technique you'll need to learn is either retain or diff/lag. Both methods would work here.
The following illustrates one way to solve this, but would need additional work by you to deal with things like size not changing (meaning a 0 denominator) and other potential exceptions.
Basically, we use retain to cause a value to persist across records, and use that in the calculations.
data want;
set have;
by group;
retain lastprice rateprice lastsize;
if first.group then do;
counter=0;
call missing(of lastprice rateprice lastsize); *clear these out;
end;
counter+1; *Increment the counter;
if counter=2 then do;
rateprice=(price-lastprice)/(size-lastsize); *Calculate the rate over 2;
end;
if counter le 2 then newprice=price; *For the first two just move price into newprice;
else if counter>2 then newprice=lastprice+(size-lastsize)*rateprice; *Else set it to the change;
output;
lastprice=newprice; *save the price and size in the retained vars;
lastsize=size;
run;
Here a different approach that is obviously longer than Joe's, but could be generalized to other similar situations where the calculation is different or depends on more values.
Add a sequence number to your data set:
data have2;
set have;
by group;
if first.group the seq = 0;
seq + 1;
run;
Use proc reg to calculate the intercept and slope for the first two rows of each group, outputting the estimates with outest:
proc reg data=have2 outest=est;
by group;
model price = size;
where seq le 2;
run;
Join the original table to the parameter estimates and calculate the predicted values:
proc sql;
create table want as
select
h.*,
e.intercept + h.size * e.size as newprice
from
have h
left join est e
on h.group = e.group
order by
group,
size
;
quit;