Here is my code
%macro redemptions1(startdate, enddate, sd, ed, sunday1, sunday2);
data _null_;
%put &startdate;
run;
%mend redemptions1;
data _null_;
format tday date9.;
format sd date9.;
format ed date9.;
tday=today();
if weekday(tday) = 1 then do; ed = intnx('day',tday,-9); sd = intnx('day',tday,-15);end;
if weekday(tday) = 2 then do; ed = intnx('day',tday,-3); sd = intnx('day',tday,-9);end;
if weekday(tday) = 3 then do; ed = intnx('day',tday,-4); sd = intnx('day',tday,-10);end;
if weekday(tday) = 4 then do; ed = intnx('day',tday,-5); sd = intnx('day',tday,-11);end;
if weekday(tday) = 5 then do; ed = intnx('day',tday,-6); sd = intnx('day',tday,-12);end;
if weekday(tday) = 6 then do; ed = intnx('day',tday,-7); sd = intnx('day',tday,-13);end;
if weekday(tday) = 7 then do; ed = intnx('day',tday,-8); sd = intnx('day',tday,-14);end;
startdate = (year(sd) - 1900) * 10000 + month(sd) * 100 + day(sd);
enddate = (year(ed) - 1900) * 10000 + month(ed) * 100 + day(ed);
sunday1 = year(intnx('day',sd,-6))*10000+month(intnx('day',sd,-6))*100+day(intnx('day',sd,-6));
sunday2 = year(intnx('day',sd,1))*10000+month(intnx('day',sd,1))*100+day(intnx('day',sd,1));
%redemptions1(startdate,enddate,sd,ed,sunday1,sunday2);
run;
If i pass values through the variables startdate,enddate etc, The redemeptions1 macro just prints 'startdate' instead of actually printing the value of startdate. How do I get it to print the value contained in the variable(s)?
Thanks!
You need to construct a call to the macro as a text string and then tell SAS to execute it using either CALL EXECUTE or the DOSUBL function.
The parameters on a macro call need to be literal text giving the values you want. Your call in the data step starts %redemptions1(startdate,... so the first parameter is the literal text startdate and that's what the macro prints. Instead, you could do something like:
myCall = '%redemptions1(' || startdate || ')';
call execute(myCall);
This construct the necessary call - something like %redemptions1(09MAR2017) - and then executes it. You could of course do this in one line:
call execute('%redemptions1(' || startdate || ')');
You'll need to fill in the values of the other parameters, of course.
Your date calculations look a bit sketchy, by the way - startdate and enddate may not contain the values you think they do. Please look up the dhms function to see if that might help. You're creating a number like '1170309' for today's date - 1 million, 170 thousand, 3 hundred and 9. The year value is very odd - do you really want 117 (2017 - 1900)?. If you ask SAS to handle that value as a date, it will treat it as a number of days since 01JAN1960, which would be some date way in the future.
Related
I have a column for dollar-amount that I need to break apart into $1000 segments - so $0-$999, $1,000-$1,999, etc.
I could use Case/When, but there are an awful lot of groups I would have to make.
Is there a more efficient way to do this?
Thanks!
You could just use arithmetic. For example you could convert them to upper limit of the $1,000 range.
up_to = 1000*ceil(dollar/1000);
Let's make up some example data:
data test;
do dollar=0 to 5000 by 500 ;
up_to = 1000*ceil(dollar/1000);
output;
end;
run;
Results:
Obs dollar up_to
1 0 0
2 500 1000
3 1000 1000
4 1500 2000
5 2000 2000
6 2500 3000
7 3000 3000
8 3500 4000
9 4000 4000
10 4500 5000
11 5000 5000
Absolutely. This is a great use case for user-defined formats.
proc format;
value segment
0-<1000 = '0-1000'
1000-<2000 = '1000s'
2000-<3000 = '2000s'
;
quit;
If the number is too high to write out, do it with code!
data segments;
retain
fmtname 'segment'
type 'n' /* numeric format */
eexcl 'Y' /* exclude the "end" match, so 0-1000 excluding 1000 itself */
;
do start = 0 to 1e6 by 1000;
end = start + 1000;
label = catx('- <',start,end); * what you want this to show up as;
output;
end;
run;
proc format cntlin=segments;
quit;
Then you can use segment = put(dollaramt,segment.); to assign the value of segment, or just apply the format format dollaramt segment.; if you're just using it in PROC SUMMARY or somesuch.
And you can combine the two approaches above to generate a User Defined Format that will bin the amounts for you.
Create bins to set up a user defined format. One drawback of this method is that it requires you to know the range of data ahead of time.
Use a user defined function via PROC FCMP.
Use a manual calculation
I illustrate version of the solution for 1 & 3 below. #2 requires PROC FCMP but I think using it a plain data step can be simpler.
data thousands_format;
fmtname = 'thousands_fmt';
type = 'N';
do Start = 0 to 10000 by 1000;
END = Start + 1000 - 1;
label = catx(" - ", put(start, dollar12.0), put(end, dollar12.0));
output;
end;
run;
proc format cntlin=thousands_format;
run;
data demo;
do i=100 to 10000 by 50;
custom_format = put(i, thousands_fmt.);
manual_format = catx(" - ", put(floor(i/1000)*1000, dollar12.0), put((ceil(i/1000))*1000-1, dollar12.0));
output;
end;
run;
In the Data Step of SAS, you get value of a Column by directly using its name, for example, like this,
name = col1;
But for some reason, I want to get value of a column where column is represented by a string. For example, like this,
name = get_value_of_column(cats("col", i))
Is this possible? And if so, how?
The DATA Step functions VVALUE and VVALUEX will return the formatted value of a variable.
VVALUE(<variable-name>) static, a step compilation time interaction
VVALUEX(<expression>) dynamic, a runtime expression resolving to a variable name
The actual value of the variable can be dynamically obtained via a _type_ array scan
Array Scan
data have;
input name $ x y z (s t u) ($) date: yymmdd10.;
format s t u $upcase. date yymmdd10.;
datalines;
x 1 2 3 a b c 2020-10-01
y 2 3 4 b c d 2020-10-02
z 3 4 5 c d e 2020-10-03
s 4 5 6 hi ho silver 2020-10-04
t 5 6 7 aa bb cc 2020-10-05
u 6 7 8 -- ** !! 2020-10-06
date 7 8 9 ppp qqq rrr 2020-10-07
;
data want;
set have;
length u_vvalue name_vvaluex $20.;
u_vvalue = vvalue(u);
name_vvaluex = vvaluex(name);
array nums _numeric_;
array chars _character_;
/* NOTE:
* variable based arrays cause automatic variable _i_ to be in the PDV
* and _i_ will be automatically dropped from output data sets
*/
do _i_ = 1 to dim(nums);
if upcase(name) = upcase(vname(nums(_i_))) then do;
name_numeric_raw = nums(_i_);
leave;
end;
end;
do _i_ = 1 to dim(chars);
if upcase(name) = upcase(vname(chars(_i_))) then do;
name_character_raw = chars(_i_);
leave;
end;
end;
run;
If you perform an 'excessive' amount of dynamic value lookup in your DATA Step a transposition could possibly lead to simpler processing.
I'd like to calculate the number of patients currently within an Emergency Room by hour and I'm having trouble conceptualizing an efficient code.
I have two time variables, 'Check In Time' and 'Release Time'. These date/time variables are obviously arbitrary and the 'release time' variable will come after the 'check in time variable'.
I would like the output for a given day to look something like this:
Hour Midnight 1am 2am 3am 4am.....
# of Pts 34 56 89 23 29
So for example, at 1am there were 56 patients currently in the ED -when considering both checkin and release times.
My initial thought is to:
1) round the time variables
2) Write a code a code the looks something like this...
data EDTimesl;
set EDDATA;
if checkin = '1am' and release = '2am' then OneAMToTwoAM = 1;
if checkin = '1am' and release = '3am' then OneAMToTwoAM = 1;
if checkin = '1am' and release = '3am' then TwoAMToThreeAM = 1;
....
run;
This, however, gives me pause because I feel there is a more efficient method!
Thanks in advance!
I found a code online that might answer the question! Please see below:
data have (keep=admitdate disdate);
/* generate some admission and discharge date time variables*/
year=2015; /* for example all of the admits are in 2015*/
format admitdate disdate datetime20.;
do day= 1 to 20;
do month=1 to 12;
hour = floor(24*ranuni(4445));
min = floor(50*ranuni(1234));
date = mdy(month,day,2015);
admitdate=dhms(date,hour,min,0);
/* random duration of stay*/
duration = 60 + floor(3000*ranuni(7777));
disdate = intnx('minute',admitdate,duration);
output;
end;
end;
run;
data occupancy;
set have;
format admitdate disdate datetime20.;
Do Occupanthour = (dhms(datepart(admitdate),hour(admitdate),0,0)) to
dhms(datepart(disdate),hour(disdate),0,0) by 3600;
HourOfDay = hour(OccupantHour);
DayOfWeek = Weekday(datepart(OccupantHour));
output;
end;
format OccupantHour datetime20.;
run;
Proc freq data=occupancy;
Tables HourOfDay;
run;
proc tabulate data=occupancy;
class DayOfWeek;
class HourOfDay;
tables HourOfDay,
(DayOfWeek All)*n;
run;
I have 5 columns .The columns are
date
stock[a,b,c,d,.]
qty_in[fixed number as in 10 qty came in for the stock on 1/1/2015]
qty_out[ went out /or got sold]
final_qty(qty_in -qty_out)
There are over 100 stocks and transaction for over 6 months duration,thus for the stocks on each day[for example,qty_in on 2/1/2015 is 10 then it should display the value of qty_in as sum of qty_in on 2/1/2015 +final_qty on 1/1/2015]for the same stock ] . How can i achieve this with sas.
Run this in sas
data testfile;
input date $ 1-10 stock $ 11-16 qty_in $17-20 qty_out $21-23 final_qty $24-26;
datalines;
1/1/2015 a 10 0 10
1/1/2015 b 20 4 16
1/1/2015 c 32 23 9
2/1/2015 a 10 /*this value should be= qty_in(2/1/2015 + final_qty 1/1/2015 i.e. 10+10=20*/
2/1/2015 b 20 /*this should be 20+16=36*/
2/1/2015 c 32
;
if you want to do this in a data step you first need to sort the data set by stock and by date. Also, start with just 4 columns and will compute the final col in the data set:
data stockout5;
set stockin4;
retain FIN_QTY;
by stock date;
if (first.stock) then FIN_QTY = INQTY - OUTQTY;
else FIN_QTY = FIN_QTY + INQTY - OUTQTY;
run;
let me know if this works for you. If you supply some test data with what you are starting with and what you want to end up with it would help. Your question is fine but it's not very clear unless you've worked with financial data before (imo)
From start to finish this should do what you're looking for. It's pretty straight forward let me know if you don't understand something. Note that 0 is added in for missing out values.
Data stock4;
format date date9.;
date = '1jan2015'd;
stock = "a";
in = 10;
out = 0 ;
output;
date = "1jan2015"d;
stock = "b";
in = 20;
out = 4;
output;
date = "1jan2015"d;
stock ="c";
in =32;
out=23;
output;
date="2jan2015"d;
stock = "a";
in = 10;
out=0;
output ;
date="2jan2015"d;
stock ="b";
in = 20;
out=0;
output;
date ="2jan2015"d;
stock = "c";
in=32;
out=0;
output;
run;
proc sort data=stock4;
by stock date;
run;
data stock5;
set stock4;
retain FIN_QTY;
by stock date;
if (first.stock) then FIN_QTY = IN - OUT;
else FIN_QTY = FIN_QTY + IN - OUT;
run;
I have a dataset with from and to dates of registration for a group of users. I would like to programmatically find which months lie in between those dates for each user, without having to hard code in any months, etc. I only want a summary of numbers registered in each month, so if that makes it quicker, so much the better.
E.g. I have something like
User-+-From-------+-To-----------------
A + 11JAN2011 + 15MAR2011
A + 16JUN2011 + 17AUG2011
B + 10FEB2011 + 12FEB2011
C + 01AUG2011 + 05AUG2011
And I want something like
Month---+-Registrations
JAN2011 + 1 (A)
FEB2011 + 2 (AB)
MAR2011 + 1 (A)
APR2011 + 0
MAY2011 + 0
JUN2011 + 1 (A)
JUL2011 + 1 (A)
AUG2011 + 2 (AC)
Note I don't need the bit in brackets; that was just to try and clarify my point.
Thanks for any help.
One easy way is to construct an intermediate dataset and then PROC FREQ.
data have;
informat from to DATE9.;
format from to DATE9.;
input user $ from to;
datalines;
A 11JAN2011 15MAR2011
A 16JUN2011 17AUG2011
B 10FEB2011 12FEB2011
C 01AUG2011 05AUG2011
;;;;
run;
data int;
set have;
_mths=intck('month',from,to,'d'); *number of months after the current one (0=current one). 'd'=discrete=count 1st of month as new month;
do _i = 0 to _mths; *start with current month, iterate over months;
month = intnx('month',from,_i,'b');
output;
end;
format month MONYY7.;
run;
proc freq data=int;
tables month/out=want(keep=month count rename=count=registrations);
run;
You can eliminate the _mths step by doing that in the do loop.