Converting SAS custom user format to macro program - sas

I have the following code that works properly in my SAS program to subset needed dates to shift hourly data but I need to convert to a macro so that I can call it for multiple data sets. I have very little experience in macro programming so any help would be appreciated.
%let yr_beg=2007;
%let yr_end=2020;
data DST_FMT(drop=year);
attrib hlo length=$1
start dst_start end format=mmddyy10.;
fmtname="dst";
type="N";
do year="&yr_beg" to "&yr_end";
start= nwkdom(2, 1, 3, year)+1;
end= nwkdom(1, 1, 11, year);
dst_start= start - 1;
label='*';
output;
end;
start=.;end=.;
hlo="O";
label='';
output;
run;
proc format cntlin=DST_FMT; run;

Here's how you can convert your current program to a macro for calling.
This is not needed though, usually a %INCLUDE would be more appropriate for something like this since it's usually only done once.
You also do not need the quotes around the macro variable.
%macro generate_date_format(yr_beg= , yr_end);
data DST_FMT(drop=year);
attrib hlo length=$1
start dst_start end format=mmddyy10.;
fmtname="dst";
type="N";
do year=&yr_beg to &yr_end;
start= nwkdom(2, 1, 3, year)+1;
end= nwkdom(1, 1, 11, year);
dst_start= start - 1;
label='*';
output;
end;
start=.;end=.;
hlo="O";
label='';
output;
run;
proc format cntlin=DST_FMT; run;
%mend;
Execute macro:
%generate_date_format(yr_beg = 2008, yr_end = 2021);
Macro tutorial references:
https://stats.idre.ucla.edu/sas/seminars/sas-macros-introduction/
Sample macros from documentation:
https://communities.sas.com/t5/SAS-Communities-Library/SAS-9-4-Macro-Language-Reference-Has-a-New-Appendix/ta-p/291716
Macro documentation:
https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/mcrolref/titlepage.htm

Related

Matching SAS character variables to a list

So I have a vector of search terms, and my main data set. My goal is to create an indicator for each observation in my main data set where variable1 includes at least one of the search terms. Both the search terms and variable1 are character variables.
Currently, I am trying to use a macro to iterate through the search terms, and for each search term, indicate if it is in the variable1. I do not care which search term triggered the match, I just care that there was a match (hence I only need 1 indicator variable at the end).
I am a novice when it comes to using SAS macros and loops, but have tried searching and piecing together code from some online sites, unfortunately, when I run it, it does nothing, not even give me an error.
I have put the code I am trying to run below.
*for example, I am just testing on one of the SASHELP data sets;
*I take the first five team names to create a search list;
data terms; set sashelp.baseball (obs=5);
search_term = substr(team,1,3);
keep search_term;;
run;
*I will be searching through the baseball data set;
data test; set sashelp.baseball;
run;
%macro search;
%local i name_list next_name;
proc SQL;
select distinct search_term into : name_list separated by ' ' from work.terms;
quit;
%let i=1;
%do %while (%scan(&name_list, &i) ne );
%let next_name = %scan(&name_list, &i);
*I think one of my issues is here. I try to loop through the list, and use the find command to find the next_name and if it is in the variable, then I should get a non-zero value returned;
data test; set test;
indicator = index(team,&next_name);
run;
%let i = %eval(&i + 1);
%end;
%mend;
Thanks
Here's the temporary array solution which is fully data driven.
Store the number of terms in a macro variable to assign the length of arrays
Load terms to search into a temporary array
Loop through for each word and search the terms
Exit loop if you find the term to help speed up the process
/*1*/
proc sql noprint;
select count(*) into :num_search_terms from terms;
quit;
%put &num_search_terms.;
data flagged;
*declare array;
array _search(&num_search_terms.) $ _temporary_;
/*2*/
*load array into memory;
if _n_ = 1 then do j=1 to &num_search_terms.;
set terms;
_search(j) = search_term;
end;
set test;
*set flag to 0 for initial start;
flag = 0;
/*3*/
*loop through and craete flag;
do i=1 to &num_search_terms. while(flag=0); /*4*/
if find(team, _search(i), 'it')>0 then flag=1;
end;
drop i j search_term ;
run;
Not sure I totally understand what you are trying to do but if you want to add a new binary variable that indicates if any of the substrings are found just use code like:
data want;
set have;
indicator = index(term,'string1') or index(term,'string2')
... or index(term,'string27') ;
run;
Not sure what a "vector" would be but if you had the list of terms in a dataset you could easily generate that code from the data. And then use %include to add it to your program.
filename code temp;
data _null_;
set term_list end=eof;
file code ;
if _n_ =1 then put 'indicator=' # ;
else put ' or ' #;
put 'index(term,' string :$quote. ')' #;
if eof then put ';' ;
run;
data want;
set have;
%include code / source2;
run;
If you did want to think about creating a macro to generate code like that then the parameters to the macro might be the two input dataset names, the two input variable names and the output variable name.

load values from datasets into arrays and use them in a datastep

I have 5 separate datasets(actually many more but i want to shorten the code) named dk33,dk34,dk35,dk51,dk63, each dataset contains a numeric field: surv_probs. I would like to load the values into 5 arrays and then use the arrays in a datastep(result), however, I need advice what is the best way to do it.
I am getting error when I use the macro: setarrays: (code below)
WARNING: The quoted string currently being processed has become more than 262 characters long. You might have unbalanced quotation
marks.
WARNING: The quoted string currently being processed has become more than 262 characters long. You might have unbalanced quotation
marks.
ERROR: Illegal reference to the array dk33_arr.
Here is the main code.
%let var1 = dk33;
%let var2 = dk34;
%let var3 = dk35;
%let var4 = dk51;
%let var5 = dk63;
%let varN = 5;
/*put length of each column into macro variables */
%macro getlength;
%do i=1 %to &varN;
proc sql noprint;
select count(surv_probs)
into : &&var&i.._rows
from work.&&var&i;
quit;
%end;
%mend;
/*load values of column:surv_probs into macro variables*/
%macro readin;
%do i=1 %to &varN;
proc sql noprint;
select surv_probs
into: &&var&i.._list separated by ","
from &&var&i;
quit;
%end;
%mend;
data _null_;
call execute('%readin');
call execute('%getlength');
run;
/* create arrays*/
%macro setarrays;
%do i=1 %to 1;
j=1;
array &&var&i.._arr{&&&&&&var&i.._rows};
do while(scan("&&&&&&var&i.._list",j,",") ne "");
&&var&i.._arr = scan("&&&&&&var&i.._list",j,",");
j=j+1;
end;
%end;
%mend;
data result;
%setarrays
put dk33_arr(1);
* some other statements where I use the arrays*
run;
Answer to toms question:
*macro getlength(when executed) creates 5 macro variables named: dk33_rows,dk34_rows,dk35_rows,dk51_rows,dk63_rows
*the macro readin(when executed):creates 5 macro variables dk33_list,dk34_list,dk35_list,dk51_list,dk63_list. Each containing a string which is comma separates the values from the column: eg.: 0.99994,0.1999,0.1111
*the macro setarrays creates 5 arrays,when executed, dk33_arr,dk34_arr,... holding the parsed values from the macro variables created by readin
I find that "macro arrays" like VAR1,VAR2,.... are generally more trouble than they are worth. Either keep your list of dataset names in an actual dataset and generate code from that. Or if the list is short enough put the list into a single macro variable and use %SCAN() to pull out the items as you need them.
But either way it is also better to avoid trying to write macro code that needs more than three &'s. Build up the reference in multiple steps. Build a macro variable that has the name of the macro you want to reference and then pull the value of that into another macro variable. It might take more lines of code, but you can more easily understand what is happening.
%let i=1 ;
%let mvarname=var&i;
%let dataset_name=&&&mvarname;
Before you begin using macro code (or other code generation techniques) make sure you know what code you are trying to generate. If you want to load a variable into a temporary array you can just use a DO loop. There is no need to macro code, or copying values, or even counts, into macro variables. For example instead of getting the count of the observations you could just make your temporary array larger than you expect to ever need.
data test1 ;
if _n_=1 then do;
do i=1 to nobs_dk33;
array dk33 (1000) _temporary_;
set dk33 nobs=nobs_dk33 ;
dk33(i)=surv_probs;
end;
do i=1 to nobs_dk34;
array dk34 (1000) _temporary_;
set dk34 nobs=nobs_dk34 ;
dk34(i)=surv_probs;
end;
end;
* What ever you are planning to do with the DK33 and DK34 arrays ;
run;
Or you could transpose the dataset first.
proc transpose data=dk33 out=dk33_t prefix=dk33_ ;
var surv_probs ;
run;
Then your later step is easier since you can just use a SET statement to read in the one observation that has all of the values.
data test;
if _n_=1 then do;
set dk33_t ;
array dk33 dk33_: ;
end;
....
run;

Infinite Loop while using Do while in SAS

I am very new but keen to learn SAS coding.I have 2 data sets a and b namely dt1 and dt2 which consist of columns a for dt1 and b and c for dt2:
a b c
2014 2008 2
2009 3
2014 4
2015 5
I am trying to get the nth row of the c column when the element which is at nth row of b column is equal to a(1)
Here it is c=4;
I wrote a code below.
DATA dt1;
set dt1;
data dt2;
set dt2;
i=1;
do while (b ne a);
i=i+1;
end;
call symput('ROW_NUMBER',i);
run;
proc print data = dt2(keep = c obs = &ROW_NUMBER firstobs = &ROW_NUMBER);
run;
but this code enters in an infinite loop and I could not find any solution for this. I appreciate if you help solve this issue.
Thanks
I think you should learn the basic syntax of the data step before trying to use macro variables. A lot of what you're doing makes little sense. Here is an explanation of how the data step works. You will do yourself a huge favor if you study that.
Here's how to do an inner join in proc sql, which seems to be more in line with your goal here. This simply selects the values of c where dt1.a is equal to dt2.b:
proc sql;
select c
from dt1 inner join dt2 on dt1.a = dt2.b;
quit;
If you were to use a data step, you'd do something like this the following.
data out(keep=c);
set dt1;
do until (a=b or eof);
set dt2 end=eof;
if a=b then output;
end;
run;
proc print data=out noobs;
run;
Use the end= option to create temporary variable eof which allows you to end the loop after the last row of dt2 is read.
This is a simple MERGE. You just need to rename the variables to match. This assumes they're both sorted by the value (a/b). You can then set the macro variable in that data step or do whatever you want.
data want;
merge dt1(in=_a rename=a=b) dt2(in=_b);
by b;
if _a and _b;
call symput("ROW_NUMBER",c);
run;
If you want to define macro variables:
data _null_;
set dt2;
if _n_=1 then set dt1;
if a=b then do;
call symput('c_val',c);
call symput('row_num',_n_);
end;
run;
%put &row_num &c_val;

Macro returning a value

I created the following macro. Proc power returns table pw_cout containing column Power. The data _null_ step assigns the value in column Power of pw_out to macro variable tpw. I want the macro to return the value of tpw, so that in the main program, I can call it in DATA step like:
data test;
set tmp;
pw_tmp=ttest_power(meanA=a, stdA=s1, nA=n1, meanB=a2, stdB=s2, nB=n2);
run;
Here is the code of the macro:
%macro ttest_power(meanA=, stdA=, nA=, meanB=, stdB=, nB=);
proc power;
twosamplemeans test=diff_satt
groupmeans = &meanA | &meanB
groupstddevs = &stdA | &stdB
groupns = (&nA &nB)
power = .;
ods output Output=pw_out;
run;
data _null_;
set pw_out;
call symput('tpw'=&power);
run;
&tpw
%mend ttest_power;
#itzy is correct in pointing out why your approach won't work. But there is a solution maintaing the spirit of your approach: you need to create a power-calculation function uisng PROC FCMP. In fact, AFAIK, to call a procedure from within a function in PROC FCMP, you need to wrap the call in a macro, so you are almost there.
Here is your macro - slightly modified (mostly to fix the symput statement):
%macro ttest_power;
proc power;
twosamplemeans test=diff_satt
groupmeans = &meanA | &meanB
groupstddevs = &stdA | &stdB
groupns = (&nA &nB)
power = .;
ods output Output=pw_out;
run;
data _null_;
set pw_out;
call symput('tpw', power);
run;
%mend ttest_power;
Now we create a function that will call it:
proc fcmp outlib=work.funcs.test;
function ttest_power_fun(meanA, stdA, nA, meanB, stdB, nB);
rc = run_macro('ttest_power', meanA, stdA, nA, meanB, stdB, nB, tpw);
if rc = 0 then return(tpw);
else return(.);
endsub;
run;
And finally, we can try using this function in a data step:
options cmplib=work.funcs;
data test;
input a s1 n1 a2 s2 n2;
pw_tmp=ttest_power_fun(a, s1, n1, a2, s2, n2);
cards;
0 1 10 0 1 10
0 1 10 1 1 10
;
run;
proc print data=test;
You can't do what you're trying to do this way. Macros in SAS are a little different than in a typical programming language: they aren't subroutines that you can call, but rather just code that generate other SAS code that gets executed. Since you can't run proc power inside of a data step, you can't run this macro from a data step either. (Just imagine copying all the code inside the macro into the data step -- it wouldn't work. That's what a macro in SAS does.)
One way to do what you want would be to read each observation from tmp one at a time, and then run proc power. I would do something like this:
/* First count the observations */
data _null_;
call symputx('nobs',obs);
stop;
set tmp nobs=obs;
run;
/* Now read them one at a time in a macro and call proc power */
%macro power;
%do j=1 %to &nobs;
data _null_;
nrec = &j;
set tmp point=nrec;
call symputx('meanA',meanA);
call symputx('stdA',stdA);
call symputx('nA',nA);
call symputx('meanB',meanB);
call symputx('stdB',stdB);
call symputx('nB',nB);
stop;
run;
proc power;
twosamplemeans test=diff_satt
groupmeans = &meanA | &meanB
groupstddevs = &stdA | &stdB
groupns = (&nA &nB)
power = .;
ods output Output=pw_out;
run;
proc append base=pw_out_all data=pw_out; run;
%end;
%mend;
%power;
By using proc append you can store the results of each round of output.
I haven't checked this code so it might have a bug, but this approach will work.
You can invoke a macro which calls procedures, etc. (like the example) from within a datastep using call execute(), but it can get a bit messy and difficult to debug.

How can I make a character variable equal to the formatted value of a numeric variable for arbitrary SAS formats?

If I have a numeric variable with a format, is there a way to get the formatted value as a character variable?
e.g. I would like to write something like the following to print 10/06/2009 to the screen but there is no putformatted() function.
data test;
format i ddmmyy10.;
i = "10JUN2009"d;
run;
data _null_;
set test;
i_formatted = putformatted(i); /* How should I write this? */
put i_formatted;
run;
(Obviously I can write put(i, ddmmyy10.), but my code needs to work for whatever format i happens to have.)
The VVALUE function formats the variable passed to it using the format associated with the variable. Here's the code using VVALUE:
data test;
format i ddmmyy10.;
i = "10JUN2009"d;
run;
data _null_;
set test;
i_formatted = vvalue(i);
put i_formatted;
run;
While cmjohns solution is slightly faster than this code, this code is simpler because there are no macros involved.
Use vformat() function.
/* test data */
data test;
i = "10jun2009"d;
format i ddmmyy10.;
run;
/* print out the value using the associated format */
data _null_;
set test;
i_formatted = putn(i, vformat(i));
put i_formatted=;
run;
/* on log
i_formatted=10/06/2099
*/
This seemed to work for a couple that I tried. I used VARFMT and a macro function to retrieve the format of the given variable.
data test;
format i ddmmyy10. b comma12.;
i = "10JUN2009"d;
b = 123405321;
run;
%macro varlabel(variable) ;
%let dsid=%sysfunc(open(&SYSLAST.)) ;
%let varnum=%sysfunc(varnum(&dsid,&variable)) ;
%let fmt=%sysfunc(varfmt(&dsid,&varnum));
%let dsid=%sysfunc(close(&dsid)) ;
&fmt
%mend varlabel;
data test2;
set test;
i_formatted = put(i, %varlabel(i) );
b_formatted = put(b, %varlabel(b) );
put i_formatted=;
put b_formatted=;
run;
This gave me:
i_formatted=10/06/2009
b_formatted=123,405,321
I can do this with macro code and sashelp.vcolumn but it's a bit fiddly.
proc sql noprint;
select trim(left(format)) into :format
from sashelp.vcolumn
where libname eq 'WORK' and memname eq 'TEST';
run;
data test2;
set test;
i_formatted = put(i, &format);
put i_formatted;
run;
Yes, there is a putformatted() function. In fact, there are two: putc() and putn(). Putc handles character formats, putn() numeric. Your code will need to look at the format name (all and only character formats start with "$") do determine which to use. Here is the syntax of putc (from the interactive help):
PUTC(source, format.<,w>)
Arguments
source
is the SAS expression to which you want to apply the format.
format.
is an expression that contains the character format you want to apply to source.
w
specifies a width to apply to the format.
Interaction: If you specify a width here, it overrides any width specification
in the format.