Dynamically create the "KEEP" List in SAS - sas

I am trying to create a keep list dynamically. Say:
%MACRO TEST(A=,B=,OUT_VAR=,KEEP_VAR=);
&OUT_VAR=MAX(&A,&B);
%IF &KEEP_VAR = 'Y' %THEN VAR_LIST=%SYSFUNC(CATS(VAR_LIST,&OUT_VAR));
%PUT VAR_LIST;
%MEND;
DATA ABC (keep = VAR_LIST);
LENGTH VAR_LIST $100.;
RETAIN VAR_LIST '';
%TEST(A=1,B=3,OUT_VAR=FIRS,KEEP_VAR='Y');
%TEST(A=2,B=4,OUT_VAR=SEC,KEEP_VAR='Y');
%TEST(A=3,B=5,OUT_VAR=THIR,KEEP_VAR='N');
RUN;
I have a datastep in which i am creating several variables calculated by the macro code.
I want to create a dynamic list of these output variables and then use it in the keep statement.
The above code does not seem to work, can someone suggest what am I missing here.

A running data step is source code that has already been compiled. You are trying to change the source while it is running -- that is not going to happen. This is a symptom of mixing scopes and contexts between macro code and generated code.
However, the TEST macro is generating source and thus can either
maintain state information for generating a KEEP source code statement after the macros are invoked.
generate an additional KEEP statement per TEST invocation
In general, macro only parameter values are already 'strings', and do not need to be quoted.
%MACRO TEST(A=,B=,OUT_VAR=,KEEP_VAR=);
&OUT_VAR=MAX(&A,&B);/* generates data step source code*/
%IF &KEEP_VAR = 'Y' %THEN
%LET VAR_LIST=&VARLIST &OUT_VAR;
%MEND;
* must occur before sequence of TEST invocations;
%global VARLIST; * globals generally want to be avoided, next version avoids it;
%let VARLIST =;
DATA ABC;
%TEST(A=1,B=3,OUT_VAR=FIRS,KEEP_VAR='Y');
%TEST(A=2,B=4,OUT_VAR=SEC,KEEP_VAR='Y');
%TEST(A=3,B=5,OUT_VAR=THIR,KEEP_VAR='N');
KEEP &VARLIST;
RUN;
%symdel VARLIST;
I hope this is a learning exercise -- The above pattern of coding has a macro generate code that would suffice as two normal non-macro data step statements do. Too much macroification can be obfuscating.
Additionally, the output ABC might not be what you expect. There will be one row with A=3, B=5, FIRS=4, SEC=6 and no THIR=8. So you have also some mixed up row and column data concepts.
Here is a cleaner alternative in which TEST generates the KEEP
%MACRO TEST(A=,B=,OUT_VAR=,KEEP_VAR=);
&OUT_VAR=MAX(&A,&B);/* generates data step source code*/
%IF %upcase(&KEEP_VAR) = Y %THEN %STR(KEEP &OUT_VAR;);
%MEND;
DATA ABC;
%TEST(A=1,B=3,OUT_VAR=FIRS,KEEP_VAR=Y);
%TEST(A=2,B=4,OUT_VAR=SEC,KEEP_VAR=Y);
%TEST(A=3,B=5,OUT_VAR=THIR,KEEP_VAR=N);
RUN;
The macro code still has to rely on the caller to specify a proper KEEP_VAR parameter Y or not Y.
An alternative is to push the Y/N KEEP concept into two parameters, use of one parameter over the other discriminates the KEEP.
%MACRO TEST(A=,B=,KEEP_VAR=,TEMP_VAR=);
%IF %length (&KEEP_VAR) %then %do;
&KEEP_VAR
%end;
%else %do;
&TEMP_VAR
%end;
MAX(&A,&B);
%IF %length (&KEEP_VAR) %then %do;
KEEP &KEEP_VAR;
%end;
%MEND;
DATA ABC;
%TEST(A=1,B=3,KEEP_VAR=FIRS);
%TEST(A=2,B=4,KEEP_VAR=SEC);
%TEST(A=3,B=5,TEMP_VAR=THIR);
RUN;
For me, a widely used macro should have clarity of invocation. The black box of the macro implementation can strive to be as clean and efficient but does not have to be.

It is not clear what you are trying to do, but you can use multiple KEEP statements if want.
data want ;
x=1;
keep x;
y=2;
z=3;
keep z;
run;
Although for your application it looks like you might want to generate a DROP statement instead.
%MACRO TEST(A=,B=,OUT_VAR=,KEEP_VAR=Y);
&OUT_VAR=MAX(&A,&B);
%IF (&KEEP_VAR ^= Y) %THEN %DO;
drop &out_var ;
%end;
%MEND;
DATA ABC ;
%TEST(A=1,B=3,OUT_VAR=FIRST)
%TEST(A=2,B=4,OUT_VAR=SECOND)
%TEST(A=3,B=5,OUT_VAR=THIRD,KEEP_VAR=N)
RUN;

Related

SAS: How to create a macro to re run a program while changing variable values

I have a task to write a macro to "re-run your programs" using different values for a variable to reduce the mean of that variable in order to reduce a second variable dependant on the first below a certain threshhold. However, im new to SAS and from what ive read on macros i didnt see anything that obviously fit this description.
This prove to be quite interesting question. I haven't found any pre-built methods for this, but maybe something like this maybe:
First I define the norm, eg. the distance from what you want.
%macro do_calc(value);
%let res=%eval(&value.-1);
%mend;
You should locate the iteration direction at this point. Most likely just derive your function and nudge the numbers down the dimension. Something like
%macro update_vars (var1, var2);
%let var1=&var1. + 0.1;
%let var2=&var2. - 0.2;
%mend;
The actual iteration is quite easy.
%macro iterate;
%let res=110;
%let i=1;
%do %while(&res.>100 or i>1000);
%do_calc(&res.); /*Add var1, var2); here */
%update_vars(&var1.,&var2.);
%put &res.;
%let i= %eval(&i.+1); /*failsafe*/
%end;
%mend;
%iterate;

Using Toggle Statements in SAS

I am attempting to create a program that toggles certain sections of the code on or off based on user input. The code below should only run when the prog1 variable is set equal to Y. However, my log is showing that the code runs no matter what. Does anyone know what is going on?
Code:
%let prog1 = Y;
%let prog2 = N;
data _null_;
if "&prog1." = "Y" then do;
%findit(&file1.);
%findit(&file2);
end;
run;
data _null_;
if "prog2." = "Y" then do;
%findit(&file3.);
end;
run;
Log:
It is doing what you told it to do. The macro references and macro code will be evaluated first. Then any SAS code that the macro references generate will by processed by SAS. So you have written a DATA step that will conditionally skip over the SAS code that the macros generate. But the macros themselves will always run.
If you code the main program as a macro instead of open SAS code then you can add macro logic like %IF to conditionally generate the macro calls.
Or for this simple example you could use CALL EXECUTE() or other code generation methods to control the generation of the macro calls. That way SAS will never see the macro calls in the input stream if the condition is false.
data _null_;
if "&prog1." = "Y" then do;
call execute('%nrstr(%findit)(&file1.)');
call execute('%nrstr(%findit)(&file2.)');
end;
run;
You have some mistake in your code. (use of macro function into a dataset, misuse of & (ampercent) in the call of your macro-variable).
1) Always use the & (ampercent) to call a macro-variable
2) Add a point like &path. when it's necessary, if there is another string or macro-variable following the &path macro-variable.
3) Prefer to make your check using macro function with %if,%then, ect.
You should make your program more macro oriented like that :
%let prog1 = Y;
%let prog2 = Y;
%macro check();
%if "&prog1." = "Y" %then %do;
%put execute 1;
%findit(&file1);
%findit(&file2);
%end;
%if "&prog2." = "Y" %then %do;
%put execute 2;
%findit(&file3);
%end;
%mend;
%check;
It will work now,
Regards,
Your second if statement is checking "prog2." not "&prog2." and in your log the macro variable "&prog3." is the one getting resolved instead of &prog1. and &prog2.
Try adding this to your code which prints all the user macro variables to the log.
%put _user_;

SYSERR Automatic Macro Variable

I have several programs in one SAS project, i.e program A -> program B ->.... I want to build dependency between programs using macro variables.
Program A will process few data step and procs. If ANY procedure in program A executes with errors, I would like to run program C. Otherwise run program B.
This seems to be tricky since syserr resets at each step boundary. If first data step in program A executes with error and the rest don't, then at the end of program A, syserr is still 0. I need macro variable value to be something other than 0 once error happens and the value can keep till program ends.
If the program dependency is based on other criteria (say values), the user-defined macro variables can handle that. For something related to system errors, I think SAS already has something can do the trick.But I can't find anything else except syserr, which doesn't seem to help.
Note: I find this SAS stop on first error. But basically it's to check the error status after each data step. That sounds crazy if program A contains 50+ data steps.
Easy - just use syscc!
SYSCC is a read/write automatic macro variable that enables you to
reset the job condition code and to recover from conditions that
prevent subsequent steps from running.
See documentation, but I guess you will be looking for something like:
%if &syscc > 4 %then %do;
%inc "/mypath/pgmB.sas";
%end;
%else %do;
%inc "/mypath/pgmA.sas";
%end;
The highest value of syscc is retained across step boundaries, always with an integer to represent the error level. Example values:
The values for SYSCC are:
0 is no errors no warnings
4 is warnings
greater than 4 means an error occurred
Note that there are some things it won't catch, but to improve it's effectiveness you can use:
options errorcheck=strict;
Finally - you mention 'sas project', if by this you mean you are using Enterprise Guide then please be aware of the advice in this usage note.
You could define a macro that kept track of the error status and run this after each step. The macro would look something like this:
%macro track_err;
%global err_status;
%let err_status = %sysfunc(max(&err_status, &syserr ne 0));
%mend;
Usage example below. First initialize the value to track the overall error status. The first datastep will fail, the second will run successfully and the final value of err_status will be 1.
%let err_status = 0;
data oops;
set sashelp.doesnt_exist;
run;
%track_err;
data yay;
set sashelp.class;
run;
%track_err;
%put &=err_status;
Final output:
ERR_STATUS=1
As far as 'sounding crazy to check the status after each step'... well SAS doesn't provide something for the exact requirement you have, so the only way to do it is literally to have something check after each step.
EDIT: Correction - looks like the syscc approach mentioned in RawFocus's answer actually shows there IS something that does this in SAS.
If you want the check to 'blend in' more with the code, then consider replacing your run and quit statements with a macro that performs the run/quit, then checks the status all in one. It will result in slightly cleaner code. Something like this:
%macro run_quit_track;
run;quit;
%global err_status;
%let err_status = %sysfunc(max(&err_status, &syserr ne 0));
%mend;
%let err_status = 0;
data oops;
set sashelp.doesnt_exist;
%run_quit_track;
data yay;
set sashelp.class;
%run_quit_track;
%put &=err_status;
Use one macro, call it runquitA. call this macro at the end of each proc sql or proc data in the place of quit; and run;
Example:
/*Program A*/
%macro runquitA;
; run; quit;
%if &syserr. ne 0 %then %do;
/*Call Program C*/
%end;
%mend runquitA;
proc sql;
create table class1 as
select * from sashelp.class;
%runquitA;
data class2;
set sashelp.class;
%runquitA;
/*Call Program B*/
/*end of Program A*/

What is the importance of the % symbol as used inside a SAS macro

Please consider this sample SAS macro code:
%MACRO reports;
%IF &SYSDAY = Monday %THEN %DO;
%END;
%MEND reports;
Does every single word inside the macro need to be prefixed with a %? What exactly does the % sign mean?
% is a macro trigger, along with &. It identifies the next symbol(s) as part of a macro language element. This might be a macro call (%reports();), a macro statement (%if), a macro comment (%*), or other macro language elements.
Understanding how the SAS macro language works is pretty important to understanding the difference here. %IF for example is instructing the SAS macro processor to do something. IF is regular SAS code that will be put into the SAS data step (or whatever). Spend some time understanding what the macro language is doing - what the point of it is entirely - to fully understand that.
And, as with many things in SAS, Ian Whitlock can explain it better than I can.
The % symbol indicates that it is macro logic, no datastep logic.
Macro logic is executed before compilation, just like pre-compiler logic in C++. Fro instance
%MACRO reports ;
data lastWorkingDayData;
set allData;
%IF &SYSDAY = Monday
%THEN %DO ;
if transactionDate ge "&SYSDATE."d -3 then output;
%END ;
%ELSE %DO ;
if transactionDate ge "&SYSDATE."d -1 then output;
%END ;
RUN ;
/* your printing logic comes here */
%MEND reports ;
%reports;
will, if you run it today be converted before it is even compiled to
data lastWorkingDayData;
set allData;
if transactionDate ge "&SYSDATE."d -3 then output;
RUN ;
/* your printing logic comes here */
before it is even compiled. To understand it better, start your code with option mprint; and inspect your log

what is wrong with the following sas code?

I am new to sas. I have written a basic code but it is not working. Can somebody help me figure out what is wrong with the code. I wish to append the datasets.
options mprint mlogic symbolgen;
%macro temp();
%let count = 0;
%if &count = 0 %then %do;
data temp;
set survey_201106;
%let count = %eval(&count +1);
%end;
%else %do;
%do i = 201107 %to 201108;
data temp;
set temp survey_&i;
%end;
%end;
run;
%mend;
%temp;
You are setting &count to 0 at the beginning of the macro, so the %else clause will never be executed.
I'm not sure what your aim is, but it looks like you just want to concatenate 3 datasets and store in a new dataset. If so, will this not suffice:
data temp;
set survey_201106-survey_201108;
run;
This creates a dataset called temp and populates it with the the contents of survey_201106, survey_201107 and survey_201108 in order. The - tells SAS that you want the all the datasets named survey_20110* between survey_201106 and survey_201108 inclusive.
Details of the syntax.
options mprint mlogic symbolgen;
%macro temp();
proc sql noprint;
create table table_list as
select monotonic() as num,memname
from dictionary.tables
where libname = 'WORK' and memname contains 'SURVEY_';
quit;
proc sql noprint;
select count(*) into :cnt
from table_list;
quit;
%do i = 1 %to &cnt.;
%if &i eq 1 and %sysfunc(exist(work.temp)) %then %do;
proc sql;
drop table work.temp;
quit;
%end;
proc sql noprint;
select memname into :memname
from table_list
where num = &i.;
quit;
proc append base = temp data = &memname. force;
run;
%end;
%mend;
%temp;
Working: Above code will append all the work data sets whose names starting with
'SURVEY_' to temp data set.
dictionary.tables is used to create a data set which will contain the list of data sets whose names starts with 'survey_'.
table_list data set:
num memname
1 survey_201106
2. survey_201107
3. survey_201108
cnt macro variable is created to hold number of such data sets
Within loop, each data set present the list of data set names (in table table_list) will be appended to work.temp data set
What you're probably trying to do is create a macro that appends (without using proc append for some reason) when a dataset exists, or creates it new when it does.
SAS is not like r or other similar languages, where you have to control largely everything that happens. SAS's strength is that you can ask it to do common things with only a line or two of code. SAS is what's commonly called a 4th Generation Language for this reason: you're not supposed to control all of the little bits. That's a waste of time. Use the Procedures (PROC...) and constructs SAS provides you.
In this case, PROC APPEND does exactly what this whole macro does. It creates a dataset or appends new rows to it if it already exists.
proc append base=temp data=in_data;
run;
Now, if you're trying to learn the macro language and using this concept as a learning tool only, it is possible to do this in a macro that's not all that different from yours.
Note: This is not a good way to do this. It might be useful for learning macro concepts, but it should not be used as an example of good code. Despite my improvements, it is still not the way you should do this; proc append or SRSwift's example are better.
One thing I'm going to introduce here: a macro parameter. A good rule of macro programming is that all macros should have parameters. If there's no possible parameter, it should usually be possible to do without needing a macro. Parameters are what make macros useful, most of the time. In this case I'm going to rewrite your macro to take one append dataset as a parameter and one 'base' dataset. In your example, temp was the base dataset and survey_1106 etc. are the append datasets.
Also, &count needs to be a global macro variable. In SAS, variables created inside a macro are by default local in scope - ie, they only are defined inside one run of the macro and then disappear. This is nearly identical to functions in c/etc. languages (a bit different from r, which uses lexical scoping, and you might be expecting given how you wrote this). There are some funny rules, though, but for now we'll just go with this. global macro variables, which include any variable that has already been defined in the global scope, are available in all macro iterations (and outside of macros).
So:
%macro append_dataset(base=,append=);
%if &count=0 %then %do;
data &base.;
set &append.;
run;
%end;
%else %do;
data &base.;
set &base. &append.;
run;
%end;
%let count=%eval(&count.+1);
%mend append_dataset;
%let count=0;
%append_dataset(base=temp,append=survey_1106);
%append_dataset(base=temp,append=survey_1107);
%append_dataset(base=temp,append=survey_1108);
Now, you could generate those calls through an external method (such as dictionary.tables as in Harshad's example). You also could add another element to the macro, which is to iterate over all of the elements in a list provided as append. You could also hardcode the list in a %do loop, as you did in the initial example (but I think that's bad practice to get into). You could literally do that in my macro:
%macro append_dataset(base=,append=);
%do survey=201106 to 201108;
%if &count=0 %then %do;
data &base.;
set survey_&survey.;
run;
%end;
%else %do;
data &base.;
set &base. survey_&survey.s;
run;
%end;
%let count=%eval(&count.+1);
%end;
%mend append_dataset;
Notice the count increment is inside the do loop - that's one of the places you went wrong here. Otherwise this is just adding an outer loop and changing the append mentions to the calculated values from the loop. But again, this is fairly poor coding practice - the loop should at minimum be constructed from a macro parameter.