I want to execute a macro conditionally, based on some loop variable in the dataset.
data alldat;
do i=1 to 5;
* here, a macro should be called ;
* that is accessing some array variable of the dataset ;
* e.g. %my_macro(my_array(i));
%put hello;
output;
end;
run;
proc print; run;
How is this possible?
If I execute this example code above, hello is only output once, while alldat contains 5 values, as one would expect. I want 5 hellos in my output.
Thank you!
If you want your data step loop to output hello 5 times then use a PUT statement instead of a %PUT statement.
In general macros are used to generate SAS code that is then executed. It really depends on what type of code your macro generates. If it only generates statements than can be used inside of a data step then call it once inside of your DO loop and the generated statements will run 5 times. So if your macro generates data step statements that could update the variable whose name is passed to it then your code might look like this.
data alldata;
array my_array var1-var5 ;
do i=1 to dim(my_array);
%my_macro(my_array(i));
put 'hello';
output;
end;
run;
Otherwise you could use CALL EXECUTE to generate the macro calls so that they can then run and generate their code after the data step stops. This data step will read values from an existing array and pass the values to the macro. So it will generate 5 calls to the macro for every observation in the input data. The generated macro calls will run after the data step stops.
data _null_;
set mydata;
array my_array var1-var5 ;
do i=1 to dim(my_array);
call execute(cats('%nrstr(%my_macro)(',my_array(i),');'));
call execute('%nrstr(%put hello;)');
end;
run;
Related
Need to understand why call execute is working when it is outside the loop for each iteration ?Variable name will be overwritten every time so I should get result for last variable in my dataset rather that all 111 variables.
data _null_;
set basel.Data_Dictionary;
do i =1 to 111 ;
call symput('Varname',NAME);
%put &varname.;
end;
call execute ('%missimp(&varname.)');
run;
Not sure what you mean by outside the loop?
The first thing that program will do is print the value that the macro variable VARNAME has BEFORE the data _null_ step starts to the log. Note that macro code is processed first and then the resulting text is interpreted as the SAS code that you want to run. It would be less confusing to place the %put statement before the data statement.
Your datastep will "loop" over each observation in your source dataset. It will read an observation from your input data. The DO loop will cause it to set the macro variable VARNAME to the same value one hundred and eleven times. Then it will place a call to the macro named MISSIMP that will use the value of VARNAME (at the time the CALL EXECUTE statement ran). This will repeat until the SET statement tries to read past the end of input dataset. All of those macro calls will run after the current data step finishes.
A much simpler process would be just skip the CALL SYMPUT statement and use the value of NAME to generate the code to pass to CALL EXECUTE. Like this:
data _null_;
set basel.Data_Dictionary;
call execute(cats('%missimp(',NAME,')'));
run;
I have a pretty large macro that I want to call several times. (I'm using replicate weights to calculate my error.) I want to call the process for different variables, say VAR1-VAR99. In the past I've used a DATA NULL step and CALL EXECUTE like so:
data _null_;
do i=1 to 99;
call execute(compress("%mymacro(VAR" || i || ")") );
end;
run;
This isn't working for me this time, though. There might be something I'm missing about the scope of macro variables? I'd like to call:
%mymacro(VAR1)
%mymacro(VAR2)
...
%mymacro(VAR99)
and of course I'd like to do this without 99 lines of code. Why might my method be suddenly failing me? What are other ways to do this?
Here is an example of generating macro calls with call execute. I added %NRSTR, as it prevents macro timing issues. It makes the call execute generate the macro call, without actually executing the macro. If your macro generates macro variables from data, without the %NRSTR you can end up with timing issues and scope issues.
%macro mymacro(var) ;
%put var=&var ;
%mend mymacro ;
data _null_;
do i=1 to 5;
call execute(cats('%nrstr(%mymacro(var',i,"))")) ;
end;
run;
Or it could be as simple as changing your code to use single quotes instead of double quotes. Single quotes will prevent the macro from executing when the data step compiles. If your macro does not generate macro variables from data, this may be enough. But I always use %NRSTR.
data _null_;
do i=1 to 5;
call execute(compress('%mymacro(VAR' || i || ")") );
end;
run;
Don't use call execute, try to call macro program in macro program.
%macro repeat(n);
%do i=1 %to &n;
%mymacro(VAR&i);
%end;
%mend;
Large is probably the key word in that question. Let me explain.
You are pushing the macro calls onto the stack using call execute(). But what actually is getting placed on the stack is the code generated by the macro and not the call. Look at the lines with + at the beginning in the SAS log.
If the macro generates just a few lines of code then not much is stacked up to run after the data step. But if it is large then you might overload the stack.
Also if the macro uses the lines of SAS code it generates to create macro variables (using call symputx() or SQL's into clause) that later drive the macro's logic there will be a timing issue. Again that is more likely to happen with a large macro than a small (simple) macro.
Wrap the macro call (or at least the macro's name) in %nrstr() will prevent SAS from running the macro during the call exucute() call. Instead the macro call will placed onto the stack to be run after the data step finishes.
Consider this simple macro definition.
%macro mymacro(varname);
proc means data=sashelp.class ;
var &varname ;
run;
%mend mymacro;
If I use call execute() to generate calls to it like this:
data _null_;
call execute('%mymacro(age)');
call execute('%mymacro(height)');
call execute('%mymacro(weight)');
run;
Then you will see lines like this in the SAS log
1 + proc means data=sashelp.class ; var age ; run;
2 + proc means data=sashelp.class ; var height ; run;
3 + proc means data=sashelp.class ; var weight ; run;
But if instead you add %nrstr() like this:
data _null_;
call execute('%nrstr(%mymacro)(age)');
call execute('%nrstr(%mymacro)(height)');
call execute('%nrstr(%mymacro)(weight)');
run;
Then the lines in the SAS log look like this.
1 + %mymacro(age)
2 + %mymacro(height)
3 + %mymacro(weight)
I have two values which represent dates:
a=101 and b=103
Below is first macro saved in separate file one.sas:
%global time nmall;
%let nmall =;
%macro pmall;
%do i=&a. %to &b;
%if &i =&a. then %do;
%let nmall=&nmall.&i;
%end;
%else %let nmall=&nmall.,&i;
end;
%put (&nmall);
%mend;
%pmall;
So above pmall give me values 101,102,103.
Below is second macro:
%include “one.as”;
%macro c(a=,b=);
%let m=;
%let m1=;
%do i =&a %to &b;
%let o=&i;
proc sql;
create table new&o as select * from data where nb in(&o.);quit;
%let m =&m.date&o;
data date&o.;
set date&o.;
if pass =&o.;
run;
proc sort data=date&o.;
by flag;
end;
data output &a._&b.;
set &m;
%mend;
The above macro creates three datasets date101 date102 and date 103, then append it to output101_103.
I am trying to modify above macros in such a way that I will not use %macro and %mend approach. Below is the modified macro code:
data a_to_c;
do o=&a to &c;
output;
end;
run;
so above code will have values 101 102 103 in variable o for dataset a_to_c.
data _null_;
set a_to_c;
call execute ('create table new’||strip(o)||' as select * from data
where nb in(’||strip(o)||' );quit;’);
run;
I want to know how to do below things.
Create pmall values in a macro variable in my modified macro inside the data step data a_to_c, so that I can use it further.
How to proceed from %let m macro in the first macro code to new code which I am developing above.
Geetha:
I think you will find the macro-ization of the process to be far easier if you go from a data-centric explicit solution and proceed abstracting the salient features into macro symbols (aka variables)
The end run solution appears to be:
data output_101_to_103;
set original_data;
where nb between 101 and 103;
run;
proc sort data=output_101_to_103;
by nb flag;
run;
In which case you could code a macro that abstracts 101 to FIRST and 103 to LAST. The data sets could also be abstracted. The abstracted parts are specified as the macro parameters.
%macro subsetter(DATA=, FIRST=, LAST, OUTPREFIX=OUTPUT);
%local out;
%let out = &OUTPREFIX._&FIRST._&LAST.;
data &out;
set &DATA.;
where nb between &FIRST. and &LAST.;
* condition = "between &FIRST. and &LAST."; * uncomment if you want to carry along the condition into your output data set;
run;
proc sort data=&out;
by nb flag;
run;
%mend;
And use as
%subsetter (data=original_data, first=101, last=103, outprefix=output)
Note: If you did keep the condition variable in the output data, you WOULD NOT be able to use it directly as a source code statement in a future data step, as in if nb condition then ...
I suppose you could also pass the NB and FLAG as parameters -- but you approach a point of diminishing returns on the utility of the macro.
Macro-izing the specific example I showed doesn't make too much sense unless you need to perform a lot of different variations of FIRST and LAST in a well documented framework. Sometimes it is just better to not abstract the code and work with the specific cases. Why? Because when there are too many abstracted pieces the macro invocation is almost as long as the specific code you are generating and the abstraction just gets in the way of understanding.
If the macro is simply chopping up data and reassembling data, you might be better served rethinking the flow using where, by, and class statements and abstracting around that.
Pmall is macro variable which will have list of values separated by
commas. In my modify macro, i want to create pmall as macro variable
in the datastep data a_to_c; do o=&a to &c; output; end; run; – geetha
anand 1 min ago
To create a macro variable from within a data step using the CALL SYMPUTX() function.
data a_to_c;
length pmall $200 ;
do o=&a to &c;
pmall=catx(',',pmall,o);
output;
end;
call symputx('pmall',pmall);
drop pmall;
run;
If you really want to generate code without a SAS macro you can use CALL EXECUTE() or write the code to a file and use %INCLUDE to run it. Or for small pieces of code you could try putting the code in a macro variable, but macro variables can only contain 64K bytes.
It is really hard to tell from what you posted what code you want to generate. Let's assume that you want to generate an new dataset for each value in the sequence and then append that to some aggregate dataset. So for the first pass through the loop your code might be as simple as these two steps. First to create the proper subset in the right order and the second to append the result to the aggregate dataset.
proc sort data=nb out=date101 ;
where nb=101 ;
by flag ;
run;
proc append base=date101_103 data=date101 force;
run;
Then next two times through the loop will look the same only the "101" will be replaced by the current value in the sequence.
So using CALL EXECUTE your program might look like:
%let a=101;
%let c=103;
proc delete data=date&a._&c ;
run;
data _null_;
do nb=&a to &c;
call execute(catx(' ','proc sort data=nb out=',cats('date',nb,'),';'));
call execute(cats('where nb=',nb,';')) ;
call execute('by flag; run;');
call execute("proc append base=date&a._&c data=");
call execute(cats('date',nb));
call execute(' force; run;');
end;
run;
Writing it to a file to run via %INCLUDE would look like this:
filename code temp ;
data _null_;
file code ;
do nb=&a to &c;
put 'proc sort data=nb out=date' nb ';'
/ ' where ' nb= ';'
/ ' by flag;'
/ ';'
/ "proc append base=date&a._&c data=date" nb 'force;'
/ 'run;'
;
end;
run;
proc delete data=date&a._&c ;
run;
%include code / source2;
If the goal is to just create the aggregate dataset and you do not need to keep the smaller intermediate datasets then you could just use the same name for the intermediate dataset on each pass through the loop. That will make the code generation easier as then there is only only place that needs to change based on the current value. Also that way you only need to have two dataset names even for a sequence of 10 or 20 values. It will take less space and reduce clutter in the work library.
SAS documentation says the macro references in call execute are executed immediately. Does this code exemplify it?
%let var = abc;
data _null_;
call execute ('&var');
run;
Sort of. Here is a more complete example using value of the macro variable that is actual executable SAS code.
data _null_;
call symputx('var','data;run;');
run;
%put var= %superq(var);
data _null_;
call execute ('&var');
run;
You can see in the SAS log that the code that CALL EXECUTE() actually pushed onto the stack to run is the VALUE of the macro variable even though the single quotes would prevent the macro variable from expanding during the data _null_ step that is using the CALL EXECUTE() statement.
NOTE: CALL EXECUTE generated line.
1 + data;run;
I posted a question a while back about trimming a macro variable down that I am using to download a CSV from Yahoo Finance that contains variable information on each pass to the site. The code that was suggested to me to achieve this was as follows:
data _null_;
a = "&testvar.";
call symputx('svar',trim(input(a,$8.)));
run;
That worked great, however I have since needed to redesign the code so that I am declaring multiple macro variables and submitting multiple ones at the same time.
To declare multiple macros at the same time I have used the following lines of code:
%let svar&e. = &svar.;
%put stock_ticker = &&svar&e.;
The varible &e. is an iterative variable that goes up by one everytime. This declares what looks to be an identical macro to the one called &svar. everytime they are put into the log, however the new dynamic macro is now throwing up the original warning message of:
WARNING: The quoted string currently being processed has become more than 262 characters long. You
may have unbalanced quotation marks.
That i was getting before i started using the symputx option suggested in my original problem.
The full code for this particular nested macro is listed below:
%macro symbol_var;
/*here the start row and end row created in the macro above are passed to this nested macro and then passed through the*/
/*source dataset. at the end of the loop each ticker macro variable is defined in turn for use in the following nested*/
/*macro, symbol by metric.*/
%do e = &beg_point. %to &end_point. %by 1;
%put stock row in dataset nasdaq ticker = &e.;
%global svar&e;
proc sql noprint;
select symbol
into :testvar
from nasdaq_ticker
where monotonic() = &e.;
quit;
/*convert value to string here*/
data _null_;
a = "&testvar.";
call symputx('svar',trim(input(a,$8.)));
run;
%let svar&e. = &svar.;
%put stock_ticker = &&svar&e.;
%end;
%mend;
%symbol_var;
Anyone have any suggestions how I could declare the macro &&svar&e. directly into the call synputx step? It currently throws up an error saying that the macro variable being created cannot contain any special characters. Ive tried using "E, %NRQUOTE and %NRBQUOTE but either I have used the function in an invalid context or I haven't got the syntax exactly right.
Thanks
Isn't this as simple as the following two line data step?
%macro symbol_var;
/*here the start row and end row created in the macro above are passed to this nested macro and then passed through the*/
/*source dataset. at the end of the loop each ticker macro variable is defined in turn for use in the following nested*/
/*macro, symbol by metric.*/
data _null_;
set nasdaq_ticker(firstobs=&beg_point. obs=&end_point.);
call symputx('svar' || strip(_n_), symbol);
run;
%mend;
%symbol_var;
Or the following (which includes debugging output)
%macro symbol_var;
/*here the start row and end row created in the macro above are passed to this nested macro and then passed through the*/
/*source dataset. at the end of the loop each ticker macro variable is defined in turn for use in the following nested*/
/*macro, symbol by metric.*/
data _null_;
set nasdaq_ticker(firstobs=&beg_point. obs=&end_point.);
length varname $ 32;
varname = 'svar' || strip(_n_);
call symputx(varname, symbol);
put varname '= ' symbol;
run;
%mend;
%symbol_var;
When manipulating macro variables and desiring bullet-proof code I often find myself reverting to using a data null step. The original post included the problem about a quoted string warning. This happens because the SAS macro parser does not hide the value of your macro variables from the syntax scanner. This means that your data (stored in macro vars) can create syntax errors in your program because SAS attempts to interpret it as code (shudder!). It really makes the hair on the back of my neck stand up to risk my program at the hands of what might be in the data. Using the data step and functions protects you from this completely. You will note that my code never uses an ampersand character other than the observation window points. This makes my code bullet proof regarding what dirty data there may be in the nasdaq_ticker data set.
Also, it is important to point out that both Dom and I wrote code that makes one pass over the nasdaq_ticker data set. Not to bash the original posted code, but looping in that way causes a proc sql invocation for every observation in the result set. This will create very poor performance for large result sets. I recommend developing an awareness of how many times a macro loop is going to cause you to read a data set. I have been bitten by this many times in my own code.
Try
call symputx("svar&e",trim(input(a,$8.)));
You need double quotes ("") to resolve the e macro.
As an aside, I am not sure you need the input statement if $testvar is a string and not a number.
I would have written this as
%macro whatever();
proc sql noprint;
select count(*)
into :n
from nasdaq_ticker;
select strip(symbol)
into :svar1 - :svar%left(&n)
from nasdaq_ticker;
quit;
%do i=1 %to &n;
%put stock_ticker = &&svar&i;
%end;
%mend;