% in beginning of codes in SAS - sas

As part of my work, I am going through this SAS code. I have never worked on SAS before. Could anybody explain the usage of '%' in front of the SAS lines in the code below?
%if &i>0 %then %do;
or
%put ##### calling formula;
%formula();

% signs indicate macros or macro code. Macros in SAS are similar to functions in other programming languages, but not quite. You can treat them like functions though. They deal exclusively with text and only text.
The macro facility exists to handle more generalized problems. Things in SAS are done in PROCs and the DATA Step. Each PROC and DATA Step is like its own little self-contained environment - stuff that happens in there stays in there. The Macro facility gives you tools to do things like conditionally call PROCs, the DATA Step, or system options.
I highly recommend taking one of the free training courses on SAS programming. If you want to get a jump start on Macro language, start with this free course on Coursera.
To help get you started, here's what this code is doing, line-by-line.
%if &i>0 %then %do;
If the macro variable, i, is greater than 1, run some code. All macro variables start with &. The value of the macro variable i can be reviewed by typing %put &i.
%put ##### calling formula;
%put writes a line to the log in open code. This is likely being used to help debug things.
%formula();
This is a macro function that holds some code and runs it. It has no arguments. Macro functions are created with the following syntax:
%macro myMacro();
<macro or SAS code here>;
%mend;
% is the key symbol to invoke a macro. If we wanted to invoke myMacro, we can do so by prefixing it with a %:
%myMacro;

Related

Checking to see if a dataset exists

I've just finished the main macro for a project that I'm on. It generates a line to enter into another table. So, my next step is to write another macro that calls this one. One of the arguments for this next macro is the name of the dataset in which to insert this new observation, which leads to my question...
I'd like for my next macro to check and see if the named dataset exists. If so, it will insert the new calculated line into the dataset. If it dose not yet exist, I'd like to save the new line as a dataset with this name.
To get a little bit more concrete, let's suppose I have the macro
%calculate_for(ARG1, ARG2, ARG3) that creates a single-observation dataset NEXT_LINE. I want to write a macro that does something like:
%macro do_for(ARG1, ARG2, ARG3, DATASET_NAME);
%calculate_for(&ARG1, &ARG2, &ARG3)
{if DATASET_NAME exists then do:}
data &DATASET_NAME;
set &DATASET_NAME
NEXT_LINE;
run;
{if DATASET_NAME doesn't exist yet then do:}
data &DATASET_NAME;
set NEXT_LINE;
run;
%mend;
How might I go about doing this in SAS?
The macro function %SYSFUNC can be used to invoke almost any DATA step function.
For example
%macro …;
data &out;
set
%if %sysfunc (EXIST(&OUT,DATA)) %then %do;
&OUT
%end;
NEXT_LINE;
;
run;
%mend;
Likewise, the %SYSCALL routine can be used to invoke almost any CALL routine.
As #Reeza comments, for the specific coding case in your question, Proc APPEND could be the better choice. The pattern shown in your sample code would cause an entire rewrite of the base table.
Other coding patterns that do not rewrite the entire data set include
DATA Step : MODIFY statement with subsequent OUTPUT, REPLACE or REMOVE statements
Proc SQL : INSERT INTO … SELECT … FROM
If you are doing a lot of development, perhaps don't recreate the wheel at every step. Look around for SAS macro libraries that have common utility features, one example Roland's SAS® Macros

What does testing if a variable is not equal to nothing do?

I came across SAS code recently that looks something like this:
%if var_name ~= %then %do;
flag = 1;
%end;
I understand that ~= means "not equal", but there appears to be nothing here for the variable to be compared to. Can someone shed any light on this syntax?
I've ruled out the possibility that this is shorthand for identifying missing observations: the flag is generated regardless of whether var_name contains any missing observations. That being said, it does the exact same thing as the code that you would think would actually do this:
%if var_name ~= . %then %do;
flag = 1;
%end;
The above also generates a flag with value 1 for all observations.
Any help on this greatly appreciated as I am quite new to SAS!
Bestimate: The macro expression is emitting flag=1; as an unconditional DATA step statement .
"Coming across SAS code" can be anywhere on the continuum of a singularly rewarding experience to a journey into a deep dark place.
The snippet
%if var_name ~= . %then %do;
flag = 1;
%end;
is construct consistent with someone who is learning macro and does not yet grok the scopes and environments within a SAS session. Macro variables and statements do not mingle with running data step variables and statements. Macro programming typically controls what is eventually seen as the DATA or PROC step source code that needs to be run.
There can be legitimate reasons for the snippet and therein starts your journey.
%IF expression %THEN statement; involves the resolution of a macro logical expression.
The expression is implicitly resolved and evaluated to be zero (false) or non-zero (true). Expressions that can not be resolved down to a non-missing numeric value at macro evaluation time will log an ERROR:
NOTE: Macro evaluation time is long gone by the time the SAS executor has compiled and is executing the DATA Step. SAS Documentation is pretty awesome, use it!.
Your var_name ~= expression is always true.
%put NOTE: %nrstr(%eval(var_name ~=)) resolves to %eval(var_name ~=);
----
NOTE: %eval(var_name ~=) resolves to 1
Because the %IF expression always resolves to true the %THEN statement is always resolved and emitted as source code to be consumed by the SAS executor.
So in your case the source code flag = 1; is emitted, ostensibly as part of a DATA step in which the flag assignment is unconditional.
Many times the statement is another macro expression that does not emit anything and instead performs an action that affects the macro state at the current macro scope -- For example %IF &variable=&target %THEN %let target_met=1;.
The statements around the one you noticed are really clues to whether the %IF is correct and what it should be. What could it be?
Does the data set be processed actually have a column named var_name ?Maybe you are dealing with metadata output by Proc CONTENTS, SQL DICTIONARY.COLUMNS or working in a framework that uses control data for generating statements.
A goofup wherein the %if - %then should really be a data step if -then and the var_name should have been replaced with an actual variable name found in the data set being processed.
Working in a code generating framework where non-empty symbols representing data step variables are used to generate data step if-then statements
The code is the work of a madman, mad genius, or village idiot.
Happy coding!

%let %put variables - what it does to your sas program

I'm new in SAS. I'd like to know what the lines below do. Couldn't figure out what it does to the program because I didn't encounter any of the defined variables in the succeeding parts after declaration.
%let cutofftime =%sysfunc(time());
%let currdt = %sysfunc(putn(&cutofftime.,time5.)) ;
%put &cutofftime. &currdt.;
The %let statement is used to create a macro variable.
The first statement:
%let cutofftime =%sysfunc(time());
uses the time() function to determine the current time. It returns the current time as a numeric value which is number of seconds since midnight.
The second statement:
%let currdt = %sysfunc(putn(&cutofftime.,time5.)) ;
uses the PUTN() to convert the numeric time value (which Is now stored in the macro variable CUTOFFTIME) to a pretty formatted value like 22:30.
So after the two %let statements have run, you created two macro variables. Then the %PUT statement is used to write the values of the two macro variables to the log:
%put &cutofftime. &currdt.;
Using the %PUT statement to write the value of macro variables to the log is a useful way to debug macro code, in the same way that the PUT statement can be used to write the value of data step variables to the log as a data step debugging tool. When I run the code at 9:32 PM, the log shows:
3 %put &cutofftime. &currdt.;
77537.809 21:32
That said, if you're new to SAS, you should probably avoid the trying to learn the macro language at the same time as you are learning the SAS language.

SAS - How to return a value from a SAS macro?

I would like to return a value from a SAS macro I created but I'm not sure how. The macro computes the number of observations in a dataset. I want the number of observations to be returned.
%macro nobs(library_name, table_name);
proc sql noprint;
select nlobs into :nobs
from dictionary.tables
where libname = UPCASE(&library_name)
and memname = UPCASE(&table_name);
quit;
*return nobs macro variable;
&nobs
%mend;
%let num_of_observations = %nobs('work', 'patients');
Also, I would like the &nobs macro variable that is used within the macro to be local to that macro and not global. How can I do that?
I'll answer the core question Bambi asked in comments:
My main concern here is how to return a value from a macro.
I'm going to quibble with Dirk here in an important way. He says:
A SAS macro inserts code. It can never return a value, though in some cases you can mimic functions
I disagree. A SAS macro returns text that is inserted into the processing stream. Returns is absolutely an appropriate term for that. And when the text happens to be a single numeric, then it's fine to say that it returns a value.
However, the macro can only return a single value if it only has macro statements in addition to that value. Meaning, every line has to start with a %. Anything that doesn't start with % is going to be returned (and some things that do start with % might also be returned).
So the important question is, How do I return only a value from a macro.
In some cases, like this one, it's entirely possible with only macro code. In fact, in many cases this is technically possible - although in many cases it's more work than you should do.
Jack Hamilton's linked paper includes an example that's appropriate here. He dismisses this example, but that's largely because his paper is about counting observations in cases where NOBS is wrong - either with a WHERE clause, or in certain other cases where datasets have been modified without the NOBS metadata being updated.
In your case, you seem perfectly happy to trust NOBS - so this example will do.
A macro that returns a value must have exactly one statement that either is not a macro syntax statement, or is a macro syntax statement that returns a value into the processing stream. %sysfunc is an example of a statement that does so. Things like %let, %put, %if, etc. are syntax statements that don't return anything (by themselves); so you can have as many of those as you want.
You also have to have one statement that puts a value in the processing stream: otherwise you won't get anything out of your macro at all.
Here is a stripped down version of Jack's macro at the end of page 3, simplified to remove the nlobsf that he is showing is wrong:
%macro check;
%let dsid = %sysfunc(open(sashelp.class, IS));
%if &DSID = 0 %then
%put %sysfunc(sysmsg());
%let nlobs = %sysfunc(attrn(&dsid, NLOBS));
%put &nlobs;
%let rc = %sysfunc(close(&dsid));
%mend;
That macro is not a function style macro. It doesn't return anything to the processing stream! It's useful for looking at the log, but not useful for giving you a value you can program with. However, it's a good start for a function style macro, because what you really want is that &nlobs, right?
%macro check;
%let dsid = %sysfunc(open(sashelp.class, IS));
%if &DSID = 0 %then
%put %sysfunc(sysmsg());
%let nlobs = %sysfunc(attrn(&dsid, NLOBS));
&nlobs
%let rc = %sysfunc(close(&dsid));
%mend;
Now this is a function style macro: it has one statement that is not a macro syntax statement, &nlobs. on a plain line all by itself.
It's actually more than you need by one statement; remember how I said that %sysfunc returns a value to the processing stream? You could remove the %let part of that statement, leaving you with
%sysfunc(attrn(&dsid, NLOBS))
And then the value will be placed directly in the processing stream itself - allowing you to use it directly. Of course, it isn't as easy to debug if something goes wrong, but I'm sure you can work around that if you need to. Also note the absence of a semi-colon at the end of the statement - this is because semicolons aren't required for macro functions to execute, and we don't want to return any extraneous semicolons.
Let's be well behaved and add a few %locals to get this nice and safe, and make the name of the dataset a parameter, because nature abhors a macro without parameters:
%macro check(dsetname=);
%local dsid nlobs rc;
%let dsid = %sysfunc(open(&dsetname., IS));
%if &DSID = 0 %then
%put %sysfunc(sysmsg());
%let nlobs = %sysfunc(attrn(&dsid, NLOBS));
&nlobs
%let rc = %sysfunc(close(&dsid));
%mend;
%let classobs= %check(dsetname=sashelp.class);
%put &=classobs;
There you have it: a function style macro that uses the nlobs function to find out how many rows are in any particular dataset.
What is the Problem writing function-like macros?
i.e. macros you can use as%let myVar = %myMacro(myArgument)
You can use your user written macro as if it were a function if all you do is
calling some %doSomething(withSometing) like macro functions
assign values to macro variables with a %let someVar = statement
"return" your result, typically by writing &myResult. on the last line before your %mend
As soon as you include a proc or data step in your macro, this does not work any more
Luckily, %sysFunc() comes to the rescue, so we can use any data step function
This includes low level functions like open, fetch and close which can even access your data
nerdy people can do quite a lot with it, but even if you are nerdy, your boss will seldom give you the time to do so.
How do we solve this?, i.e. which building blocks do I use to solve this?
proc fcmp allows packaging some data step statements in a subroutine or function
This function, meant for use in a data step, can be used within %sysfunc()
Within this function you can call run_macro to execute any macro IN BACKGROUND IMMEDIATELY
Now we are ready for the practical solution
Step 1: write a helper macro
with no parameters,
using some global macro variables
"returning" its result in a global macro variable
I know that is bad coding habit, but to mitigate the risk, we qualify those variables with a prefix. Applied to the example in the question
** macro nobsHelper retrieves the number of observations in a dataset
Uses global macro variables:
nobsHelper_lib: the library in which the dataset resides, enclosed in quotes
nobsHelper_mem: the name of the dataset, enclosed in quotes
Writes global macro variable:
nobsHelper_obs: the number of observations in the dataset
Take care nobsHelper exists before calling this macro, or it will be ost
**;
%macro nobsHelper();
** Make sure nobsHelper_obs is a global macro variable**;
%global nobsHelper_obs;
proc sql noprint;
select nobs
into :nobsHelper_obs
from sashelp.vtable
where libname = %UPCASE(&nobsHelper_lib)
and memname = %UPCASE(&nobsHelper_mem);
quit;
%* uncomment these put statements to debug **;
%*put NOTE: inside nobsHelper, the following macro variables are known;
%*put _user_;
%mend;
Step 2: write a helper function;
**Functions need to be stored in a compilation library;
options cmplib=sasuser.funcs;
** function nobsHelper, retrieves the number of observations in a dataset
Writes global macro variables:
nobsHelper_lib: the library in which the dataset resides, enclosed in quotes
nobsHelper_mem: the name of the dataset, enclosed in quotes
Calls the macro nobsHelper
Uses macro variable:
nobsHelper_obs: the number of observations in the dataset
**;
proc fcmp outlib=sasuser.funcs.trial;
** Define the function and specity it should be called with two character vriables **;
function nobsHelper(nobsHelper_lib $, nobsHelper_mem $);
** Call the macro and pass the variables as global macro variables
** The macro variables will be magically qouted **;
rc = run_macro('nobsHelper', nobsHelper_lib, nobsHelper_mem);
if rc then put 'ERROR: calling nobsHelper gave ' rc=;
** Retreive the result and pass it on **;
return (symget('nobsHelper_obs'));
endsub;
quit;
Step 3: write a convenience macro to use the helpers;
** macro nobs retrieves the number of observations in a dataset
Parameters:
library_name: the library in which the dataset resides
member_name: the name of the dataset
Inserts in your code:
the number of observations in the dataset
Use as a function
**;
%macro nobs(library_name, member_name);
%sysfunc(nobsHelper(&library_name, &member_name));
%* Uncomment this to debug **;
%*put _user_;
%mend;
Finally use it;
%let num_carrs = %nobs(sasHelp, cars);
%put There are &num_carrs cars in sasHelp.Cars;
Data aboutClass;
libname = 'SASHELP';
memname = 'CLASS';
numerOfStudents = %nobs(sasHelp, class);
run;
I know this is complex but at least all the nerdy work is done.
You can copy, paste and modify this in a time your boss will accept.
;
A SAS macro inserts code. It can never return a value, though in some cases you can mimic functions, usually you need a work around like
%nobs(work, patients, toReturn=num_of_observations )
** To help you understand what happens, I advice printing the code inserted by the macro in your log: ;
options mprint;
We pass the name of the macro variable to fill in to the macro, I find it most practical to
not require the user of my macro to put quotes around the libary and member names
make the name of the variable a named macro variable, so we can give it a default;
%macro nobs(library_name, table_name, toReturn=nobs);
Make sure the variable to return exists
If it exists it is known outside of this macro.
Otherwisse if we create it here, it wil by default be local and lost when we leave the macro;
%if not %symexist(&toReturn.) %then %global &toReturn.;
In the SQL, I
use the SASHELP.VTABLE, a view provided by SAS on its meta data
add the quotes I omitted in the macro call ("", not '': macro variables are not substituted in single qoutes)
use the macro %upcase function instead of the SAS upcase function, as it sometimes improves performance;
proc sql noprint;
select nobs
into :&toReturn.
from sashelp.vtable
where libname = %UPCASE("&library_name.")
and memname = %UPCASE("&table_name.");
quit;
%mend;
Pay attention if you call a macro within a macro, Run this code and read the log to understand why;
%macro test_nobs();
%nobs(sashelp, class); ** will return the results in nobs **;
%nobs(sashelp, shoes, toReturn=num_of_shoes);
%let num_of_cars = ;
%nobs(sashelp, cars, toReturn=num_of_cars);
%put NOTE: inside test_nobs, the following macro variables are known;
%put _user_;
%mend;
%test_nobs;
%put NOTE: outside test_nobs, the following macro variables are known;
%put _user_;
You can't 'return' a value from a function-style macro unless you have written it using only macro statements. Quentin's link provides an example of how to do this.
For example, you cannot use your macro like so, because proc sql cannot execute in the middle of a %put statement (this is possible with other more complex workarounds, e.g. dosubl, but not the way you've written it).
%put %nobs(mylib,mydata);
The best you can do without significant changes is to create a global macro variable and use that in subsequent statements.
To create a macro variable that is local to the originating macro, you have to first declare it via a %local statement within the macro definition.
I know I am very late to this discussion, but thought of commenting since I came across this. This is another way of doing this I think:
%macro get_something_back(input1, input2, output);
&output = &input1 + &input2;
%mend;
data _test_;
invar1 = 1; invar2 = 2;
%get_something_back(invar1, invar2, outvar);
end;
This will also work outside a datastep.
%global sum;
%macro get_something_back(input1, input2, outvar);
%let &outvar = &sysevalf(&input1 + &input2);
%mend;
%get_something(1, 2, sum);

Using a dynamic macro variable in a call symput statement

I posted a question a while back about trimming a macro variable down that I am using to download a CSV from Yahoo Finance that contains variable information on each pass to the site. The code that was suggested to me to achieve this was as follows:
data _null_;
a = "&testvar.";
call symputx('svar',trim(input(a,$8.)));
run;
That worked great, however I have since needed to redesign the code so that I am declaring multiple macro variables and submitting multiple ones at the same time.
To declare multiple macros at the same time I have used the following lines of code:
%let svar&e. = &svar.;
%put stock_ticker = &&svar&e.;
The varible &e. is an iterative variable that goes up by one everytime. This declares what looks to be an identical macro to the one called &svar. everytime they are put into the log, however the new dynamic macro is now throwing up the original warning message of:
WARNING: The quoted string currently being processed has become more than 262 characters long. You
may have unbalanced quotation marks.
That i was getting before i started using the symputx option suggested in my original problem.
The full code for this particular nested macro is listed below:
%macro symbol_var;
/*here the start row and end row created in the macro above are passed to this nested macro and then passed through the*/
/*source dataset. at the end of the loop each ticker macro variable is defined in turn for use in the following nested*/
/*macro, symbol by metric.*/
%do e = &beg_point. %to &end_point. %by 1;
%put stock row in dataset nasdaq ticker = &e.;
%global svar&e;
proc sql noprint;
select symbol
into :testvar
from nasdaq_ticker
where monotonic() = &e.;
quit;
/*convert value to string here*/
data _null_;
a = "&testvar.";
call symputx('svar',trim(input(a,$8.)));
run;
%let svar&e. = &svar.;
%put stock_ticker = &&svar&e.;
%end;
%mend;
%symbol_var;
Anyone have any suggestions how I could declare the macro &&svar&e. directly into the call synputx step? It currently throws up an error saying that the macro variable being created cannot contain any special characters. Ive tried using &QUOTE, %NRQUOTE and %NRBQUOTE but either I have used the function in an invalid context or I haven't got the syntax exactly right.
Thanks
Isn't this as simple as the following two line data step?
%macro symbol_var;
/*here the start row and end row created in the macro above are passed to this nested macro and then passed through the*/
/*source dataset. at the end of the loop each ticker macro variable is defined in turn for use in the following nested*/
/*macro, symbol by metric.*/
data _null_;
set nasdaq_ticker(firstobs=&beg_point. obs=&end_point.);
call symputx('svar' || strip(_n_), symbol);
run;
%mend;
%symbol_var;
Or the following (which includes debugging output)
%macro symbol_var;
/*here the start row and end row created in the macro above are passed to this nested macro and then passed through the*/
/*source dataset. at the end of the loop each ticker macro variable is defined in turn for use in the following nested*/
/*macro, symbol by metric.*/
data _null_;
set nasdaq_ticker(firstobs=&beg_point. obs=&end_point.);
length varname $ 32;
varname = 'svar' || strip(_n_);
call symputx(varname, symbol);
put varname '= ' symbol;
run;
%mend;
%symbol_var;
When manipulating macro variables and desiring bullet-proof code I often find myself reverting to using a data null step. The original post included the problem about a quoted string warning. This happens because the SAS macro parser does not hide the value of your macro variables from the syntax scanner. This means that your data (stored in macro vars) can create syntax errors in your program because SAS attempts to interpret it as code (shudder!). It really makes the hair on the back of my neck stand up to risk my program at the hands of what might be in the data. Using the data step and functions protects you from this completely. You will note that my code never uses an ampersand character other than the observation window points. This makes my code bullet proof regarding what dirty data there may be in the nasdaq_ticker data set.
Also, it is important to point out that both Dom and I wrote code that makes one pass over the nasdaq_ticker data set. Not to bash the original posted code, but looping in that way causes a proc sql invocation for every observation in the result set. This will create very poor performance for large result sets. I recommend developing an awareness of how many times a macro loop is going to cause you to read a data set. I have been bitten by this many times in my own code.
Try
call symputx("svar&e",trim(input(a,$8.)));
You need double quotes ("") to resolve the e macro.
As an aside, I am not sure you need the input statement if $testvar is a string and not a number.
I would have written this as
%macro whatever();
proc sql noprint;
select count(*)
into :n
from nasdaq_ticker;
select strip(symbol)
into :svar1 - :svar%left(&n)
from nasdaq_ticker;
quit;
%do i=1 %to &n;
%put stock_ticker = &&svar&i;
%end;
%mend;