macro variable is uninitialized after %let statement in sas - sas

I want to create something in SAS that works like an Excel lookup function. Basically, I set the values for macro variables var1, var2, ... and I want to find their index number according to the ref table. But I get the following messages in the data step.
NOTE: Variable A is uninitialized.
NOTE: Variable B is uninitialized.
NOTE: Variable NULL is uninitialized.
When I print the variables &num1,&num2, I get nothing. Here is my code.
data ref;
input index varname $;
datalines;
0 NULL
1 A
2 B
3 C
;
run;
%let var1=A;
%let var2=B;
%let var3=NULL;
data temp;
set ref;
if varname=&var1 then call symput('num1',trim(left(index)));
if varname=&var2 then call symput('num2',trim(left(index)));
if varname=&var3 then call symput('num3',trim(left(index)));
run;
%put &num1;
%put &num2;
%put &num3;
I can get the correct values for &num1,&num2,.. if I type varname='A' in the if-then statement. And if I subsequently change the statement back to varname=&var1, I can still get the required output. But why is it so? I don't want to input the actual string value and then change it back to macro variable to get the result everytime.

Solution to immediate problem
You need to wrap your macro variables in double quotes if you want SAS to treat them as string constants. Otherwise, it will treat them the same way as any other random bits of text it finds in your data step.
Alternatively, you could re-define the macro vars to include the quotes.
As a further option, you could use the symget or resolve functions, but these are not usually needed unless you want to create a macro variable and use it again within the same data step. If you use them as a replacement for double quotes they tend to use a lot more CPU as they will evaluate the macro vars once per row by default - normally, macro vars are evaluated just once, at compile time, before your code executes.
A better approach?
For the sort of lookup you're doing, you actually don't need to use a dataset at all - you can instead define a custom format, which gives you much more flexibility in how you can use it. E.g. this creates a format called lookup:
proc format;
value lookup
1 = 'A'
2 = 'B'
3 = 'C'
other = '#N/A' /*Since this is what vlookup would do :) */
;
run;
Then you can use the format like so:
%let testvar = 1;
%let testvar_lookup = %sysfunc(putn(&testvar, lookup.));
Or in a data step:
data _null_;
var1 = 1;
format var1 lookup.;
put var1=;
run;

Related

Set variable to macro variable with ampersand

Not sure how to title this as the title is still pretty ambiguous but what I'm doing is.
PROC SQL NOPRINT;
SELECT LABEL INTO :head
FROM dictionary.columns
WHERE UPCASE(MEMNAME)='PROCSQLDATA' AND UPCASE(NAME)=%UPCASE("&var.");
QUIT;
DATA want;
SET have;
head="%SUPERQ(&head.)";
RUN;
So what I'm doing with the code is setting a macro variable "head" to the label of the variable "&var." within the data set "procsqldata". So let's say the label for one of the variables that I'm throwing into the proc sql is Adam&Steve. How do I set that to a variable within a data set without throwing an error. One way that I tried to cheat, which doesn't work because I may be doing it wrong is doing
%LET steve='&steve';
but that doesn't seem to work and it just does an infinite loop on the data step for some reason.
A few points.
First the %SUPERQ() function wants the NAME of the macro variable to quote. So if you write:
%superq(&head)
the macro processor will evaluate the macro variable HEAD and use the value as the name of the macro variable whose value you want it to quote. Instead write that as:
%superq(head)
Second macro triggers are not evaluated inside of strings that use single quotes on the outside. So this statement:
%let steve='&steve';
will set the macro variable Steve to single quote, ampersand, s, t, .... single quote.
But note that if you macro quote the single quotes then they do not have that property of hiding text from the macro processor. So something like:
%str(%')%superq(head)%str(%')
or
%bquote('%superq(head)')
Will generate the value of the macro variable HEAD surrounded with quotes.
So you might get away with:
head = %bquote('%superq(head)') ;
Although sometimes that macro quoting can confuse the SAS compiler (especially inside of a macro) so now that you have the single quotes protecting the ampersand you might need to remove the macro quoting.
head = %unquote(%bquote('%superq(head)')) ;
But the real solution is not to use macro quoting it at all.
Either pull the value using the SYMGET() function.
head = symget('head');
(make sure to set a length for the dataset variable HEAD or SAS will default it to $200 because of the function call).
Or better still just leave the label in a variable to begin with instead of trying to confuse yourself (and everyone else) by stuffing it into a macro variable just so you can then pull it back into a real variable.
PROC SQL NOPRINT;
create table label as SELECT LABEL
FROM dictionary.columns
WHERE LIBNAME='MYLIB' and MEMNAME='PROCSQLDATA' and UPCASE(NAME)=%UPCASE("&var.")
;
QUIT;
DATA want;
SET have;
if _n_=1 then set label;
head = label;
drop label;
run;
%SUPERQ takes the name of the argument, retrieves the value of it in a macro quoted context.
You might have better long term understanding if the symbol name used (head) is a name with more contextual meaning (such as var_label). Also, your DICIONARY.COLUMNS query should include a criteria for libname=. Note: LIBNAME and MEMNAME values are always uppercase in DICTIONARY.COLUMNS. Name, which is the column name, can be mixed case and needs the upcase to compare for name equality.
PROC SQL NOPRINT;
SELECT LABEL INTO :var_label
FROM dictionary.columns
WHERE
LIBNAME = 'WORK' and
MEMNAME = 'PROCSQLDATA' and
UPCASE(NAME)=%UPCASE("&var.")
;
QUIT;
data labels;
set have;
head_label = "%superq(var_label)";
run;
An & in SUPERQ argument means the name of the macro variable, whose value is to be retrieved, is found as the value of a different macro variable.
%let p = %nrstr(Some value & That%'s that);
%let q = p;
%let v = %superq(&q);
%put &=v;
-------- LOG -------
V=Some value & That's that
&q become p and %superq retrieved value of p for assignment to v
Note: In some situations, you can retrieved the label of a variable in a running data step using the VLABEL or VLABELX functions.
data have;
label weight = 'Weight (kg)';
retain weight .;
weight_label_way1 = vlabel ( weight );
weight_label_way2 = vlabelx('weight');
run;
There will be a quoting function that can solve this, but I can never remember what they all do, and I find it best to avoid them, for my own sanity and that of my coworkers.
In this case, you don't need to resolve the macro variable into a string literal (ie head = "&head";) at all; you can use SYMGET:
DATA want;
SET have;
head = SYMGET('head');
RUN;
See the docs for the SYMGET function here:
https://documentation.sas.com/?docsetId=mcrolref&docsetTarget=n00cqfgax81a11n1oww7hwno4aae.htm&docsetVersion=9.4&locale=en
On an unrelated note, you should also read the 'DICTIONARY Tables and Performance' section at the end of this page:
https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.5&docsetId=sqlproc&docsetTarget=n02s19q65mw08gn140bwfdh7spx7.htm&locale=en
You might be surprised how much faster your first query will run if you removed the UPCASE functions from the WHERE clause.

Using SAS SET statement with numbered macro variables

I'm trying to create a custom transformation within SAS DI Studio to do some complicated processing which I will want to reuse often. In order to achieve this, as a first step, I am trying to replicate the functionality of a simple APPEND transformation.
To this end, I've enabled multiple inputs (max of 10) and am trying to leverage the &_INPUTn and &_INPUT_count macro variables referenced here. I would like to simply use the code
data work.APPEND_DATA / view=work.APPEND_DATA;
%let max_input_index = %sysevalf(&_INPUT_count - 1,int);
set &_INPUT0 - &&_INPUT&max_input_index;
keep col1 col2 col3;
run;
However, I receive the following error:
ERROR: Missing numeric suffix on a numbered data set list (WORK.SOME_INPUT_TABLE-WORK.ANOTHER_INPUT_TABLE)
because the macro variables are resolved to the names of the datasets they refer to, whose names do not conform to the format required for the
SET dataset1 - dataset9;
statement. How can I get around this?
Much gratitude.
You need to create a macro that loops through your list and resolves the variables. Something like
%macro list_tables(n);
%do i=1 %to &n;
&&_INPUT&i
%end;
%mend;
data work.APPEND_DATA / view=work.APPEND_DATA;
%let max_input_index = %sysevalf(&_INPUT_count - 1,int);
set %list_tables(&max_input_index);
keep col1 col2 col3;
run;
The SET statement will need a list of the actual dataset names since they might not form a sequence of numeric suffixed names.
You could use a macro %DO loop if are already running a macro. Make sure to not generate any semi-colons inside the %DO loop.
set
%do i=1 %to &_inputcount ; &&_input&i %end;
;
But you could also use a data step to concatenate the names into a single macro variable that you could then use in the SET statement.
data _null_;
call symputx('_input1',symget('_input'));
length str $500 ;
do i=1 to &_inputcount;
str=catx(' ',str,symget(cats('_input',i)));
end;
call symputx('_input',str);
run;
data .... ;
set &_input ;
...
The extra CALL SYMPUTX() at the top of the data step will handle the case when count is one and SAS only creates the _INPUT macro variable instead of creating the series of macro variables with the numeric suffix. This will set _INPUT1 to the value of _INPUT so that the DO loop will still function.

SAS: Get number of variables in current data step

I need a way to dynamically return the number of variables in the current data step.
Using SAS NOTE 24671: Dynamically determining the number of observations and variables in a SAS data set, I have come up with the following macro.
%macro GetVarCount(dataset);
/* Open assigns ID to open data set. Assigns 0 if DNE */
%let exists = %sysfunc(open(&dataset));
%if &exists %then
%do;
%let returnValue = %sysfunc(attrn(&exists, nvars));
%let closed = %sysfunc(close(&exists));
%end;
/* Output error if no dataset */
%else %put %sysfunc(sysmsg());
&returnValue
%mend;
Unfortunately, this errors out on an initial pass of a data set since the data set has not yet been created. After the first pass, and a dataset with 0 observations has been created, the macro can access the table and the number of variables.
For instance,
data example;
input x y;
put "NOTE: [DEV] There are %GetVarCount(example) variables in the EXAMPLE data set.";
datalines;
1
2
;
run;
The first run produces:
ERROR: File WORK.EXAMPLE.DATA does not exist.
WARNING: Apparent symbolic reference RETURNVALUE not resolved.
NOTE: [DEV] There are &returnValue variables in the EXAMPLE data set.
The second run produces:
NOTE: [DEV] There are 2 variables in the EXAMPLE data set.
Is there a way to get the number of variables in a data set first time the data step is run?
In your example, you're trying to determine the number of active variables in a data step - this isn't necessarily the same as the number of variables that will be in the output data set, because (a) there might not be an output data set and (b) some of the variables might get dropped.
With that caveat in mind, if you really want to do that, then this works:
data fred;
length x y z $ 20 f g 8;
array vars_char _character_;
array vars_num _numeric_;
total_vars = dim(vars_char) + dim(vars_num);
put "Vars in data step: " total_vars;
run;
This works by using the special _character_ and _numeric_ keywords to create arrays of all character and numeric vars in the current buffer, and the dim() function to get the sizes of those arrays.
It will only count variables that exist when the arrays are declared, so it doesn't count total_vars in this case.
You could wrap this in a macro like:
%macro var_count(var_count_name):
array vars_char _character_;
array vars_num _numeric_;
&var_count_name = dim(vars_char) + dim(vars_num);
%mend;
and then use it like:
data fred;
length x y z $ 20 f g 8;
%var_count(total_vars);
put "Vars in data step: " total_vars;
run;
Try to open a dataset that has already been created.
The 'open' function requires the dataset that WILL be open to exist, I think you want 'open' to give you an ID of the already open dataset; that is not the case.
The reason it works only after the first pass (not just the second), is because the first pass created an empty dataset with metadata regarding the variables it contains.
Use a library to permanently store your dataset first and then try your macro to read from it:
Data <lib>.dataset;
update:
#Reeza already gave you the answer in the comments.
Another alternative:
Using put _all_; will print all the variables to the log, if you write the put into a file and then read it and count the '=' signs you can get the variable count too. Just remove _n_ and _ERROR_ from the count.

Get values of Macro Variables in a SAS Table

I have a set of input macro variables in SAS. They are dynamic and generated based on the user selection in a sas stored process.
For example:There are 10 input values 1 to 10.
The name of the macro variable is VAR_. If a user selects 2,5,7 then 4 macro variables are created.
&VAR_0=3;
&VAR_=2;
&VAR_1=5;
&VAR_2=7;
The first one with suffix 0 provides the count. The next 3 provides the values.
Note:If a user select only one value then only one macro variable is created. For example If a user selects 9 then &var_=9; will be created. There will not be any count macro variable.
I am trying to create a sas table using these variables.
It should be like this
OBS VAR
-----------
1 2
2 5
3 7
-----------
This is what I tried. Not sure if this is the right way to do approach it.
It doesn't give me a final solution but I can atleast get the name of the macro variables in a table. How can I get their values ?
data tbl1;
do I=1 to &var_0;
VAR=CAT('&VAR_',I-1);
OUTPUT;
END;
RUN;
PROC SQL;
CREATE TABLE TBL2 AS
SELECT I,
CASE WHEN VAR= '&VAR_0' THEN '&VAR_' ELSE VAR END AS VAR
from TBL1;
QUIT;
Thank You for your help.
Jay
SAS helpfully stores them in a table for you already, you just need to parse out the ones you want. The table is called SASHELP.VMACRO or DICTIONARY.MACROS
Here's an example:
%let var=1;
%let var2=3;
%let var4=5;
proc sql;
create table want as
select * from sashelp.vmacro
where name like 'VAR%';
quit;
proc print data=want;
run;
I think the real issue is the inconsistent behavior of the stored process. It only creates the 0 and 1 variable when there are multiple selections. I think that your example is a little off. If the value of VAR_0 is three then their should be a VAR_3 macro variable. Also the value of VAR_ and VAR_1 should be set to the same thing.
To fix this in the past I have done something like this. First let's assign the parameter name a macro variable so that the code is reusable for other programs.
%let name=VAR_;
Then first make sure the minimal macro variables exist.
%global &name &name.0 &name.1 ;
Then make sure that you have a count by setting the 0 variable to 1 when it is empty.
%let &name.0 = %scan(&&&name.0 1,1);
Then make sure that you have a 1 variable. Since it should have the same value as the macro variable without a suffix just re-assign it.
%let &name.1 = &&&name ;
Now your data step is easier.
data want ;
length var $32 value $200 ;
do i=1 to &&&name.0 ;
var=cats(symget('name'),i);
value=symget(var);
output;
end;
run;
I don't understand your numbering scheme and recommend changing it, if you can; the &var_ variable is very confusing.
Anyway, the easiest way to do this is SYMGET. That returns a value from the macro symbol table which you can specify at runtime.
%let VAR_0=3;
%let VAR_=2;
%let VAR_1=5;
%let VAR_2=7;
data want;
do obs = 1 to &var_0.;
var = input(symget(cats('VAR_',ifc(obs=1,'',put(obs-1,2.)))),2.);
output;
end;
run;

Categorical variables with macro

I am trying to create categorical variables in sas. I have written the following macro, but I get an error: "Invalid symbolic variable name xxx" when I try to run. I am not sure this is even the correct way to accomplish my goal.
Here is my code:
%macro addvars;
proc sql noprint;
select distinct coverageid
into :coverageid1 - :coverageid9999999
from save.test;
%do i=1 %to &sqlobs;
%let n=coverageid&i;
%let v=%superq(&n);
%let f=coverageid_&v;
%put &f;
data save.test;
set save.test;
%if coverageid eq %superq(&v)
%then &f=1;
%else &f=0;
run;
%end;
%mend addvars;
%addvars;
You're combining macro code with data step code in a way that isn't correct. %if = macro language, meaning you are actually evaluating whether the text "coverageid" is equal to the text that %superq(&v) evaluates to, not whether the contents of the coverageid variable equal the value in &v. You could just convert %if to if, but even if you got that to work properly it would be hideously inefficient (you're rewriting the dataset N times, so if you have 1500 values for coverageID you rewrite the entire 500MB dataset or whatnot 1500 times, instead of just once).
If what you want to do is take the variable 'coverageid' and convert it to a set of variables that consist of all possible values of coverageid, 1/0 binary, for each, there are a nubmer of ways to do it. I'm fairly sure the ETS module has a procedure that just does this, but I don't recall it off the top of my head - if you were to post this to the SAS mailing list, one of the guys there would undoubtedly have it quickly.
The simple way for me, is to do this with entirely datastep code. First determine how many potential values there are for COVERAGEID, then assign each to a direct value, then assign the value to the correct variable.
If the COVERAGEID values are consecutive (ie, 1 to some number, no skips, or you don't mind skipping) then this is easy - set up an array and iterate over it. I will assume they are NOT consecutive.
*First, get the distinct values of coverageID. There are a dozen ways to do this, this works as well as any;
proc freq data=save.test;
tables coverageid/out=coverage_values(keep=coverageid);
run;
*Then save them into a format. This converts each value to a consecutive number (so the lowest value becomes 1, the next lowest 2, etc.) This is not only useful for this step, but it can be useful in the future in converting back.;
data coverage_values_fmt;
set coverage_values;
start=coverageid;
label=_n_;
fmtname='COVERAGEF';
type='i';
call symputx('CoverageCount',_n_);
run;
*Import the created format;
proc format cntlin=coverage_values_fmt;
quit;
*Now use the created format. If you had already-consecutive values, you could skip to this step and skip the input statement - just use the value itself;
data save.test_fin;
set save.test;
array coverageids coverageid1-coverageid&coveragecount.;
do _t = 1 to &coveragecount.;
if input(coverageid,COVERAGEF.) = _t then coverageids[_t]=1;
else coverageids[_t]=0;
end;
drop _t;
run;
Here's another way that doesn't use formats, and may be easier to follow.
First, just make some test data:
data test;
input coverageid ##;
cards;
3 27 99 105
;
run;
Next, create a data set with no observations but one variable for each level of coverageid. Note that this approach allows arbitrary values here.
proc transpose data=test out=wide(drop=_name_);
id coverageid;
run;
Finally, create a new data set that combines the initial data set and the wide one. Then, for each level of x, look at each categorical variable and decide whether to turn it "on".
data want;
set test wide;
array vars{*} _:;
do i=1 to dim(vars);
vars{i} = (coverageid = substr(vname(vars{i}),2,1));
end;
drop i;
run;
The line
vars{i} = (coverageid = substr(vname(vars{i}),2));
may require more explanation. vname returns the name of the variable, and since we didn't specify a prefix in proc transpose, all variables are named something like _1, _2, etc. So we take the substring of the variable name that starts in the second position, and compare it to coverageid; if they're the same, we set the variable to 1; otherwise it evaluates to 0.