Can someone tell me what this SAS code does? - sas

I am new to SAS. I am translating SAS code into another language and saw this:
IF FIRST.VAR THEN VAR1=VAR2; ELSE VAR1=VAR2*DUM;
VAR, VAR1, and VAR2 are variables in the dataset. DUM is not a variable, but a SAS dialect.
Any idea what this DUM does?
I haven't try anything

If you only post one line of code without any context there is no way anyone can truly help.
Here is what those two statements mean in a SAS data step.
The FIRST.VAR will exist only if VAR is included in the list of variables in the BY statement. In which case it will be true on the first observation of the group defined by the current value of VAR.
So for the first observation in the group defined by the value VAR the variable VAR1 is assigned the value of VAR2. And for any subsequent observations in the group VAR1 is instead assigned the value of VAR2 times DUM.
In that code DUM is the name of a variable. If the variable does not exist then SAS will create it as a numeric variable and its value will be missing. If you perform multiplication and at least on the values is missing the result will also be missing.

Related

Difficulty understanding the "_n_" variable in SAS, and how it applies to a concatenate function

I am very new to SAS, and for whatever reason am finding a lot of difficulty deciphering what this code block (below) does. I've googled and search stackoverflow to no avail. I'd appreciate any input, thanks!
set dataset;
id=cat("L",_n_);
run;
Probably there must be a data statement as well.
data newdataset;
set dataset;
id = cat("L", _n_);
run;
This above code creates a new dataset named newdataset from the existing dataset named dataset.
Also creating a new column called id, and id is creating by concatenating a constant character value "L" with the automatic variable _n_ using the CAT function. The automatic variable _n_ represents the number of times the DATA step has iterated.

what's the difference on dealing char between where statement and if statement in SAS 9.4M6

Recently, I happen to find an interesting thing:
data test;
set sashelp.class;
if _N_ in (2 3 5 7) then name = '';
run;
data test;
set ;
where name;
run;
The result is: data Test has 15 observations.
I use SAS 9.4M6 for over a year and don't remember that where statment can deal with char type variable just like boolean type variable. I strightly change where to if and it leads the results: name is not a numeric variable and data Test has 0 observation.
So here is my two questions:
1. Is where name; another way of where name is not missing? if not, what happens when submitwhere name?
2. When did this kind of code(where <variable>;) start be allowed in SAS?
Thanks for any tips.
Yes it is a test for non empty values. I have used it for years, mainly when interactively browsing a dataset. I suspect it has been there since SAS introduced the WHERE statement, or at least since they re-factored it to use the same syntax as SQL WHERE clauses.
WHERE statements use PROC SQL style syntax (use of LIKE no use of variable lists) but IF statements use normal SAS syntax.
So if you used
if name ;
in your data step then you would see notes about SAS trying to convert the character variable NAME into a number that it can evaluate as FALSE (zero of missing) or TRUE (any other value). Since the names in SASHELP.CLASS cannot be converted to numbers then all of them will be treated as FALSE.
Interesting find! I also have not noticed this. I tried this on both 9.4M4 and 9.4M5 and received the same results. This may be something added even earlier than that. I see no documentation on it as well. From testing, here is what I am able to see:
where statements can automatically convert characters such as as if name to boolean statements
if statements do not automatically convert characters to boolean and require the user to explicitly state the character you are trying to exclude, e.g.
Code:
data want;
set test ;
if(NOT missing(name));
run;
I cannot tell when this was added, but it is earlier than versions I have access to. Your test code would be a great quiz question!

How to use call symput on a specific observation in SAS

I'm trying to convert a SAS dataset column to a list of macro variables but am unsure of how indexing works in this language.
DATA _Null_;
do I = 1 to &num_or;
set CondensedOverrides4 nobs = num_or;
call symputx("Item" !! left(put(I,8.))
,"Rule", "G");
end;
run;
Right now this code creates a list of macro variables Item1,Item2,..ItemN etc. and assigns the entire column called "Rule" to each new variable. My goal is to put the first observation of "Rule" in Item1, the second observation in that column in Item2, etc.
I'm pretty new to SAS and understand you can't brute force logic in the same way as other languages but if there's a way to do this I would appreciate the guidance.
Much easier to create a series of macro variables using PROC SQL's INTO clause. You can save the number of items into a macro variable.
proc sql noprint;
select rule into :Item1-
from CondensedOverrides4
;
%let num_or=&sqlobs;
quit;
If you want to use a data step there is no need for a DO loop. The data step iterates over the inputs automatically. Put the code to save the number of observations into a macro variable BEFORE the set statement in case the input dataset is empty.
data _null_;
if eof then call symputx('num_or',_n_-1);
set CondensedOverrides4 end=eof ;
call symputx(cats('Item',_n_),rule,'g');
run;
SAS does not need loops to access each row, it does it automatically. So your code is really close. Instead of I, use the automatic variable _n_ which can function as a row counter though it's actually a step counter.
DATA _Null_;
set CondensedOverrides4;
call symputx("Item" || put(_n_,8. -l) , Rule, "G");
run;
To be honest though, if you're new to SAS using macro variables to start isn't recommended, there are usually multiple ways to avoid it anyways and I only use it if there's no other choice. It's incredibly powerful, but easy to get wrong and harder to debug.
EDIT: I modified the code to remove the LEFT() function since you can use the -l option on the PUT statement to left align the results directly.
EDIT2: Removing the quotes around RULE since I suspect it's a variable you want to store the value of, not the text string 'RULE'. If you want the macro variables to resolve to a string you would add back the quotes but that seems incorrect based on your question.

explain what is happening in Proc Sql

select Name into :Dataset1-:Dataset%trim(%left(&DatasetNum)) from MEM;
I am not able to interpret what is happening here in this statement can anyone give me an explanation.
I understand this stament
select count(Name) into :DatasetNum from MEM
But not the above one.
It is attempting to use the value of the macro variable DATASETNUM as the upper bound on the macro variable names that are being created by the SELECT statement. Because the previous variable was created with leading spaces the %LEFT() macro is called to remove them. The call to the macro %trim() is not needed as trailing spaces would not cause any trouble.
It is much easier to just build the macro variable array first and then set the counter variable from the value of the automatic macro variable SQLOBS. Plus then it will not have the leading blanks.
select name into :Dataset1- from mem ;
%let DatasetNum=&sqlobs;
If you have an older version of SAS that doesn't support the new :varname- syntax then just use a large value for the upper bound. SAS will only create the number of macro variables it needs.
select name into :Dataset1-:Dataset99999 from mem;
This is creating an array of SAS macro variables (DATASET1, DATASET2, DATASET3) etc, populated from the Name column of the MEM dataset.
It is analagous to:
data _null_;
set MEM;
call symputx(cats('Dataset',_n_),Name);
run;

Cycling through all variables

I started to learn SAS here fairly recently and am getting the basics down pretty well, but have a question regarding something that is a little outside of my current realm of knowledge. Does anyone happen to know of a way to cycle through all variables in a SAS dataset? I know how to run a do loop/array on variables in a range (x1-x99), but ideally would like to look at every variable without having to rename any variables. Basically, I'm looking to run through a dataset and change variable values when the current value = 'True'/'False'. My guess is that I'll need to use proc contents in someway here, but not really sure how to go about using it correctly. Any tips/insight would be greatly appreciated. Thanks!
You can create an array of non-similarly-named variables. You're on the right track with PROC CONTENTS, although you also can use dictionary.columns or sashelp.vcolumn, which contain basically the same information.
proc sql;
select name into :collist separated by ' '
from dictionary.columns
where memname='DATASETNAME' and libname='LIBNAME' and <other criteria>;
quit;
The variables have to be all of the same type (char/numeric) so you may want to include a criterion of variable type in your query, plus any other limiting factor you may need.
That will create a list, &collist., in a macro variable you can use in your array
array vars &collist.;
and now you can loop over the array.
You may also be able to cheat things, if all of your variables are the same type, and you know the order is fixed . The double dash list (x1--x99) is 'in variable order, all variables from x1 to x99' and doesn't require numeric suffixes or anything like that.
Finally, you also might be able to write a format in PROC FORMAT to accomplish what you need, depending on what you are intending to do (mapping TRUE to 1 and FALSE to 0 or something like that).
Adding to Joe's answer: you can overcome the requirement that all variables should be of the same type. For that you can use macro loop instead of array. Firstly you need to define the macro:
%macro loop;
%do i=1 %to %sysfunc(countw(&collist));
....
<here goes your code for changing values, where instead of a variable name
you use macro function %scan(&collist,&i)>
....
%end;
%mend loop;
and now you can paste %loop into the DATA step where you're going to process all variables.