proc means data=tableepisodes noprint;
output out=tableepisodes
mean(%ratings %dummies)=%ratings %dummies;
by ProgCodeID ProgSeasonCodeID year week
I was reading through a SAS code and I am not sure what the mean part of the code does ,
Is it that it only takes the mean of %ratings variables and attach the % dummies variables to the output ?
would really appreciate if I could get help in understanding this code snippet
That isn't a complete code snippet, and no.
It calculates the mean of the variables listed in %rating AND %dummies, assuming of course that's what is included in those macros.
Without seeing the macro definitions we can't be sure of what it is actually doing.
As written, the code is going to evaluate the means of the variables stored inside the macro variables ratings and dummies. Taking ratings as an example, we're assuming it was defined earlier on as something like:
%let ratings = good bad ugly;
So, when you pass it through the proc means, %ratings will evaluate to good bad ugly and SAS will take the means of all three variables.
You could have written the proc means function as:
proc means data = tableepisodes noprint;
by ProgCodeID ProgSeasonCodeID year week;
var good bad ugly;
output out = tableepisodes mean= / autoname;
run;
instead. (Also, note that you're overwriting your original dataset here, which you may want to avoid.)
Related
I'm trying to convert a SAS dataset column to a list of macro variables but am unsure of how indexing works in this language.
DATA _Null_;
do I = 1 to &num_or;
set CondensedOverrides4 nobs = num_or;
call symputx("Item" !! left(put(I,8.))
,"Rule", "G");
end;
run;
Right now this code creates a list of macro variables Item1,Item2,..ItemN etc. and assigns the entire column called "Rule" to each new variable. My goal is to put the first observation of "Rule" in Item1, the second observation in that column in Item2, etc.
I'm pretty new to SAS and understand you can't brute force logic in the same way as other languages but if there's a way to do this I would appreciate the guidance.
Much easier to create a series of macro variables using PROC SQL's INTO clause. You can save the number of items into a macro variable.
proc sql noprint;
select rule into :Item1-
from CondensedOverrides4
;
%let num_or=&sqlobs;
quit;
If you want to use a data step there is no need for a DO loop. The data step iterates over the inputs automatically. Put the code to save the number of observations into a macro variable BEFORE the set statement in case the input dataset is empty.
data _null_;
if eof then call symputx('num_or',_n_-1);
set CondensedOverrides4 end=eof ;
call symputx(cats('Item',_n_),rule,'g');
run;
SAS does not need loops to access each row, it does it automatically. So your code is really close. Instead of I, use the automatic variable _n_ which can function as a row counter though it's actually a step counter.
DATA _Null_;
set CondensedOverrides4;
call symputx("Item" || put(_n_,8. -l) , Rule, "G");
run;
To be honest though, if you're new to SAS using macro variables to start isn't recommended, there are usually multiple ways to avoid it anyways and I only use it if there's no other choice. It's incredibly powerful, but easy to get wrong and harder to debug.
EDIT: I modified the code to remove the LEFT() function since you can use the -l option on the PUT statement to left align the results directly.
EDIT2: Removing the quotes around RULE since I suspect it's a variable you want to store the value of, not the text string 'RULE'. If you want the macro variables to resolve to a string you would add back the quotes but that seems incorrect based on your question.
I have written a simple macro and applied it in a simple SAS data step to illustrate an issue I am having with viewing output.
Macro:
%macro test_func(var=);
%put &var;
%mend;
Data step:
data test_data_step;
value = 0;
%test_func(var = value);
run;
My issue is that the output I see is just the string value rather than the value held in the variable whose name is equal to that string.
I believe I have a vague understanding as to why SAS is doing this, but I don't know how to get it to give the desired value (namely 0 in this case). So how would I be able to achieve that functionality?
Thanks!
The issue lies in the difference between %put and put.
Do you want to see the contents of the macro variable &var? Then use %put(&var).
If, however, you want to see the contents of the SAS data step variable whose name is stored in &var, then use put(&var).
As such I would rewrite this:
%macro test_func(var=);
put &var.;
%mend;
And now this works as you expect:
data test_data_step;
value = 0;
%test_func(var = value)
run;
(Note one other minor change - the removal of the ; after %test_func - it is unnecessary, and while usually not a big deal, it can cause problems if you get in the habit of putting it there.)
So I have created a macro, which works perfectly fine. Within the macro, I set where the observation will begin reading, and then how many observations it will read.
But, in my proc print call, I am not able to simply do:
(firstobs=&start obs=&obs)
Because although firstobs will correctly start where I want it, obs does not cooperate, as it must be a higher number than firstobs. For example,
%testmacro(start=5, obs=3)
Does not work, because it is reading in the first 3 observations, but trying to start at observation 5. What I want the macro to do is, start at observation 5, and then read the next 3. So what I did is this:
(firstobs=&start obs=%eval((&obs-1)+&start))
This works perfectly fine when I use it. But I am just wondering if there is a simpler way to do this, rather than having to use the whole %eval... call. Is there one simple call, something like numberofobservations=...?
I don't think there is. You can only simplify your macro a little, within the %eval(). .
%let start=5;
%let obs=3;
data want;
set sashelp.class (firstobs=&start obs=%eval(&obs-1+&start));
run;
Data set options are listed here:
http://support.sas.com/documentation/cdl/en/ledsoptsref/68025/HTML/default/viewer.htm#p0h5nwbig8mobbn1u0dwtdo0c0a0.htm
You could count the obs inside the data step using a counter and only outputting the records desired, but that won't work on something like proc print and isn't efficient for larger data steps.
You could try the point= option, but I'm not familiar with that method, and again I don't think it will work with proc print.
As #Reeza said - there is not a dataset option that will do what you are looking for. You need to calculate the ending observation unfortunately, and %eval() is about as good a way to do it as any.
On a side-note, I would recommend making your macro parameter more flexible. Rather than this:
%testmacro(start=5, obs=3)
Change it to take a single parameter which will be the list of data-set options to apply:
%macro testmacro(iDsOptions);
data want;
set sashelp.class (&iDsOptions);
run;
%mend;
%testmacro(firstobs=3 obs=7);
This provides more flexibility if you need to add in additional options later, which means fewer future code changes, and it's simpler to call the macro. You also defer figuring out the observation counts in this case to the calling program which is a good thing.
My company just switched from R to SAS and I am converting a lot of my R code to SAS. I am having a huge issue dynamically declaring variables (macro variables) in SAS.
For example one of my processes needs to take in the mean of a column and then apply it throughout the code in many steps.
%let numm =0;
I have tried the following with my numm variable but both methods do not work and I cannot seem to find anything online.
PROC MEANS DATA = ASSGN3.COMPLETE mean;
#does not work
&numm = VAR MNGPAY;
run;
Proc SQL;
#does not work
&numm =(Select avg(Payment) from CORP.INV);
quit;
I would highly recommend buying a book on SAS or taking a class from SAS Training. SAS Programming II is a good place to start (Programming I if you have not programmed anything else, but that doesn't sound like the case). The code you have shows you need it. It is a complete paradigm shift from R.
That said, try this:
proc sql noprint;
select mean(payment) into :numm from corp.inv;
quit;
%put The mean is: &numm;
Here's the proc summary / data step equivalent:
proc summary data = corp.inv;
var payment;
output out = inv_summary mean=;
run;
data _null_;
set inv_summary;
call symput('numm',payment);
run;
%put The mean is: &numm;
Proc sql is a more compact approach if you just want a simple arithmetic mean, but if you require more complex statistics it makes sense to use proc summary.
I started to learn SAS here fairly recently and am getting the basics down pretty well, but have a question regarding something that is a little outside of my current realm of knowledge. Does anyone happen to know of a way to cycle through all variables in a SAS dataset? I know how to run a do loop/array on variables in a range (x1-x99), but ideally would like to look at every variable without having to rename any variables. Basically, I'm looking to run through a dataset and change variable values when the current value = 'True'/'False'. My guess is that I'll need to use proc contents in someway here, but not really sure how to go about using it correctly. Any tips/insight would be greatly appreciated. Thanks!
You can create an array of non-similarly-named variables. You're on the right track with PROC CONTENTS, although you also can use dictionary.columns or sashelp.vcolumn, which contain basically the same information.
proc sql;
select name into :collist separated by ' '
from dictionary.columns
where memname='DATASETNAME' and libname='LIBNAME' and <other criteria>;
quit;
The variables have to be all of the same type (char/numeric) so you may want to include a criterion of variable type in your query, plus any other limiting factor you may need.
That will create a list, &collist., in a macro variable you can use in your array
array vars &collist.;
and now you can loop over the array.
You may also be able to cheat things, if all of your variables are the same type, and you know the order is fixed . The double dash list (x1--x99) is 'in variable order, all variables from x1 to x99' and doesn't require numeric suffixes or anything like that.
Finally, you also might be able to write a format in PROC FORMAT to accomplish what you need, depending on what you are intending to do (mapping TRUE to 1 and FALSE to 0 or something like that).
Adding to Joe's answer: you can overcome the requirement that all variables should be of the same type. For that you can use macro loop instead of array. Firstly you need to define the macro:
%macro loop;
%do i=1 %to %sysfunc(countw(&collist));
....
<here goes your code for changing values, where instead of a variable name
you use macro function %scan(&collist,&i)>
....
%end;
%mend loop;
and now you can paste %loop into the DATA step where you're going to process all variables.