SAS find non-zero minimum in row - sas

Does anybody know how to find the non-zero minimum in a row using the min function in SAS? Or any other option in SAS code?
Current code:
PIP_factor = `min(PIPAllAutos, PIPNotCovByWC, PIPCovByWC, PIPNotPrincOpByEmpls);

I think you need to use an array solution, ie
array pipArray pip:; *or whatever;
PIP_factor=9999;
do _n = 1 to dim(pipArray);
if pipArray[_n] > 0 then
PIP_factor = min(PIP_factor,pipArray[_n]);
end;
Or somesuch.

Here is another way, using the IFN function:
data null_;
PIPAllAutos = 2;
PIPNotCovByWC = .;
PIPCovByWC = 0;
PIPNotPrincOpByEmpls = 1;
PIP_factor = min(ifn(PIPAllAutos=0, . ,PIPAllAutos)
, ifn(PIPNotCovByWC=0, . ,PIPNotCovByWC)
, ifn(PIPCovByWC=0, . ,PIPCovByWC)
, ifn(PIPNotPrincOpByEmpls=0, . ,PIPNotPrincOpByEmpls)
);
put PIP_factor=;
run;
Note the min function ignores missing values; the ifn function sets zero values to missing.
Might be more typing than it's worth; offered only as an alternative. There are many ways to skin the cat.

This one doesn't suffer from 9999 limitation of the approved answer.
%macro minnonzero/parmbuff;
%local _argn _args _arg;
/* get rid of external parenthesis */
%let _args=%substr(%bquote(&syspbuff),2,%length(%bquote(&syspbuff))-2);
%let _argn=1;
min(
%do %while (%length(%scan(%bquote(&_args),&_argn,%str(|))) ne 0);
%let _arg=%scan(%bquote(&_args),&_argn,%str(|));
%if &_argn>1 %then %do;
,
%end;
ifn(&_arg=0,.,&_arg)
%let _argn=%eval(&_argn+1);
%end;
);
%mend;
You call it with pipe-separated list of arguments, e.g.
data piesek;
a=3;
b="kotek";
c=%minnonzero(a|findc(b,"z");
put c; /* 3, "kotek" has no "z" in it, so findc returns 0 */
run;

/* For each row, find the variable name corresponding to the minimum value */
proc iml;
use DATASET; /* DATASET is your dataset name of interest */
read all var _NUM_ into X[colname=VarNames]; /* read in only numerical columns */
close DATASET;
idxMin = X[, >:<]; /* find columns for min of each row */
varMin = varNames[idxMin]; /* corresponding var names */
print idxMin varMin;
For Max:
idxMax = X[, <:>];
I wasn't familiar with the operator above, SAS provides a helpful table for IML operators:
In PROC IML you can also create new datasets/append the results to your old one if you need them later on.
Full blog post: source, all credit goes to Rick Wicklin at SAS
edit: For the non-zero part, I would just do a PROC SQL using a WHERE variable is not 0 to filter before feeding it in the PROC IML. I am sure it can be done within PROC IML, but I just started using it myself. So, please comment if you know a way around it in PROC IML and I will include the fix.

Related

how to put &&var&i into if condition

/*create macro variables*/
PROC SQL NOPRINT;
SELECT RESTRICTIONS
INTO :RESTRI1 - :RESTRI35
FROM SASDATA.RESTRICTIONLIST;
QUIT;
%PUT &RESTRI2;
/*the resolved value is: */
gender = 'M' and state = 'CA'
I want to create a data set sasdata.newlist&i when the ith restriction
is &&restri&i (eg: gender = 'M' and state = 'CA').
I only want the observations which meet the restriction &&restri&I* in this new created dataset
While the sasdata.newlist2 contains all data in sasdata.oldlist, the if condition doesn't work. Anybody can help me to solve this problem?
%Macro testing(I);
data sasdata.newlist&i;
set sasdata.oldlist;
%if &&restri&i %then;
run;
%mend testing;
%testing(2)
You are not resolving the macro variables in the proper context. When applying the restriction code, resolve it so it can be compiled (data step-wise) as part of the DATA step.
%Macro testing(I);
data sasdata.newlist&i;
set sasdata.oldlist;
/* %if &&restri&i %then; NO-no-no, incorrect context */
* apply ith restriction as a sub-setting IF statement;
if &&restri&i;
run;
%mend testing;
%testing(2)
While it's hard to tell when to use macro statement, when not.
for example: Do I need to put % in the if -then-else statement and do while statement in code below? By the way, can I use "Do i = 1 to n while (condition)" statement here like this?
%MACRO FUNDSOURCE(I);
DATA SASDATA.STUDENT&I;
SET SASDATA.STUDENTLIST
DO M = 1 TO 310 WHILE(&&BUDG&I > 0); /*loop through all observations_ALL
STUDENTS*/
IF &&BUDG&I LE 3000- FA_TOT1 THEN do;
DISBURSE = &&BUDG&I;
FA_TOT1+DISBURE;
&&BUDG&I - DISBURSE;
end;
ELSE IF &&BUDG&I GT (3000- FA_TOT1) THEN DO;
DISBURSE = 3000-FA_TOT1;
FA_TOT1+DISBURSE;
&&BUDG&I - DISBURSE;
END;
END;
IF _n_ > M THEN DELETE; /*if budget are all gone, delete other observations,
keep observations only for the student who get funds*/
RUN;
%MEND FUNDSOURCE;

Splitting a SAS dataset into multiple datasets, according to value of one variable

Is there a more streamlined way of accomplishing this? This is a simplified example. In the real case there are > 10 values of var, each of which need their own dataset.
data
new1
new2
new3;
set old;
if var = 'new1' then output new1;
else if var = 'new2' then output new2;
else if var = 'new3' then output new3;
run;
This should work out. You just need to change the %to 5 to 10 (the max new number). The point made by #Reeza is great. I would also take a look at that post since it's an important suggestion. Usually this is not a good way to handle data, but this should get you around.
data have;
input var $;
datalines;
new1
new2
new3
new4
new5
;
run;
*Actual code starts here;
%macro splitting;
%do i=1 %to 5;
%put "new&i";
proc sql;
create table table&i as
select *
from have
where var contains "new&i";
quit;
%end;
%mend splitting;
%splitting;

Macro to loop through variables and store results SAS

I have the following variables: A_Bldg B_Bldg C_Bldg D_Bldg. I want to multiply them by INTSF and store the result in a new variable, Sale_i. For example, A_Bldg * INTSF = Sale_A, B_Bldg * INTSF = Sale_B, and so on.
My code is:
%macro loopit(mylist);
%let n=%sysfunc(countw(&mylist));
%do J = 1 %to &n;
%let i = %scan(&mylist,&J);
data test;
set data;
sale_&i. = &i._Bldg * INTSF;
run;
%end;
%mend;
%let list = A B C D;
%loopit(&list);
This only produces Sale_D, which is the last letter in the list. How do I get Sales A-C to appear? The first four lines of code are so I can loop through the text A-D. I thought about doing it with arrays, but didn't know how to choose the variables based on the A-D indicators. Thanks for your help!
You're currently looping through your list and recreating the test dataset every time, so it only appears to have sale_d because you're only viewing the last iteration.
You can clean up your loop by scanning through your list in one data step to solve your problem:
%let list = A B C D;
%macro loopit;
data test;
set data;
%do i = 1 %to %sysfunc(countw(&list.));
%let this_letter = %scan(&list., &i.);
sale_&this_letter. = &this_letter._Bldg * INTSF;
%end;
run;
%mend loopit;
%loopit;
Your %DO loop is in the wrong place. But really you do not need to use macro code to do something that the native SAS code can already do.
data want;
set have ;
array in A_Bldg B_Bldg C_Bldg D_Bldg ;
array out sale_1-sale4 ;
do i=1 to dim(in);
out(i)=intsf*in(i);
end;
run;

Where is the missing operator in the SAS MIN function?

I'm just starting out in SAS and have run into some troubles. I want to get the number of observations from two data sets and assign those values to existing global macro variables. Then I want to find the smaller of the two. This is my attempt so far:
%GLOBAL nBlue = 0;
%GLOBAL nRed = 0;
%MACRO GetArmySizes(redData=, blueData=);
/* Takes in 2 Army Datasets, and outputs their respective sizes to nBlue and nRed */
data _Null_;
set &blueData nobs=j;
if _N_ =2 then stop;
No_of_obs=j;
call symput("nBlue",j);
run;
data _Null_;
set &redData nobs=j;
if _N_ =2 then stop;
No_of_obs=j;
call symput("nRed",j);
run;
%put &nBlue;
%put &nRed;
%MEND;
%put &nBlue; /* outputs 70 here */
%put &nRed; /* outputs 100 here */
%put %EVAL(min(1,5));
%GetArmySizes(redData=redTeam1, blueData=blueTeam); /* outputs 70\n100 here */
%put &nBlue; /* outputs 70 here */
%put &nRed; /* outputs 100 here */
%MACRO PrepareOneVOneArmies(redData=,numRed=,blueData=,numBlue=);
/* Takes in two army data sets and their sizes, and outputs two new army
data sets with the same number of observations */
%let smallArmy = %eval(min(&numRed,&numBlue));
%put &smallArmy;
%local numOneVOne;
%let numOneVOne = %eval(&smallArmy-%Eval(&nBlue - &nRed));
%put &numOneVOne;
data redOneVOne; set &redData (obs=&numOneVOne);
run;
data blueOneVOne; set &blueData (obs=&numOneVOne);
run;
%MEND;
%PrepareOneVOneArmies(redData=redTeam1,numRed=&nRed,blueData=blueTeam,numBlue=&nBlue);
/* stops executing when program gets to %let smallArmy =... */
redTeam1 is a data set with 100 observations, blueTeam has 70 observations.
I now run into the problem where whenever I call the function "Min" I get:
"ERROR: Required operator not found in expression: min(1,5)"
or
"ERROR: Required operator not found in expression: min(100,70)"
What am I missing?
"Min" seems like a simple enough function. Also, if it matters, I am using the University edition of SAS.
While using functions in macro language you need to wrap the function in %SYSFUNC(). This helps sas delineate from a word that could be min versus a reference to an actual function.
%put %sysfunc(min(1,5));
Not related to your question, but for obtaining the size of a dataset, reading the full data set is an inefficient method. Consider using the dictionary table (SASHELP.VTABLE) instead.

Macro returning a value

I created the following macro. Proc power returns table pw_cout containing column Power. The data _null_ step assigns the value in column Power of pw_out to macro variable tpw. I want the macro to return the value of tpw, so that in the main program, I can call it in DATA step like:
data test;
set tmp;
pw_tmp=ttest_power(meanA=a, stdA=s1, nA=n1, meanB=a2, stdB=s2, nB=n2);
run;
Here is the code of the macro:
%macro ttest_power(meanA=, stdA=, nA=, meanB=, stdB=, nB=);
proc power;
twosamplemeans test=diff_satt
groupmeans = &meanA | &meanB
groupstddevs = &stdA | &stdB
groupns = (&nA &nB)
power = .;
ods output Output=pw_out;
run;
data _null_;
set pw_out;
call symput('tpw'=&power);
run;
&tpw
%mend ttest_power;
#itzy is correct in pointing out why your approach won't work. But there is a solution maintaing the spirit of your approach: you need to create a power-calculation function uisng PROC FCMP. In fact, AFAIK, to call a procedure from within a function in PROC FCMP, you need to wrap the call in a macro, so you are almost there.
Here is your macro - slightly modified (mostly to fix the symput statement):
%macro ttest_power;
proc power;
twosamplemeans test=diff_satt
groupmeans = &meanA | &meanB
groupstddevs = &stdA | &stdB
groupns = (&nA &nB)
power = .;
ods output Output=pw_out;
run;
data _null_;
set pw_out;
call symput('tpw', power);
run;
%mend ttest_power;
Now we create a function that will call it:
proc fcmp outlib=work.funcs.test;
function ttest_power_fun(meanA, stdA, nA, meanB, stdB, nB);
rc = run_macro('ttest_power', meanA, stdA, nA, meanB, stdB, nB, tpw);
if rc = 0 then return(tpw);
else return(.);
endsub;
run;
And finally, we can try using this function in a data step:
options cmplib=work.funcs;
data test;
input a s1 n1 a2 s2 n2;
pw_tmp=ttest_power_fun(a, s1, n1, a2, s2, n2);
cards;
0 1 10 0 1 10
0 1 10 1 1 10
;
run;
proc print data=test;
You can't do what you're trying to do this way. Macros in SAS are a little different than in a typical programming language: they aren't subroutines that you can call, but rather just code that generate other SAS code that gets executed. Since you can't run proc power inside of a data step, you can't run this macro from a data step either. (Just imagine copying all the code inside the macro into the data step -- it wouldn't work. That's what a macro in SAS does.)
One way to do what you want would be to read each observation from tmp one at a time, and then run proc power. I would do something like this:
/* First count the observations */
data _null_;
call symputx('nobs',obs);
stop;
set tmp nobs=obs;
run;
/* Now read them one at a time in a macro and call proc power */
%macro power;
%do j=1 %to &nobs;
data _null_;
nrec = &j;
set tmp point=nrec;
call symputx('meanA',meanA);
call symputx('stdA',stdA);
call symputx('nA',nA);
call symputx('meanB',meanB);
call symputx('stdB',stdB);
call symputx('nB',nB);
stop;
run;
proc power;
twosamplemeans test=diff_satt
groupmeans = &meanA | &meanB
groupstddevs = &stdA | &stdB
groupns = (&nA &nB)
power = .;
ods output Output=pw_out;
run;
proc append base=pw_out_all data=pw_out; run;
%end;
%mend;
%power;
By using proc append you can store the results of each round of output.
I haven't checked this code so it might have a bug, but this approach will work.
You can invoke a macro which calls procedures, etc. (like the example) from within a datastep using call execute(), but it can get a bit messy and difficult to debug.