I have a list of SAS datasets which I want to sort by the same variable.
I do not want to use the PROC Sort statement for each one of them,
is there a way to use loops to shorten the entire code?
I am new to SAS so please help!
%let prim =sasdata.qc_no_rx ;
%let other_removals = sasdata.qc_other_removals;
%let drops =sasdata.droplist;
Array data_1(3) $ sasdata.qc_no_rx sasdata.qc_other_removals
sasdata.droplist ;
do over data_1;
Proc sort data = data_1 ;
by ims_ref;
end;
Assuming you have a data set, called dname_list, with the data set names, and they're called dname. Call execute will generate the code and execute it.
I usually create my command in a string and then pass that to call execute. This is a data _null_ step so it doesn't generate a data set but you can generate the data set to test at first if necessary.
You don't need to loop because SAS loops through the records in a data set by itself.
If you're sorting data in a library make sure to include the library name as well.
data _null_;
*data dname_execute;
set dname_list;
string = catt('proc sort data=', dname, '; by age; run;');
call execute(string);
run;
This should help:
%macro multsort(indlist,outdlist,byvarlist,ndata);
%do i = 1 %to &ndata.;
%let indata = %scan("&indlist.",&i.," ");
%let outdata = %scan("&outdlist.",&i.," ");
%let byvars = %scan("&byvarlist.",&i.," ");
proc sort data = &indata. out=&outdata.;by &byvars. ;run;
%end;
%mend;
%multsort(indlist=sashelp.Air sashelp.Buy,outdlist=Sa Sb,byvarlist=Air amount,ndata=2);
I'm just starting out in SAS and have run into some troubles. I want to get the number of observations from two data sets and assign those values to existing global macro variables. Then I want to find the smaller of the two. This is my attempt so far:
%GLOBAL nBlue = 0;
%GLOBAL nRed = 0;
%MACRO GetArmySizes(redData=, blueData=);
/* Takes in 2 Army Datasets, and outputs their respective sizes to nBlue and nRed */
data _Null_;
set &blueData nobs=j;
if _N_ =2 then stop;
No_of_obs=j;
call symput("nBlue",j);
run;
data _Null_;
set &redData nobs=j;
if _N_ =2 then stop;
No_of_obs=j;
call symput("nRed",j);
run;
%put &nBlue;
%put &nRed;
%MEND;
%put &nBlue; /* outputs 70 here */
%put &nRed; /* outputs 100 here */
%put %EVAL(min(1,5));
%GetArmySizes(redData=redTeam1, blueData=blueTeam); /* outputs 70\n100 here */
%put &nBlue; /* outputs 70 here */
%put &nRed; /* outputs 100 here */
%MACRO PrepareOneVOneArmies(redData=,numRed=,blueData=,numBlue=);
/* Takes in two army data sets and their sizes, and outputs two new army
data sets with the same number of observations */
%let smallArmy = %eval(min(&numRed,&numBlue));
%put &smallArmy;
%local numOneVOne;
%let numOneVOne = %eval(&smallArmy-%Eval(&nBlue - &nRed));
%put &numOneVOne;
data redOneVOne; set &redData (obs=&numOneVOne);
run;
data blueOneVOne; set &blueData (obs=&numOneVOne);
run;
%MEND;
%PrepareOneVOneArmies(redData=redTeam1,numRed=&nRed,blueData=blueTeam,numBlue=&nBlue);
/* stops executing when program gets to %let smallArmy =... */
redTeam1 is a data set with 100 observations, blueTeam has 70 observations.
I now run into the problem where whenever I call the function "Min" I get:
"ERROR: Required operator not found in expression: min(1,5)"
or
"ERROR: Required operator not found in expression: min(100,70)"
What am I missing?
"Min" seems like a simple enough function. Also, if it matters, I am using the University edition of SAS.
While using functions in macro language you need to wrap the function in %SYSFUNC(). This helps sas delineate from a word that could be min versus a reference to an actual function.
%put %sysfunc(min(1,5));
Not related to your question, but for obtaining the size of a dataset, reading the full data set is an inefficient method. Consider using the dictionary table (SASHELP.VTABLE) instead.
I have a data set with one row for each country and 100 columns (10 variables with 10 data years each).
For each variable I am trying to make a new data set with the three most recent data years for that variable for each country (which might not be successive).
This is what I have so far, but I know its wrong because of the nest loop, and its has same value for recent1 recent2 recent3 however I haven't figured out how to create recent1 recent2 recent3 without two loops.
%macro test();
data Maternal_care_recent;
set wb;
keep country MATERNAL_CARE_2004 -- MATERNAL_CARE_2013 recent_1 recent_2 recent_3;
%let rc = 1;
%do i = 2013 %to 2004 %by -1;
%do rc = 1 %to 3 %by 1;
%if MATERNAL_CARE_&i. ne . %then %do;
recent_&rc. = MATERNAL_CARE_&i.;
%end;
%end;
%end; run; %mend; %test();
You don't need to use a macro to do this - just some arrays:
data Maternal_care_recent;
set wb;
keep country MATERNAL_CARE_2004-MATERNAL_CARE_2013 recent_1 recent_2 recent_3;
array mc {*} MATERNAL_CARE_2004-MATERNAL_CARE_2013;
array recent {*} recent1-recent3;
do i = 2013 to 2004 by -1;
do rc = 1 to 3 by 1;
if mc[i] ne . then do;
recent[rc] = mc[i];
end;
end;
run;
Maybe I don't get your request, but according to your description:
"For each variable I am trying to make a new data set with the three most recent data years for that variable for each country (which might not be successive)" I created this sample dataset with dt1 and dt2 and 2 locations.
The output will be 2 datasets (and generally the number of the variables starting with DT) named DS1 and DS2 with 3 observations for each country, the first one for the first variable, the second one for the second variable.
This is the sample dataset:
data sample_ds;
length city $10 dt1 dt2 8.;
infile datalines dlm=',';
input city $ dt1 dt2;
datalines;
MS,5,0
MS,3,9
MS,3,9
MS,2,0
MS,1,8
MS,1,7
CA,6,1
CA,6,.
CA,6,.
CA,2,8
CA,1,5
CA,0,4
;
This is the sample macro:
%macro help(ds=);
data vars(keep=dt:); set &ds; if _n_ not >0; run;
%let op = %sysfunc(open(vars));
%let nvrs = %sysfunc(attrn(&op,nvars));
%let cl = %sysfunc(close(&op));
%do idx=1 %to &nvrs.;
proc sort data=&ds(keep=city dt&idx.) out=ds&idx.(where=(dt&idx. ne .)) nodupkey; by city DESCENDING dt&idx.; run;
data ds&idx.; set ds&idx.;
retain cnt;
by city DESCENDING dt&idx.;
if first.city then cnt=0; else cnt=cnt+1;
run;
data ds&idx.(drop=cnt); set ds&idx.(where=(cnt<3)); rename dt&idx.=act&idx.; run;
%end;
%mend;
You will run this macro with:
%help(ds=sample_ds);
In the first statement of the macro I select the variables on which I want to iterate:
data vars(keep=dt:); set &ds; if _n_ not >0; run;
Work on this if you want to make this work for your code, or simply rename your variables as DT1 DT2...
Let me know if it is correct for you.
When writing macro code, always keep in mind what has to be done when. SAS processes your code stepwise.
Before your sas code is even compiled, your macro variables are resolved and your macro code is executed
Then the resulting SAS Base code is compiled
Finally the code is executed.
When you write %if MATERNAL_CARE_&i. ne . %then %do, this is macro code interpreded before compilation.
At that time MATERNAL_CARE_&i. is not a variable but a text string containing a macro variable.
The first time you run trhough your %do i = 2013 %to 2004 by -1, it is filled in as MATERNAL_CARE_2013, the second as MATERNAL_CARE_2012., etc.
Then the macro %if statement is interpreted, and as the text string MATERNAL_CARE_1 is not equal to a dot, it is evaluated to FALSE
and recent_&rc. = MATERNAL_CARE_&i. is not included in the code to pass to your compiler.
You can see that if you run your code with option mprint;
The resolution;
options mprint;
%macro test();
data Maternal_care_recent;
set wb;
keep country MATERNAL_CARE_: recent_:;
** The : acts as a wild card here **;
%do i = 2013 %to 2004 %by -1;
if MATERNAL_CARE_&i. ne . then do;
%do rc = 1 %to 3 %by 1;
recent_&rc. = MATERNAL_CARE_&i.;
%end;
end;
%end;
run;
%mend;
%test();
Now, before compilation of if MATERNAL_CARE_&i. ne . then do, only the &i. is evalueated and if MATERNAL_CARE_2013 ne . then do is passed to the compiler.
The compiler will see this as a test if the SAS variable MATERNAL_CARE_1 has value missing, and that is just what you wanted;
Remark:
It is not essential that I moved the if statement above the ``. It is just more efficient because the condition is then evaluated less often.
It is however essential that you close your %ifs and %dos with an %end and your ifs and dos with an end;
Remark:
you do not need %let rc = 1, because %do rc = 1 to 3 already initialises &rc.;
For completeness SAS is compiled stepwise:
The next PROC or data step and its macro code are only considered when the preveous one is executed.
That is why you can write macro variables from a data step or sql select into that will influence the code you compile in your next step,
somehting you can not do for instance with C++ pre compilation;
Thanks everyone. Found a hybrid solution from a few solutions posted.
data sample_ds;
infile datalines dlm=',';
input country $ maternal_2004 maternal_2005
maternal_2006 maternal_2007 maternal_2008 maternal_2009 maternal_2010 maternal_2011 maternal_2012 maternal_2013;
datalines;
MS,5,0,5,0,5,.,5,.,5,.
MW,3,9,5,0,5,0,5,.,5,0
WE,3,9,5,0,5,.,.,.,.,0
HU,2,0,5,.,5,.,5,0,5,0
MI,1,8,5,0,5,0,5,.,5,0
HJ,1,7,5,0,5,0,.,0,.,0
CJ,6,1,5,0,5,0,5,0,5,0
CN,6,1,.,5,0,5,0,5,0,5
CE,6,5,0,5,0,.,0,5,.,8
CT,2,5,0,5,0,5,0,5,0,9
CW,1,5,0,5,0,5,.,.,0,7
CH,0,5,0,5,0,.,0,.,0,5
;
%macro test(var);
data &var._recent;
set sample_ds;
keep country &var._1 &var._2 &var._3;
array mc {*} &var._2004-&var._2013;
array recent {*} &var._1-&var._25;
count=1;
do i = 10 to 1 by -1;
if mc[i] ne . then do;
recent[count] = mc[i];
count=count+1;
end;
end;
run;
%mend;
I have a table like this:
Lista_ID 1 4 7 10 ...
in total there are 100 numbers.
I want to call each one of these numbers to a macro i created. I was trying to use 'scan' but read that it's just for character variables.
the error when i runned the following code was
there's the code:
proc sql;
select ID INTO: LISTA_ID SEPARATED BY '*' from
WORK.AMOSTRA;
run;
PROC SQL;
SELECT COUNT(*) INTO: NR SEPARATED BY '*' FROM
WORK.AMOSTRA;
RUN;
%MACRO CICLO_teste();
%LET LIM_MSISDN = %EVAL(NR);
%LET I = %EVAL(1);
%DO %WHILE (&I<= &LIM_MSISDN);
%LET REF = %SCAN(LISTA_ID,&I,,'*');
DATA WORK.UP&REF;
SET WORK.BASE&REF;
FORMAT PERC_ACUM 9.3;
IF FIRST.ID_CLIENTE THEN PERC_ACUM=0;
PERC_ACUM+PERC;
RUN;
%LET I = %EVAL(&I+1);
%END;
%MEND;
%CICLO_TESTE;
the error was that:
VARIABLE PERC IS UNITIALIZED and
VARIABLE FIRST.ID_CLIENTE IS UNITIALIZED.
What I want is to run this macro for each one of the Id's in the List I showed before, and that are referenced in work.base&ref and work.up&ref.
How can I do it? What I'm doing wrong?
thanks!
Here's the CALL EXECUTE version.
%MACRO CICLO_teste(REF);
DATA WORK.UP&REF;
SET WORK.BASE&REF;
BY ID_CLIENTE;
FORMAT PERC_ACUM 9.3;
IF FIRST.ID_CLIENTE THEN PERC_ACUM=0;
PERC_ACUM+PERC;
RUN;
%CICLO_TESTE;
DATA _NULL_;
SET amostra;
*CREATE YOUR MACRO CALL;
STR = CATT('%CLIO_TESTE(', ID, ')');
CALL EXECUTE(STR);
RUN;
First you should note that SAS macro variable resolve is intrinsically a "text-based" copy-paste action. That is, all the user-defined macro variables are texts. Therefore, %eval is unnecessary in this case.
Other miscellaneous corrections include:
Check the %scan() function for correct usage. The first argument should be a text string WITHOUT QUOTES.
run is redundant in proc sql since each sql statement is run as soon as they are sent. Use quit; to exit proc sql.
A semicolon is not required for macro call (causes unexpected problems sometimes).
use %do %to for loops
The code below should work.
data work.amostra;
input id;
cards;
1
4
7
10
;
run;
proc sql noprint;
select id into :lista_id separated by ' ' from work.amostra;
select count(*) into :nr separated by ' ' from work.amostra;
quit;
* check;
%put lista_id=&lista_id nr=&nr;
%macro ciclo_teste();
%local ref;
%do i = 1 %to &nr;
%let ref = %scan(&lista_id, &i);
%*check;
%put ref = &ref;
/* your task below */
/* data work.up&ref;*/
/* set work.base&ref;*/
/* format perc_acum 9.3;*/
/* if first.id_cliente then perc_acum=0;*/
/* perc_acum + perc;*/
/* run; */
%end;
%mend;
%ciclo_teste()
tested on SAS 9.4 win7 x64
Edited:
In fact I would recommend doing this to avoid scanning a long string which is inefficient.
%macro tester();
/* get the number of obs (a more efficient way) */
%local NN;
proc sql noprint;
select nobs into :NN
from dictionary.tables
where upcase(libname) = 'WORK'
and upcase(memname) = 'AMOSTRA';
quit;
/* assign &ref by random access */
%do i = 1 %to &NN;
data _null_;
a = &i;
set work.amostra point=a;
call symputx('ref',id,'L');
stop;
run;
%*check;
%put ref = &ref;
/* your task below */
%end;
%mend;
%tester()
Please let me know if you have further questions.
Wow that seems like a lot of work. Why not just do the following:
data work.amostra;
input id;
cards;
1
4
7
10
;
run;
%macro test001;
proc sql noprint;
select count(*) into: cnt
from amostra;
quit;
%let cnt = &cnt;
proc sql noprint;
select id into: x1 - :x&cnt
from amostra;
quit;
%do i = 1 %to &cnt;
%let x&i = &&x&i;
%put &&x&i;
%end;
%mend test001;
%test001;
now in variables &x1 - &&x&cnt you have your values and you can process them however you like.
In general if your list is small enough (macro variables are limited to 64K characters) then you are better off passing the list in a single delimited macro variable instead of multiple macro variables.Remember that PROC SQL will automatically set the count into the macro variable SQLOBS so there is no need to run the query twice. Or you can use %sysfunc(countw()) to count the number of entries in your delimited list.
proc sql noprint ;
select id into :idlist separated by '|' from .... ;
%let nr=&sqlobs;
quit;
...
%do i=1 %to &nr ;
%let id=%scan(&idlist,&i,|);
data up&id ;
...
%end;
If you do generate multiple macro variables there is no need to set the upper bound in advance as SAS will only create the number of macro variables it needs based on the number of observations returned by the query.
select id into :idval1 - from ... ;
%let nr=&sqlobs;
If you are using an older version of SAS the you need set an upper bound on the macro variable range.
select id into :idval1 - :idval99999 from ... ;
I have following dataset
data parm2;
input a b c d e;
datalines;
1 2 3 4 A
;
run;
Problem1: I would like have a set of macro variables. Assume i do not know the number of fields and its corresponding name of the field.
Problem2: fields are not same datatype.
desired operation is like following:
data _null_;
set parm2;
call symput('a',a);
call symput('b',b);
call symput('c',c);
call symput('d',d);
call symput('e',e);
run;
%put &a;
If this is the structure of your data, I would transpose:
proc transpose data=parm2 out=parmt;
var _all_;
run;
Then reference the two columns to create all the macro variables and their corresponding values:
data _null_;
set parmt;
call symput(_name_,col1);
run;
after some research i found the following solution. Although not a perfect one but worth to share. Looking forward #Reeze answer
data _null_;
set parm2;
array t(*) _numeric_; /*this deal with different data type*/
do i = 1 to dim(t);
call symput(vname(t[i]), t[i]);
end;
array t2(*) _character_;
do i = 1 to dim(t2);
call symput(vname(t2[i]), t2[i]);
end;
run;
Here's a Call VNEXT solution with VVALUEX, assuming you don't have a variable that has the same name as an automatic variable it seems to work. Derived solution from SAS Note: http://support.sas.com/kb/24/798.html
data parm2;
input a b c d e $;
datalines;
1 2 3 4 A
;
run;
data _null_;
set parm2;
length name $32;
*temporarily set name to not missing to start loop;
name='blank';
do while(name ne " ");
call vnext(name);
/* Omit automatic variables, and variables created in this step only */
if trim(name) not in('list','name','flag','i',' ','_ERROR_','_N_') then
call symput(name, vvaluex(name));
end;
run;
%put &a;
%put &b;
%put &c;
%put &d;
%put &e;
[ Edited - some codes or lines of code are marked with * as the OP does not require it ]
Use proc sql dictionary to get the variable name contained in your datase with the use of Memname and libname specification.
Use data step to obtained variables into marco variable. The name of variables are stored under the column name called name, and that's why we have to put it as call symputx( 'variable ' !! left(_n_), **name** );. The function of macro variable Total is to tell the number of variables existed in your data set.
Now you would have variable1= a , variable2= b....
%macro definevar ( library, dataset);
proc sql;
create table Attribute as
select * from dictionary.columns;
where memname = upcase( &dataset ) and libname = upcase(&library);
quit;
data letmacro;
set Attribute end=end;
call symputx( 'variable ' !! left(_n_), name );
* if end then call symputx ( Total, _n_);
run;
/*
***** extra ********
data _null_;
set &dataset ;
%do i=1 to &total;
call symputx ( "var&i" !! left(_n_), &&variable&i );
%end;
run;
***** extra ********
*/
%mend definevar;
%definevar( ifanylibrary, parm2)
And I am looking forward to learn CALL VNEXT solution by #Reeza