Is it possible to nest a do until (eof) in a do until or do while loop? I have not been able to get it to work.
Data
data have;
do i = 1 to 3;
output;
end;
run;
Example - quits after first j
data want;
j = 1;
do while (j < 4);
do until (eof);
set have end=eof;
end;
call missing(eof);
output;
j + 1;
end;
run;
Let me know if I haven't made my problem clear enough. Thanks!
In general you can use any type of DO loop inside any other type.
Your particular program is going to stop the first time it tries to run the SET statement on the second iteration of the outer DO loop since there are no more observations to be read. Most normal SAS data step stop in the same way when they have exhausted their input stream.
If you really want to re-read the whole dataset then use the POINT= option on the SET statement. Make sure to have a way to stop the step since it will no longer be able to read past the end of the input stream.
data want;
do j=1 to 4;
do p=1 to nobs;
set have point=p nobs=nobs ;
* Do something there ;
end;
* Do something else here ;
end;
* and maybe something here too ;
stop;
run;
Related
I know I can use call execute to create and execute multiple data steps. But is there a way to generate code for one singular data step, with many repetitive lines of code?
In R for instance I can create a vector of variables, executing some paste/print statement and get something approximating the output I need. As follows:
strings<-c("Exkl_UtgUtl_Flyg",
"Exkl_UtgUtl_Tag",
"Exkl_UtgUtl_Farja",
"Exkl_UtgUtl_Hyrbil",
"Exkl_UtgUtl_Bo",
"Exkl_UtgUtl_Aktiv",
"Exkl_UtgUtl_Annat")
first_string<-strings[1]
other_strings<-strings[strings!=first_string]
gsub("\t", "",gsub("\n","",gsub(",","",paste0(
paste0("DATA IBIS3_5;
Set IBIS3_5;
if ",first_string,"=3 and hjalpvariabel=1 then do;",
first_string,"=1;",
paste0(gsub("Exkl_","",first_string),"SSEK_Pers")," = ",paste0(gsub("^Exkl_","",first_string),"SSEK_PPmedel"),";
end;
else if ",first_string,"=3 then ",first_string,"=2;"),
paste0("else if ",other_strings,"=3 and hjalpvariabel=1 then do;",
other_strings,"=1;",
paste0(gsub("Exkl_","",other_strings),"SSEK_Pers")," = ",paste0(gsub("^Exkl_","",other_strings),"SSEK_PPmedel"),";
end;
else if ",other_strings,"=3 then ",other_strings,"=2;", collapse=","),"run;"))))
I still have to delete the quotes and the bracketed number manually, but that's at least bearable. The final output looks something like this:
DATA IBIS3_5; Set IBIS3_5;
if Exkl_UtgUtl_Flyg=3 and hjalpvariabel=1 then do; Exkl_UtgUtl_Flyg=1;UtgUtl_FlygSSEK_Pers=UtgUtl_FlygSSEK_PPmedel; end; else if Exkl_UtgUtl_Tag=3 and hjalpvariabel=1 then do;Exkl_UtgUtl_Tag=1;UtgUtl_TagSSEK_Pers=UtgUtl_TagSSEK_PPmedel; end; else if Exkl_UtgUtl_Tag=3 then Exkl_UtgUtl_Tag=2;
else if Exkl_UtgUtl_Farja=3 and hjalpvariabel=1 then do;Exkl_UtgUtl_Farja=1;UtgUtl_FarjaSSEK_Pers=UtgUtl_FarjaSSEK_PPmedel; end; else if Exkl_UtgUtl_Farja=3 then Exkl_UtgUtl_Farja=2;
else if Exkl_UtgUtl_Hyrbil=3 and hjalpvariabel=1 then do;Exkl_UtgUtl_Hyrbil=1;UtgUtl_HyrbilSSEK_Pers=UtgUtl_HyrbilSSEK_PPmedel; end; else if Exkl_UtgUtl_Hyrbil=3 then Exkl_UtgUtl_Hyrbil=2;
else if Exkl_UtgUtl_Bo=3 and hjalpvariabel=1 then do;Exkl_UtgUtl_Bo=1;UtgUtl_BoSSEK_Pers=UtgUtl_BoSSEK_PPmedel; end; else if Exkl_UtgUtl_Bo=3 then Exkl_UtgUtl_Bo=2;
else if Exkl_UtgUtl_Aktiv=3 and hjalpvariabel=1 then do;Exkl_UtgUtl_Aktiv=1;UtgUtl_AktivSSEK_Pers=UtgUtl_AktivSSEK_PPmedel; end; else if Exkl_UtgUtl_Aktiv=3 then Exkl_UtgUtl_Aktiv=2;
else if Exkl_UtgUtl_Annat=3 and hjalpvariabel=1 then do;Exkl_UtgUtl_Annat=1;UtgUtl_AnnatSSEK_Pers=UtgUtl_AnnatSSEK_PPmedel; end; else if Exkl_UtgUtl_Annat=3 then Exkl_UtgUtl_Annat=2;
run;
Is there a way of generating this code in SAS, without, having to resort to other programs?
I do not see the pattern in the code you want to generate so I will leave that part to your imagination. So for the purpose of discussion let's assume you want to generate code like:
var1=var1**2;
var2=var2**2;
You can use conditional logic in the data step that is generating the code via CALL EXECUTE() to generate the beginning (and possible the ending) of the data step you want to generate.
One way is to test when you are on the first (or last) observation.
Example:
data _null_;
set variables end=eof;
if _n_=1 then call execute('data want; set have;');
call execute(cats(variable,'=',variable,'**2;'));
if eof then call execute('run;');
run;
You could also use that _n_=1 test to determine whether or not you need to generate the ELSE.
if _n_>1 then call execute(' else ');
Another way is to use a DO loop to read all of the observations.
data _null_;
call execute('data want; set have;');
do while(not eof);
set variables end=eof;
call execute(cats(variable,'=',variable,'**2;'));
end;
call execute('run;');
stop;
run;
Or if instead of using CALL EXECUTE() you write the code to a file you don't need to worry about generating the beginning or ending of the data step. You can just %INCLUDE the generated code into the middle of a data step.
filename code temp;
data _null_;
file code;
set variables end=eof;
put variable '=' variable '**2;' ;
run;
data want;
set have;
%include code / source2;
run;
The below writes the programs to a file then brings the file bqack into SAS for execution. This is how I avoid macros in almost all cases. 13 years of macros and I ended up with 5 ampersands one night. Vowed to fix.:
* USE WHEN NOT DEBUGGING CODE: ;
* filename TEMP '$MYTEMPFILE';
* USE WHEN DEBUGGING CODE: ;
filename TEMP 'c:\temp\Test.sas';
data _null_ ;
file TEMP ;
set DICT end=eof;
if _n_ = 1 then
do ;
put 'data Africa ; '
/ ' attrib '
;
end;
if label = "" then
put #12 name #30 'label="XX_' name +(-1) '"' ;
else
put #12 name #30 'label="XX_' label +(-1)'"' ;
if eof then
do ;
put #12';'
/ ' set sashelp.SHOES;'
/ 'run;'
;
end;
run;
%include TEMP ;
Is the inner loop here (marked with ********) okay as is, or do I need to use something like %eval()? (I don't think I need %eval() because there are no macro variables.)
do _i = 1 to 5;
if sp_id_array{_i} ne . then do;
do _j = (_i+1) to 5; *********;
if sp_id_array{_j} ne . then do;
sp_id = sp_id_array{_i};
sp_partner_id = sp_id_array{_j};
output;
end;
end;
end;
end;
That's fine; SAS will automatically use the (_i+1). In fact, you can modify the loop control variable itself inside the loop.
data _null_;
do _i = 1 to 5;
put _i=;
_i=5;
end;
run;
I am beginner to SAS Programming.
I have written a piece of code to understand the stuff, but I am not getting why after getting the continue statement it is going to the output a statement.
Given below is the code :
data a B;
put 'entering do DATASTEP' ;
do i=1 to 4;
put 'entering do loop'" " i;
if (i=1) then do;
put 'value of i is 1'" " i;
put 'Entering the loop' ;
put j=_N_;
if _N_ = 2 then continue;
set sashelp.class(firstobs=1 obs=5);
put 'Ouside the loop';
output a;
end;
if (i=2) then do;
put 'value of i is 2'" " i;
put 'Entering the loop' ;
put j=_n_;
set sashelp.class(firstobs=6 obs=10);
put 'Ouside the loop';
output B;
end;
end;
put 'GETING OUT OF THE DATASTEP';
run;
For more clarity about my doubt request to please run this and then we can have a discussion about the output dataset and the log.
Thanks in advance.
It looks to me like the CONTINUE is working fine.
Normally SAS will stop the data step when you read past the end of the input data. Without the CONTINUE statement that would be when it tried to read from the first SET statement for the 6th time. But since you skipped it once it will stop when it tries to execute the second SET statement for the 6th time.
Here is a simplified version of what your data step is doing. Notice how it reads the records in 1,6,7,2,8,3,9,4,10,5 order.
data sample;
do i=1 to 10; output; end;
run;
data _null_ ;
if _n_^=2 then do;
set sample (firstobs=1 obs=5);
put i=;
end;
set sample (firstobs=6 obs=10);
put i=;
run;
i=1
i=6
i=7
i=2
i=8
i=3
i=9
i=4
i=10
i=5
Is it possible to repeat a data step a number of times (like you might in a %do-%while loop) where the number of repetitions depends on the result of the data step?
I have a data set with numeric variables A. I calculate a new variable result = min(1, A). I would like the average value of result to equal a target and I can get there by scaling variable A by a constant k. That is solve for k where target = average(min(1,A*k)) - where k and target are constants and A is a list.
Here is what I have so far:
filename f0 'C:\Data\numbers.csv';
filename f1 'C:\Data\target.csv';
data myDataSet;
infile f0 dsd dlm=',' missover firstobs=2;
input A;
init_A = A; /* store the initial value of A */
run;
/* read in the target value (1 observation) */
data targets;
infile f1 dsd dlm=',' missover firstobs=2;
input target;
K = 1; * initialise the constant K;
run;
%macro iteration; /* I need to repeat this macro a number of times */
data myDataSet;
retain key 1;
set myDataSet;
set targets point=key;
A = INIT_A * K; /* update the value of A /*
result = min(1, A);
run;
/* calculate average result */
proc sql;
create table estimate as
select avg(result) as estimate0
from myDataSet;
quit;
/* compare estimate0 to target and update K */
data targets;
set targets;
set estimate;
K = K * (target / estimate0);
run;
%mend iteration;
I can get the desired answer by running %iteration a few times, but Ideally I would like to run the iteration until (target - estimate0 < 0.01). Is such a thing possible?
Thanks!
I had a similar problem to this just the other day. The below approach is what I used, you will need to change the loop structure from a for loop to a do while loop (or whatever suits your purposes):
First perform an initial scan of the table to figure out your loop termination conditions and get the number of rows in the table:
data read_once;
set sashelp.class end=eof;
if eof then do;
call symput('number_of_obs', cats(_n_) );
call symput('number_of_times_to_loop', cats(3) );
end;
run;
Make sure results are as expected:
%put &=number_of_obs;
%put &=number_of_times_to_loop;
Loop over the source table again multiple times:
data final;
do loop=1 to &number_of_times_to_loop;
do row=1 to &number_of_obs;
set sashelp.class point=row;
output;
end;
end;
stop; * REQUIRED BECAUSE WE ARE USING POINT=;
run;
Two part answer.
First, it's certainly possible to do what you say. There are some examples of code that works like this available online, if you want a working, useful-code example of iterative macros; for example, David Izrael's seminal Rakinge macro, which performs a rimweighting procedure by iterating over a relatively simple process (proc freqs, basically). This is pretty similar to what you're doing. In the process it looks in the datastep at the various termination criteria, and outputs a macro variable that is the total number of criteria met (as each stratification variable separately needs to meet the termination criterion). It then checks %if that criterion is met, and terminates if so.
The core of this is two things. First, you should have a fixed maximum number of iterations, unless you like infinite loops. That number should be larger than the largest reasonable number you should ever need, often by around a factor of two. Second, you need convergence criteria such that you can terminate the loop if they're met.
For example:
data have;
x=5;
run;
%macro reduce(data=, var=, amount=, target=, iter=20);
data want;
set have;
run;
%let calc=.;
%let _i=0;
%do %until (&calc.=&target. or &_i.=&iter.);
%let _i = %eval(&_i.+1);
data want;
set want;
&var. = &var. - &amount.;
call symputx('calc',&var.);
run;
%end;
%if &calc.=&target. %then %do;
%put &var. reduced to &target. in &_i. iterations.;
%end;
%else %do;
%put &var. not reduced to &target. in &iter. iterations. Try a larger number.;
%end;
%mend reduce;
%reduce(data=have,var=x,amount=1,target=0);
That is a very simple example, but it has all of the same elements. I prefer to use do-until and increment on my own but you can do the opposite also (as %rakinge does). Sadly the macro language doesn't allow for do-by-until like the data step language does. Oh well.
Secondly, you can often do things like this inside a single data step. Even in older versions (9.2 etc.), you can do all of what you ask above in a single data step, though it might look a little clunky. In 9.3+, and particularly 9.4, there are ways to run that proc sql inside the data step and get the result back without waiting for another data step, using RUN_MACRO or DOSUBL and/or the FCMP language. Even something simple, like this:
data have;
initial_a=0.3;
a=0.3;
target=0.5;
output;
initial_a=0.6;
a=0.6;
output;
initial_a=0.8;
a=0.8;
output;
run;
data want;
k=1;
do iter=1 to 20 until (abs(target-estimate0) < 0.001);
do _n_ = 1 to nobs;
if _n_=1 then result_tot=0;
set have nobs=nobs point=_n_;
a=initial_a*k;
result=min(1,a);
result_tot+result;
end;
estimate0 = result_tot/nobs;
k = k * (target/estimate0);
end;
output;
stop;
run;
That does it all in one data step. I'm cheating a bit because I'm writing my own data step iterator, but that's fairly common in this sort of thing, and it is very fast. Macros iterating multiple data steps and proc sql steps will be much slower typically as there is some overhead from each one.
Given a data step like this:
data tmp;
do i=1 to 10;
if 3<i<7 then do;
some stuff;
end;
end;
run;
I want to write to the log how many times the if statement is true. For example, in this example, I want to have a line in the log that says:
If statement true 3 times
because the condition is true when i is 4, 5, or 6. How can I do this?
Using retain to keep a counter variable, it's pretty easy to increment a count of how many times an if condition was met.
data tmp;
retain Counter 0;
do i=1 to 10;
if 3<i<7 then do;
Counter+1;
*some stuff;
end;
end;
put 'If statement true ' Counter 'time(s).';
run;
Note that this writes to the log once because it is the last thing that occurs before the data step terminates (there's only one loop in the data step in the example). If you wanted to do this for a data step that has more than one loop (e.g. when there is a set statement reading data in from another dataset, you'd want to tell SAS you only want it to report at the end of the step. You'd do it like this:
* create an example input data set;
data exampleData;
do i=1 to 10;
output;
end;
run;
* use a variable 'eof' to indicate the end of the input dataset;
data new;
set exampleData end=eof;
retain Counter 0;
if 3<i<7 then do;
Counter+1;
*some stuff;
end;
if eof then put 'If statement true ' Counter 'time(s).';
run;