I'm trying to create a custom text report from my sas code, below is the code
data have ;
ncandidates=1; ngames=3; controlppt=1; controlgame=2;
ppt1='Abc'; ppt2='Bcd';
infile cards dsd dlm='|';
input (var1-var21) ($);
cards;
1|2|a|1|3|b1|2|a|1|3|b1|2|a|1|3|b1|2|a|1|3|b
1|2|a|1|3|b1|2|a|1|3|b1|2|a|1|3|b1|2|a|1|3|b
;
filename report 'myreport.txt';
data _null_;
file report dsd dlm='|' LRECL=8614;
a='';
put
83*'#'
/ '##### Number of ppts'
/ 83*'#'
/ 'input.Name=' #
;
eof = 0;
do until(eof);
set have end=eof;
If not missing(var1) then
put var1-var10 ## ;
end;
put a
// 83*'#'
/ '##### Output Data'
/ 83* '#'
// 'output.Name=' #;
eof=0;
do until(eof);
set have ;
If not missing(var11) then
put var11-var20 ## ;
end;
put '1';
run;
Everything gets printed to the file except for the last put '1';
Nothing after the second do until block gets executed;
Also, if I add end=eof to the last do until block then everything gets printed twice.
Do we have a solution around this?
I am not sure about the cause of the issue. But sometimes SAS behaves weird if a dataset is read several times as you do it. But using a different variable for second set have end=eof2; resolves the problem:
data have ;
ncandidates=1; ngames=3; controlppt=1; controlgame=2;
ppt1='Abc'; ppt2='Bcd';
infile cards dsd dlm='|';
input (var1-var21) ($);
cards;
1|2|a|1|3|b1|2|a|1|3|b1|2|a|1|3|b1|2|a|1|3|b
1|2|a|1|3|b1|2|a|1|3|b1|2|a|1|3|b1|2|a|1|3|b
;
filename report '~/myreport.txt';
data _null_;
file report dsd dlm='|' LRECL=8614;
a='';
put
83*'#'
/ '##### Number of ppts'
/ 83*'#'
/ 'input.Name=' #
;
eof = 0;
do until(eof);
set have end=eof;
If not missing(var1) then
put var1-var10 ## ;
end;
put a
// 83*'#'
/ '##### Output Data'
/ 83* '#'
// 'output.Name=' #;
eof2=0;
do until(eof2);
set have end=eof2;
If not missing(var11) then
put var11-var20 ## ;
end;
put '1';
stop;
run;
A 'plain' DATA step stops when a read is attempted after the last record of a set has been read. This typically happens during the implicit loop that is inherent in the magic of a DATA step. When you loop over a set explicitly with end of data checks, the read attempt beyond does not occur, and thus does not implicitly end the step.
The eof flag is changed only when the end of data is reached. It is not set to 0 when not at end of data -- the eof flag is simply what it is at the start of the loop. Thus the flag needs to be reset if reused for a subsequent loop.
* 'top' is logged twice;
* the data step ends when the second implicit iteration tries to read past eof of the first set;
data _null_;
put 'top';
do until (eof);
set sashelp.class(obs=2) end=eof;
put name=;
end;
eof = 0; * reset flag;
do until (eof);
set sashelp.class(where=(name=:'J')) end=eof;
put name=;
end;
run;
* 'top' is logged once;
* the data step ends when the stop is reached at the bottom;
data _null_;
put 'top';
do until (eof);
set sashelp.class(obs=2) end=eof;
put name=;
end;
eof = 0;
do until (eof);
set sashelp.class(where=(name=:'J')) end=eof;
put name=;
end;
run;
Related
I need help with the following. My input dataset is as follows:
If one of the values in the QC column is a FAIL, all of the values in the last column 'Final' should be REPEAT, irrespective of what other values are found in the QC column. Desired output dataset:
Thank you.
The following code does not give expected results as no condition is specified for other qc values.
data exp;
set exp;
if QC = "FAIL" then do;
FINAL= "REPEAT";
end;
run;
You need to read the data twice. The first time to figure out if there are any QC failures. The second time to get the records again so you can attach the new variable and write them to the output dataset. The first pass can stop as soon as you find any failure.
data want ;
do while(not eof1);
set have end=eof1;
if qc = 'FAIL' then do;
final='REPEAT';
eof1=1;
end;
end;
do while(not eof2);
set have end=eof2;
output;
end;
stop;
run;
You have to process the entire data set to determine if no change is needed.
In this example there is a check if FAIL occurs in any row and then conditionally changes to REPEAT
data have;
id+1;
input value qc $ ##;
datalines;
. FAIL 1 PASS 0 PASS 1 PASS . FAIL
;
%let repeat_flag = 0;
data _null_;
set have;
where qc = 'FAIL';
call symputx ('repeat_flag',1);
stop;
run;
%if &repeat_flag %then %do;
data have;
set have;
qc = 'REPEAT';
run;
%end;
to set everything to REPEAT dependent on the number of FAILs, you can add a conditional to the DATA _NULL_.
data _null_;
set have;
where qc = 'FAIL';
if _n_ >= 3;
call symputx ('repeat_flag',1);
stop;
run;
I know I can use call execute to create and execute multiple data steps. But is there a way to generate code for one singular data step, with many repetitive lines of code?
In R for instance I can create a vector of variables, executing some paste/print statement and get something approximating the output I need. As follows:
strings<-c("Exkl_UtgUtl_Flyg",
"Exkl_UtgUtl_Tag",
"Exkl_UtgUtl_Farja",
"Exkl_UtgUtl_Hyrbil",
"Exkl_UtgUtl_Bo",
"Exkl_UtgUtl_Aktiv",
"Exkl_UtgUtl_Annat")
first_string<-strings[1]
other_strings<-strings[strings!=first_string]
gsub("\t", "",gsub("\n","",gsub(",","",paste0(
paste0("DATA IBIS3_5;
Set IBIS3_5;
if ",first_string,"=3 and hjalpvariabel=1 then do;",
first_string,"=1;",
paste0(gsub("Exkl_","",first_string),"SSEK_Pers")," = ",paste0(gsub("^Exkl_","",first_string),"SSEK_PPmedel"),";
end;
else if ",first_string,"=3 then ",first_string,"=2;"),
paste0("else if ",other_strings,"=3 and hjalpvariabel=1 then do;",
other_strings,"=1;",
paste0(gsub("Exkl_","",other_strings),"SSEK_Pers")," = ",paste0(gsub("^Exkl_","",other_strings),"SSEK_PPmedel"),";
end;
else if ",other_strings,"=3 then ",other_strings,"=2;", collapse=","),"run;"))))
I still have to delete the quotes and the bracketed number manually, but that's at least bearable. The final output looks something like this:
DATA IBIS3_5; Set IBIS3_5;
if Exkl_UtgUtl_Flyg=3 and hjalpvariabel=1 then do; Exkl_UtgUtl_Flyg=1;UtgUtl_FlygSSEK_Pers=UtgUtl_FlygSSEK_PPmedel; end; else if Exkl_UtgUtl_Tag=3 and hjalpvariabel=1 then do;Exkl_UtgUtl_Tag=1;UtgUtl_TagSSEK_Pers=UtgUtl_TagSSEK_PPmedel; end; else if Exkl_UtgUtl_Tag=3 then Exkl_UtgUtl_Tag=2;
else if Exkl_UtgUtl_Farja=3 and hjalpvariabel=1 then do;Exkl_UtgUtl_Farja=1;UtgUtl_FarjaSSEK_Pers=UtgUtl_FarjaSSEK_PPmedel; end; else if Exkl_UtgUtl_Farja=3 then Exkl_UtgUtl_Farja=2;
else if Exkl_UtgUtl_Hyrbil=3 and hjalpvariabel=1 then do;Exkl_UtgUtl_Hyrbil=1;UtgUtl_HyrbilSSEK_Pers=UtgUtl_HyrbilSSEK_PPmedel; end; else if Exkl_UtgUtl_Hyrbil=3 then Exkl_UtgUtl_Hyrbil=2;
else if Exkl_UtgUtl_Bo=3 and hjalpvariabel=1 then do;Exkl_UtgUtl_Bo=1;UtgUtl_BoSSEK_Pers=UtgUtl_BoSSEK_PPmedel; end; else if Exkl_UtgUtl_Bo=3 then Exkl_UtgUtl_Bo=2;
else if Exkl_UtgUtl_Aktiv=3 and hjalpvariabel=1 then do;Exkl_UtgUtl_Aktiv=1;UtgUtl_AktivSSEK_Pers=UtgUtl_AktivSSEK_PPmedel; end; else if Exkl_UtgUtl_Aktiv=3 then Exkl_UtgUtl_Aktiv=2;
else if Exkl_UtgUtl_Annat=3 and hjalpvariabel=1 then do;Exkl_UtgUtl_Annat=1;UtgUtl_AnnatSSEK_Pers=UtgUtl_AnnatSSEK_PPmedel; end; else if Exkl_UtgUtl_Annat=3 then Exkl_UtgUtl_Annat=2;
run;
Is there a way of generating this code in SAS, without, having to resort to other programs?
I do not see the pattern in the code you want to generate so I will leave that part to your imagination. So for the purpose of discussion let's assume you want to generate code like:
var1=var1**2;
var2=var2**2;
You can use conditional logic in the data step that is generating the code via CALL EXECUTE() to generate the beginning (and possible the ending) of the data step you want to generate.
One way is to test when you are on the first (or last) observation.
Example:
data _null_;
set variables end=eof;
if _n_=1 then call execute('data want; set have;');
call execute(cats(variable,'=',variable,'**2;'));
if eof then call execute('run;');
run;
You could also use that _n_=1 test to determine whether or not you need to generate the ELSE.
if _n_>1 then call execute(' else ');
Another way is to use a DO loop to read all of the observations.
data _null_;
call execute('data want; set have;');
do while(not eof);
set variables end=eof;
call execute(cats(variable,'=',variable,'**2;'));
end;
call execute('run;');
stop;
run;
Or if instead of using CALL EXECUTE() you write the code to a file you don't need to worry about generating the beginning or ending of the data step. You can just %INCLUDE the generated code into the middle of a data step.
filename code temp;
data _null_;
file code;
set variables end=eof;
put variable '=' variable '**2;' ;
run;
data want;
set have;
%include code / source2;
run;
The below writes the programs to a file then brings the file bqack into SAS for execution. This is how I avoid macros in almost all cases. 13 years of macros and I ended up with 5 ampersands one night. Vowed to fix.:
* USE WHEN NOT DEBUGGING CODE: ;
* filename TEMP '$MYTEMPFILE';
* USE WHEN DEBUGGING CODE: ;
filename TEMP 'c:\temp\Test.sas';
data _null_ ;
file TEMP ;
set DICT end=eof;
if _n_ = 1 then
do ;
put 'data Africa ; '
/ ' attrib '
;
end;
if label = "" then
put #12 name #30 'label="XX_' name +(-1) '"' ;
else
put #12 name #30 'label="XX_' label +(-1)'"' ;
if eof then
do ;
put #12';'
/ ' set sashelp.SHOES;'
/ 'run;'
;
end;
run;
%include TEMP ;
I want to create a variable that resolves to the character before a specified character (*) in a string. However I am asking myself now if this specified character appears several times in a string (like it is in the example below), how to retrieve one variable that concatenates all the characters that appear before separated by a comma?
Example:
data have;
infile datalines delimiter=",";
input string :$20.;
datalines;
ABC*EDE*,
EFCC*W*d*
;
run;
Code:
data want;
set have;
cnt = count(string, "*");
_startpos = 0;
do i=0 to cnt until(_startpos=0);
before = catx(",",substr(string, find(string, "*", _startpos+1)-1,1));
end;
drop i _startpos;
run;
That code output before=C for the first and second observation. However I want it to be before=C,E for the first one and before=C,W,d for the second observation.
You can use Perl regular expression replacement pattern to transform the original string.
Example:
data have;
infile datalines delimiter=",";
input string :$20.;
datalines;
ABC*EDE*,
EFCC*W*d*
;
data want;
set have;
csl = prxchange('s/([^*]*?)([^*])\*/$2,/',-1,string); /* comma separated letters */
csl = prxchange('s/, *$//',1,csl); /* remove trailing comma */
run;
Make sure to increment _STARTPOS so your loop will finish. You can use CATX() to add the commas. Simplify selecting the character by using CHAR() instead of SUBSTR(). Also make sure to TELL the data step how to define the new variable instead of forcing it to guess. I also include test to handle the situation where * is in the first position.
data have;
input string $20.;
datalines;
ABC*EDE*
EFCC*W*d*
*XXXX*
asdf
;
data want;
set have;
length before $20 ;
_startpos = 0;
do cnt=0 to length(string) until(_startpos=0);
_startpos = find(string,'*',_startpos+1);
if _startpos>1 then before = catx(',',before,char(string,_startpos-1));
end;
cnt=cnt-(string=:'*');
drop i _startpos;
run;
Results:
Obs string before cnt
1 ABC*EDE* C,E 2
2 EFCC*W*d* C,W,d 3
3 *XXXX* X 1
4 asdf 0
call scan is also a good choice to get position of each *.
data have;
infile datalines delimiter=",";
input string :$20.;
datalines;
ABC*EDE*,
EFCC*W*d*
****
asdf
;
data want;
length before $20.;
set have;
do i = 1 to count(string,'*');
call scan(string,i,pos,len,'*');
before = catx(',',before,substrn(string,pos+len-1,1));
end;
put _n_ = +7 before=;
run;
Result:
_N_=1 before=C,E
_N_=2 before=C,W,d
_N_=3 before=
_N_=4 before=
I have looked around quite a bit for something of this nature, and the majority of sources all give examples of counting the amount of observations etc.
But what I am actually after is a simple piece of code that will check to see if there are any observations in the dataset, if that condition is met then the program needs to continue as normal, but if the condition is not met then I would like a new record to be created with a variable stating that the dataset is empty.
I have seen macros and SQL code that can accomplish this, but what I would like to know is is it possible to do the same in SAS code? I know the code I have below does not work, but any insight would be appreciated.
Data TEST;
length VAR1 $200.;
set sashelp.class nobs=n;
call symputx('nrows',n);
obs= &nrows;
if obs = . then VAR1= "Dataset is empty"; output;
Run;
You could do it by always appending a 1-row data set with the empty dataset message, and then delete the message if it doesn't apply.
data empty_marker;
length VAR1 $200;
VAR1='Dataset is empty';
run;
Data TEST;
length VAR1 $200.;
set
sashelp.class nobs=n
empty_marker (in=marker)
;
if (marker) and _n_ > 1 then delete;
Run;
Easiest way I can think of is to use the nobs statement to check the number of records. The trick is you don't want to actually read from an empty data set. That will terminate the DATA Step and the nobs value will not be set. So you use an always false if statement to check the number of observations.
data test1;
format x best. msg $32.;
stop;
run;
data test1;
if _n_ = 0 then
set test1 nobs=nobs;
if ^nobs then do;
msg = "NO RECORDS";
output;
stop;
end;
set test1;
/*Normal code here*/
output;
run;
So this populates the nobs value with 0. The if clause sees the 0 and allows you to set the message and output that value. Use the stop to then terminate the DATA Step. Outside of that check, do your normal data step code. You need the ending output statement because of the first. Once the compiler sees an output it will not do it automatically for you.
Here it works for a data set with values.
data test2;
format x best. msg $32.;
do x=1 to 5;
msg="Yup";
output;
end;
run;
data test2;
if _n_ = 0 then
set test2 nobs=nobs;
if ^nobs then do;
msg = "NO RECORDS";
output;
stop;
end;
set test2;
y=x+1;
output;
run;
Given a data step like this:
data tmp;
do i=1 to 10;
if 3<i<7 then do;
some stuff;
end;
end;
run;
I want to write to the log how many times the if statement is true. For example, in this example, I want to have a line in the log that says:
If statement true 3 times
because the condition is true when i is 4, 5, or 6. How can I do this?
Using retain to keep a counter variable, it's pretty easy to increment a count of how many times an if condition was met.
data tmp;
retain Counter 0;
do i=1 to 10;
if 3<i<7 then do;
Counter+1;
*some stuff;
end;
end;
put 'If statement true ' Counter 'time(s).';
run;
Note that this writes to the log once because it is the last thing that occurs before the data step terminates (there's only one loop in the data step in the example). If you wanted to do this for a data step that has more than one loop (e.g. when there is a set statement reading data in from another dataset, you'd want to tell SAS you only want it to report at the end of the step. You'd do it like this:
* create an example input data set;
data exampleData;
do i=1 to 10;
output;
end;
run;
* use a variable 'eof' to indicate the end of the input dataset;
data new;
set exampleData end=eof;
retain Counter 0;
if 3<i<7 then do;
Counter+1;
*some stuff;
end;
if eof then put 'If statement true ' Counter 'time(s).';
run;