SAS Data Step programming - explain dataset(obs=0) - sas

Could you please explain why none the data step statements are processed if we set the (obs=0) data set option in the (wrong) example below?
data temp;
x=0;
run;
data wrong;
set temp(obs=0);
x=1;
y=1;
output;
y=2;
output;
run;
data right;
set temp(obs=1);
x=1;
y=1;
output;
y=2;
output;
run;
I would normally expect that both work.wrong and work.right have the same output.

One of the ways a data step stops executing is when a SET statement executes and reads an end-of-file character (i.e. there are no more records to read).
So if you SET a dataset with (obs=0), when the SET statement executes, the data step stops. For example:
122 data _null_ ;
123 put _n_= "I ran" ;
124 set sashelp.class(obs=0) ;
125 put _n_= "I did not run" ;
126 run;
_N_=1 I ran
NOTE: There were 0 observations read from the data set SASHELP.CLASS.
The first PUT statement executes, but the second does not, because the step stopped when the SET statement executed.
When you SET a dataset with (OBS=1), the data step stops on the SECOND iteration:
135 data _null_ ;
136 put _n_= "I ran before SET" ;
137 set sashelp.class(obs=1) ;
138 put _n_= "I ran after SET" ;
139 run;
_N_=1 I ran before SET
_N_=1 I ran after SET
_N_=2 I ran before SET
NOTE: There were 1 observations read from the data set SASHELP.CLASS.

Related

How to get dataset label in a data step?

I wanna get the label of dataset which is in the set statement but I haven't find a beautiful method.
I have tried this:
data test;
set sashelp.class;
rc = open("sashelp.class");
label = attrc(rc,"label");
rc = close(rc);
run;
It works but also have a weak point that I have to write the name of dataset in open() function.
I am looking for a better way to replace writing it manually since I have dozens of similar steps.
I have tried &syslast too, but it doesn't work. May there is some way else?
Maybe INDSNAME
18 data _null_;
19 set sashelp.class(obs=2 drop=_all_) sashelp.shoes(obs=2 drop=_all_)indsname=indsname;
20 retain label;
21 if indsname ne lag(indsname) then do;
22 rc = open(indsname); label=attrc(rc,"label"); rc=close(rc);
23 end;
24 put _all_;
25
26 run;
indsname=SASHELP.CLASS label=Student Data rc=0 _ERROR_=0 _N_=1
indsname=SASHELP.CLASS label=Student Data rc=. _ERROR_=0 _N_=2
indsname=SASHELP.SHOES label=Fictitious Shoe Company Data rc=0 _ERROR_=0 _N_=3
indsname=SASHELP.SHOES label=Fictitious Shoe Company Data rc=. _ERROR_=0 _N_=4
Would statement style macros allowed by IMPLMAC qualify for a SAS beauty pageant ?
Repurpose the SET statement as a statement style macro that emits source code containing a normal SET statement and a data step variable assignment of the data set label retrieved via ATTRC.
Example:
%macro SET(data) / STMT;
options IMPLMAC = 0; %* turn off implmac so following SET is normal token;
SET &data;
options IMPLMAC = 1; %* turn on implac so subsequent SET invoke this macro;
%local id;
%let id = %sysfunc(open(&data));
%if (&id) %then %do;
DSLABEL = %sysfunc(quote(%sysfunc(ATTRC(&ID,LABEL))));
%let id = %sysfunc(close(&id));
%end;
%mend;
data have(label="This is the data set I ""have""");
x=1;
run;
options IMPLMAC=1 MPRINT;
data _null_;
SET HAVE;
run;
options IMPLMAC=0;

SAS macro - rename variables using their label values as their new variable names

I want to produce a macro that converts the variable names of a dataset into the variables' labels. I intend to apply this macro for large datasets where manually changing the variable names would be impractical.
I came across this code online from the SAS website, which looked promising but produced errors. I made slight edits to remove some errors. It now works for their sample dataset but not mine. Any assistance with improving this code to work with my sample dataset would be greatly appreciated!
SAS sample dataset (works with code):
data t1;
label x='this_x' y='that_y';
do x=1,2;
do y=3,4;
z=100;
output;
end;
end;
run;
My sample dataset (does not work with code):
data t1;
input number group;
label number = number_lab group = group_lab;
datalines;
1 1
1 .
2 1
2 .
3 2
3 .
4 1
4 .
5 2
5 .
6 1
6 .
;
run;
Code:
%macro chge(dsn);
%let dsid=%sysfunc(open(&dsn));
%let cnt=%sysfunc(attrn(&dsid,nvars));
%do i= 1 %to &cnt;
%let var&i=%sysfunc(varname(&dsid,&i));
%let lab&i=%sysfunc(varlabel(&dsid,&i));
%if lab&i = %then %let lab&i=&&var&i;
%end;
%let rc=%sysfunc(close(&dsid));
proc datasets;
modify &dsn;
rename
%do j = 1 %to &cnt;
%if &&var&j ne &&lab&j %then %do;
&&var&j=&&lab&j
%end;
%end;
quit;
run;
%mend chge;
%chge(t1)
proc contents;
run;
My code produces the following error messages:
ERROR 73-322: Expecting an =.
ERROR 76-322: Syntax error, statement will be ignored.
Mainly you are not closing the RENAME statement with a semi-colon. But it also looks like you have the RUN and QUIT statements in the wrong order.
But note that there is no need for that complex %sysfunc() macro code to get the list of names and labels. Since you are already generating a PROC DATASETS step your macro can generate other SAS code also. Then your macro will be clearer and easier to debug.
%macro chge(dsn);
%local rename ;
proc contents data=&dsn noprint out=__cont; run;
proc sql noprint ;
select catx('=',nliteral(name),nliteral(label))
into :rename separated by ' '
from __cont
where name ne label and not missing(label)
;
quit;
%if (&sqlobs) %then %do;
proc datasets nolist;
modify &dsn;
rename &rename ;
run;
quit;
%end;
%mend chge;
If the list of rename pairs is too long to fit into a single macro variable then you could resort to generating two series of macro variables using PROC SQL and then add back your %DO loop.
Here is SAS log from testing on your sample file.
4156 %chge(t1);
MPRINT(CHGE): proc contents data=t1 noprint out=__cont;
MPRINT(CHGE): run;
NOTE: The data set WORK.__CONT has 2 observations and 41 variables.
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 0.08 seconds
cpu time 0.01 seconds
MPRINT(CHGE): proc sql noprint ;
MPRINT(CHGE): select catx('=',nliteral(name),nliteral(label)) into :rename
separated by ' ' from __cont where name ne label and not missing(label) ;
MPRINT(CHGE): quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.04 seconds
cpu time 0.00 seconds
MPRINT(CHGE): proc datasets nolist;
MPRINT(CHGE): modify t1;
MPRINT(CHGE): rename group=group_lab number=number_lab ;
NOTE: Renaming variable group to group_lab.
NOTE: Renaming variable number to number_lab.
MPRINT(CHGE): run;
NOTE: MODIFY was successful for WORK.T1.DATA.
MPRINT(CHGE): quit;
NOTE: PROCEDURE DATASETS used (Total process time):
real time 0.12 seconds
cpu time 0.00 seconds
Note that if I now try to run it again on the modified dataset it does not rename anything.
4157 %chge(t1);
MPRINT(CHGE): proc contents data=t1 noprint out=__cont;
MPRINT(CHGE): run;
NOTE: The data set WORK.__CONT has 2 observations and 41 variables.
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
MPRINT(CHGE): proc sql noprint ;
MPRINT(CHGE): select catx('=',nliteral(name),nliteral(label)) into :rename
separated by ' ' from __cont where name ne label and not missing(label) ;
NOTE: No rows were selected.
MPRINT(CHGE): quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.08 seconds
cpu time 0.00 seconds
you're missing a semi-colon here:
&&var&j=&&lab&j
if you still get an error, turn on these options and let us know where the error occurs.
options symbolgen mprint;

ERROR 180-322: Statement is not valid or it is used out of proper

I have searched some info about this error,but it seems none match mine,may someone familiar with this error take a look.
"Code generated by a SAS macro, or submitted with a "submit selected" operation in your editor, can leave off a semicolon inadvertently." it is still abstruse for me to explore in my code by this comment.although I got this error,the outcomes is right.may someone give any advice..thanks!
%let cnt=500;
%let dataset=fund_sellList;
%let sclj='~/task_out_szfh/fundSale/';
%let wjm='sz_fundSale_';
%macro write_dx;
options spool;
data _null_;
cur_date=put(today(),yymmdd10.);
cur_date=compress(cur_date,'-');
cnt=&cnt;
retain i;
set &dataset;
if _n_=1 then i=cnt;
if _n_<=i then do;
abs_filename=&sclj.||&wjm.||cur_date||'.dat';
abs_filename=compress(abs_filename,'');
file anyname filevar=abs_filename encoding='utf8' nobom ls=32767 DLM='|';
put cst_id #;
put '#|' #;
put cust_name #;
put '#|' ;
end;
run;
%mend write_dx;
%write_dx();
and if I am not using macro,there is no error.
data _null_;
options spool;
cur_date=put(today(),yymmdd10.);
cur_date=compress(cur_date,'-');
cnt=&cnt;
retain i;
set &dataset;
if _n_=1 then i=cnt;
if _n_<=i then do;
abs_filename=&sclj.||&wjm.||cur_date||'.dat';
abs_filename=compress(abs_filename,'');
file anyname filevar=abs_filename encoding='utf8' nobom ls=32767 DLM='|';
put cst_id #;
put '#|' #;
put cust_name #;
put '#|' ;
end;
run;
--------------------------------update----------------------------------
I add % to the keyword,but still get the same error
%macro write_dx;
options spool;
data _null_;
cur_date=put(today(),yymmdd10.);
cur_date=compress(cur_date,'-');
cnt=&cnt;
retain i;
set &dataset;
%if _n_=1 %then i=cnt;
%if _n_<=i %then %do;
abs_filename=&sclj.||&wjm.||cur_date||'.dat';
abs_filename=compress(abs_filename,'');
file anyname filevar=abs_filename encoding='utf8' nobom ls=32767 DLM='|';
put cst_id #;
put '#|' #;
put cust_name #;
put '#|' ;
%end;
run;
%mend write_dx;
%write_dx();
Why did you add () to the macro call when the macro is not designed to accept any parameters? If you do that then the () are NOT processed by the macro processor and so are passed along to SAS to interpret. That is the same error message as you would get if you submitted (); by itself.
1 %macro xx ;
2 data _null_;
3 put 'Running data step in macro';
4 run;
5 %mend xx;
6 %xx();
Running data step in macro
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.00 seconds
6 %xx();
-
180
ERROR 180-322: Statement is not valid or it is used out of proper order.
7 ****;
8 ();
-
180
ERROR 180-322: Statement is not valid or it is used out of proper order.
But if you define it with 0 or more parameters.
%macro param();
generated code
%mend ;
%put |%param()|;
The macro processor will use the () and so they are not passed onto SAS to use.
|generated code|
Change
%macro write_dx;
to
%macro write_dx();
When creating a macro, you have to include the () even if no values are passed.

SAS: do not send email when no records

I'm new to SAS base and I am trying to either add a line saying all ID's transferred when nobs=0 Here's what I added on what Alex gave me which print the ID's when there are any in the data set workgo.recds_not_processed. I added nobs=0 condition but it is not sending the ID's when they are present.
data _null_;
length id_list $ 3000;
retain id_list '';
file mymail;
SET set workgo.recds_not_processed nobs = nobs end = eof;
IF nobs = 0 then PUT "All ID's transferred successfully";
else if _n_ = 1 then do;
put 'Number of records not processed=' nobs;
put 'The IDs are:';
end;
/* Print the IDs in chunks */
if length(strip(id_list)) > 2000 then do;
put id_list;
call missing(id_list);
end;
call catx(', ', id_list, id);
if eof then put id_list;
run;
You are very nearly there - the unmatched end spotted by Joe and the double set that I pointed out were the only obvious errors preventing your code from working.
The reason you got no output when dealing with an empty dataset is that SAS terminates a data step when it tries to read a record from a set statement and there are none left to read. One solution is to move your empty dataset logic before the set statement.
Also, you can achieve the desired result without resorting to retain and call catx by using a double trailing ## in your put statement to write to the same line repeatedly from multiple observations from the input dataset:
data recds_not_processed;
do id=1 to 20;
output;
end;
run;
data _null_;
/*nobs is populated when the data step is compiled, so we can use it before the set statement first executes*/
IF nobs=0 then
do;
PUT "All IDs transferred successfully";
stop;
end;
/*We have to put the set statement after the no-records logic because SAS terminates the data step after trying to read a record when there are none remaining.*/
SET recds_not_processed nobs=nobs end=eof;
if _n_=1 then
do;
put 'Number of records not processed=' nobs;
put 'The IDs are:';
end;
/* Print the IDs in chunks */
if eof or mod(_N_, 5)=0 then
put id;
else
put id +(-1) ', ' ##;
run;

Is there a way to detect when you've reached the last observation in a SAS DATA step?

Is there a way to check how many observations are in a SAS data set at runtime OR to detect when you've reached the last observation in a DATA step?
I can't seem to find anything on the web for this seemingly simple problem. Thanks!
The nobs= option to a set statement can give you the number of observations. When the data step is compiled, the header portion of the input datasets are scanned, so you don't even have to execute the set statement in order to get the number of observations. For instance, the following reports 2 as expected:
/* a test data set with two observations and no vars */
data two;
output;
output;
run;
data _null_;
if 0 then set two nobs=nobs;
put nobs=;
run;
/* on log
nobs=2
*/
The end= option sets a flag when the last observation (for the set statement) is read in.
A SAS data set, however, can be a SAS data file or a SAS view. In the case of the latter, the number of observations may not be known either at compile time or at execution time.
data subclass/view=subclass;
set sashelp.class;
where sex = symget("sex");
run;
%let sex=F;
data girls;
set subclass end=end nobs=nobs;
put name= nobs= end=;
run;
/* on log
Name=Alice nobs=9.0071993E15 end=0
Name=Barbara nobs=9.0071993E15 end=0
Name=Carol nobs=9.0071993E15 end=0
Name=Jane nobs=9.0071993E15 end=0
Name=Janet nobs=9.0071993E15 end=0
Name=Joyce nobs=9.0071993E15 end=0
Name=Judy nobs=9.0071993E15 end=0
Name=Louise nobs=9.0071993E15 end=0
Name=Mary nobs=9.0071993E15 end=1
*/
You can also use %sysfunc(attrn( dataset, nlobs)) though it is limited to SAS data sets (i.e. not data views). Credit for the macro to this SUGI paper, which also give great information regarding good macro design.
You can get all sorts of other character and numeric information on a SAS data set.
See the documentation on attrn and attrc.
%macro numobs (data=&syslast ) ;
/* --------------------------------------------
Return number of obs as a function
--------------------------------------------
*/
%local dsid nobs rc;
%let data = &data ; /* force evaluation of &SYSLAST */
%let dsid=%sysfunc(open(&data));
%if &dsid > 0 %then
%do ;
%let nobs=%sysfunc(attrn(&dsid,nlobs));
%let rc=%sysfunc(close(&dsid));
%end ;
%else
%let nobs = -1 ;
&nobs
%mend numobs;
Find the number of observations in a SAS data set:
proc sql noprint;
select count(*) into: nobs
from sashelp.class
;
quit;
data _null_;
put "&nobs";
run;
The SQL portion counts the number of observaions, and stores the number in a macro variable called "nobs".
The data step puts the number for display, but you can use the macro variable like any other.
Performing a certain action when the last observation is processed:
data _null_;
set sashelp.class end=eof;
if eof then do;
put name= _n_=;
end;
run;
The "end" option to the "set" statement defines a variable (here "eof" for end-of-file) that is set to 1 when the last observation is processed. You can then test the value of the variable, and perform actions when its value is 1. For more info, see the documentation for the "set" statement.
data hold;
set input_data end=last;
.
.
.
if last then do;
.
.
.
end;
run;