Instead of commenting in and out large blocks of code when developing, I am looking for an equivalent to the stop statement outside a data step which stops a SAS script at a certain point, ideally without throwing an error, nor setting brackets, nor defining own macros. All I found as minimal workaround is something like the following:
%put --- This is code I would like to execute;
data _null_;
abort cancel file;
run;
%put --- This is code which should temporarily disabled;
Is there a shorter or cleaner solution (in terms of log-output) how to stop executing a SAS script without quitting SAS altogether?
Related questions
Ending a SAS-Stored process properly inspired my example code.
break/exit script is about a slightly different problem
Create one macro stop_submission
%macro stop_submission;
%abort cancel;
%mend;
Sample use:
%macro stop_submission;
%abort cancel;
%mend;
data one;
set sashelp.class;
run;
%stop_submission
data two;
set sashelp.class;
run;
data three;
set sashelp.class;
run;
Will log something like
29167 data one;
29168 set sashelp.class;
29169 run;
NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.ONE has 19 observations and 5 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
29170
29171 %stop_submission
ERROR: Execution canceled by an %ABORT CANCEL statement.
NOTE: The SAS System stopped processing due to receiving a CANCEL request.
Related
I have a job which at first imports some xlsx files, then connects to multiple DB tables. Based on conditions, the job selects rows to output, and creates an excel file to send on to the final end-user.
Sometimes, that job returns zero rows, which is acceptable; in that case, I would prefer to create an empty excel file with only the variables, but not run the other code (checking/cleaning code).
How can I conditionally execute code only when there are results?
Something like this:
I get 0 rows
If Result = 0 then Go to *"here"*
Else *"just run the code further"*
You have a few useful things that can help you here.
First off, PROC SQL sets a macro variable SQLOBS, which is particularly useful in identifying how many records were returned from the last SQL query it ran.
proc sql;
select * from sashelp.class;
quit;
%put I returned &SQLOBS rows;
You might use this to drive further processing, either with %IF blocks as Tom notes in comments or other methods I will cover below.
You can also check how many rows are in a dataset explicitly, if you prefer a slightly more robust option.
proc sql;
select count(*) into :class_count from sashelp.class;
quit;
%put I returned &class_count rows;
For very large datasets, there are faster options (using the dataset descriptors, dictionary tables, or a few other options), but for most tables this is fine.
Either way, what I would typically do with a program I intended to run in production would be then to drive the rest of the program from macros.
%macro whatIWantToDo(params);
...
do stuff
...
%mend whatIWantToDo;
proc sql;
mySqlStuff;
quit;
%if &sqlobs. gt 0 %then %do;
%whatIWantToDo(params);
%end;
%else %do;
%put Nothing to do;
%end;
Another option is to use call execute; this is appropriate if your data drives the macro parameters. The big advantage of call execute is that it only runs if you have data rows - if you have zero, it won't do anything!
Say you have some datasets to run code on. You could have up to twelve - one per month - but only have them for the current calendar year, so in Jan you have one, Feb you have two, etc. You could do this:
data mydata_jan mydata_feb mydata_mar;
set sashelp.class;
run;
%macro printit(data=);
title "Printing &data.";
proc print data=&data;
run;
title;
%mend printit;
data _null_;
set sashelp.vtable;
where upcase(memname) like 'MYDATA_%' and nobs gt 0;
callstr = cats('%printit(data=',memname,')');
call execute(Callstr);
run;
First I make the datasets, with a name I can programmatically identify. Then I make the macro that I want to run on each (this could be checking, cleaning, whatever). Then I use sashelp.vtable which shows which tables are created, and check the nobs variable (number of observations) is more than zero. Then I use call execute to run the macro on that dataset!
I created a file (metadata.sas7bdat) which has 100 entries with two columns directory and sasdatasets.
Directory filename
/home abc.sas7bdat
/home def.sas7bdat
/home/sub_dir ghj.sas7bdat
Assume I have sample.sas which just gets the means of each dataset.
Proc means data=abc.sas out=x;
Run;
The above sample.sas should run on each dataset present in metadata file. Right now I have written it in very traditional way of it loops through all the files in metadata and runs proc means on each file and appends all the data. The final dataset has the means of all the files present in metadata.
I believe if I can split the metadata file into 4 parts each having (100/4=25 entries) and submit it as 4 programs and finally merge the output from all the 4 programs It would reduce the processing time by large amount. ( think of 10,000 entries and also assume there is more processing than proc means). Its just I am not well versed with what kind of options to use to submit it as 4 programs and how to sync the output from 4 different processes.
Can you provide me the skeleton of how I should construct this program , I have my vague thoughts but I am sure I can take away some elegant answers .
This question appears to be very similar to Is it possible to loop over SAS datasets?
Breaking into four parts will not reduce the overall processing time. For each part sure. If you think the overall there will be reduced processing, what tests or evidence is there supporting that premise ?
When there is a very large set of processing, P, broken into steps, p(1), p(2), ... p(N), you will need to construct a data structure that can store intermediate results, and rules for not repeating past processing when restarting the process after some p(i) experiences an error, or prior attempt at P stops at p(i)
Consider your case, you will need to store, or accumulate intermediate means with a step-key. The natural key would be the directory and filename.
A top-level dispatcher invokes a step macro for each item in the metadata. The step macro performs the process details when necessary and appends the results in an accumulating way.
Top-level: dispatch each step for the metadata items
%macro process_all (metadata=, results=);
data _null_;
set &metadata;
invoke = cats('%process_step(libname=',libname,',memname=',memname,",results=&results)");
put 'NOTE: ' _n_= invoke=;
call execute ('%nrstr(' || trim(invoke) || ')');
/* global parameter for checking testing 'restart' *
%if %symexist(test_param_1) %then %do;
if _n_ >= &test_param_1 then stop;
%end;
run;
%mend;
Step-level: core process for one metadata item. Skip step if done before, otherwise do actions and accumulate results.
%macro process_step(libname=, memname=, results=);
%local have_results;
%local step_done;
%let have_results = %sysfunc(exist(&results,data));
%let step_done = 0;
%if &have_results %then %do;
data _null_;
set &results; where libname="&libname" and memname="&memname";
call symput ('step_done', '1'); stop;
run;
%end;
%if &step_done %then %do;
%put NOTE: This step already done. &=libname &=memname;
%return;
%end;
proc delete data=_step_out;
proc means data=&libname..&memname noprint;
var height weight;
output out=_step_out(label="step output for &libname. . &memname.");
run;
data _step_out;
length libname $8 memname $32.;
set _step_out;
libname = symget('libname');
memname = symget('memname');
run;
proc append base=&results data=_step_out force;
run;
%mend;
Test the scheme for various configurations of metadata
data _1 _2 _3 _4 _5 _6 _7 _8;
set sashelp.class;
run;
data configuration_1(keep=libname memname);
length libname $8 memname $32.;
libname = 'work';
do index = 1,2,3,7,8; memname='_'||cats(index); output; end;
run;
options mprint;
/*
* reset results to start from scratch
proc delete data=results_1;run;
*/
%let test_param_1 = 3; %* force stoppage after first three items;
%process_all(metadata=configuration_1, results=results_1)
%symdel test_param_1; %* force rerun of all to ignore first three already done items;
%process_all(metadata=configuration_1, results=results_1)
However you actually program the scheme you should be aware of and account for intermediate settings or data sets that might be left over from a prior run or step.
I have several programs in one SAS project, i.e program A -> program B ->.... I want to build dependency between programs using macro variables.
Program A will process few data step and procs. If ANY procedure in program A executes with errors, I would like to run program C. Otherwise run program B.
This seems to be tricky since syserr resets at each step boundary. If first data step in program A executes with error and the rest don't, then at the end of program A, syserr is still 0. I need macro variable value to be something other than 0 once error happens and the value can keep till program ends.
If the program dependency is based on other criteria (say values), the user-defined macro variables can handle that. For something related to system errors, I think SAS already has something can do the trick.But I can't find anything else except syserr, which doesn't seem to help.
Note: I find this SAS stop on first error. But basically it's to check the error status after each data step. That sounds crazy if program A contains 50+ data steps.
Easy - just use syscc!
SYSCC is a read/write automatic macro variable that enables you to
reset the job condition code and to recover from conditions that
prevent subsequent steps from running.
See documentation, but I guess you will be looking for something like:
%if &syscc > 4 %then %do;
%inc "/mypath/pgmB.sas";
%end;
%else %do;
%inc "/mypath/pgmA.sas";
%end;
The highest value of syscc is retained across step boundaries, always with an integer to represent the error level. Example values:
The values for SYSCC are:
0 is no errors no warnings
4 is warnings
greater than 4 means an error occurred
Note that there are some things it won't catch, but to improve it's effectiveness you can use:
options errorcheck=strict;
Finally - you mention 'sas project', if by this you mean you are using Enterprise Guide then please be aware of the advice in this usage note.
You could define a macro that kept track of the error status and run this after each step. The macro would look something like this:
%macro track_err;
%global err_status;
%let err_status = %sysfunc(max(&err_status, &syserr ne 0));
%mend;
Usage example below. First initialize the value to track the overall error status. The first datastep will fail, the second will run successfully and the final value of err_status will be 1.
%let err_status = 0;
data oops;
set sashelp.doesnt_exist;
run;
%track_err;
data yay;
set sashelp.class;
run;
%track_err;
%put &=err_status;
Final output:
ERR_STATUS=1
As far as 'sounding crazy to check the status after each step'... well SAS doesn't provide something for the exact requirement you have, so the only way to do it is literally to have something check after each step.
EDIT: Correction - looks like the syscc approach mentioned in RawFocus's answer actually shows there IS something that does this in SAS.
If you want the check to 'blend in' more with the code, then consider replacing your run and quit statements with a macro that performs the run/quit, then checks the status all in one. It will result in slightly cleaner code. Something like this:
%macro run_quit_track;
run;quit;
%global err_status;
%let err_status = %sysfunc(max(&err_status, &syserr ne 0));
%mend;
%let err_status = 0;
data oops;
set sashelp.doesnt_exist;
%run_quit_track;
data yay;
set sashelp.class;
%run_quit_track;
%put &=err_status;
Use one macro, call it runquitA. call this macro at the end of each proc sql or proc data in the place of quit; and run;
Example:
/*Program A*/
%macro runquitA;
; run; quit;
%if &syserr. ne 0 %then %do;
/*Call Program C*/
%end;
%mend runquitA;
proc sql;
create table class1 as
select * from sashelp.class;
%runquitA;
data class2;
set sashelp.class;
%runquitA;
/*Call Program B*/
/*end of Program A*/
I currently have a SAS process that generates multiple data sets (whether they have observations or not). I want to determine a way to control the export procedure based on the total number of observations (if nobs > 0, then export). My first attempt was something primitive using if/then logic comparing a select into macro var (counting obs in a data set) -
DATA _NULL_;
SET A_EXISTS_ON_B;
IF &A_E > 0 THEN DO;
FILE "C:\Users\ME\Desktop\WORKLIST_T &PDAY..xls";
PUT TASK;
END;
RUN;
The issue here is that I don't have a way to write multiple sets to the same workbook with multiple sheets(or do I?)
In addition, whenever I try and add another "Do" block, with similar logic, the execution fails. If this cannot be done with a data null, would ODS be the answer?
The core of what you want to do, conditionally execute code, can be done one of a number of ways.
Let's imagine we have a short macro that exports a dataset to excel. Simple as pie.
%macro export_to_excel(data=,file=,sheet=);
proc export data=&data. outfile=&file. dbms=excel replace;
sheet=&sheet.;
run;
%mend export_to_excel;
Now let's say we want to do this conditionally. How we do it depends, to some degree, on how we call this macro in our code now.
Let's say you have:
%let wherecondition=1; *always true!;
data class;
set sashelp.class;
if &wherecondition. then output;
run;
%export_to_excel(data=class,file="c:\temp\class.xlsx", sheet=class1);
Now you want to make this so it only exports if class has some rows in it, right. So you get the # of obs in class:
proc sql;
select count(1) into :classobs from class;
quit;
And now you need to incorporate that somehow. In this case, the easiest way is to add a condition to the export macro. Open code doesn't allow conditional executing of code, so it needs to be in a macro.
So we do:
%macro export_to_excel(data=,file=,sheet=,condition=1);
%if &condition. %then %do;
proc export data=&data. outfile=&file. dbms=excel replace;
sheet=&sheet.;
run;
%end;
%mend export_to_excel;
And you add the count to the call:
%export_to_excel(data=class,file="c:\temp\class.xlsx", sheet=class1,condition=&classobs.)
Tada, now it won't try to export when it's 0. Great.
If this code is already in a macro, you don't have to alter the export macro itself. You can simply put that %if %then part around the macro call. But that's only if the whole thing is already a macro - %if isn't allowed outside of macros (sorry).
Now, if you're exporting a whole bunch of datasets, and you're generating your export calls from something, you can add the condition there, more easily and more smoothly than this.
Basically, either make by hand (if that makes sense), or use proc sql or proc contents or (other method of your choice) to make a dataset that contains one row per dataset-to-export, with four variables: dataset name, file to export, sheet to export (unless that's the same as the dataset name), and count of observations for that dataset. Often the first three would be made by hand, and then merged/updated via sql or something else to the count of obs per dataset.
Then you can generate calls to export, like so:
proc sql;
select cats('%export_to_excel(data=',dataname,',file=',filename,',sheet=',sheetname,')')
into :explist separated by ' '
from datasetwithnames
where obsnum>0;
quit;
&explist.; *this actually executes them;
Assuming obsnum is the new variable you created with the # of obs, and the other variables are obviously named. That won't pull a line with anything with 0 observations - so it never tries to execute the export. That works with the initial export macro just as well as with the modified one.
Suggest you google around for different approaches to writing XLS files.
Regarding using a DATA step or PROC step, the DATA step is tolerant of datasets that have 0 obs. If the SET statement reads a dataset that has 0 obs, it will simply end the step. So you don't need special logic. Most PROCS also accomodate 0 obs dataset without throwing a warning or error.
For example:
1218 *Make a 0 obs dataset;
1219 data empty;
1220 x=1;
1221 stop;
1222 run;
NOTE: The data set WORK.EMPTY has 0 observations and 1 variables.
1223
1224 data want;
1225 put "I run before SET statement.";
1226 set empty;
1227 put "I do not run after SET statement.";
1228 run;
I run before SET statement.
NOTE: There were 0 observations read from the data set WORK.EMPTY.
NOTE: The data set WORK.WANT has 0 observations and 1 variables.
1229
1230 proc print data=empty;
1231 run;
NOTE: No observations in data set WORK.EMPTY.
But note as Joe points out, PROC EXPORT will happily export a dataset with 0 obs and write an file with 0 records, overwriting if it was there already. e.g.:
1582 proc export data=sashelp.class outfile="d:\junk\class.xls";
1583 run;
NOTE: File "d:\junk\class.xls" will be created if the export process succeeds.
NOTE: "CLASS" range/sheet was successfully created.
1584
1585 data class;
1586 stop;
1587 set sashelp.class;
1588 run;
NOTE: The data set WORK.CLASS has 0 observations and 5 variables.
1589
1590 *This will replace class.xls";
1591 proc export data=class outfile="d:\junk\class.xls" replace;
1592 run;
NOTE: "CLASS" range/sheet was successfully created.
ODS statements would likely do the same.
I use a macro to check if a dataset is empty. SO answers like:
How to detect how many observations in a dataset (or if it is empty), in SAS?
When using VTYPE on a dataset with 0 observations I do not get needed information.
Here is MWE:
Create simple set with 1 variable and 1 observation.
data fullset;
myvar=1;
run;
Create another set with same 1 variable and 0 observations.
data emptyset;
set fullset;
stop;
run;
Make a macro that opens set, checks vtype and prints it to log.
%macro mwe(inset);
%local TYPE;
data _NULL_;
set &inset.;
CALL SYMPUT("TYPE", VTYPE(myvar));
put TYPE;
stop;
run;
%put &=TYPE.;
%mend mwe;
When run on set with observations everything works fine:
%mwe(fullset);
TYPE=N
But when run with an empty set the TYPE does not get assigned
%mwe(emptyset);
TYPE=
I guess the reason is that no code lines are processed since the set has no observations. Is there any workaround for that?
NOTE:Using proc contents and parsing the result table is certainly an overkill for such a simple task
Your problem is not vtype(), but how the data step works with an empty dataset.
When the set statement attempts to pull a row and fails, the data step immediately terminates. This can be useful - for example, when you don't want it to do things after the last row in the dataset is past. But in this case, it is less useful. Your datastep terminates instantly upon the set statement, meaning your call symput never occurs.
However, you can take advantage of a different thing: the fact that SAS will happily create all of the metadata even before set, during compilation.
%macro mwe(inset);
%local TYPE;
data _NULL_;
CALL SYMPUT("TYPE", VTYPE(myvar));
set &inset.;
stop;
run;
%put &=TYPE.;
%mend mwe;
Notice I moved the call symput before the set. Yes, vtype() works fine even before set - the variables are still defined in the PDV even before anything happens in the data step.
(I also took out the spurious put statement that never will do anything as no TYPE variable is ever created in either version.)
An alternative approach is to use the vartype function instead, which does not require a set statement and unlike the vtype function can be used in pure macro code outside a data step (without resorting to dosubl or the like).
What all this means in practice is that you can use vartype to make a function-style macro version of vtype, like so:
%macro vtype(ds,var);
%local dsid varnum rc vartype;
%let dsid = %sysfunc(open(&ds));
%let varnum = %sysfunc(varnum(&dsid,&var));
%let vartype = %sysfunc(vartype(&dsid,&varnum));
%let rc = %sysfunc(close(&dsid));
&vartype
%mend vtype;
/*Example*/
%put %vtype(emptyset,myvar);
/*Output*/
N