Update Library in SAS - sas

I may or may not have a library named qa.my_library
I have a temp library (same columns, different data) in work.my_library_temp
My goal is to achieve this pseudo code
if qa.my_library doesn't exists
then qa.my_library = work.my_library_temp;
else qa.my_library = SQL UNION (qa.my_library, work.my_library_temp)
How would you write the code for it ?

To determine if a LIBRARY exists use the LIBREF function. To determine if a DATA SET exists use the EXIST function.
Here is some sample code:
%Global LIBEXISTS DSNEXISTS;
Options source source2 notes symbolgen mlogic mprint ;
%Macro Check(Lib=,DSN=);
%Let LIBEXISTS=0;
/* Outside the DATA STEP, use %SYSFUNC */
%IF %SYSFUNC(LIBREF(&LIB)) = 0 %THEN %Let LIBEXISTS=1; ;
%Let DSNEXISTS=0;
%IF %SYSFUNC(EXIST(&DSN)) > 0 %THEN %Let DSNEXISTS=1; ;
%Mend;
libname test 'c:\';
Data Test.Temp;
Test="Test";
Run;
%Check(Lib=test,dsn=test.temp) ;
%Put LIB EXISTS? &libexists;
%Put DSN EXISTS? &dsnexists;
libname test clear;
%Check(Lib=test,dsn=test.temp) ;
%Put LIB EXISTS? &libexists;
%Put DSN EXISTS? &dsnexists;

One of the reasons for the concept of libraries is to avoid name collisions. What you ask would be unable to deal with such collision (i.e., a dataset with the same name in both libraries). If you can be certain that there will be no collision, than there is actually no reason for what you want: there is no fundamental difference between remembering (and parsing) a unique dataset name or a unique dataset name with its library.
Leaves only the situation where there can be collisions, but you have one library who has priority then. You can write something that gets somewhat close to what you want:
Using SASHELP library, identify every dataset in library A and B. Assuming that A takes priority: for every dataset in B which is not in A, run the following:
data A.dataset_from_B /view=A.dataset_from_B;
set B.dataset_from_B;
run;
Again, it is not exactly the same, but probably serves your needs. But I would be really interested in knowing why you want this though.

Related

splitting a sas program into multiple programs and finally merging output

I created a file (metadata.sas7bdat) which has 100 entries with two columns directory and sasdatasets.
Directory filename
/home abc.sas7bdat
/home def.sas7bdat
/home/sub_dir ghj.sas7bdat
Assume I have sample.sas which just gets the means of each dataset.
Proc means data=abc.sas out=x;
Run;
The above sample.sas should run on each dataset present in metadata file. Right now I have written it in very traditional way of it loops through all the files in metadata and runs proc means on each file and appends all the data. The final dataset has the means of all the files present in metadata.
I believe if I can split the metadata file into 4 parts each having (100/4=25 entries) and submit it as 4 programs and finally merge the output from all the 4 programs It would reduce the processing time by large amount. ( think of 10,000 entries and also assume there is more processing than proc means). Its just I am not well versed with what kind of options to use to submit it as 4 programs and how to sync the output from 4 different processes.
Can you provide me the skeleton of how I should construct this program , I have my vague thoughts but I am sure I can take away some elegant answers .
This question appears to be very similar to Is it possible to loop over SAS datasets?
Breaking into four parts will not reduce the overall processing time. For each part sure. If you think the overall there will be reduced processing, what tests or evidence is there supporting that premise ?
When there is a very large set of processing, P, broken into steps, p(1), p(2), ... p(N), you will need to construct a data structure that can store intermediate results, and rules for not repeating past processing when restarting the process after some p(i) experiences an error, or prior attempt at P stops at p(i)
Consider your case, you will need to store, or accumulate intermediate means with a step-key. The natural key would be the directory and filename.
A top-level dispatcher invokes a step macro for each item in the metadata. The step macro performs the process details when necessary and appends the results in an accumulating way.
Top-level: dispatch each step for the metadata items
%macro process_all (metadata=, results=);
data _null_;
set &metadata;
invoke = cats('%process_step(libname=',libname,',memname=',memname,",results=&results)");
put 'NOTE: ' _n_= invoke=;
call execute ('%nrstr(' || trim(invoke) || ')');
/* global parameter for checking testing 'restart' *
%if %symexist(test_param_1) %then %do;
if _n_ >= &test_param_1 then stop;
%end;
run;
%mend;
Step-level: core process for one metadata item. Skip step if done before, otherwise do actions and accumulate results.
%macro process_step(libname=, memname=, results=);
%local have_results;
%local step_done;
%let have_results = %sysfunc(exist(&results,data));
%let step_done = 0;
%if &have_results %then %do;
data _null_;
set &results; where libname="&libname" and memname="&memname";
call symput ('step_done', '1'); stop;
run;
%end;
%if &step_done %then %do;
%put NOTE: This step already done. &=libname &=memname;
%return;
%end;
proc delete data=_step_out;
proc means data=&libname..&memname noprint;
var height weight;
output out=_step_out(label="step output for &libname. . &memname.");
run;
data _step_out;
length libname $8 memname $32.;
set _step_out;
libname = symget('libname');
memname = symget('memname');
run;
proc append base=&results data=_step_out force;
run;
%mend;
Test the scheme for various configurations of metadata
data _1 _2 _3 _4 _5 _6 _7 _8;
set sashelp.class;
run;
data configuration_1(keep=libname memname);
length libname $8 memname $32.;
libname = 'work';
do index = 1,2,3,7,8; memname='_'||cats(index); output; end;
run;
options mprint;
/*
* reset results to start from scratch
proc delete data=results_1;run;
*/
%let test_param_1 = 3; %* force stoppage after first three items;
%process_all(metadata=configuration_1, results=results_1)
%symdel test_param_1; %* force rerun of all to ignore first three already done items;
%process_all(metadata=configuration_1, results=results_1)
However you actually program the scheme you should be aware of and account for intermediate settings or data sets that might be left over from a prior run or step.

SAS macro modification

I have two values which represent dates:
a=101 and b=103
Below is first macro saved in separate file one.sas:
%global time nmall;
%let nmall =;
%macro pmall;
%do i=&a. %to &b;
%if &i =&a. then %do;
%let nmall=&nmall.&i;
%end;
%else %let nmall=&nmall.,&i;
end;
%put (&nmall);
%mend;
%pmall;
So above pmall give me values 101,102,103.
Below is second macro:
%include “one.as”;
%macro c(a=,b=);
%let m=;
%let m1=;
%do i =&a %to &b;
%let o=&i;
proc sql;
create table new&o as select * from data where nb in(&o.);quit;
%let m =&m.date&o;
data date&o.;
set date&o.;
if pass =&o.;
run;
proc sort data=date&o.;
by flag;
end;
data output &a._&b.;
set &m;
%mend;
The above macro creates three datasets date101 date102 and date 103, then append it to output101_103.
I am trying to modify above macros in such a way that I will not use %macro and %mend approach. Below is the modified macro code:
data a_to_c;
do o=&a to &c;
output;
end;
run;
so above code will have values 101 102 103 in variable o for dataset a_to_c.
data _null_;
set a_to_c;
call execute ('create table new’||strip(o)||' as select * from data
where nb in(’||strip(o)||' );quit;’);
run;
I want to know how to do below things.
Create pmall values in a macro variable in my modified macro inside the data step data a_to_c, so that I can use it further.
How to proceed from %let m macro in the first macro code to new code which I am developing above.
Geetha:
I think you will find the macro-ization of the process to be far easier if you go from a data-centric explicit solution and proceed abstracting the salient features into macro symbols (aka variables)
The end run solution appears to be:
data output_101_to_103;
set original_data;
where nb between 101 and 103;
run;
proc sort data=output_101_to_103;
by nb flag;
run;
In which case you could code a macro that abstracts 101 to FIRST and 103 to LAST. The data sets could also be abstracted. The abstracted parts are specified as the macro parameters.
%macro subsetter(DATA=, FIRST=, LAST, OUTPREFIX=OUTPUT);
%local out;
%let out = &OUTPREFIX._&FIRST._&LAST.;
data &out;
set &DATA.;
where nb between &FIRST. and &LAST.;
* condition = "between &FIRST. and &LAST."; * uncomment if you want to carry along the condition into your output data set;
run;
proc sort data=&out;
by nb flag;
run;
%mend;
And use as
%subsetter (data=original_data, first=101, last=103, outprefix=output)
Note: If you did keep the condition variable in the output data, you WOULD NOT be able to use it directly as a source code statement in a future data step, as in if nb condition then ...
I suppose you could also pass the NB and FLAG as parameters -- but you approach a point of diminishing returns on the utility of the macro.
Macro-izing the specific example I showed doesn't make too much sense unless you need to perform a lot of different variations of FIRST and LAST in a well documented framework. Sometimes it is just better to not abstract the code and work with the specific cases. Why? Because when there are too many abstracted pieces the macro invocation is almost as long as the specific code you are generating and the abstraction just gets in the way of understanding.
If the macro is simply chopping up data and reassembling data, you might be better served rethinking the flow using where, by, and class statements and abstracting around that.
Pmall is macro variable which will have list of values separated by
commas. In my modify macro, i want to create pmall as macro variable
in the datastep data a_to_c; do o=&a to &c; output; end; run; – geetha
anand 1 min ago
To create a macro variable from within a data step using the CALL SYMPUTX() function.
data a_to_c;
length pmall $200 ;
do o=&a to &c;
pmall=catx(',',pmall,o);
output;
end;
call symputx('pmall',pmall);
drop pmall;
run;
If you really want to generate code without a SAS macro you can use CALL EXECUTE() or write the code to a file and use %INCLUDE to run it. Or for small pieces of code you could try putting the code in a macro variable, but macro variables can only contain 64K bytes.
It is really hard to tell from what you posted what code you want to generate. Let's assume that you want to generate an new dataset for each value in the sequence and then append that to some aggregate dataset. So for the first pass through the loop your code might be as simple as these two steps. First to create the proper subset in the right order and the second to append the result to the aggregate dataset.
proc sort data=nb out=date101 ;
where nb=101 ;
by flag ;
run;
proc append base=date101_103 data=date101 force;
run;
Then next two times through the loop will look the same only the "101" will be replaced by the current value in the sequence.
So using CALL EXECUTE your program might look like:
%let a=101;
%let c=103;
proc delete data=date&a._&c ;
run;
data _null_;
do nb=&a to &c;
call execute(catx(' ','proc sort data=nb out=',cats('date',nb,'),';'));
call execute(cats('where nb=',nb,';')) ;
call execute('by flag; run;');
call execute("proc append base=date&a._&c data=");
call execute(cats('date',nb));
call execute(' force; run;');
end;
run;
Writing it to a file to run via %INCLUDE would look like this:
filename code temp ;
data _null_;
file code ;
do nb=&a to &c;
put 'proc sort data=nb out=date' nb ';'
/ ' where ' nb= ';'
/ ' by flag;'
/ ';'
/ "proc append base=date&a._&c data=date" nb 'force;'
/ 'run;'
;
end;
run;
proc delete data=date&a._&c ;
run;
%include code / source2;
If the goal is to just create the aggregate dataset and you do not need to keep the smaller intermediate datasets then you could just use the same name for the intermediate dataset on each pass through the loop. That will make the code generation easier as then there is only only place that needs to change based on the current value. Also that way you only need to have two dataset names even for a sequence of 10 or 20 values. It will take less space and reduce clutter in the work library.

SYSERR Automatic Macro Variable

I have several programs in one SAS project, i.e program A -> program B ->.... I want to build dependency between programs using macro variables.
Program A will process few data step and procs. If ANY procedure in program A executes with errors, I would like to run program C. Otherwise run program B.
This seems to be tricky since syserr resets at each step boundary. If first data step in program A executes with error and the rest don't, then at the end of program A, syserr is still 0. I need macro variable value to be something other than 0 once error happens and the value can keep till program ends.
If the program dependency is based on other criteria (say values), the user-defined macro variables can handle that. For something related to system errors, I think SAS already has something can do the trick.But I can't find anything else except syserr, which doesn't seem to help.
Note: I find this SAS stop on first error. But basically it's to check the error status after each data step. That sounds crazy if program A contains 50+ data steps.
Easy - just use syscc!
SYSCC is a read/write automatic macro variable that enables you to
reset the job condition code and to recover from conditions that
prevent subsequent steps from running.
See documentation, but I guess you will be looking for something like:
%if &syscc > 4 %then %do;
%inc "/mypath/pgmB.sas";
%end;
%else %do;
%inc "/mypath/pgmA.sas";
%end;
The highest value of syscc is retained across step boundaries, always with an integer to represent the error level. Example values:
The values for SYSCC are:
0 is no errors no warnings
4 is warnings
greater than 4 means an error occurred
Note that there are some things it won't catch, but to improve it's effectiveness you can use:
options errorcheck=strict;
Finally - you mention 'sas project', if by this you mean you are using Enterprise Guide then please be aware of the advice in this usage note.
You could define a macro that kept track of the error status and run this after each step. The macro would look something like this:
%macro track_err;
%global err_status;
%let err_status = %sysfunc(max(&err_status, &syserr ne 0));
%mend;
Usage example below. First initialize the value to track the overall error status. The first datastep will fail, the second will run successfully and the final value of err_status will be 1.
%let err_status = 0;
data oops;
set sashelp.doesnt_exist;
run;
%track_err;
data yay;
set sashelp.class;
run;
%track_err;
%put &=err_status;
Final output:
ERR_STATUS=1
As far as 'sounding crazy to check the status after each step'... well SAS doesn't provide something for the exact requirement you have, so the only way to do it is literally to have something check after each step.
EDIT: Correction - looks like the syscc approach mentioned in RawFocus's answer actually shows there IS something that does this in SAS.
If you want the check to 'blend in' more with the code, then consider replacing your run and quit statements with a macro that performs the run/quit, then checks the status all in one. It will result in slightly cleaner code. Something like this:
%macro run_quit_track;
run;quit;
%global err_status;
%let err_status = %sysfunc(max(&err_status, &syserr ne 0));
%mend;
%let err_status = 0;
data oops;
set sashelp.doesnt_exist;
%run_quit_track;
data yay;
set sashelp.class;
%run_quit_track;
%put &=err_status;
Use one macro, call it runquitA. call this macro at the end of each proc sql or proc data in the place of quit; and run;
Example:
/*Program A*/
%macro runquitA;
; run; quit;
%if &syserr. ne 0 %then %do;
/*Call Program C*/
%end;
%mend runquitA;
proc sql;
create table class1 as
select * from sashelp.class;
%runquitA;
data class2;
set sashelp.class;
%runquitA;
/*Call Program B*/
/*end of Program A*/

Using a series of values for a SAS macro parameter

I'm looking for a way to use a series of values for a macro parameter instead of a single value. I'm basically manipulating a series of files for consecutive months (May 2014 to Sept 2015) and I've written a macro to take advantage of the naming conventions. However, I'm still manually writing out the months to use the macro. I'm doing this many times over with lots of different files from this month. Is there a way to have the parameter reference a list of values and go through them like an array/do-loop? I've looked into %ARRAY as a possibility but that doesn't seem to do what I'm looking for unless I'm not seeing it's full capability. I've attached a sample of this code below.
%MACRO memmonth(monyr=);
proc freq data=work.both_&monyr ;
table var1/ out=work.mem_&monyr;
run;
data work.mem_&monyr;
set work.mem_&monyr;
count_&monyr=count;
run;
%MEND memmonth;
%memmonth(monyr=may14)
%memmonth(monyr=jun14)
%memmonth(monyr=jul14)
%memmonth(monyr=aug14)
%memmonth(monyr=sep14)
%memmonth(monyr=oct14)
%memmonth(monyr=nov14)
%memmonth(monyr=dec14)
%memmonth(monyr=jan15)
%memmonth(monyr=feb15)
%memmonth(monyr=mar15)
%memmonth(monyr=apr15)
%memmonth(monyr=may15)
%memmonth(monyr=jun15)
%memmonth(monyr=jul15)
%memmonth(monyr=aug15)
%memmonth(monyr=sep15)
In general I would recommend passing the list of values as a space delimited list and adding looping logic in the macro. If spaces are valid characters in the values then use some other delimiter. Do not use comma as the delimiter as it means you will need to use macro quoting to call the macro.
So your basic macro is this.
%macro memmonth(monyr);
proc freq data=work.both_&monyr ;
table var1/ out=work.mem_&monyr (rename=(count=count_&monyr)) ;
run;
%mend memmonth;
%memmonth(may14)
%memmonth(jun14)
You could change it to this.
%macro memmonth(monyrlist);
%local i monyr;
%do i=1 %to %sysfunc(countw(&monyrlist));
%let monyr=%scan(&monyrlist,&i);
proc freq data=work.both_&monyr ;
table var1/ out=work.mem_&monyr (rename=(count=count_&monyr)) ;
run;
%end;
%mend memmonth;
%memmonth(may14 jun14)
If you always want to process all of the months in an interval then you could just pass in the start and end month of the interval.
%macro memmonth(start,end);
%local i monyr;
%do i=0 %to %sysfunc(intck(month,"01&start"d,"01&end"d));
%let monyr=%sysfunc(intnx(month,"01&start"d,&i),monyy5.);
proc freq data=work.both_&monyr ;
table var1/ out=work.mem_&monyr (rename=(count=count_&monyr)) ;
run;
%end;
%mend memmonth;
%memmonth(may14,sep15)
If you have a source list, whether it is a text file or a dataset, you can use a simple data step to generate macro calls. So if you have an input dataset with the variable MONYR then your driver program would look like this:
data _null_;
set mylist ;
call execute(cats('%nrstr(memmonth)(',MONYR,')'));
run;
If the source is a file with the names then replace the SET statement with the appropriate INFILE and INPUT statements. If the source is a directory name then look at one of the many SAS ways to read the names of files in a directory into a dataset and use that to drive the macro call generation.

what is wrong with the following sas code?

I am new to sas. I have written a basic code but it is not working. Can somebody help me figure out what is wrong with the code. I wish to append the datasets.
options mprint mlogic symbolgen;
%macro temp();
%let count = 0;
%if &count = 0 %then %do;
data temp;
set survey_201106;
%let count = %eval(&count +1);
%end;
%else %do;
%do i = 201107 %to 201108;
data temp;
set temp survey_&i;
%end;
%end;
run;
%mend;
%temp;
You are setting &count to 0 at the beginning of the macro, so the %else clause will never be executed.
I'm not sure what your aim is, but it looks like you just want to concatenate 3 datasets and store in a new dataset. If so, will this not suffice:
data temp;
set survey_201106-survey_201108;
run;
This creates a dataset called temp and populates it with the the contents of survey_201106, survey_201107 and survey_201108 in order. The - tells SAS that you want the all the datasets named survey_20110* between survey_201106 and survey_201108 inclusive.
Details of the syntax.
options mprint mlogic symbolgen;
%macro temp();
proc sql noprint;
create table table_list as
select monotonic() as num,memname
from dictionary.tables
where libname = 'WORK' and memname contains 'SURVEY_';
quit;
proc sql noprint;
select count(*) into :cnt
from table_list;
quit;
%do i = 1 %to &cnt.;
%if &i eq 1 and %sysfunc(exist(work.temp)) %then %do;
proc sql;
drop table work.temp;
quit;
%end;
proc sql noprint;
select memname into :memname
from table_list
where num = &i.;
quit;
proc append base = temp data = &memname. force;
run;
%end;
%mend;
%temp;
Working: Above code will append all the work data sets whose names starting with
'SURVEY_' to temp data set.
dictionary.tables is used to create a data set which will contain the list of data sets whose names starts with 'survey_'.
table_list data set:
num memname
1 survey_201106
2. survey_201107
3. survey_201108
cnt macro variable is created to hold number of such data sets
Within loop, each data set present the list of data set names (in table table_list) will be appended to work.temp data set
What you're probably trying to do is create a macro that appends (without using proc append for some reason) when a dataset exists, or creates it new when it does.
SAS is not like r or other similar languages, where you have to control largely everything that happens. SAS's strength is that you can ask it to do common things with only a line or two of code. SAS is what's commonly called a 4th Generation Language for this reason: you're not supposed to control all of the little bits. That's a waste of time. Use the Procedures (PROC...) and constructs SAS provides you.
In this case, PROC APPEND does exactly what this whole macro does. It creates a dataset or appends new rows to it if it already exists.
proc append base=temp data=in_data;
run;
Now, if you're trying to learn the macro language and using this concept as a learning tool only, it is possible to do this in a macro that's not all that different from yours.
Note: This is not a good way to do this. It might be useful for learning macro concepts, but it should not be used as an example of good code. Despite my improvements, it is still not the way you should do this; proc append or SRSwift's example are better.
One thing I'm going to introduce here: a macro parameter. A good rule of macro programming is that all macros should have parameters. If there's no possible parameter, it should usually be possible to do without needing a macro. Parameters are what make macros useful, most of the time. In this case I'm going to rewrite your macro to take one append dataset as a parameter and one 'base' dataset. In your example, temp was the base dataset and survey_1106 etc. are the append datasets.
Also, &count needs to be a global macro variable. In SAS, variables created inside a macro are by default local in scope - ie, they only are defined inside one run of the macro and then disappear. This is nearly identical to functions in c/etc. languages (a bit different from r, which uses lexical scoping, and you might be expecting given how you wrote this). There are some funny rules, though, but for now we'll just go with this. global macro variables, which include any variable that has already been defined in the global scope, are available in all macro iterations (and outside of macros).
So:
%macro append_dataset(base=,append=);
%if &count=0 %then %do;
data &base.;
set &append.;
run;
%end;
%else %do;
data &base.;
set &base. &append.;
run;
%end;
%let count=%eval(&count.+1);
%mend append_dataset;
%let count=0;
%append_dataset(base=temp,append=survey_1106);
%append_dataset(base=temp,append=survey_1107);
%append_dataset(base=temp,append=survey_1108);
Now, you could generate those calls through an external method (such as dictionary.tables as in Harshad's example). You also could add another element to the macro, which is to iterate over all of the elements in a list provided as append. You could also hardcode the list in a %do loop, as you did in the initial example (but I think that's bad practice to get into). You could literally do that in my macro:
%macro append_dataset(base=,append=);
%do survey=201106 to 201108;
%if &count=0 %then %do;
data &base.;
set survey_&survey.;
run;
%end;
%else %do;
data &base.;
set &base. survey_&survey.s;
run;
%end;
%let count=%eval(&count.+1);
%end;
%mend append_dataset;
Notice the count increment is inside the do loop - that's one of the places you went wrong here. Otherwise this is just adding an outer loop and changing the append mentions to the calculated values from the loop. But again, this is fairly poor coding practice - the loop should at minimum be constructed from a macro parameter.