I have a program that starts with:
%let filename = file1.csv
The program then imports the file into a sas dataset and then moves some of the data to a sql table based on some rules.
I want to have the program loop through processing whatever files are in a folder..I guess by redefining filename each time it gets to the end of the program and going back to the top until all of the csv files in the folder have been processed?
This is an example of a process that might benefit from creating a macro. Sounds like your existing code is close to being ready to become a macro. Just wrap the existing code into a macro definition that takes FILENAME as a parameter (remove your %let statement).
Then your existing program can become something like this. Where the last line is the one that actually runs the steps defined in the macro definition.
%macro loadfile(filename=);
... existing code ....
%mend loadfile;
%loadfile(filename=file1.csv);
To extend it to loading all files in a directory you just need to generate the list of files and use the list to generate a series of calls to the macro. So something like this might work on a Windows machine. So it will call the Windows command DIR to get the list of files and read the result into a variable and for each file found generate a call to the macro. The commands pushed by CALL EXECUTE will run after the data step finishes.
data _null_;
infile 'dir /b *.csv' pipe truncover ;
input filename $256. ;
call execute(cats('%nrstr(loadfile)(filename=',filename,')'));
run;
Related
I have a PowerShell program that searches a folder on my PC for several text files. If the file is not in the folder, it writes the filename to another text file. When the procedure finishes, I have a text file with a list of files (one column) that are missing from the folder.
Next I would like PC SAS to read the list from the text file and launch the corresponding SAS program that I have already written that retrieves each file from our FTP server.
I am not sure how to go about having SAS read the filenames and launch my FTP programs. Any suggestions on how to accomplish this task?
Sounds like the first problem is you need to modify the SAS program to use a dataset with the list of filenames to process. One easy way to do that is to create a macro that downloads one file and takes the filename as an input parameter. So convert your code that downloads a file to a macro.
%macro mf_download(file);
* Code that downloads &FILE from mainframe. ;
%mend ;
Then it is easy to write a program that reads the names from a file and calls the macro for each name in the file. So say the file with the list of names if named filelist.txt then that part of the program might look like this:
data names;
infile 'filelist.txt' truncover ;
input filename :$256. ;
call execute(cats('%nrstr(%mf_download)(',filename,')'));
run;
When I need to call a macro several times, I've been using CALL EXECUTE in a DATA NULL step like so:
DATA _NULL_;
DO i=1 to 1000;
CALL EXECUTE ('%mymacro');
RUN;
This has worked fine for me up until now. However if I use this method to call %mymacro a million times (say), I get an "out of memory" error before it runs the macro once.
My naive understanding of this is that SAS attempts to "write out" the macro a million times before executing and thus runs out of memory during this process. Is this accurate? And: what are good ways to get around this?
You just need to understand how Call Execute works :
Basically ,Call Execute will parse the macro code immediately, but it queues up the resulting SAS steps until after the current data step finishes. In other words, you are potentially building up millions upon millions of lines of SAS code in memory that SAS is just storing up to be executed when that data _null_ step finishes. Eventually, this gets so large that SAS just crash.
Here's a couple of solutions :
1- Either add %nrstr() into your CALL EXECUTE statements.
2- Or change your data _null_ step to generate a file with the code and %include the file.
One option would be to chage the data step so that it actually creates a .sas file that contains the macro calls... and then %include it. For exmaple:
data _null_;
file "myfile.sas";
do i=1 to 1000;
put '%mymacro';
end;
run;
%include "myfile.sas";
This may fix the issue. Then again I'm not sure if SAS would like a .sas program that contains 1 million lines of code either. If the latter is the case, then simply break the program up into 10 .sas files each with 100k lines of code.
We have hundreds of macros across several autocall libraries that I'd like to compile using the MSTORE facility.
The problem is that some of those libraries are used on other sites - and we don't want to add /STORE to every definition, as then it is required to use the MSTORED option.
I guess the answer is to add /* /STORE */ at the end of every %macro statement, so that the .sas program files can be updated with a regular expression (or some other text replacement technique) on every redeploy, but this seems like a messy workaround.
Is it possible / in some way supported, to compile regular (unmodified) autocall macros and store the definitions? Eg by %inc'ing each one and saving the work macro catalog, or something like that?
I won't say definitively that this isn't possible, but I can report that I tried to do the same thing some time ago and got stuck on the same point. I was also unable to find any way of doing this other than adding /store to every %macro statement.
I vaguely remember that I was able to upload the work.sasmacr catalogue from one session to another on the same machine (after first compiling a few autocall macros to populate it), but the other session didn't recognise the macro definitions from transferred catalogue even though the appropriate options were set for using stored compiled macros.
My motivation was different from yours - I was looking for a way to define a macro in one session and execute it in another without saving it in an autocall folder or %including it in both sessions - but the conclusion was the same.
What is the problem that you had?
First compile all of your autocall macros. Say you have a fileref named MYMACS that points to the directory with the source code.
%include mymacs(macro1,macro2,.....);
You might use a program to search for all of the source files so that you could automate generating the %include statement(s). Or you could use a datastep and copy all of the source files into a single temporary file and include that.
filename src temp;
data _null_;
infile "&inpath\*.sas" ;
file src ;
input;
put _infile_;
run;
%inc src ;
Then copy the WORK catalog to you a new location. Note that the name is different if you are running SAS on an application server. In that case try copying from WORK.SASMAC1 instead of WORK.SASMACR.
libname out base "&path";
proc catalog cat=work.sasmacr et=macro ;
copy out=out.sasmacr ;
run;
quit;
You can test if it worked by clearing your current work macro catalog, so you know SAS is not finding the macro there, and setting options to point to the new catalog of compiled macros.
proc catalog cat=work.sasmacr kill force ;
quit;
options mrecall mstored sasmstore=out ;
Then trying to run one of the copied compiled macros.
Now start up a new session and try using the compiled macros in that session.
Here was the approach I took for compilation (of course there are many alternative ways). The locations to query can be extracted from:
%put %sysfunc(getoption(sasautos));
The approach relies on macros being closed off with )/*/STORE SOURCE*/; as follows:
%macro some_macro(var1=x, var2=y
)/*/STORE SOURCE*/;
The SAS code has to be a program, as you can't create a stored compiled macro from within a macro.
/* set mstore options */
options mstored sasmstore=yourlib;
/* get list of macros */
/* taken from macrocore library */
/*https://github.com/sasjs/core/blob/main/base/mp_dirlist.sas*/
%mp_dirlist(path=/location/one,outds=in1)
%mp_dirlist(path=/location/two,outds=in2)
/* set up a temporary fileref */
filename tmp temp;
/**
* write each source macro to the fileref
* and switch on the STORE option
*/
data want;
set in1 in2;
infile dummy filevar=filepath end=last ;
file tmp;
do until(last);
file tmp;
input;
if _infile_=')/*/STORE SOURCE*/;' then put ')/STORE SOURCE;';
else put _infile_;
end;
run;
%inc tmp; /* compile macros */
filename tmp clear; /* clear ref */
I have 4 txt files that need to be loaded to SAS and save them as 4 sas files. Here are how the text files look like: cle20130805.txt, cle20130812.txt, cle20130819.txt and cle20130826.txt . I used a % Do loop under % Macro in order to get the 4 files imported with only one invoke of the Macro. So Here is my code:
%macro cle;
%do i=20130805 %to 20130826 %by 7;
Data cleaug.cle&i;
infile "home/abc/cle&i..txt" dlm= '|' dsd firstobs=1 obs=100;
input a_no b_no c_no;
run;
%end;
%mend cle;
%cle
I am expect to have 4 sas file saved with only invoke the marco once. However it just can't run successfully. Any ideas where am I doing wrong in the code?
Thanks,
I don't recommend you try to write one macro to import all four files. Either it will be a specific macro you only ever use once - in which case you could just write this by hand and save the time you've already spent - or it will be something you have to modify every single month or whatever that you use it.
Instead, make the macro something that does precisely one file, but includes the information needed to call it easily. In this case, it sounds like you need one parameter: the date, so 20130805 or whatnot. Then give it a reasonable name that really says what it does.
%macro import_files(date=);
Data cleaug.cle&date.;
infile "home/abc/cle&date..txt" dlm= '|' dsd firstobs=1 obs=100;
input a_no b_no c_no;
run;
%mend import_files;
Now you call it:
%import_files(date=20130805)
%import_files(date=20130812)
%import_files(date=20130819)
%import_files(date=20130826)
Just as easy as the macro you wrote above, even hardcoding the four dates. If the dates are predictable in some fashion, you can generate the macro calls very easily as well (if there are more than 4, for example). You could do a directory listing of the location where the files are, or call the macro from a data step using CALL EXECUTE if you really like looping.
I have a process in SAS that creates a .csv. I have another process in Python that waits until the file exists and then does something with that file. I want to make sure that the Python process doesn't start doing its thing until SAS is done writing the file.
I thought I would just have SAS create the .csv and then rename the file after it was done. Can I do the renaming in SAS? Then the python process could just wait until the renamed file existed.
EDIT: I accidentally posted this question twice and then accidentally deleted both versions. Sorry about that.
I think better than renaming would be to write a shell script that executes the SAS program, then only executes the Python script once the SAS program has exited without errors. The syntax for this varies somewhat depending on your OS, but wouldn't be too difficult.
With verion 9.2 and later SAS has a rename function that should work just the way you would like.
You could generate your output in a SAS dataset and then write to the .csv file only when you're finished with it. Not sure how you're creating the csv, but I usually just do this:
data _null_;
file './ouptut_file.csv' dlm=',';
set dataset_name;
put var1 var2;
run;
If you do this at the very end of your SAS code, no csv file will be generated until you're finished manipulating the data.
There are more than one way of doing this.
Does the python script monitor for .csv files only? Or it will be triggered when any new file is created in that directory?
If it will trigger only to .csv, then you could simply output to, say, a .txt file, then rename it using the x statement (OS command), or the rename function as #cmjohns suggested.
Another option is, you could output the .csv file to a different directory, then just move it to the directory that python is monitoring. As long as the file is in the same disk, moving the file will be done in an instant.
You could also modify the python script to only look for .csv, then do the option 1.
You could even create a "flag" file for python to check when sas has finished with the .csv.
But if possible, I think I would go with #jgunnink.
How about having the watching program watch for a separate small file instead of the large CSV file? That would be written in one write operation by SAS and reduce the risk of triggering the read before the write is done.
data _null_;
file 'bigfile.csv' dsd ;
set bigdata;
put (_all_) (+0);
run;
data _null_;
file 'bigfile.txt' ;
put 'bigfile.csv written';
run;