How to delete external folders containing SAS datasets, using SAS - sas

In the drive, I have folders (date wise) which contain 1 SAS file each. My expected thing is to keep only 13 months of data(only 13 months folders) and delete any other older folder in the drive. Do we have a code for it or is there is anything something else to do this help?

Use either the FDELETE or DELETE to do this, another useful function to use is FEXIST which checks if a file physically exists. The sas Documentation in the links above have good examples.
Example Deleteing a Directory:The folder won't be deleted if it is not empty
%let TargetPath=c:\data\temp\Folder;
FILENAME FMyRep "&TargetPath";
%LET rc=%SYSFUNC(FDELETE(FMyRep));
%PUT rc=&rc;
FILENAME FMyRep CLEAR;
Original post here.

The following macro will delete an entire folder and all subdirectories (recursively) - it does this by extracting the contents, and deleting all the files from the bottom upwards.
https://core.sasjs.io/mp__deletefolder_8sas.html
The test for it is here: https://core.sasjs.io/mp__deletefolder_8test_8sas_source.html

Related

Append CSVs in folder; how to skip / delete first rows in each file

I have 25 CSV files in a folder linked as my data source. The 1st row in each file contains just the file name in column A, followed by the column headers in the 2nd row (this is how the files are generated and sent to me; I do not have access to the database).
CSVs first 2 rows
When I remove the first row of the sample file, then promote headers, then Close & Apply, I get a list of errors which are essentially the redundant column header rows in each of the subsequent 24 files in the folder.
Error List
Upon suggestion, I changed the end of the first Applied Step in Transform Sample File from QuoteStyle.None]) to QuoteStyle.Csv]). This did not solve and didn't seem to change anything.
Another suggestion was that I could just proceed with the errors; filter as needed but that it wouldn't be a problem. This seems risky/sloppy to me, but maybe it's fine and I'm just a nervous newb?
Many thanks for any input!

How can I read 7z files directly from SAS?

I'm a SAS beginner programmer who has a lot of compressed files under 7z format. Because of lack of space in the server I work, I need to open files directly from their compressed form. I've found the following SAS documentation about Reading Compressed Text Files:
https://support.sas.com/resources/papers/proceedings/proceedings/sugi31/155-31.pdf
However, I do not obtain any result using the next code, for example:
FileName Com7zipa Pipe '7za e "rie_mbco_matriz_07.7z" "rie_mbco_matriz_07.sas7bdat" -y -so';
Data DataSet07;
infile Com7zipa;
Input NRO_DOC;
run;
I hope you can help me.
Best regards,
Jean Pierre
As far as I am aware, gzipis supported in 9.4M5 or above but 7z is not. Although server space is limited, you will likely have at least a fair amount of SAS WORK directory space allocated. You can use x commands to unzip the file to the WORK directory instead and read it from there. Note that you will need to enable x commands first using this method.
Let's assume a file named myfile.7z has a csv file in it named myfile.csv.
/* Create a macro variable holding the location of the WORK directory */
%let workdir = %sysfunc(getoption(work));
x "7za e /my/dir/myfile.7z -o&workdir.";
proc import
file = "&workdir./myfile.csv"
out = myfile
dbms = csv
replace;
run;

adjusting the project.xml file in a SAS Enterprise Guide project outside SAS EG

We are going to migrate our EG projects (over 1000 projects) to a new environment.
In the old environment we use "W-Latin" as encoding on the Teradata database.
In the new environment we will start using "UTF-8" as encoding on the Teradata database.
And a lot of other changes which I believe are not relevant for this question.
To prevent data issues we will have to replace functions like REVERSE, etc with KREVERSE, etc
We could do this by opening al projects and clicking through it to change the functions in the expression builder.
This would be really time consuming, considering that we have over 1000 .egp files
We already have a code scanner that unzips the .egp file and detects al the use of these functions in the project.xml file.
The next step could be that we find and replace the functions and put the project.xml file back in the .egp file.
Who can tell me how to put the project.xml file back in the .egp file without corrupting the .egp file
I was able to do this.
tl;dr -- Zip the files back up and change the extension to .egp.
Created a new EG project and added a code node to create sample data:
data test;
do cat = "A", "B", "C";
do i=1 to 10;
r = rannor(123);
output;
end;
end;
drop i;
run;
I then added a Query node to the output to do a "SUM" of the r column by cat.
Ran the flow and got expected output.
Saved the EG project.
Opened the EG Project in 7zip and extracted the archive to a location.
In project.xml, I found the section for the Query and changed the SUM to MEAN
<Expression>
<LHS_TYPE>LHS_FUNCTION</LHS_TYPE>
<LHS_DMCOLGROUP>Numeric</LHS_DMCOLGROUP>
<RHS_TYPE>RHS_COLUMN</RHS_TYPE>
<RHS_DMCOLGROUP>Numeric</RHS_DMCOLGROUP>
<InFormat />
<LHS_String>MEAN</LHS_String>
<LHS_Calc />
<OutputType>OPTYPE_NOTSET</OutputType>
<RHS_StringOne>r</RHS_StringOne>
<RHS_StringTwo />
</Expression>
Selected the files and added them to an achieve using 7zip. Selected "zip" compression and saved the file with ".egp" extension.
I opened the project in EG and ran the flow. The output was now the MEAN of R and not the SUM.

Solution to work disk space not enough in sas

I have more than 50 tables running in work. Before, it worked well.
But recently, there are some errors like:
ERROR: An I/O error has occurred on file
WORK.'SASTMP-000000030'n.UTILITY. ERROR: File
WORK.'SASTMP-000000030'n.UTILITY is damaged. I/O processing did not
complete. NOTE: Error was encountered during utility-file processing.
You may be able to execute the SQL statement successfully if you
allocate more space to the WORK library. ERROR: There is not enough WORK disk space to store the results of an internal sorting
phase. ERROR: An error has occurred.
Does anyone know how to solve this error?
Your disk is full. If this is running on a server, ask your system administrator to investigate the problem.
If this is your desktop, find and delete un-needed files to free up space.
Clean out old SAS Work Folders
Often, old SAS Work folders do not get cleared when SAS closes. You can get back a lot of disk space by going to the path defined for SAS Work, and deleting all the old folders.
In SAS
%put %sysfunc(pathname(work));
will show you where the current WORK library is located. One level up is where all SAS Work folders are created.
On my system, that returns:
C:\Users\dpazzula\AppData\Local\Temp\SAS Temporary Files\_TD9512_GXM2L12-PAZZULA_
That means that I should look in "C:\Users\dpazzula\AppData\Local\Temp\SAS Temporary Files\" to find old folders to delete.
Your work space is full.
Your SAS server uses a dedicated directory where all SAS sessions store their temporary files: All files in the work libraries, as well as temp files as used while sorting, joining etc.
Solutions:
Have more space allocated.
Make certain only to put necessary files into work/ clean up/ close old sessions.
Run less processes.
Replace interim datasets with views instead, especially if you're using large source datasets :
data master /view=master ;
set lib.monthlydata20: ; /* all datasets since Jan 2000 */
run ;
proc sql ;
create table want as
select *
from master
where ID in(select ID from lookup) ;
quit ;
try to compress all datasets using this option
OPTIONS COMPRESS=YES REUSE=YES;
this should be in the very beginning of your code. it will compress all datasets by nearly 98%.It will also make your code run faster. It will consume more CPU but will decrease size.
In some cases, this might not help if the compressed data sets exceed the hard disk space.
Also, change your work directory to the biggest drive that has disk space.
Study your code.
Create a Data Flow Diagram to determine WHEN each file is created, where it is used downstream. Find out when a data set is no longer needed and DELETE it. If you have 50 data sets, chances are numerous data sets are 'value-added' by a subsequent step, and can go away freeing up your work space. A cute trick is to REUSE some of the data set names - to keep the number of unneeded data sets in check.
Rule of thumb: leave the environment the way you found it - if there were no files in WORK to start, manually clean up after yourself. Unless it is a Stored Process, which starts a completely new SAS job, and will clean up after itself upon completion of the job.

SAS: Set current folder to the folder containing the running program

I've just started learning SAS because I'm required to use it for a statistics course. For this course, the university provides SAS 9.2 through their virtual-machine setup: I make a reservation in their system, they generate a VM on one of their servers, and I connect to the VM using Microsoft's Remote Desktop client. The virtual machines are generated and erased per session; settings are reset every time, and files must be stored on my client computer (which is accessible in the VM by a UNC path).
Within this setup, when I open a program file stored on my laptop, I've only been able to access the accompanying data files (each stored in the same folder as the program) either by hardcoding the full path or by updating the "current folder" setting at the beginning of each session. The first is problematic because it means the program won't run anywhere else - in particular, when I email it to the professor. The second is inconvenient, because browsing to this particular UNC path is time consuming, and I already have to browse to the same path to open the program file.
I want to make this easier by programmatically setting the current folder to the folder containing the program. Then I could just open the file and get to work. I've found some examples of getting the filename of the program file, of getting the path to a fileref, and of (link limit exceeded) setting the current folder, but I haven't been able to combine them in the right way. Please connect the dots for me.
To programmatically change the Windows current directory from SAS, you can use the X command, which is what really happens when you use the "Change current folder" dialog box:
x 'cd "\\computername\share name\folder"';
You can also do this using the SYSTEM data step function, a method I prefer because you get a return code (but more typing of course):
data _null_;
rc = system( 'cd "\\computername\share name\folder"' );
if rc = 0
then putlog 'Command successful';
else putlog 'Command failed';
run;
Note the UNC path is surrounded with double-quotes, which is necessary if the path contains blanks.
Of course, this still requires you to manually type in the command, but it might be something you could add to the program source code. If your VM environment allowed you to maintain some permanent presence on the server, you could save this command into a start-up file.
I would ask your professor for advice; if you are working with data given to you as part of your class, you may only need to send just the source code. On the other hand, if you are creating output data as part of your assignment, your professor might want your to deliver source code and SAS data sets. Surely he or she will have some procedure.
Complete Answer:
SAS's obtuse notation requires some strange delimiter fiddling to combine my partial solution (finding the path) with #Bob Duell's partial solution (setting the current folder). There seem to be two key rules involved:
&var is expanded in double-quoted strings ("&var"), but not single-quoted strings ('&var')
Quotes in &var are not treated as delimiters after expansion
So the solution is to compute a string of the quoted path (where the quotes are part of the string), and expand that within a double-quoted parameter to X or SYSTEM:
%let qsrc=%str(%")&src%str(%");
X "cd &qsrc"
It's not required to store the string, both &src and &qsrc can be expanded in-place, which yields a single statement solution:
X "cd %str(%")%substr(%sysget(SAS_EXECFILEPATH),1,%eval(%length(%sysget(SAS_EXECFILEPATH))-%length(%sysget(SAS_EXECFILENAME))))%str(%")";
This executes correctly, but breaks the syntax coloring in the GUI. Within a string, %str(%") and "" both expand to ", so replacing %str(%") with "" both executes correctly and is colored correctly in the GUI:
X "cd ""%substr(%sysget(SAS_EXECFILEPATH),1,%eval(%length(%sysget(SAS_EXECFILEPATH))-%length(%sysget(SAS_EXECFILENAME))))""";
This inherits the limitation that it only works when SAS_EXECFILEPATH and SAS_EXECFILENAME are defined, which is the case when running from within the Windows GUI editor. It's also subject to any limitations on in the "cd" command, which SAS intercepts rather than invoking the Windows command line. I expect it will fail on paths containing quotes.
A partial answer: One way to get the containing folder from the filename of the program file
Spread out & logging steps:
/* Find PathName of folder containing program */
%let FullName=%sysget(SAS_EXECFILEPATH);
%put FullName: &FullName.;
%let FullLen=%length(&FullName);
%put FullLen: &FullLen.;
%let BaseName=%sysget(SAS_EXECFILENAME);
%put BaseName: &BaseName.;
%let BaseLen=%length(&BaseName);
%put BaseLen: &BaseLen.;
%let PathLen=%eval(&FullLen.-&BaseLen.);
%put PathLen: &PathLen.;
%let PathName=%substr(&FullName,1,&PathLen);
%put PathName: &PathName.;
Consolidated & silent:
/* Find src folder */
%let src=%substr(%sysget(SAS_EXECFILEPATH),1,%eval(%length(%sysget(SAS_EXECFILEPATH))-%length(%sysget(SAS_EXECFILENAME))));
This only works when SAS_EXECFILEPATH and SAS_EXECFILENAME are defined, and it's not clear when that is. It does work when using the Windows GUI editor.