How to put special chars in a string after pipe in SAS - sas

I'm migrating some SAS software from a server Unix to a server Linux.
Currently, in a SAS program, I have the following instruction:
Filename myname pipe "ls -le &mypath." ;
(then the file myname is used in a data step as InFile myname truncover end=fine;).
The option -e, in the ls of Unix, produces a list of files where the year is always printed, even for the recently created files.
For example, compare:
myserver.myuser:/mypath> ls -l myfile
-rwxr-xr-x 1 auser auser 24422893965 Nov 5 06:17 myfile
with:
myserver.myuser:/mypath> ls -le myfile
-rwxr-xr-x 1 auser auser 24422893965 Nov 5 06:17:27 2021 myfile
In Linux, the option -e of ls does not exist, but you can have the same result with the command:
ls -l --time-style="+%b %d %T %Y"
The problem is how to use this command, which contains special characters like “ and %, in the SAS instruction.
I tried with :
Filename itt pipe "ls -l --time-style='+%%b %%d %%T %%Y' &mypath." ;
but I get:
total 1420
-rwxrwxr-x 1 myuser mygroup 1450000 052130864ov %d %T %Y myfile
I tried:
Filename itt pipe "ls -l --time-style='+%str(%)b %str(%)d %str(%)T %str(%)Y' &mypath." ;
but I get
total 1420
-rwxrwxr-x 1 myuser mygroup 1450000 )b )d )T )Y myfile
Is there a way to make that command work?
Any alternative solution to get the same result is welcome.

It is much easier to deal with strings that include macro trigger characters, & and %, in SAS code than it is in macro code.
If you really want to define the fileref MYNAME you could use the FILENAME() function call in a data step. Or use the QUOTE() function to generate a macro variable that has single quotes on the outside to protect the macro triggers that you could then use in your FILENAME statement.
data _null_;
length cmd $300;
cmd = catx(' ','ls -l --time-style="+%b %d %T %Y"',"&mypath");
call symputx('lscmd',quote(trim(cmd),"'"));
run;
filename myname pipe &lscmd;
But why not just build the ls command as part of the data step that reads the results? You could also use a style for the datetime value that is easier for SAS to read, such as something the DATETIME informat understands.
data ls_output;
length cmd $300;
cmd = catx(' ','ls -l --time-style="+%d%b%Y:%T"',"&mypath");
infile lscmd pipe filevar=cmd truncover ;
input mode :$11. links owner :$20. group :$20. size lastmod :datetime. file $256.;
format lastmod datetime19.;
run;

Use %nrstr. This will not resolve % signs at compilation time.
Filename itt pipe "ls -l --time-style='%nrstr(+%b %d %T %Y)' &mypath.";

Related

Reading and extracting files from a directory using SAS

I have a directory /run/return/files/archives/prep/share/ that contains both .txt and .csv files.
For example IA_PROD.txt and retour_PROD.csv
I want to read both types of files and extract only their names (IA_PROD and retour_PROD) to store in an excel file named FILE_NAMES.xlsx. I have the code below that extracts .txt and .csv files though two separate data sets (file_list1 and file_list2) and I finally concatenate the two data sets to export in an excel sheet. I wanted to be able to optimise my code to make it one single data step where I read both csv, txt and extract both of them together.
Thanks for your generous help
%let REP_BLOCTEL_ALLER = /run/return/files/archives/prep/share/;
filename result pipe "ls &rep_bloctel_aller./*txt";
filename result2 pipe "ls &rep_bloctel_aller./*csv";
data file_list1;
infile result lrecl=200 truncover;
input rep $120.;
file_name = tranwrd(substr(rep, length("&rep_bloctel_aller./")+1),'.txt','');
call symput(compress('txt_'!!put(_n_,2.)),file_name);
call symput('n_obs',put(_n_,2.));
run;
data file_list2;
infile result2 lrecl=200 truncover;
input rep $120.;
file_name = tranwrd(substr(rep, length("&rep_bloctel_aller./")+1),'.csv','');
call symput(compress('csv_'!!put(_n_,2.)),file_name);
call symput('n_obs',put(_n_,2.));
run;
DATA file_list;
SET file_list1 file_list2;
RUN;
proc export data = file_list
(keep = file_name)
outfile="&rep_bloctel_aller./FILE_NAME_BLOCTEL.xlsx"
dbms=xlsx
replace;
sheet= "FILE_NAME_BLOCTEL";
run;
Not sure if it answers your question, but I think this answers your problem :)
What I propose it to filter your input files with ls :
%let REP_BLOCTEL_ALLER = /run/return/files/archives/prep/share/;
filename result pipe "ls &file_path. | egrep -i '\.csv$|\.txt$'";
data file_list;
infile result lrecl=200 truncover;
input filename $120.;
run;
I don't think we can make it shorter!
Some explanations:
ls lists your directory
the pipe | forwards the result of ls to the egrep command
egrpe is used to search for text value in the result sent by ls
the -i option indicates to egrep that text lookup must be case insensitive (will detect TXT, txt, TxT files and so on)
The '.csv$|.txt$' indicates to search for either '.csv' or '.txt' at the end of a line ($), which corresponds to the file the extension
If the goal is to just generate the list into an XLSX file then you just need:
libname out xlsx "&rep_bloctel_aller./FILE_NAME_BLOCTEL.xlsx";
data out.FILE_NAME_BLOCTEL;
infile "cd &rep_bloctel_aller.; ls *.txt *.csv" pipe truncover;
input file_name $256.;
run;

how does pipe read all information of a page using SAS?

I have a folder which has some tables. Once I opened it, it shows table name, date modified, type and size.
I am trying to read all the information including: table name, date modified, type and size using SAS. so I tried pipe first:
filename tbl pipe "dir /abc/sales";
data new;
infile tbl pad;
input all $500.;
run;
the result only has the table name, but no date modified, type and size.
so just wonder how to fix it.
An example folder 'sales' below:
table name size date modified type
sales1 490k 10/28/2020 9:32:50 am sas7bdat
sales2 85k 11/12/2020 4:28:23 pm sas7bdat
sales3 307k 12/17/2020 1:55:09 pm sas7bdat
From your path it looks like SAS is running on Unix. Not sure what the command dir does on your flavor of Unix, but ls -l should get the file details on any flavor of Unix.
data new;
infile "ls -l /abc/sales/" pipe truncover ;
input all $500.;
run;

how to read files from a folder that were created before a date

I am trying to use SAS to read multiple files from a directory and they were created before a date.
I have used this code to help me to read all the files. It works perfectly. Now I found out that only some files that were created before a certain date are what I need. I think that could be done either by FILENAME PIPE Dir options or by INFILE statement options, but I cannot find the answers.
code source:
http://support.sas.com/kb/41/880.html
filename DIRLIST pipe 'dir "C:\_today\file*.csv" /b ';
data dirlist ;
infile dirlist lrecl=200 truncover;
input file_name $100.;
run;
data _null_;
set dirlist end=end;
count+1;
call symputx('read'||put(count,4.-l),cats('c:\_today\',file_name));
call symputx('dset'||put(count,4.-l),scan(file_name,1,'.'));
if end then call symputx('max',count);
run;
options mprint symbolgen;
%macro readin;
%do i=1 %to &max;
data &&dset&i;
infile "&&read&i" lrecl=1000 truncover dsd;
input var1 $ var2 $ var3 $;
run;
%end;
%mend readin;
%readin;
Currently you are reading in just the file names using the dir command. The existing /b modifier is saying print just the file name and nothing else. You want to change it to read both the file name and the CREATED date of the file. In order to do that it gets a little messy. You will need to change that pipe command from:
filename DIRLIST pipe 'dir "C:\_today\file*.csv" /b ';
...to this... :
filename DIRLIST pipe 'dir "C:\_today\file*.csv" /tc ';
The output will change from something like this:
file1.csv
file2.csv
...
...to something like this... :
Volume in drive C has no label.
Volume Serial Number is 90ED-A122
Directory of C:\_today
01/13/2017 09:14 AM 1,991 file1.csv
01/11/2017 11:43 AM 169 file2.csv
...
...
...
01/11/2017 11:43 AM 169 file99.csv
99 File(s) 6,449 bytes
0 Dir(s) 57,999,806,464 bytes free
So you will then need to modify your data step that creates dirlist to clean up the results returned by the new dir statement. You will need to ignore the header and footer and read in the date and time etc. Once you have that date and time in the appropriate SAS format, you can then just use a SAS where clause to keep the rows you are interested in. I will leave this as an exercise for you to do. If you have trouble with it you can always open a new question.
If you need more information on the dir command, you can open up a command prompt (Start Menu->Run->"cmd"), and then type in dir /? to see a list of available switches for the dir command. You may find a slightly different combination of switches for it that better suits your task than what I listed above.
You can use powershell to leverage the features of the operating system.
filename get_them pipe
" powershell -command
""
dir c:\temp
| where {$_.LastWriteTime -gt '3/19/2019'}
| select -property name
| ft -hidetableheader
""
";
data _null_;
infile get_them;
input;
putlog _infile_;
run;

How to check available disk space using SAS

How can I check the space left on a drive and if it is less than 1GB to output a message using SAS.
I only have a code that checks the SAS file size.
I've basically modified the code available in this link according to your requirement. I've also added a bit of code to fix issues faced due to quotes and the pipe command. Basically you should let SAS deal with quotes before passing on the code.
%macro windows_bytes_free(sm_path);
%global mv_bytes_free;
%let mv_bytes_free = -1; /* In case of error */
%let filepath = %sysfunc(quote(%qsysfunc(dequote(&sm_path)))); /* To prevent issues with quotes remove quotes if present and apply it again*/
/* Run the DIR command and retrieve results using an unnamed pipe */
filename tempdir pipe %sysfunc(quote(dir /-c &filepath | find "bytes free")) ;
data _null_;
infile tempdir length=reclen ;
input line $varying1024. reclen ;
re = prxparse('/([0-9]+) bytes/'); /* Parse the output of DIR using a Perl regular expression */
if prxmatch(re, line) then do;
bytes_str = prxposn(re, 1, line);
bytes = input(bytes_str, 20.);
call symput('mv_bytes_free', bytes); /* Assign available disk space in bytes to a global macro variable */
kb = bytes /1024;
mb = kb / 1024;
gb = mb / 1024;
format bytes comma20.0;
format kb mb gb comma20.1;
/* Write a note to the SAS log */
put "NOTE: &sm_path " bytes= kb= mb= gb=;
if gb<1 then put '** Available space is less than 1 gb';
else put '** Enough space is available';
end;
run;
%if &mv_bytes_free eq -1 %then %put ERROR: error in windows_bytes_free macro;
%mend;
An example of how to use this macro for the C: drive
%windows_bytes_free(c:);
Tazz:
Presuming you are running SAS on a Windows platform -- Piping wmic command output into SAS can deliver vast amounts of information about the system, including the freespace on the disks.
WMIC - Using Windows Management Instrumentation Command-line;
https://msdn.microsoft.com/en-us/library/aa394531(v=vs.85).aspx;
%let csvdata = %sysfunc(pathname(work))\wmic_output.csv;
filename wmic_csv "&csvdata" encoding="utf-16";
filename gather pipe "wmic logicaldisk get name,size,freespace /format:csv";
* process the wmic command and strip off blank first row and extraneous CR character at end of line;
data _null_;
infile gather;
input;
if _n_ > 1;
_infile_ = compress(_infile_, '0d'x);
file wmic_csv;
put _infile_;
run;
proc import replace out=diskinfo file=wmic_csv dbms=csv;
run;
data _null_;
set diskinfo;
if freespace < 1e9 then put "WARNING: " name "has remaining" freespace=;
run;
wmic can also export it's information in XML format -- the output is more complicated but extremely capable. This sample code uses SAS' xmlv2 engine and the automap= option:
* WMIC - Using Windows Management Instrumentation Command-line;
* https://msdn.microsoft.com/en-us/library/aa394531(v=vs.85).aspx;
%let xmldata = %sysfunc(pathname(work))\wmic_output.xml;
%let xmlautomap = %sysfunc(pathname(work))\wmic_output-automap.xml;
%let xmlmap = %sysfunc(pathname(work))\wmic_output-map.xml;
filename wmic "&xmldata" encoding="utf-16";
filename wmicmap "&xmlmap";
filename gather pipe "wmic logicaldisk get name,size,freespace /format:rawxml > ""&xmldata""";
data _null_;
infile gather;
input;
put _infile_;
rc = sleep(.1,1);
run;
libname wmic xmlv2 automap=replace xmlmap=wmicmap;
proc copy in=wmic out=work;
run;
proc transpose data=work.property out=properties(drop=_name_) suffix=_text;
by instance_ordinal;
id property_name;
var value;
run;
filename gather;
filename wmic;
filename wmicmap;

How to get user properties like created by in sas

How to fetch user details for a .sas file or file properties for all files stored in a directory? I am trying to get all possible attributes like: modified date, modified by, created by, for a macro.
data dir_meta(drop=rc file_ref fid);
%let directory_ref = %sysfunc(filename(dirref,&dir));
%let dir_id=%sysfunc(dopen(&dirref));
if &dir_id eq 0 then do;
put _error_=1;
return;
end;
%let _count=%sysfunc(dnum(&dir_id);
do i=1 to &_count;
%let dir_name = %sysfunc(dread(&dir_id,&i);
if upcase(scan(&dir_name,-1,.)) = upcase(&extn) then do;
put &dir\&dir_name;
file_ref='temp';
file_name=%sysfunc( filename(file_ref,"&dir\&&dir_name"));
fid=%sysfunc(fopen(file_ref));
create_date=%sysfunc(finfo(&fid,Create Time));
Modified_date=%sysfunc(finfo(&fid,Last Modified));
output;
rc=fclose(fid);
end;
end;
%let rc_dir=%sysfunc(dclose(dir_id);
run;
Sweta,
Presuming you are using SAS in a recent version of Windows and the session has X command allowed, then you can pipe the results of a powershell command to a data step to read in what ever information you want.
In powershell use this command to see the kinds of information about a file that can be selected
PS > DIR | GET-MEMBER
Once you decide on the members to select a data step can read the powershell output. For example:
filename fileinfo pipe 'powershell -command "dir | select Fullname, Length, #{E={$_.LastWriteTime.ToString(''yyyy-MM-ddTHH:mm:ss.ffffffzzz'')}} | convertTo-csv"';
* powershell datetime formatting tips: https://technet.microsoft.com/en-us/library/ee692801.aspx?f=255&MSPPError=-2147217396;
data mydata;
infile fileinfo missover firstobs=4 dsd dlm=',';
attrib
filename length=$250
size length=8 format=comma12.
lastwrite length=8 format=datetime20. informat=E8601DZ32.6
;
input filename size lastwrite;
run;