How can I check the space left on a drive and if it is less than 1GB to output a message using SAS.
I only have a code that checks the SAS file size.
I've basically modified the code available in this link according to your requirement. I've also added a bit of code to fix issues faced due to quotes and the pipe command. Basically you should let SAS deal with quotes before passing on the code.
%macro windows_bytes_free(sm_path);
%global mv_bytes_free;
%let mv_bytes_free = -1; /* In case of error */
%let filepath = %sysfunc(quote(%qsysfunc(dequote(&sm_path)))); /* To prevent issues with quotes remove quotes if present and apply it again*/
/* Run the DIR command and retrieve results using an unnamed pipe */
filename tempdir pipe %sysfunc(quote(dir /-c &filepath | find "bytes free")) ;
data _null_;
infile tempdir length=reclen ;
input line $varying1024. reclen ;
re = prxparse('/([0-9]+) bytes/'); /* Parse the output of DIR using a Perl regular expression */
if prxmatch(re, line) then do;
bytes_str = prxposn(re, 1, line);
bytes = input(bytes_str, 20.);
call symput('mv_bytes_free', bytes); /* Assign available disk space in bytes to a global macro variable */
kb = bytes /1024;
mb = kb / 1024;
gb = mb / 1024;
format bytes comma20.0;
format kb mb gb comma20.1;
/* Write a note to the SAS log */
put "NOTE: &sm_path " bytes= kb= mb= gb=;
if gb<1 then put '** Available space is less than 1 gb';
else put '** Enough space is available';
end;
run;
%if &mv_bytes_free eq -1 %then %put ERROR: error in windows_bytes_free macro;
%mend;
An example of how to use this macro for the C: drive
%windows_bytes_free(c:);
Tazz:
Presuming you are running SAS on a Windows platform -- Piping wmic command output into SAS can deliver vast amounts of information about the system, including the freespace on the disks.
WMIC - Using Windows Management Instrumentation Command-line;
https://msdn.microsoft.com/en-us/library/aa394531(v=vs.85).aspx;
%let csvdata = %sysfunc(pathname(work))\wmic_output.csv;
filename wmic_csv "&csvdata" encoding="utf-16";
filename gather pipe "wmic logicaldisk get name,size,freespace /format:csv";
* process the wmic command and strip off blank first row and extraneous CR character at end of line;
data _null_;
infile gather;
input;
if _n_ > 1;
_infile_ = compress(_infile_, '0d'x);
file wmic_csv;
put _infile_;
run;
proc import replace out=diskinfo file=wmic_csv dbms=csv;
run;
data _null_;
set diskinfo;
if freespace < 1e9 then put "WARNING: " name "has remaining" freespace=;
run;
wmic can also export it's information in XML format -- the output is more complicated but extremely capable. This sample code uses SAS' xmlv2 engine and the automap= option:
* WMIC - Using Windows Management Instrumentation Command-line;
* https://msdn.microsoft.com/en-us/library/aa394531(v=vs.85).aspx;
%let xmldata = %sysfunc(pathname(work))\wmic_output.xml;
%let xmlautomap = %sysfunc(pathname(work))\wmic_output-automap.xml;
%let xmlmap = %sysfunc(pathname(work))\wmic_output-map.xml;
filename wmic "&xmldata" encoding="utf-16";
filename wmicmap "&xmlmap";
filename gather pipe "wmic logicaldisk get name,size,freespace /format:rawxml > ""&xmldata""";
data _null_;
infile gather;
input;
put _infile_;
rc = sleep(.1,1);
run;
libname wmic xmlv2 automap=replace xmlmap=wmicmap;
proc copy in=wmic out=work;
run;
proc transpose data=work.property out=properties(drop=_name_) suffix=_text;
by instance_ordinal;
id property_name;
var value;
run;
filename gather;
filename wmic;
filename wmicmap;
Related
I'm looking for getting the total number of rows (count) from a sas dataset file using SAS code.
I tried this code
data _null_; infile "C:\myfiles\sample.sas7bdat" end=eof; input; if eof then put "Lines read=====:" ; run;
This is the results OUTput I get(does not show the number of lines).Obviously, I did not get any actual number of lines in the file
Lines read=====:
NOTE: 1 record was read from the infile
"C:\myfiles\sample.sas7bdat".
However, I know the number of lines in that sample.sas7dat file is more than 1.
Please help!
The INFILE statement is for reading a file as raw TEXT. If you have a SAS dataset then you can just SET the dataset to read it into a data step.
So the equivalent for your attempted method would be something like:
data _null_;
set "C:\myfiles\sample.sas7bdat" end=eof;
if eof then put "Observations read=====:" _n_ ;
run;
One cool thing about sas7bdat files is the amount of metadata stored with them. The row count of that file is already known by SAS as an attribute. You can use proc contents to read it. Observations is the number of rows in the table.
libname files "C:\myfiles";
proc contents data=files.sample;
run;
A more advanced way is to open the file directly using macro functions.
%let dsid = %sysfunc(open(files.sample) ); /* Open the file */
%let nobs = %sysfunc(attrn(&dsid, nlobs) ); /* Get the number of observations */
%let rc = %sysfunc(close(&dsid) ); /* Close the file */
%put Total observations: &nobs
I am trying to iterate over a list of different files and input them.
DO Year = 2000 to 2021 by 1;
filename fileftp ftp year+'.csv.gz' host='ftp.abcgov'
cd='/pub/' user='anonymous'
pass='XXXX' passive recfm=s debug;
INFILE fileftp NBYTE=n;
END;
How do I get it so that year is included in the file name?
Currently, when I try this (year+'.csv.gz') it is trying to recognize year as an option incorrectly.
You need to use the SAS macro facility for this. Since your file is zipped, you'll also need to unzip it before importing the data.
%macro importData;
%do year = 2000 to 2021;
filename fileftp ftp "&year..csv.gz"
host = 'ftp.abcgov'
cd = '/pub/'
user = 'anonymous'
pass = 'XXXX'
recfm = s
passive
debug
;
filename download temp;
/* Download the file to a temporary local space */
%let rc = %sysfunc(fcopy(fileftp, download));
/* Unzip the file */
filename unzip "%sysfunc(pathname(download))" gzip;
/* Read the data and output it by year */
proc import
file = unzip
out = want&year.
dbms = csv
replace;
run;
%end;
%mend;
%importData;
If fcopy does not work for you, you can use a data step to write one file to another.
data _null_;
infile fileftp;
file download;
input;
put _INFILE_ ;
run;
Good Morning
So I have tried to download zip file from the website, and try to assign the location.
The location I want to put is
S:\Projects\
Method1,
First Attempt is below
DATA _null_ ;
x 'start https://yehonal.github.io/DownGit/#/home?url=https:%2F%2Fgithub.com%2FCSSEGISandData%2FCOVID-19%2Ftree%2Fmaster%2Fcsse_covid_19_data%2Fcsse_covid_19_daily_reports';
RUN ;
Method1, I can download the file, but this automatically downloaded to my Download folder.
Method 2,
so I found out this way.
filename out "S:\Projects\csse_covid_19_daily_reports.zip";
proc http
url='https://yehonal.github.io/DownGit/#/home?url=https:%2F%2Fgithub.com%2FCSSEGISandData%2FCOVID-19%2Ftree%2Fmaster%2Fcsse_covid_19_data%2Fcsse_covid_19_daily_reports'
method="get" out=out;
run;
But the code is not working, not downloading anything.
how can I download the file from the web and assign to the certain location?
I would probably recommend a macro in this case then (or CALL EXECUTE) but I prefer macros and then calling the macro via CALL EXECUTE. Took about a minute running on SAS Academics on Demand (free cloud service).
*set start date for files;
%let start_date = 01-22-2020;
*macro to import data;
%macro importFullData(date);
*file name reference;
filename out "/home/fkhurshed/WANT/&date..csv";
*file to download;
%let download_url = "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/&date..csv";
proc http url=&download_url
method="get" out=out;
run;
*You can add in data import/append steps here as well as necessary;
%mend;
%importFullData(&start_date.);
data importAll;
start_date=input("&start_date", mmddyy10.);
*runs up to previous day;
end_date=today() - 1;
do date=start_date to end_date;
formatted_date=put(date, mmddyyd10.);
str=catt('%importFullData(', formatted_date, ');');
call execute(str);
end;
run;
The url when viewed in a browser is using javascript in the browser to construct a zip file that is automatically downloaded. Proc HTTP does not run javascript, so will not be able to download the ultimate response which is the constructed zip file, thus you get the 404 message.
The list of the files in the repository can be obtain as json from url
https://api.github.com/repos/CSSEGISandData/COVID-19/contents/csse_covid_19_data/csse_covid_19_daily_reports
The listing data contains the download_url for each csv file.
A download_url will look like
https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/01-22-2020.csv
You can download individual files with SAS per #Reeza, or
use git commands or SAS git* functions to download the repository
AFAIK git archive for downloading only a specific subfolder of a repository is not available surfaced by github server
use svn commands to download a specific folder from a git repository
requires svn be installed (https://subversion.apache.org/) I used SlikSvn
Example:
Make some series plots of a response by date from stacked imported downloaded data.
options noxwait xsync xmin source;
* use svn to download all files in a subfolder of a git repository;
* local folder for storing data from
* COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University;
%let covid_data_root = c:\temp\csse;
%let rc = %sysfunc(dcreate(covid,&covid_data_root));
%let download_path = &covid_data_root\covid;
%let repo_subdir_url = https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_daily_reports;
%let svn_url = %sysfunc(tranwrd(&repo_subdir_url, tree/master, trunk));
%let os_command = svn checkout &svn_url "&download_path";
/*
* uncomment this block to download the (data) files from the repository subfolder;
%sysexec %superq(os_command);
*/
* codegen and execute the PROC IMPORT steps needed to read each csv file downloaded;
libname covid "&covid_data_root.\covid";
filename csvlist pipe "dir /b ""&download_path""";
data _null_;
infile csvlist length=l;
length filename $200;
input filename $varying. l;
if lowcase(scan(filename,-1,'.')) = 'csv';
out = 'covid.day_'||translate(scan(filename,1,'.'),'_','-');
/*
* NOTE: Starting 08/11/2020 FIPS data first starts appearing after a few hundred rows.
* Thus the high GuessingRows
*/
template =
'PROC IMPORT file="#path#\#filename#" replace out=#out# dbms=csv; ' ||
'GuessingRows = 5000;' ||
'run;';
source_code = tranwrd (template, "#path#", "&download_path");
source_code = tranwrd (source_code, "#filename#", trim(filename));
source_code = tranwrd (source_code, "#out#", trim(out));
/* uncomment this line to import each data file */
*call execute ('%nrstr(' || trim (source_code) || ')');
run;
* memname is always uppercase;
proc contents noprint data=covid._all_ out=meta(where=(memname like 'DAY_%'));
run;
* compute variable lengths for LENGTH statement;
proc sql noprint;
select
catx(' ', name, case when type=2 then '$' else '' end, maxlen)
into
:lengths separated by ' '
from
( select name, min(type) as type, max(length) as maxlen, min(varnum) as minvarnum, max(varnum) as maxvarnum
from meta
group by name
)
order by minvarnum, maxvarnum
;
quit;
* stack all the individual daily data;
data covid.all_days;
attrib date length=8 format=mmddyy10.;
length &lengths;
set covid.day_: indsname=dsname;
date = input(substr(dsname,11),mmddyy10.);
format _character_; * reset length based formats;
informat _character_; * reset length based informats;
run ;
proc sort data=covid.all_days out=us_days;
where country_region = 'US';
by province_state admin2 date;
run;
ods html gpath='.' path='.' file='covid.html';
options nobyline;
proc sgplot data=us_days;
where province_state =: 'Cali';
*where also admin2=: 'O';
by province_state admin2;
title "#byval2 County, #byval1";
series x=date y=confirmed;
xaxis valuesformat=monname3.;
label province_state='State' admin2='County';
label confirmed='Confirmed (cumulative?)';
run;
ods html close;
options byline;
Plots
My coworker and I have three zip files, representing three iterations of a monthly download from CMS of the NPPES Data Dissemination (March, April, and May). We use the following code to extract what we need from the newest zip file and create a fairly compact dataset.
PROC IMPORT OUT=NPI_Layout
DATAFILE= "&dir./NPI File Layout.xlsx"
DBMS=XLSX REPLACE;
SHEET="Sheet1";
RUN;
options compress = yes;
data npi_layout;
set npi_layout;
length infmt fmt inpt $60. lbl $200.;
if type = 'NUMBER' then do;
infmt = 'informat '||compress(field)||' '||compress(length)||'.;';
fmt = 'format '||compress(field)||' '||compress(length)||'.;';
inpt = compress(field);
end;
else if type = 'VARCHAR' then do;
infmt = 'informat '||compress(field)||' $'||compress(length)||'.;';
fmt = 'format '||compress(field)||' $'||compress(length)||'.;';;
inpt = compress(field)||' $';
end;
else if type = 'DATE' then do;
infmt = 'informat '||compress(field)||' mmddyy10.;';
fmt = 'format '||compress(field)||' date9.;';
inpt = compress(field);
end;
lbl = 'label '||compress(field)||" = '"||trim(label)||"';";
run;
proc sql noprint;
select infmt
,fmt
,inpt
,lbl
into :infmt1 -
,:fmt1 -
,:inpt1 -
,:lbl1 -
from npi_layout;
quit;
%macro loop;
%let infmt_stmnt = ;
%let fmt_stmnt = ;
%let inpt_stmnt = input;
%let lbl_stmnt = ;
%do i = 1 %to &sqlobs;
%let infmt_stmnt = &infmt_stmnt &&infmt&i;
%let fmt_stmnt = &fmt_stmnt &&fmt&i;
%let inpt_stmnt = &inpt_stmnt &&inpt&i;
%let lbl_stmnt = &lbl_stmnt &&lbl&i;
%end;
data npi.npi;
%let _EFIERR_ = 0; /* set the ERROR detection macro variable */
infile inzip(npidata_pfile_20050523-20180513.csv)
delimiter = ',' MISSOVER DSD lrecl = 32767 firstobs = 2;* obs = 10000;
&infmt_stmnt;
&fmt_stmnt;
&inpt_stmnt;
&lbl_stmnt;
run;
%mend loop;
%loop;
When we run the above code on the file from March, we get a successful output. However, when we try to run it on the April and May downloads, we get the following error:
Error in Log
ERROR: Open failure for
*dir*/NPI/Downloads/NPPES_Data_Dissemination_May_2018.zip
during attempt to create a local file handle.
Google only returns a single result, which indicates that it's an error that pops up when a filename (or path, presumably) is wrong. We've double-checked the path and filename multiple times, and it's all correct (and, obviously, the code works on the March file). Additionally, if I change the code so it's trying to pull a non-existent .csv from the zip file, it gives me a different error about that file not existing within the zip, so it's clearly seeing the zip file in the first place. We're not really sure what's going on; any advice?
(The data is sourced from http://download.cms.gov/nppes/NPI_Files.html, if you want to check the file for yourself.)
Did you try adding quotes around the member name?
infile inzip("npidata_pfile_20050523-20180513.csv") ...
Saw the same ERROR message on Windows 10 64-bit with plenty of RAM and disk space.
The Windows internals used by the ZIP engine is likely dealing with streams which involves file handles. So I would suspect the ZIP engine is trying to allocate too much RAM or too large an intermediary file for dealing with extracting the 6GB "npidata_pfile_20050523-20180513.csv".
Submit the issue to SAS Support -- There might be some session settings that would let the engine work against the file. If not, you'll have to extract the file outside SAS.
How large were the April and May pfile sizes?
How to fetch user details for a .sas file or file properties for all files stored in a directory? I am trying to get all possible attributes like: modified date, modified by, created by, for a macro.
data dir_meta(drop=rc file_ref fid);
%let directory_ref = %sysfunc(filename(dirref,&dir));
%let dir_id=%sysfunc(dopen(&dirref));
if &dir_id eq 0 then do;
put _error_=1;
return;
end;
%let _count=%sysfunc(dnum(&dir_id);
do i=1 to &_count;
%let dir_name = %sysfunc(dread(&dir_id,&i);
if upcase(scan(&dir_name,-1,.)) = upcase(&extn) then do;
put &dir\&dir_name;
file_ref='temp';
file_name=%sysfunc( filename(file_ref,"&dir\&&dir_name"));
fid=%sysfunc(fopen(file_ref));
create_date=%sysfunc(finfo(&fid,Create Time));
Modified_date=%sysfunc(finfo(&fid,Last Modified));
output;
rc=fclose(fid);
end;
end;
%let rc_dir=%sysfunc(dclose(dir_id);
run;
Sweta,
Presuming you are using SAS in a recent version of Windows and the session has X command allowed, then you can pipe the results of a powershell command to a data step to read in what ever information you want.
In powershell use this command to see the kinds of information about a file that can be selected
PS > DIR | GET-MEMBER
Once you decide on the members to select a data step can read the powershell output. For example:
filename fileinfo pipe 'powershell -command "dir | select Fullname, Length, #{E={$_.LastWriteTime.ToString(''yyyy-MM-ddTHH:mm:ss.ffffffzzz'')}} | convertTo-csv"';
* powershell datetime formatting tips: https://technet.microsoft.com/en-us/library/ee692801.aspx?f=255&MSPPError=-2147217396;
data mydata;
infile fileinfo missover firstobs=4 dsd dlm=',';
attrib
filename length=$250
size length=8 format=comma12.
lastwrite length=8 format=datetime20. informat=E8601DZ32.6
;
input filename size lastwrite;
run;