SAS: Reading all files containing a specific string - sas

I would like to read a set of files that all contain a specific string (xxx*.csv). So if my folder has the files below, the code should read in file #1 and #2.
1. xxx_123.csv
2. xxx_456.csv
3. xyz_111.csv
The first part of the filename is known, but the second part is unknown. Anyone know how I can achieve this in SAS Enterprise Guide?

You can create a wildcard fileref, not sure how to do it interactively in EG, but in Base SAS the code would be similar to the below.
filename wildxxx "/path/xxx*.csv" ;
data readin ;
infile wildxxx /* + infile options */;
/* input statement */
run ;

Related

SAS adding unwanted spaces to file names when macro variables are included

I'm creating a text file with SAS and I'm using a macro variable with a date in my text file's name to make it distinct from other similar files.
The problem I'm experiencing:
SAS is adding two unwanted spaces in the middle of the file name. The unwanted spaces are placed directly before the text generated by my macro variable
I'm certain this has everything to do with my macro variable being used, but on its own, the variable doesn't contain any spaces. Below is my code:
proc format;
picture dateFormat
other = '%Y%0m%0d%0H%0M' (datatype=datetime);
run;
data _null_;
dateTime=datetime();
call symput('dateTime', put(dateTime,dateFormat.));
run;
%LET FILE = text_text_abc_&dateTime..txt;
filename out "/location/here/&FILE" termstr=crlf;
data _null_; set flatfile;
/*file content is created in here*/
run;
The exported file name will look like this:
NOTE: The file OUT is:
Filename=/location/here/text_text_abc_ 201702010855.txt
If it helps, I'm using SAS E-Guide 7.1.
Any help is appreciated! Thanks, all!
You need to assign an appropriate default length to your picture format. SAS is applying a default default length of 14 but you need 12, e.g.
proc format;
picture dateFormat (default=12)
other = '%Y%0m%0d%0H%0M' (datatype=datetime);
run;
Use call symputx() instead of call symput(), then SAS will automatically strip the leading and trailing blanks from the value written to the macro variable. You should really only use call symput() in the rare cases where you want the macro variable value to have leading or trailing blanks.
Run this little program to see the difference.
data _null_;
str=' XX ';
call symput('var1',str);
call symputX('var2',str);
run;
%put |&var1|;
%put |&var2|;

Export SAS dataset in Excel's "Text (MS-DOS)" format

I need to export a data set as text file for an ancient batch process probably running on Unix. The file has one column and all fields are numeric.
I want to create a text file which emulates the way Excel creates Text (MS-DOS) files:
Saves a workbook as a tab-delimited text file for use on the MS-DOS
operating system, and ensures that tab characters, line breaks, and
other characters are interpreted correctly. Saves only the active
sheet.
What is the best way to achieve this?
DOS uses encoding page 437, which is a very limited set of characters. If you don't have any special characters, you're good. If you do have special characters, you'll need to change the encoding page to 437 in order to guarantee character compatibility. This can be done as a dataset option.
SAS internally names this pcoem437. You can see the difference in output by changing the encoding= option.
data have;
input var$;
datalines;
ElNiƱo
ElNino
;
run;
proc export data=have(encoding=pcoem437)
file='C:\Directory\want.txt'
dbms=tab
replace;
run;
If you just have one column then the delimiter doesn't matter. You can write the file using a DATA step very easily.
data _null_;
set have ;
file 'myfile.txt' ;
put VAR1 ;
run;
If you want to add an extra line with the column name then add this before the PUT statement.
if _n_=1 then put 'VAR1';
If you are worried about whether you need to generate LF or CRLF for the end of line you can control that with the TERMSTR= option on the FILE statement.

How to import txt file in SAS using infile command

I 'm beginner in SAS and currently using SAS 9.1 I want to import txt file using infile command but it is giving error. My code is as follows
data sasdata.twenty;
infile "C:\Users\Ravi Raghava\Desktop\Cricket.txt" firstobs=2;
input Position Runs Sixes Fours balls;
run;
Any help will be highly appreciated.
Usually I do it by the following:
proc import datafile='data.txt'
out=TableName
replace;
delimiter='09'x;
run;
It works correctly.
you are using the default sas input (i.e. maybe space is used as a separator, guess whether I meant numeric or alpha-numeric and so on). You assume the SAS knows what it is doing. You know what happens when you assume.
Use a
Length Positions $20 Runs Sixes Fours balls 8;
statement. That will at least make sure the first word is treated as alpha numeric and the rest as numbers. also use the
infile ..... dlm=',';
if you got a CSV file. or
infile ..... dlm='09'x;
if it is a tab separated file.

Stata: Appending multiple files and extracting variables from file names

I have 114 files with .dat extension to convert to Stata/SE and append, with substantial number of variables (varying from 81 to 16800). I have reset max number of variables to 32000 (set maxvar 32000), increased the memory (set mem 500m) and I was using the following algorithm to combine large number of files and to generate several variables by extracting parts of file names: http://www.ats.ucla.edu/stat/stata/faq/append_many_files.htm
The code looks as follows:
cd "C:\Users\..."
! dir *.dat /a-d /b >d:\Stata_directory\Products_batchfilelist.txt
file open myfile using "d:\Stata_directory\Products_batchfilelist.txt", read
file read myfile line
drop _all
insheet using `line', comma names
gen n = substr("`line'",10,1)
gen m = substr("`line'",12,1)
gen playersnum = substr("`line'",14,1)
save Products_merged.dta, replace
drop _all
file read myfile line
while r(eof)==0 {
insheet using `line', comma names
gen n = substr("`line'",10,1)
gen m = substr("`line'",12,1)
generate playersnum = substr("`line'",14,1)
save `line'.dta, replace
append using Products_merged.dta
save Products_merged.dta,replace
drop _all
file read myfile line
}
The problem is that although variables n,m,playersnumextracted from file names are present in each individual file, they disappear in the final "Products_merged.dta" file. Could anyone tell me what could be the problem and if it is possible to solve with Stata/SE?
I don't see an obvious problem with the code that would be causing this. It may have something to do with the limits in SE, but that is still unlikely in my mind (you would see an error if a command does something to exceed maxvar).
My only suggestion would be to put a couple commands inside the append loop that will help you debug:
save `line'.dta, replace
append using Products_merged.dta
assert m!="" & n!="" & playersnum!=""
save Products_merged.dta,replace
This will do two things: ensure your variables exist after each new append (your first-order concern), and check that they are never blank (not your stated concern but a good check anyway).
If you post a couple of the files I could probably give a better answer.

Can SAS 9.1.3 read csv file with utf8 BOM?

I am using SAS 9.1.3 in AIX 5.3
I have to proc import a CSV file using SAS.
The first line of CSV are column names.
SAS reports error in the log.
Then, I find out that the CSV file has 3 characters
(which is the utf8 byte order mark).
at the very beginning of the file.
I tried to use :
filename XXX 'XXXXXXXXXX' BOM ;
But, this is syntax error.
I replace BOM with BOMFILE, still syntax error.
It seems that SAS 9.1.3 cannot recognize the BOM options.
Does anyone have similar experience ?
Instead of the import procedure, you might try a data step like the following:
data test;
infile "data.csv" firstobs=2 dlm=','; /* assuming delimiter is a comma */
input /* use Input with $UTF8Xw. informat */
field1 $utf8x3. /* input fields 1 through 3 */
field2 $utf8x10.
field3 $utf8x3.
;
run;
SAS can read this (at least 9.1 plus) but your SAS session must be running with the DBCS and encoding options set.
-DBCS
-encoding UTF-8
These need to be in the sasconfig file or on command line of invocation. With these options the default encoding is Unicode for the SAS session. Without it Unicode options pass syntax checks but have no effect.
You can try using the encoding= options infile statement but for me that never worked.
For some related info see also http://www.phuse.eu/download.aspx?type=cms&docID=3658