I am tying to convert a comma delimited text file to a pipe delimited file but my input file name (comma delimited file) is a variable (flname1). I am using the code below suggested by a stackoverflow member. The code works fine as long as I specify the file name in the infile statement but I don't know how to specify file name as a variable-
data _null_;
enddate=date();
flname1=compress("d:\temp\wq_" || year(enddate) || put(month(enddate),z2.) || ".txt");
length x1-x6 $200;
infile 'flname1' dsd dlm=',' truncover;
file 'C:\temp\pipe.txt' dsd dlm='|';
input x1-x6;
put x1-x6;
run;
I am new to SAS and any help will be greatly appreciated. Thank you!
You should be able to use the filevar option in the infile statement, e.g.:
data _null_;
enddate=date();
flname1=compress("d:\temp\wq_"||year(enddate)||put(month(enddate),z2.)||".txt");
length x1-x6 $200;
infile myinputfile dsd dlm=',' filevar=flname1 truncover;
file 'C:\temp\pipe.txt' dsd dlm='|';
input x1-x6;
put x1-x6;
run;
The documentation explains more about the option and has an example of its use in Example 5.
You probably want to actually do this as a macro variable - this isn't a normal usage of filevar (which you'd use if you had a dataset with a bunch of filenames in it or something).
%let filename = d:\temp\wq_%sysfunc(today(),YYMMN6.).txt;
%put &=filename;
data _null_;
length x1-x6 $200;
infile "&filename." dsd dlm=',' truncover;
file 'C:\temp\pipe.txt' dsd dlm='|';
input x1-x6;
put x1-x6;
run;
Macro variables are just text substitutions, so they can be used wherever you could type the same thing in. They also don't need concatenating functions - any more than you have to concatenate when you type a word in - so it's easier to do.
Here, I use %sysfunc to tell SAS to execute the today() function, and the second argument tells it how to format it - YYMMN6. is the format you look like you want (201506 or similar). Then just make sure to use " quotes not ' quotes as the latter doesn't let the macro variable resolve.
Related
This is a follow-up of my previous question:
How to import a txt file with single quote mark in a variable and another in another variable.
The solution there works perfectly until there is not a variable whose values could be null.
In this latter case, I get:
filename sample 'c:\temp\sample.txt';
data _null_;
file sample;
input;
put _infile_;
datalines;
001|This variable could be null|PROVA|MILANO|1000
002||'80S WERE GREAT|FORLI'|1100
003||'80S WERE GREAT|ROMA|1110
;
data want;
data prova;
infile sample dlm='|' lrecl=50 truncover;
format
codice $3.
could_be_null $20.
nome $20.
luogo $20.
importo 4.
;
input
codice
could_be_null
nome
luogo
importo
;
putlog _infile_;
run;
proc print;
run;
Is it possible to correctly load a file like the one in the example directly in SAS, without manually modifying the original .txt?
You will need to pre-process the file to fix the issue.
If you add quotes around the values then you will not have the problem.
002||"'80S WERE GREAT"|"FORLI'"|1100
IF you know that none of the values contain the delimiter then adding a space before every delimiter
002 | |'80S WERE GREAT |FORLI' |1100
will let you read it without the DSD option.
If lines are shorter than 32K bytes then it can be done in the same step that reads the data.
data test2 ;
infile sample dlm='|' truncover ;
input #;
_infile_ = tranwrd(_infile_,'|',' |');
input (var1-var5) (:$40.);
run;
proc print;
run;
Results:
Obs var1 var2 var3 var4 var5
1 001 This variable could be null PROVA MILANO 1000
2 002 '80S WERE GREAT FORLI' 1100
3 003 '80S WERE GREAT ROMA 1110
One way to test if you have the issue is to make sure each line has the right number of fields.
filename sample temp;
options parmcards=sample;
parmcards;
001|This variable could be null|PROVA|MILANO|1000
002||'80S WERE GREAT|FORLI'|1100
003||'80S WERE GREAT|ROMA|1110
;
data _null_;
infile sample dsd end=eof;
if eof then do;
call symputx('nfound',nfound);
putlog / 'Found ' nfound :comma11.
'problem lines out of ' _n_ :comma11. 'lines.'
;
end;
input;
retain expect nfound;
words=countw(_infile_,'|','qm');
if _n_=1 then expect=words;
else if expect ne words then do;
nfound+1;
if nfound <= 10 then do;
putlog (_n_ expect words) (=) ;
list;
end;
end;
run;
Example Results:
_N_=2 expect=5 words=4
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8
2 002||'80S WERE GREAT|FORLI'|1100 32
_N_=3 expect=5 words=3
3 003||'80S WERE GREAT|ROMA|1110 30
Found 2 problem lines out of 4 lines.
PS Go tell SAS to enhance their delimited file processing: https://communities.sas.com/t5/SASware-Ballot-Ideas/Enhancements-to-INFILE-FILE-to-handle-delimited-file-variations/idi-p/435977
You need to add the DSD option to your INFILE statement.
https://support.sas.com/techsup/technote/ts673.pdf
DSD (delimiter-sensitive data) option—Specifies that SAS should treat
delimiters within a data value as character data when the delimiters
and the data value are enclosed in quotation marks. As a result, SAS
does not split the string into multiple variables and the quotation
marks are removed before the variable is stored. When the DSD option
is specified and SAS encounters consecutive delimiters, the software
treats those delimiters as missing values. You can change the default
delimiter for the DSD option with the DELIMTER= option.
I am reading in a .csv file in SAS where some of the fields are populated in the main by null values . and a handful are populated by 5 digit SAS dates. I need SAS to recognise the field as a date field (or at the very least a numeric field), instead of reading it in as text as it is is doing at the minute.
A simplified version of my code is as so:
data test;
informat mydate date9.;
infile myfile dsd dlm ',' missover;
input
myfirstval
mydate
;
run;
With this code all values are read in as . and the field data type is text. Can anyone tell me what I need to change in the above code to get the output I need?
Thanks
If you write a data step to read a CSV file SAS will create the variable as the data type that you specify. If you tell it that MYDATE is a number it will NOT convert it to a character variable.
data test;
infile cards dsd dlm=',' TRUNCOVER ;
length myfirstval 8 mydate 8 mythirdval 8;
input myfirstval mydate mythirdval;
format mydate date9.;
cards;
1,1234,5
2,.,6
;
Note that the data step compiler will define the type of the variable at the first chance that it can. For example if the first reference is in a statement like IF MYDATE='.' ... then MYDATE will be defined as character length one to match the type of the value that it is being compared to. That is why it is best to start with a LENGTH or ATTRIB statement to clearly define your variables.
from this link i learnt how to read multiple txt file.
Problem: is it possible to create a macro variable to input all txt file in a folders. say C:\Users\Desktop\ (given all files are in txt format with name datasetyyyymmdd.)
I have dataset20150101.txt - dataset20150806.txt and i do not want to manually input all those linkage in the datalines.
data whole2;
infile datalines;
length fil2read $256;
input fil2read $;
infile dummy filevar=fil2read end=done dsd;
do while (not done);
input name$ value1 value2;
output;
end;
datalines;
C:\Users\Desktop\dataset20150501.txt
C:\Users\Desktop\dataset20150502.txt
run;
Ask the operating system which files are present:
filename DataIn pipe "dir C:\Users\Desktop\dataset*.txt /S /B";
data whole2;
infile DataIn truncover;
length fil2read $256;
input fil2read $;
infile dummy filevar=fil2read end=done dsd;
do while (not done);
input name$ value1 value2;
output;
end;
run;
The Bare option /B removes unneeded information like last access date.
I added the Sub-folder option /S because then the dir statement returns full path names. This way it also reads dataset*.txt files in subfolder of C:\Users\Desktop\. If that does not suite you, remove the /S and use
path2Read = "dir C:\Users\Desktop\"||fil2read;
filename Source 'C:\Source.txt';
Data Example;
Infile Source;
Input Var1 Var2;
Run;
Is there a way I can import all the variables from Source.txt without the "Input Var1 Var2" line? If there are many variables, I think it's too time consuming to list out all the variables, so I was wondering if there's any way to bypass that.
Thanks
Maybe you can use proc import ?
For a CSV I use this and I don't have to define every variable
proc import datafile="&CSVFILE"
out=myCsvData
dbms=dlm
replace;
delimiter=';';
getnames=yes;
run;
It depends on what you have in your txt file. Try different delimiters.
If you are looking at a solution which is INFILE statement based then following reference code should help.
data _null_;
set sashelp.class;
file '/tester/sashelp_class.txt' dsd dlm='09'x;
put name age sex weight height;
run;
/* Version #1 : When data has mixed data(numeric and character) */
data reading_data_w_format;
infile '/tester/sashelp_class.txt' dsd dlm='09'x;
format name $10. age 8. gender $1. weight height 8.2;
input (name--height) (:);
run;
proc print data=reading_data_w_format;run;
proc contents data=reading_data_w_format;run;
/* Version #2 : When all data can be read a character.
I know this version doesn't make sense, but it's still an option*/
data reading_data_wo_format;
infile '/tester/sashelp_class.txt' dsd dlm='09'x;
input (var1-var5) (:$8.); /* Length would be max length of value in all the columns */
run;
proc print data=reading_data_wo_format;run;
proc contents data=reading_data_wo_format;run;
I'd suggest to write down the informat for the variables to be read so that you are sure that the file is as per your specification. PROC IMPORT will try to scan the data first from 1st row till GUESSINGROWS(do not set it to high, if each column is of consistent length) value and based on the length and type, it will use an informat and length which it finds suitable for the reading the variables in the file.
I'm trying to use a double pipe delimiter "||" when I export a file from SAS to txt. Unfortunately, it only seems to correctly delimit the header row and uses the single version for the data.
The code is:
proc export data=notes3 outfile='/file_location/notes3.txt'
dbms = dlm;
delimiter = '||';
run;
Which results in:
ID||VAR1||VAR2
1|0|STRING1
2|1|STRING2
3|1|STRING3
If you want to use a two character delimiter, you need to use dlmstr instead of dlm in the file statement in data step file creation. You can't use proc export, unfortunately, as that doesn't support dlmstr.
You can create your own proc export fairly easily, by using dictionary.columns or sashelp.vcolumn to construct the put statement. Feel free to ask more specific questions on that side if you need help with it, but search around for data driven output and you'll most likely find what you need.
The reason proc export won't use a double pipe is because it generates a data step to do the export, which uses a file statement. This is a known limitation - quoting the help file:
Restriction: Even though a character string or character variable is
accepted, only the first character of the string or variable is used
as the output delimiter. This differs from INFILE DELIMITER=
processing.
The header row || works because SAS constructs it as a string constant rather than using a file statement.
So I don't think you can fix the proc export code, but here's a quick and dirty data step that will transform the output into the desired format, provided that your dataset has no missing values and doesn't contain any pipe characters:
/*Export as before to temporary file, using non-printing TAB character as delimiter*/
proc export
data=sashelp.class
outfile="%sysfunc(pathname(work))\temp.txt"
dbms = dlm;
delimiter = '09'x;
run;
/*Replace TAB with double pipe for all rows beyond the 1st*/
data _null_;
infile "%sysfunc(pathname(work))\temp.txt" lrecl = 32767;
file "%sysfunc(pathname(work))\class.txt";
input;
length text $32767;
text = _infile_;
if _n_ > 1 then text = tranwrd(text,'09'x,'||');
put text;
run;
/*View the resulting file in the log*/
data _null_;
infile "%sysfunc(pathname(work))\class.txt";
input;
put _infile_;
run;
As Joe suggested, you could alternatively write your own delimiter logic in a dynamically generated data step, e.g.
/*More efficient option - write your own delimiter logic in a data step*/
proc sql noprint;
select name into :VNAMES separated by ','
from sashelp.vcolumn
where libname = "SASHELP" and memname = "CLASS";
quit;
data _null_;
file "%sysfunc(pathname(work))\class.txt";
set sashelp.class;
length text $32767;
text = catx('||',&VNAMES);
put text;
run;