I'm trying to use a double pipe delimiter "||" when I export a file from SAS to txt. Unfortunately, it only seems to correctly delimit the header row and uses the single version for the data.
The code is:
proc export data=notes3 outfile='/file_location/notes3.txt'
dbms = dlm;
delimiter = '||';
run;
Which results in:
ID||VAR1||VAR2
1|0|STRING1
2|1|STRING2
3|1|STRING3
If you want to use a two character delimiter, you need to use dlmstr instead of dlm in the file statement in data step file creation. You can't use proc export, unfortunately, as that doesn't support dlmstr.
You can create your own proc export fairly easily, by using dictionary.columns or sashelp.vcolumn to construct the put statement. Feel free to ask more specific questions on that side if you need help with it, but search around for data driven output and you'll most likely find what you need.
The reason proc export won't use a double pipe is because it generates a data step to do the export, which uses a file statement. This is a known limitation - quoting the help file:
Restriction: Even though a character string or character variable is
accepted, only the first character of the string or variable is used
as the output delimiter. This differs from INFILE DELIMITER=
processing.
The header row || works because SAS constructs it as a string constant rather than using a file statement.
So I don't think you can fix the proc export code, but here's a quick and dirty data step that will transform the output into the desired format, provided that your dataset has no missing values and doesn't contain any pipe characters:
/*Export as before to temporary file, using non-printing TAB character as delimiter*/
proc export
data=sashelp.class
outfile="%sysfunc(pathname(work))\temp.txt"
dbms = dlm;
delimiter = '09'x;
run;
/*Replace TAB with double pipe for all rows beyond the 1st*/
data _null_;
infile "%sysfunc(pathname(work))\temp.txt" lrecl = 32767;
file "%sysfunc(pathname(work))\class.txt";
input;
length text $32767;
text = _infile_;
if _n_ > 1 then text = tranwrd(text,'09'x,'||');
put text;
run;
/*View the resulting file in the log*/
data _null_;
infile "%sysfunc(pathname(work))\class.txt";
input;
put _infile_;
run;
As Joe suggested, you could alternatively write your own delimiter logic in a dynamically generated data step, e.g.
/*More efficient option - write your own delimiter logic in a data step*/
proc sql noprint;
select name into :VNAMES separated by ','
from sashelp.vcolumn
where libname = "SASHELP" and memname = "CLASS";
quit;
data _null_;
file "%sysfunc(pathname(work))\class.txt";
set sashelp.class;
length text $32767;
text = catx('||',&VNAMES);
put text;
run;
Related
Assume that we have a table INPUT_TABLE which has four columns name, lat, lon, and z, filled with many data sets. In the SAS Explorer it would e.g. look like this:
name lat lon z
1 Germany 49.420469 8.7269178 17
2 England 51.5540693 -0.8249039 16
...
I handover a PREPROCESSED_TABLE based on this INPUT_TABLE to a macro %tabl:
data V42.PREPROCESSED_TABLE;
set V21.INPUT_TABLE;
drop NAME;
run;
%tabl(libin=V42, file=PREPROCESSED_TABLE);
The macro itself I am not allowed to modify.
Among other things, %tabl also writes a plain text file PREPROCESSED_TABLE.txt:
49.420469|8.7269178|17
51.5540693|-0.8249039|16
I would like to have the header names written out as well, e.g.:
lat|lon|z
49.420469|8.7269178|17
51.5540693|-0.8249039|16
My idea is to expand the PREPROCESSED_TABLE somewhere in the data step - could somebody help me with that, please? How can I read out the header names which are internally stored?
If the goal is to make a file with one line with the variable names then just write the file yourself. First get the names into a dataset (in order) and then write them. For example you could use PROC TRANSPOSE with OBS=0 dataset option to generate a file with one observation per variable.
proc transpose data=V42.PREPROCESSED_TABLE(obs=0) out=NAMES ;
var _all_ ;
run;
Which you can then use to write to a file.
data _null_;
set names ;
file 'preprocessed.txt' dsd dlm='|';
put _name_ # ;
run;
If you also want to add the data to that same file just use a second data step. Make sure to use the MOD option on the FILE statement so that data lines are appended to the existing file.
data _null_;
set V42.PREPROCESSED_TABLE;
file 'preprocessed.txt' dsd dlm='|' mod;
put (_all_) (+0);
run;
If you need to call the existing macro for other reasons you could either ignore the file it creates. Or if for some reason the content is different than just the simple dump of the file then you could just concatenate the file with the the headers with the file the macro generates. Say the macro generated 'PREPROCESSED_TABLE.txt' and your code generated the one line file 'headers.txt'. Then this step will read both and write 'PREPROCESSED_TABLE_w_headers.txt';
data _null_;
file 'PREPROCESSED_TABLE_w_headers.txt';
if _n_=1 then do;
infile 'headers.txt';
input;
put _infile_;
end;
infile 'PREPROCESSED_TABLE.txt';
input;
put _infile_;
run;
Given Reeza's and Tom's hints, I figured out a workaround myself: We simple call out macro %tabl twice, once with a 1-row-table with column-names and once with the data. This approach essentially corresponds to attaching to the file first the headers and then then data to the file (except that I have to worry about additional things added by %tabl further down in the process chain).
The technical difficulty I had was how to extract this 1-row-table with column names from the meta-info of the table input table V21.INPUT_TABLE.
My team mate showed me how that is done. To make it testable for everybody, I will show this step for the test data table sashelp.class:
proc contents data=sashelp.class out=meta (keep=NAME VARNUM) noprint;
run;
proc sort data=meta out=meta2;
by VARNUM;
run;
proc transpose data=meta2 out=colheaders (drop=_NAME_ _LABEL_);
var name;
run;
As a result, we will have a table colheaders with exactly one line containing the table headers, sorted by VARNUM which is the order in which they appear in the original table:
COL1 COL2 COL3 COL4 COL5
1 NAME SEX AGE HEIGHT WEIGHT
Problem solved, at least theoretically.
I'm writing a SAS program to interact with an API. I'm trying to use SAS to capture a specific field from a text file generated by the API.
The generated text "resp" looks like this:
{"result":{"progressId":"ab12","percentComplete":0.0,"status":"inProgress"},"meta":{"requestId":"abcde123","httpStatus":"200 - OK"}}
The field I want to capture is "progressID". In this case, it would be "ab12". If the length of progressID will change, what's the easiest way to capture this field?
My current approach is as follows:
/* The following section will import the text into a SAS table,
seperated by colon. The third column would be "ab12","percentCompelte"
*/
proc import out = resp_table
datafile= resp
dbms = dlm REPLACE;
delimiter = ':';
GETNAMES = NO;
run;
/* The following section will trim off the string ,"percentCompete"*/
data resp_table;
set resp_table;
Progress_ID = SUBSTR(VAR3,2,LENGTH(VAR3)-20);
run;
Do you have an easier/ more concise solution?
Thanks!
Shawn
You can use the JSON library engine to read a json document, and copy the contents to SAS datasets. Work with the data items that the engine creates.
Example:
filename myjson "c:\temp\sandbox.json";
data _null_;
file myjson;
input;
put _infile_;
datalines;
{"result":{"progressId":"ab12","percentComplete":0.0,"status":"inProgress"},"meta":{"requestId":"abcde123","httpStatus":"200 - OK"}}
run;
libname jsondoc json "c:\temp\sandbox.json";
proc copy in=jsondoc out=work;
run;
proc print data=work.Alldata;
where P1='result' and P2='progressId';
run;
I am trying to export SAS data into CSV, sas dataset name is abc here and format is
LINE_NUMBER DESCRIPTION
524JG 24PC AMEFA VINTAGE CUTLERY SET "DUBARRY"
I am using following code.
filename exprt "C:/abc.csv" encoding="utf-8";
proc export data=abc
outfile=exprt
dbms=tab;
run;
output is
LINE_NUMBER DESCRIPTION
524JG "24PC AMEFA VINTAGE CUTLERY SET ""DUBARRY"""
so there is double quote available before and after the description here and additional doble quote is coming after & before DUBARRY word. I have no clue whats happening. Can some one help me to resolve this and make me understand what exatly happening here.
expected result:
LINE_NUMBER DESCRIPTION
524JG 24PC AMEFA VINTAGE CUTLERY SET "DUBARRY"
There is no need to use PROC EXPORT to create a delimited file. You can write it with a simple DATA step. If you want to create your example file then just do not use the DSD option on the FILE statement. But note that depending on the data you are writing that you could create a file that cannot be properly parsed because of extra un-protected delimiters. Also you will have trouble representing missing values.
Let's make a sample dataset we can use to test.
data have ;
input id value cvalue $ name $20. ;
cards;
1 123 A Normal
2 345 B Embedded|delimiter
3 678 C Embedded "quotes"
4 . D Missing value
5 901 . Missing cvalue
;
Essentially PROC EXPORT is writing the data using the DSD option. Like this:
data _null_;
set have ;
file 'myfile.txt' dsd dlm='09'x ;
put (_all_) (+0);
run;
Which will yield a file like this (with pipes replacing the tabs so you can see them).
1|123|A|Normal
2|345|B|"Embedded|delimiter"
3|678|C|"Embedded ""quotes"""
4||D|Missing value
5|901||Missing cvalue
If you just remove DSD option then you get a file like this instead.
1|123|A|Normal
2|345|B|Embedded|delimiter
3|678|C|Embedded "quotes"
4|.|D|Missing value
5|901| |Missing cvalue
Notice how the second line looks like it has 5 values instead of 4, making it impossible to know how to split it into 4 values. Also notice how the missing values have a minimum length of at least one character.
Another way would be to run a data step to convert the normal file that PROC EXPORT generates into the variant format that you want. This might also give you a place to add escape characters to protect special characters if your target format requires them.
data _null_;
infile normal dsd dlm='|' truncover ;
file abnormal dlm='|';
do i=1 to 4 ;
if i>1 then put '|' #;
input field :$32767. #;
field = tranwrd(field,'\','\\');
field = tranwrd(field,'|','\|');
len = lengthn(field);
put field $varying32767. len #;
end;
put;
run;
You could even make this datastep smart enough to count the number of fields on the first row and use that to control the loop so that you wouldn't have to hard code it.
i m new to sas and studying different ways to do subject line task.
Here is two ways i knew at the moment
Method1: file statement in data step
*DATA _NULL_ / FILE / PUT ;
data _null_;
set engappeal;
file 'C:\Users\1502911\Desktop\exportdata.txt' dlm=',';
put id $ name $ semester scoreEng;
run;
Method2: Proc Export
proc export
data = engappeal
outfile = 'C:\Users\1502911\Desktop\exportdata2.txt'
dbms = dlm;
delimiter = ',';
run;
Question:
1, Is there any alternative way to export raw data files
2, Is it possible to export the header also using the data step method 1
You can also make use of ODS
ods listing file="C:\Users\1502911\Desktop\exportdata3.txt";
proc print data=engappeal noobs;
run;
ods listing close;
You need to use the DSD option on the FILE statement to make sure that delimiters are properly quoted and missing values are not represented by spaces. Make sure you set your record length long enough, including delimiters and inserted quotes. Don't worry about setting it too long as the lines are variable length.
You can use CALL VNEXT to find and output the names. The LINK statement is so the loop is later in the data step to prevent __NAME__ from being included in the (_ALL_) variable list.
data _null_;
set sashelp.class ;
file 'class.csv' dsd dlm=',' lrecl=1000000 ;
if _n_ eq 1 then link names;
put (_all_) (:);
return;
names:
length __name__ $32;
do while(1);
call vnext(__name__);
if upcase(__name__) eq '__NAME__' then leave;
put __name__ #;
end;
put;
return;
run;
filename Source 'C:\Source.txt';
Data Example;
Infile Source;
Input Var1 Var2;
Run;
Is there a way I can import all the variables from Source.txt without the "Input Var1 Var2" line? If there are many variables, I think it's too time consuming to list out all the variables, so I was wondering if there's any way to bypass that.
Thanks
Maybe you can use proc import ?
For a CSV I use this and I don't have to define every variable
proc import datafile="&CSVFILE"
out=myCsvData
dbms=dlm
replace;
delimiter=';';
getnames=yes;
run;
It depends on what you have in your txt file. Try different delimiters.
If you are looking at a solution which is INFILE statement based then following reference code should help.
data _null_;
set sashelp.class;
file '/tester/sashelp_class.txt' dsd dlm='09'x;
put name age sex weight height;
run;
/* Version #1 : When data has mixed data(numeric and character) */
data reading_data_w_format;
infile '/tester/sashelp_class.txt' dsd dlm='09'x;
format name $10. age 8. gender $1. weight height 8.2;
input (name--height) (:);
run;
proc print data=reading_data_w_format;run;
proc contents data=reading_data_w_format;run;
/* Version #2 : When all data can be read a character.
I know this version doesn't make sense, but it's still an option*/
data reading_data_wo_format;
infile '/tester/sashelp_class.txt' dsd dlm='09'x;
input (var1-var5) (:$8.); /* Length would be max length of value in all the columns */
run;
proc print data=reading_data_wo_format;run;
proc contents data=reading_data_wo_format;run;
I'd suggest to write down the informat for the variables to be read so that you are sure that the file is as per your specification. PROC IMPORT will try to scan the data first from 1st row till GUESSINGROWS(do not set it to high, if each column is of consistent length) value and based on the length and type, it will use an informat and length which it finds suitable for the reading the variables in the file.