Below is the code. Where AFILE is the input file. Nowhere in the SAS code am finding the column description for this file.
How can I find the description of the input file columns?
Will BFILE hold all the columns that are in AFILE?
Here AFILE is the DDNAME of dataset from JCL, BFILE is a SAS dataset
PROC SORT DATA=AFILE.BFILE OUT=BFILE;
BY COMPANY;
RUN;
Related
I'm looking for getting the total number of rows (count) from a sas dataset file using SAS code.
I tried this code
data _null_; infile "C:\myfiles\sample.sas7bdat" end=eof; input; if eof then put "Lines read=====:" ; run;
This is the results OUTput I get(does not show the number of lines).Obviously, I did not get any actual number of lines in the file
Lines read=====:
NOTE: 1 record was read from the infile
"C:\myfiles\sample.sas7bdat".
However, I know the number of lines in that sample.sas7dat file is more than 1.
Please help!
The INFILE statement is for reading a file as raw TEXT. If you have a SAS dataset then you can just SET the dataset to read it into a data step.
So the equivalent for your attempted method would be something like:
data _null_;
set "C:\myfiles\sample.sas7bdat" end=eof;
if eof then put "Observations read=====:" _n_ ;
run;
One cool thing about sas7bdat files is the amount of metadata stored with them. The row count of that file is already known by SAS as an attribute. You can use proc contents to read it. Observations is the number of rows in the table.
libname files "C:\myfiles";
proc contents data=files.sample;
run;
A more advanced way is to open the file directly using macro functions.
%let dsid = %sysfunc(open(files.sample) ); /* Open the file */
%let nobs = %sysfunc(attrn(&dsid, nlobs) ); /* Get the number of observations */
%let rc = %sysfunc(close(&dsid) ); /* Close the file */
%put Total observations: &nobs
Assume that we have a table INPUT_TABLE which has four columns name, lat, lon, and z, filled with many data sets. In the SAS Explorer it would e.g. look like this:
name lat lon z
1 Germany 49.420469 8.7269178 17
2 England 51.5540693 -0.8249039 16
...
I handover a PREPROCESSED_TABLE based on this INPUT_TABLE to a macro %tabl:
data V42.PREPROCESSED_TABLE;
set V21.INPUT_TABLE;
drop NAME;
run;
%tabl(libin=V42, file=PREPROCESSED_TABLE);
The macro itself I am not allowed to modify.
Among other things, %tabl also writes a plain text file PREPROCESSED_TABLE.txt:
49.420469|8.7269178|17
51.5540693|-0.8249039|16
I would like to have the header names written out as well, e.g.:
lat|lon|z
49.420469|8.7269178|17
51.5540693|-0.8249039|16
My idea is to expand the PREPROCESSED_TABLE somewhere in the data step - could somebody help me with that, please? How can I read out the header names which are internally stored?
If the goal is to make a file with one line with the variable names then just write the file yourself. First get the names into a dataset (in order) and then write them. For example you could use PROC TRANSPOSE with OBS=0 dataset option to generate a file with one observation per variable.
proc transpose data=V42.PREPROCESSED_TABLE(obs=0) out=NAMES ;
var _all_ ;
run;
Which you can then use to write to a file.
data _null_;
set names ;
file 'preprocessed.txt' dsd dlm='|';
put _name_ # ;
run;
If you also want to add the data to that same file just use a second data step. Make sure to use the MOD option on the FILE statement so that data lines are appended to the existing file.
data _null_;
set V42.PREPROCESSED_TABLE;
file 'preprocessed.txt' dsd dlm='|' mod;
put (_all_) (+0);
run;
If you need to call the existing macro for other reasons you could either ignore the file it creates. Or if for some reason the content is different than just the simple dump of the file then you could just concatenate the file with the the headers with the file the macro generates. Say the macro generated 'PREPROCESSED_TABLE.txt' and your code generated the one line file 'headers.txt'. Then this step will read both and write 'PREPROCESSED_TABLE_w_headers.txt';
data _null_;
file 'PREPROCESSED_TABLE_w_headers.txt';
if _n_=1 then do;
infile 'headers.txt';
input;
put _infile_;
end;
infile 'PREPROCESSED_TABLE.txt';
input;
put _infile_;
run;
Given Reeza's and Tom's hints, I figured out a workaround myself: We simple call out macro %tabl twice, once with a 1-row-table with column-names and once with the data. This approach essentially corresponds to attaching to the file first the headers and then then data to the file (except that I have to worry about additional things added by %tabl further down in the process chain).
The technical difficulty I had was how to extract this 1-row-table with column names from the meta-info of the table input table V21.INPUT_TABLE.
My team mate showed me how that is done. To make it testable for everybody, I will show this step for the test data table sashelp.class:
proc contents data=sashelp.class out=meta (keep=NAME VARNUM) noprint;
run;
proc sort data=meta out=meta2;
by VARNUM;
run;
proc transpose data=meta2 out=colheaders (drop=_NAME_ _LABEL_);
var name;
run;
As a result, we will have a table colheaders with exactly one line containing the table headers, sorted by VARNUM which is the order in which they appear in the original table:
COL1 COL2 COL3 COL4 COL5
1 NAME SEX AGE HEIGHT WEIGHT
Problem solved, at least theoretically.
I'm new to SAS and am having issues with using Linear Regression.
I loaded a CSV file and then in Tasks and Utilities > Tasks > Statistics > Linear Regression I selected WORK.BP (BP = filename) for my data. When I try to select my dependent variable SAS says "No columns are available."
The CVS file appears to have loaded correctly and has 2 columns so I can't figure out what the issue is.
Thanks for the help.
This is the code I used for loading the file:
data BP;
infile '/folders/myfolders/BP.csv' dlm =',' firstobs=2;
input BP $Pressure$;
run;
And this is what the output looks like
By running your code. you import the .csv file with the 'PRESSURE' variable as character variable; in a linear regression model, you need to have all varaible as _numeric_.
In order to do this, I suggest to use the PROC IMPORT to import a .csf format file, instead of a DATA step with an INPUT statement.
In your case, you shold follows those steps:
Define the path where the .csv file is located:
%let path = the_folder_path_where_the_csv_file_is_located ;
Define the number of rows from which start your data (by including the labels/variables names in the count):
%let datarow = 2;
Import the .csv file, here named 'BP', as follows:
proc import datafile="&path.\BP.csv"
out=BP
dbms=csv
replace;
delimiter=",";
datarow=&datarow.;
getnames=YES;
run;
I assumed that the file you want as output has to be called BP too (you'll find it in the work library!) and that the delimiter is the colon.
Hope this helps!
Is there a way to print the contents of a .txt file to a tab using ods tagsets.excelxp? I have a script that creates a .xml file with several tabs, and on one of the tabs I'd like to print the lines of the code itself, so that I can send the .xml file to someone and they can have the code that I used to produce the output. I have the code saved separately as a .txt. Any help or suggestions would be appreciated.
Easiest way would be to read the text file lines into a Data Set. Then use PROC PRINT to print to your output destination;
data lines;
infile "path/to/file.txt";
format code $2000.;
input;
code = _infile_;
run;
proc print data=lines noobs;
run;
I have an xml file that I need to read as a single column table. Now, to achieve the result I embed following line into INFILE statement:
dlmstr='nodlmstr'
I have not found any appropriate option that would let me do it more accurately.
I don't think you need an option at all. Just create your 1 variable with enough length to hold the line.
data _null_;
file "c:\temp\test.xml";
put "<a>";
put " <aa>1 </aa>";
put " <bb>2</bb>";
put "</a>";
run;
data test;
infile "c:\temp\test.xml";
format line $2000.;
input;
line = _infile_;
run;