SAS - Recognize missing values when reading CSV - sas

Given the csv:
Cat,,9
Dog,,10
Egg,,11
And the code:
DATA database ;
INFILE '/path/to/data' dlm=',' missover;
INPUT
animal $
missing $
number
;
RUN;
The output I get is:
animal missing number
Cat 9
Dog 10
Egg 11
How can I get SAS to recognize the missing value, so that my output table is like the one below?
animal missing number
Cat 9
Dog 10
Egg 11

You just need to include dsd in your infile statement as this signifies that SAS should treat two consecutive commas as a missing value. You can read more information here:
DATA database ;
INFILE '/path/to/data' dlm=',' missover dsd;
INPUT
animal $
missing $
number
;
RUN;

Related

SAS colon format modifier

What do the numbers in the grey box represent? And what's a simple way of understanding how the colon modifier affects the way sas reads in values?
The answer depends on information not provided. The answer B is the best choice in the sense that you should use the colon modifier when using informats in the INPUT statement to prevent the use of the formatted input mode instead of list input mode. Otherwise the formatted input could read too many or too few characters and also might leave the cursor in the wrong place for reading the next field.
But if you try to read that data from in-line cards it works fine for those two lines. That is because in-line data lines are padded to next multiple of 80 bytes.
If you put those lines into a file without any trailing spaces on the lines then the second line fails because there are not 10 characters to read for the last field. But if you add the TRUNCOVER option (or PAD) to the INFILE statement then it will work.
Try it yourself. TEST1 and TEST3 work. TEST2 gets a LOST CARD note.
data test1;
input name $ hired date9. age state $ salary comma10.;
format hired date9.;
cards;
Donny 5MAR2008 25 FL $43,123.50
Margaret 20FEB2008 43 NC 65,150
;
options parmcards=test;
filename test temp ;
parmcards;
Donny 5MAR2008 25 FL $43,123.50
Margaret 20FEB2008 43 NC 65,150
;
data test2;
infile test;
input name $ hired date9. age state $ salary comma10.;
format hired date9.;
run;
data test3;
infile test truncover;
input name $ hired date9. age state $ salary comma10.;
format hired date9.;
run;
With different data the first formatted input can cause trouble also. For example if the date values used only 2 digits for the year it would throw things off. So it tries to read FL as the age and then reads the first 8 characters of the salary as the STATE and just blanks as the SALARY.
data test1;
input name $ hired date9. age state $ salary comma10.;
format hired date9.;
cards;
Donny 5MAR08 25 FL $43,123.50
Margaret 20FEB2008 43 NC 65,150
;
Results:
Obs name hired age state salary
1 Donny 05MAR2008 . $43,123. .
2 Margaret 20FEB2008 43 NC 65150

SAS Infile Statement Not Getting Observations

I need to use the INFILE statement to read a file called np_traffic.csv, name the table traffic2, and only import a column called ReportingDate as a character.
Current Code is giving me the error
"The data set WORK.TRAFFIC2 may be incomplete. When this step was
stopped there were 0 observations and 1 variables."
DATA traffic2;
INFILE “E:/Documents/Week 2/np_traffic.csv”
dsd firstobs=2;
INPUT ReportingDate $;
RUN;
Let's assume that you really have a delimited text file, which is what a CSV file is, instead of the spreadsheet you pictured in the photograph in your post. To read the 6th field in a line you need to first read the first 5 fields. That does not mean you need use the values read from those fields.
data traffic2;
infile “E:/Documents/Week 2/np_traffic.csv”
dsd firstobs=2
;
length dummy $1 ReportingDate $12;
input 5*dummy ReportingDate ;
drop dummy;
run;
I would suggest to try it this way:
data traffic2;
drop a b c d e g;
infile 'E:\Documents\Week 2\np_traffic.csv' dsd dlm='<Insert your delimiter>' firstobs=2;
input a b c d e f g;
run;
https://documentation.sas.com/?docsetId=lestmtsref&docsetTarget=n1rill4udj0tfun1fvce3j401plo.htm&docsetVersion=9.4&locale=en

SAS Export Issue as it is giving additional double quote

I am trying to export SAS data into CSV, sas dataset name is abc here and format is
LINE_NUMBER DESCRIPTION
524JG 24PC AMEFA VINTAGE CUTLERY SET "DUBARRY"
I am using following code.
filename exprt "C:/abc.csv" encoding="utf-8";
proc export data=abc
outfile=exprt
dbms=tab;
run;
output is
LINE_NUMBER DESCRIPTION
524JG "24PC AMEFA VINTAGE CUTLERY SET ""DUBARRY"""
so there is double quote available before and after the description here and additional doble quote is coming after & before DUBARRY word. I have no clue whats happening. Can some one help me to resolve this and make me understand what exatly happening here.
expected result:
LINE_NUMBER DESCRIPTION
524JG 24PC AMEFA VINTAGE CUTLERY SET "DUBARRY"
There is no need to use PROC EXPORT to create a delimited file. You can write it with a simple DATA step. If you want to create your example file then just do not use the DSD option on the FILE statement. But note that depending on the data you are writing that you could create a file that cannot be properly parsed because of extra un-protected delimiters. Also you will have trouble representing missing values.
Let's make a sample dataset we can use to test.
data have ;
input id value cvalue $ name $20. ;
cards;
1 123 A Normal
2 345 B Embedded|delimiter
3 678 C Embedded "quotes"
4 . D Missing value
5 901 . Missing cvalue
;
Essentially PROC EXPORT is writing the data using the DSD option. Like this:
data _null_;
set have ;
file 'myfile.txt' dsd dlm='09'x ;
put (_all_) (+0);
run;
Which will yield a file like this (with pipes replacing the tabs so you can see them).
1|123|A|Normal
2|345|B|"Embedded|delimiter"
3|678|C|"Embedded ""quotes"""
4||D|Missing value
5|901||Missing cvalue
If you just remove DSD option then you get a file like this instead.
1|123|A|Normal
2|345|B|Embedded|delimiter
3|678|C|Embedded "quotes"
4|.|D|Missing value
5|901| |Missing cvalue
Notice how the second line looks like it has 5 values instead of 4, making it impossible to know how to split it into 4 values. Also notice how the missing values have a minimum length of at least one character.
Another way would be to run a data step to convert the normal file that PROC EXPORT generates into the variant format that you want. This might also give you a place to add escape characters to protect special characters if your target format requires them.
data _null_;
infile normal dsd dlm='|' truncover ;
file abnormal dlm='|';
do i=1 to 4 ;
if i>1 then put '|' #;
input field :$32767. #;
field = tranwrd(field,'\','\\');
field = tranwrd(field,'|','\|');
len = lengthn(field);
put field $varying32767. len #;
end;
put;
run;
You could even make this datastep smart enough to count the number of fields on the first row and use that to control the loop so that you wouldn't have to hard code it.

SAS Error: "No logical assign for filename SALARIES"

I am running SAS Studio 3.6 Basic Edition. I am a beginner at SAS and I can't get past this error that I've been having. I have the code below and the file is in the correct place. I created a folder under the "my folders" in the sidebar called "Exercises" and under that I created a folder called "data". It seems that it is not reading the file but I'm not sure why because the path is correct (to my knowledge).
Any ideas? I have already tried googling and most of the results with this error have to do with _WEBOUT which I don't believe is my problem.
DATA SALARIES;
INFILE '/Exercises/data/AAUP_data.txt';
INFILE SALARIES delimiter=',';
INPUT FICE College_Name $ State $ Type $ Average_Salary_Full
Average_Salary_Assoc Average_Salary_Asst Average_Salary_All
Average_Comp_Full Average_Comp_Assoc Average_Comp_Asst Average_Comp_All
Number_of_Professors_Full Number_of_Professors_Assoc
Number_of_Professors_Asst Number_of_Instructors Number_of_Faculty_All
;
RUN;
PROC PRINT;
RUN;
I appreciate the help.
Your second infile statement is meant to be used in conjunction with a filename statement.
If your SALARIES infile is meant to be the text file stored at '/Exercises/data/AAUP_data.txt', then there are two ways you could write this:
FILENAME SALARIES '/Exercises/data/AAUP_data.txt';
DATA SALARIES;
INFILE SALARIES delimiter=',';
INPUT FICE College_Name $ State $ Type $ Average_Salary_Full
Average_Salary_Assoc Average_Salary_Asst Average_Salary_All
Average_Comp_Full Average_Comp_Assoc Average_Comp_Asst Average_Comp_All
Number_of_Professors_Full Number_of_Professors_Assoc
Number_of_Professors_Asst Number_of_Instructors Number_of_Faculty_All
;
RUN;
PROC PRINT;
RUN;
or simply
DATA SALARIES;
INFILE '/Exercises/data/AAUP_data.txt' delimiter=',';
INPUT FICE College_Name $ State $ Type $ Average_Salary_Full
Average_Salary_Assoc Average_Salary_Asst Average_Salary_All
Average_Comp_Full Average_Comp_Assoc Average_Comp_Asst Average_Comp_All
Number_of_Professors_Full Number_of_Professors_Assoc
Number_of_Professors_Asst Number_of_Instructors Number_of_Faculty_All
;
RUN;
PROC PRINT;
RUN;

How to read many rows of text as the same oberseration into SAS?

I have a dataset stored as text. It looks like this:
data in text
I want to read this dateset into SAS like:
dataset I want it to be in SAS
This is my code now:
proc import datafile="myfile" out=mydata DBMS=dlm;
delimiter='09'x;
getnames=no;
run;
But the result just looks like which stored in text. How to revise the code? Thank you.
Your file looks to have one value per row. Assuming you want to read them into three columns then just let SAS do it for you. You can try to eliminate the any tabs or semi-colon by asking it to treat them as delimiters. You could try using the FLOWOVER option (which is the default) on the INFILE statement to have it automatically go to the next row.
data want ;
dlm='09'X || ';' ;
infile 'myfile' dlm=dlm flowover ;
input id $ val1 val2 ;
run;
Now if your data has blank rows you might get out of synch and begin trying to read the text strings like AA into the numeric variables. So if that it true you might try telling it to read exactly three rows for every observation.
data want ;
infile 'myfile' truncover ;
input id $ / val1 / val2 ;
run;