I have this dataset student in SAS and I planned to delete the records(rows) where grade is non numeric. I tried the following code and it didnt worked. I tried 'if grade=. then delete;' as well and it still didnt work. I do not want to replace the values and I just want to delete the row. Any ideas on this?
data student;
infile datalines firstobs=2 dsd truncover;
input student$ class$ grade;
datalines;
student, class, grade
Jansen, Brave, A
Yassin, Brave, 70
Benison, Brave, 67
Yan Jin, Brave, E
James, Hero, 90
Michelle, Hero, 89
Hiroku, Hero, C
Misoku, Hero, 93
;
run;
data student_cleaned;
set work.student;
if not (anyalpha(grade)) then delete;
run;
In the first step you read in grade as numeric. This means SAS will already have removed any characters from that data set. Read it in as character and then you can do the second step. Add a $ after grade to read it in as a character value.
Literally a one character change ;)
data student;
infile datalines firstobs=2 dsd truncover;
input student$ class$ grade $;
datalines;
student, class, grade
Jansen, Brave, A
Yassin, Brave, 70
Benison, Brave, 67
Yan Jin, Brave, E
James, Hero, 90
Michelle, Hero, 89
Hiroku, Hero, C
Misoku, Hero, 93
;
run;
data student_cleaned;
set work.student;
if anyalpha(grade) then delete;
run;
You already read GRADE as a numeric variable. The single letter codes like A or E will be mapped to the corresponding special missing values like .A or .E. Any other text will be mapped to regular missing.
You can use the MISSING() function to test for any of those.
if missing(grade) then delete;
Related
What do the numbers in the grey box represent? And what's a simple way of understanding how the colon modifier affects the way sas reads in values?
The answer depends on information not provided. The answer B is the best choice in the sense that you should use the colon modifier when using informats in the INPUT statement to prevent the use of the formatted input mode instead of list input mode. Otherwise the formatted input could read too many or too few characters and also might leave the cursor in the wrong place for reading the next field.
But if you try to read that data from in-line cards it works fine for those two lines. That is because in-line data lines are padded to next multiple of 80 bytes.
If you put those lines into a file without any trailing spaces on the lines then the second line fails because there are not 10 characters to read for the last field. But if you add the TRUNCOVER option (or PAD) to the INFILE statement then it will work.
Try it yourself. TEST1 and TEST3 work. TEST2 gets a LOST CARD note.
data test1;
input name $ hired date9. age state $ salary comma10.;
format hired date9.;
cards;
Donny 5MAR2008 25 FL $43,123.50
Margaret 20FEB2008 43 NC 65,150
;
options parmcards=test;
filename test temp ;
parmcards;
Donny 5MAR2008 25 FL $43,123.50
Margaret 20FEB2008 43 NC 65,150
;
data test2;
infile test;
input name $ hired date9. age state $ salary comma10.;
format hired date9.;
run;
data test3;
infile test truncover;
input name $ hired date9. age state $ salary comma10.;
format hired date9.;
run;
With different data the first formatted input can cause trouble also. For example if the date values used only 2 digits for the year it would throw things off. So it tries to read FL as the age and then reads the first 8 characters of the salary as the STATE and just blanks as the SALARY.
data test1;
input name $ hired date9. age state $ salary comma10.;
format hired date9.;
cards;
Donny 5MAR08 25 FL $43,123.50
Margaret 20FEB2008 43 NC 65,150
;
Results:
Obs name hired age state salary
1 Donny 05MAR2008 . $43,123. .
2 Margaret 20FEB2008 43 NC 65150
I am trying to create a SAS table for keeping descriptions and names of output tables which includes a formatted date inside. However the output includes date unformatted.
My code:
data tablenames;
infile datalines delimiter=',';
input description: $30. sastablename: $30.;
attrib datetoday format=yymmdd6.;
datetoday = date();
mergedtext=catx('_',sastablename,datetoday);
output;
datalines;
Table for Customers,TfC
Table for Sales,TfS
;
The code output gives TfC_20688 for mergedtext variable.
My desired output for mergedtext variable is TfC_160822.
You need to let CATX() know to use the formatted value. Try using the VVALUE() function if your variables are already formatted. Otherwise use the PUT() function to apply the format you want.
data tablenames;
infile datalines delimiter=',';
input description: $30. sastablename: $30.;
attrib datetoday format=yymmddn8.;
datetoday = date();
mergedtext1=catx('_',sastablename,vvalue(datetoday));
mergedtext2=catx('_',sastablename,put(datetoday,yymmddn8.));
datalines;
Table for Customers,TfC
Table for Sales,TfS
;
P.S. Don't use two digit years.
You can use the PUT() function to convert the SAS date in datetoday (the value of which is 20688) to the yymmdd format you want.
43 data tablenames;
44 infile datalines delimiter=',';
45 input description: $30. sastablename: $30.;
46 mergedtext=catx('_',sastablename,put(date(),yymmddn6.));
47 put mergedtext=;
48 output;
49 datalines;
mergedtext=TfC_160822
mergedtext=TfS_160822
NOTE: The data set WORK.TABLENAMES has 2 observations and 3 variables.
The code is very simple:
data test (keep = state state_num);
set raw1314.accident2013_prf;
state_num= put(state,z2.);
run;
variable "state" contains state names and the output of this program is:
Obs STATE state _num
1 Alabama 01
But isn't "put" function is used to convert numerical values into character values? Why it maps "Alabama" to "01" here?
Thanks in advance.
Your variable STATE must be numeric (1) and have a format applied to it or character (01) with a format applied. If it was the character value of Alabama this would not occur.
data _null_;
x=put('Alabama', z2.);
put x;
run;
Results:
55
56 data _null_;
57 x=put('Alabama', z2.);
___
484
NOTE 484-185: Format $Z was not found or could not be loaded.
58 put x;
59 run;
Al
Hi I am trying to import a tab delimited file in SAS that looks like this,
Names Points
Sumit1 10
Sumit2 20
SUmit4 30
SUmit5 85
SUmit6 90
SUmit7 39
hfgö®q-±òSÀ®téîÓVU«‘îj'n5E•d÷Yb#­AK$®SŽ†ÿ-ÍKÕw¿óå0"¤h—t0Ld 89
SUmit8 48
SUmit9 70
SUmit10 20
SUmit11 90
The first row represents column names.
I am using the following code to import the file,
data names;
infile "C:xxxxxxxx\names.txt"
delimiter='09'x MISSOVER DSD lrecl=32767 firstobs=2;
informat names $150.;
informat Points best32.;
format names $150.;
format Points best12.;
input names $
Points;
run;
and the sas data set after import looks like the following:
Names Points
Sumit1 10
Sumit2 20
SUmit4 30
SUmit5 85
SUmit6 90
SUmit7 39
hfgö®q-±òSÀ®téîÓVU«‘îj'n5E•d÷Yb#­AK$®SŽ†ÿ-ÍKÕw¿óå0"¤h—t0Ld .
So basically all the rows are not getting imported in sas and it stops at row 7 because of the presence of some unusual characters
(I don't know what what this characters are called).
There are 1000 files like this that I need to import. So I am using a macro to import the files.
Can somebody please help me how can I import this type of files in SAS.
Try this code....
DATA names;
LENGTH Names $ 91 Points 8 ;
FORMAT Names $CHAR91. Points BEST2. ;
INFORMAT Names $CHAR91. Points BEST2. ;
INFILE 'C:xxxxxxxx\names.txt'
LRECL=32767 ENCODING="LATIN1" TERMSTR=CRLF DLM='7F'x MISSOVER DSD ;
INPUT Names : $CHAR91. Points : ?? BEST2. ;
RUN;
Hi I am trying to export a tab delimited file in SAS that looks like this,
Names Points
Sumit1 10
Sumit2 20
SUmit4 30
SUmit5 85
SUmit6 90
SUmit7 39
hfgö®q-±òSÀ®téîÓVU«‘îj'n5E•d÷Yb#­AK$®SŽ†ÿ-ÍKÕw¿óå0"¤h—t0Ld 89
SUmit8 48
SUmit9 70
SUmit10 20
SUmit11 90
I am using the following code to import the file,
data names;
infile "C:xxxxxxxx\names.txt"
delimiter='09'x MISSOVER DSD lrecl=32767 firstobs=2;
informat names $150.;
informat Points best32.;
format names $150.;
format Points best12.;
input names $
Points;
run;
and the sas data set after import looks like the following:
Names Points
Sumit1 10
Sumit2 20
SUmit4 30
SUmit5 85
SUmit6 90
SUmit7 39
hfgö®q-±òSÀ®téîÓVU«‘îj'n5E•d÷Yb#­AK$®SŽ†ÿ-ÍKÕw¿óå0"¤h—t0Ld .
So basically all the rows are not getting imported in sas and it stops because of the presence of some unusual characters
(I don't know what what this characters are called) in row 7.
There are 1000 files like this that I need to import. So I am using a macro to import the files.
Can somebody please help me how can I import this type of files in SAS.
Try this code.....Change the lengths accordingly...
DATA names;
LENGTH Names $ 91 Points 8 ;
FORMAT Names $CHAR91. Points BEST2. ;
INFORMAT Names $CHAR91. Points BEST2. ;
INFILE 'C:xxxxxxxx\names.txt'
LRECL=32767 ENCODING="LATIN1" TERMSTR=CRLF DLM='09'x MISSOVER DSD ;
INPUT Names : $CHAR91. Points : ?? BEST2. ;
RUN;