Proc Format/proc tabulate error - sas

I am running the following code:
ods listing close;
ods pdf file = "D:\work.pdf";
proc format;
value MS 1 = "Married - Spouse Present"
2 = "Married - Spouse Absent"
3 = "Widowed"
4 = "Divorced"
5 = "Seperated"
6 = "Never Married";
value Sex 1 = "Male"
2 = "Female";
value Race 1 = "White"
2 = "Black"
4 = "Asian";
value Hispanic 1 = "Hispanic";
value Age (multilabel);
16 - 19 = "16 to 19 years"
20 - 24 = "20 to 24 years"
25 - 54 = "25 to 54 years"
55 - 64 = "55 to 64 years"
16 - 85 = "Total, 16 years and over"
20 - 85 = "20 years and over"
25 - 85 = "25 years and over"
55 - 85 = "55 years and over"
65 - 85 = "65 years and over"
;
quit;
proc tabulate data = Final;
format age age.;
class Age/mlf;
class Race Hispanic_NonHispanic Marital_Status Full_Part_Time_Status Sex Year;
var Multi_Job;
table Age Race Hispanic_nonhispanic Marital_Status Full_Part_Time_Status, Sex*year*Multi_Job All / printmiss;
format Race Race. Hispanic_nonHispanic Hispanic. Marital_Status MS. Sex Sex.;
run;
ods pdf close;
ods listing;
But I get the following error message:
ods listing close;
481 ods pdf file = "D:\work.pdf";
NOTE: Writing ODS PDF output to DISK destination "D:\work.pdf", printer "PDF".
482
483 proc format;
484 value MS 1 = "Married - Spouse Present"
485 2 = "Married - Spouse Absent"
486 3 = "Widowed"
487 4 = "Divorced"
488 5 = "Seperated"
489 6 = "Never Married";
NOTE: Format MS has been output.
490
491 value Sex 1 = "Male"
492 2 = "Female";
NOTE: Format SEX has been output.
493
494 value Race 1 = "White"
495 2 = "Black"
496 4 = "Asian";
NOTE: Format RACE has been output.
497
498 value Hispanic 1 = "Hispanic";
NOTE: Format HISPANIC has been output.
499
500 value Age (multilabel);
NOTE: Format AGE has been output.
501
502 16 - 19 = "16 to 19 years"
............so on
ERROR: Write Access Violation In Task ( TABULATE )
Exception occurred at (679B8D96)
Task Traceback
Address Frame (DBGHELP API Version 4.0 rev 5)
679B8D96 053BF9EC 0001:00057D96 sasxkern.dll
679A0070 053BFAA8 0001:0003F070 sasxkern.dll
679788B2 053BFB3C 0001:000178B2 sasxkern.dll
66FC6323 053BFB4C 0001:00005323 sassfm01.dll
66FCD034 053BFBC8 0001:0000C034 sassfm01.dll
66FDD32B 053BFC28 0001:0001C32B sassfm01.dll
66FCBDC6 053BFCC8 0001:0000ADC6 sassfm01.dll
66FC386E 053BFCEC 0001:0000286E sassfm01.dll
661217DD 053BFF58 0001:000007DD sastabul.dll
67E223EE 053BFF74 0001:000113EE sashost.dll
67E26DE0 053BFF88 0001:00015DE0 sashost.dll
7638338A 053BFF94 kernel32:BaseThreadInitThunk+0x12
772D9A02 053BFFD4 ntdll:RtlInitializeExceptionChain+0x63
772D99D5 053BFFEC ntdll:RtlInitializeExceptionChain+0x36
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 546 observations read from the data set WORK.FINAL.
NOTE: PROCEDURE TABULATE used (Total process time):
real time 0.02 seconds
cpu time 0.03 seconds
5 22 ods pdf close;
NOTE: ODS PDF printed no output.
(This sometimes results from failing to place a RUN statement before the ODS PDF CLOSE
statement.)
523 ods listing;
What am I doing incorrectly? I also tried doing the (multilabel notsorted) in the prof formate for age but it still didn't work properly. Not sure what I am doing wrong and I would appreciate some help.

You have an extra semicolon at the end of the line for your age format.
value Age (multilabel);
Remove it and you should be okay.

Related

SAS: Unable to add variable to data set

I have a data set and am trying to add four new variables using the existing ones. I keep getting an error that says the code is incomplete. I'm having trouble seeing where it is incomplete. How do I fix this?
data dataset;
input ID $
Height
Weight
SBP
DBP
WtKg = Weight/2.2;
HtCm = Height/2.4;
AveBP = DBP + (SBP - DBP)/3;
HtPolynomial = (2*Height)**2 + (1.5*Height)**3;
datalines;
001 68 150 110 70
002 73 240 150 90
003 62 101 120 80
run;
You did not end your input statement with a semicolon. input reads variables from external data (in this case, in-line data with the datalines statement). New variables are not created within input in the way you've specified.
Use input to read in the five variables of your data. After that, create new variables based on those five read-in variables:
data dataset;
input ID $
Height
Weight
SBP
DBP
;
WtKg = Weight/2.2;
HtCm = Height/2.4;
AveBP = DBP + (SBP - DBP)/3;
HtPolynomial = (2*Height)**2 + (1.5*Height)**3;
datalines;
001 68 150 110 70
002 73 240 150 90
003 62 101 120 80
;
run;
Correcting 2 errors should fix this:
Add a semicolon after the last field being read in from the datalines, which is DBP.
(A previous version of this question used the ^ symbol for exponents.) Instead of ^ to raise to the power of something, use **
For reference, SAS arithmetic operators are described here.
After making the 2 corrections above I ran the revised code below without any errors.
data dataset;
input ID $
Height
Weight
SBP
DBP;
WtKg = Weight/2.2;
HtCm = Height/2.4;
AveBP = DBP + (SBP - DBP)/3;
HtPolynomial = (2*Height)**2 + (1.5*Height)**3;
datalines;
001 68 150 110 70
002 73 240 150 90
003 62 101 120 80
run;

SAS: Avoid end-of-line problem and LOST CARD

I'm working through a SAS exercise, which has data in the following format:
3496 Jerry Nelson 13960 Wilson Dr. San Diego CA 92191 40 4
3498 Scott Mason 9226 College Dr. Oak View CA 93022 95 2
3498 CA 35 3
3498 CA 35 11
3500 Michele Stone 8393 West Ct. Emeryville CA 94608 55 5
3500 CA 70 5
For each person, the data continues until the next person's name. The following code is very close to what I need, I think:
libname Ch4data '\\Client\C$\Users\m210028\Google Drive\Adrian\Self-Study\SAS\Chapter4_data';
Data Ch4data.my_donations;
Infile '\\Client\C$\Users\m210028\Google Drive\Adrian\Self-Study\SAS\Chapter4_data\Donations.dat' MISSOVER;
Array amounts(10);
Array months(10);
Input first_name $ 6 - 19
last_name $ 20 - 33
street_address $ 34 - 58
city $ 59 - 88
state_code $ 89 - 93
zip_code $ 94 - 100
amounts(1) 101 - 105 # 106
months(1);
end = end1;
If ~(end1) Then
Do;
Input test_char $ 6-6 #;
i = 2;
Do While (0 = ANYALPHA(test_char));
Input amounts(i) 101 - 105 # 106
months(i);
end = end1;
If ~(end1) Then Input test_char $ 6-6 #;
Else test_char = '';
i = i+1;
End;
End;
Run;
Proc Print Data = Ch4data.my_donations;
Title 'Donations to Coastal Humane Society';
Run;
The problem is that I'm getting a LOST CARD note in the log, and the last name in the file, Michele Stone, doesn't make it into the data set. I suspect my code for detecting the end-of-file is incorrect. Could someone please show me how to detect the end-of-file? The SAS documentation is not helpful.
Many thanks for your time!
[UPDATE]: Thanks to Tom's comment, I can now get the last line with the following code:
libname Ch4data '\\Client\C$\Users\m210028\Google Drive\Adrian\Self-Study\SAS\Chapter4_data';
Data Ch4data.my_donations;
Infile '\\Client\C$\Users\m210028\Google Drive\Adrian\Self-Study\SAS\Chapter4_data\Donations.dat' MISSOVER END=end1;
Array amounts(10);
Array months(10);
Input first_name $ 6 - 19
last_name $ 20 - 33
street_address $ 34 - 58
city $ 59 - 88
state_code $ 89 - 93
zip_code $ 94 - 100
amounts(1) 101 - 105 # 106
months(1);
If ~(end1) Then
Do;
Input test_char $ 6-6 #;
i = 2;
Do While (0 = ANYALPHA(test_char));
Input amounts(i) 101 - 105 # 106
months(i);
If ~(end1) Then Input test_char $ 6-6 #;
Else test_char = '';
i = i+1;
End;
End;
Run;
Proc Print Data = Ch4data.my_donations;
Title 'Donations to Coastal Humane Society';
Run;
Unfortunately, it's not getting the second-to-last line. For that matter, it's skipping a lot of first lines of records. Thoughts?
You are trying to combine reading and transposing. It is probably easier to read first and then transpose. In fact you can just read
data step1;
Infile example truncover ;
Input first_name $ 6 - 19
last_name $ 20 - 33
street_address $ 34 - 58
city $ 59 - 88
state_code $ 89 - 93
zip_code $ 94 - 100
amount 101 - 105
month 105 - 110
;
if not missing(first_name) then case+1;
run;
and then apply the carry-forward of the names etc.
data step2;
update step1(obs=0) step1;
by case;
output;
run;
and then transpose.
data want;
do row=1 by 1 until(last.case);
set step2;
by case;
array months [10];
array amounts [10];
months[row]=month;
amounts[row]=amount;
end;
drop row amount month;
run;
You will need to use the line holding specifier ## to hold the line when your name check detects the first line of the next group.
filename exercise 'c:\temp\exercise.txt';
* create file to read in;
data _null_;
file exercise;
input;
put _infile_;
datalines;
3496 Jerry Nelson 13960 Wilson Dr. San Diego CA 92191 40 4
3498 Scott Mason 9226 College Dr. Oak View CA 93022 95 2
3498 CA 35 3
3498 CA 35 11
3500 Michele Stone 8393 West Ct. Emeryville CA 94608 55 5
3500 CA 70 5
run;
* read-in the data;
* error will occur if data file has a group with more than 10 months of data;
data want;
infile exercise end=end_of_data ;
array amounts(10);
array months(10);
input first_name $ 6 - 19
last_name $ 20 - 33
street_address $ 34 - 58
city $ 59 - 88
state_code $ 89 - 93
zip_code $ 94 - 100
amounts(1) 101 - 105
# 106 months(1);
do i = 2 by 1 while (not end_of_data);
input name_check $ 6-6 ##;
if name_check = ' ' then
input amounts(i) 101-105 #106 months(i);
else
leave; /* jump out of loop
* when control returns to top the input will be of the held line
*/
end;
run;

Doing Principal Components in SAS Using a Holdout and to Score New Data

I am performing Principal Components Analysis in SAS Enterprise Guide and wish to compute factor/component scores on some holdout.
KeepCombinedLR is my primary source of truth. I have another dataset, with the exact same variables, that I would like to be scored without including it in the actual factor analyses.
proc factor data = KeepCombinedLR
simple
method = prin
priors = one
rotate = varimax reorder
mineigen = 1
nfactors = 25
out = FactorScores;
var var1--var40;
run;
data Fitness;
input Age Weight Oxygen RunTime RestPulse RunPulse ##;
datalines;
44 89.47 44.609 11.37 62 178 40 75.07 45.313 10.07 62 185
44 85.84 54.297 8.65 45 156 42 68.15 59.571 8.17 40 166
38 89.02 49.874 9.22 55 178 47 77.45 44.811 11.63 58 176
40 75.98 45.681 11.95 70 176 43 81.19 49.091 10.85 64 162
44 81.42 39.442 13.08 63 174 38 81.87 60.055 8.63 48 170
44 73.03 50.541 10.13 45 168 45 87.66 37.388 14.03 56 186
;
proc factor data=Fitness outstat=FactOut
method=prin rotate=varimax score;
var Age Weight RunTime RunPulse RestPulse;
title 'Factor Scoring Example';
run;
proc print data=FactOut;
title2 'Data Set from PROC FACTOR';
run;
proc score data=Fitness score=FactOut out=FScore;
var Age Weight RunTime RunPulse RestPulse;
run;
proc print data=FScore;
title2 'Data Set from PROC SCORE';
run;
PROC SCORE will score your data for you, using your 'holdout' data set.
https://documentation.sas.com/?docsetId=statug&docsetTarget=statug_score_examples01.htm&docsetVersion=14.3&locale=en

How to shift value of column as new variable name?

I have a dataset that looks like this
ID Model_Value Count_Model
111 24 2
222 12 9
234 88 6
111 88 8
222 24 10
222 88 17
I want it to look like this:
ID Model_12 Model_24 Model_88
111 0 2 8
222 9 10 17
234 0 0 6
I don't think I am searching online for the correct terms, I thought initially a transform might work but I still want the row to represent the ID not the model.
How do I go about creating this output from what I have?
Ok I believe this is it! Thank you #mjsqu !!
I was able to do this with the help of this link: http://www.sascommunity.org/mwiki/images/d/dd/PROC_Transpose_slides.pdf
data test_transpose ;
input #1 ID_P #6 Model_Value #18 Count_Model ;
cards;
111 24 2
222 12 9
234 88 6
111 88 8
222 24 10
222 88 17
run;
proc print data=test_transpose;
run;
proc sort data=test_transpose out=test_transpose_S;
By ID_P;
run;
proc transpose
data = test_transpose_S
out = test_transpose_result (drop=_name_)
prefix=Model_Value;
var Count_Model;
BY ID_P;
id Model_Value;
run;
proc print data=test_transpose_result ;
run;
Output of the original sorted dataset and the transpose!

How to create a running 3 observation average in SAS?

I have a dataset with some volumes in a column and I want to create a second column that contains the average of the previous three observations. Is this possible?
e.g.
data have;
input Vol Avg_pre_4;
datalines;
228 .
141 .
125 .
101 164.66
116 122.33
107 114
74 108
118 99
127 99.67
123 106.33
;
run;
The LAG function is an automatic built-in queue.
VOL_AVG_OF_PRIOR3 = MEAN ( lag(Vol), lag2(Vol), lag3(Vol) )
if _n_ < 4 then VOL_AVG_OF_PRIOR3 = .;