SAS: Unable to add variable to data set - sas

I have a data set and am trying to add four new variables using the existing ones. I keep getting an error that says the code is incomplete. I'm having trouble seeing where it is incomplete. How do I fix this?
data dataset;
input ID $
Height
Weight
SBP
DBP
WtKg = Weight/2.2;
HtCm = Height/2.4;
AveBP = DBP + (SBP - DBP)/3;
HtPolynomial = (2*Height)**2 + (1.5*Height)**3;
datalines;
001 68 150 110 70
002 73 240 150 90
003 62 101 120 80
run;

You did not end your input statement with a semicolon. input reads variables from external data (in this case, in-line data with the datalines statement). New variables are not created within input in the way you've specified.
Use input to read in the five variables of your data. After that, create new variables based on those five read-in variables:
data dataset;
input ID $
Height
Weight
SBP
DBP
;
WtKg = Weight/2.2;
HtCm = Height/2.4;
AveBP = DBP + (SBP - DBP)/3;
HtPolynomial = (2*Height)**2 + (1.5*Height)**3;
datalines;
001 68 150 110 70
002 73 240 150 90
003 62 101 120 80
;
run;

Correcting 2 errors should fix this:
Add a semicolon after the last field being read in from the datalines, which is DBP.
(A previous version of this question used the ^ symbol for exponents.) Instead of ^ to raise to the power of something, use **
For reference, SAS arithmetic operators are described here.
After making the 2 corrections above I ran the revised code below without any errors.
data dataset;
input ID $
Height
Weight
SBP
DBP;
WtKg = Weight/2.2;
HtCm = Height/2.4;
AveBP = DBP + (SBP - DBP)/3;
HtPolynomial = (2*Height)**2 + (1.5*Height)**3;
datalines;
001 68 150 110 70
002 73 240 150 90
003 62 101 120 80
run;

Related

SAS: Avoid end-of-line problem and LOST CARD

I'm working through a SAS exercise, which has data in the following format:
3496 Jerry Nelson 13960 Wilson Dr. San Diego CA 92191 40 4
3498 Scott Mason 9226 College Dr. Oak View CA 93022 95 2
3498 CA 35 3
3498 CA 35 11
3500 Michele Stone 8393 West Ct. Emeryville CA 94608 55 5
3500 CA 70 5
For each person, the data continues until the next person's name. The following code is very close to what I need, I think:
libname Ch4data '\\Client\C$\Users\m210028\Google Drive\Adrian\Self-Study\SAS\Chapter4_data';
Data Ch4data.my_donations;
Infile '\\Client\C$\Users\m210028\Google Drive\Adrian\Self-Study\SAS\Chapter4_data\Donations.dat' MISSOVER;
Array amounts(10);
Array months(10);
Input first_name $ 6 - 19
last_name $ 20 - 33
street_address $ 34 - 58
city $ 59 - 88
state_code $ 89 - 93
zip_code $ 94 - 100
amounts(1) 101 - 105 # 106
months(1);
end = end1;
If ~(end1) Then
Do;
Input test_char $ 6-6 #;
i = 2;
Do While (0 = ANYALPHA(test_char));
Input amounts(i) 101 - 105 # 106
months(i);
end = end1;
If ~(end1) Then Input test_char $ 6-6 #;
Else test_char = '';
i = i+1;
End;
End;
Run;
Proc Print Data = Ch4data.my_donations;
Title 'Donations to Coastal Humane Society';
Run;
The problem is that I'm getting a LOST CARD note in the log, and the last name in the file, Michele Stone, doesn't make it into the data set. I suspect my code for detecting the end-of-file is incorrect. Could someone please show me how to detect the end-of-file? The SAS documentation is not helpful.
Many thanks for your time!
[UPDATE]: Thanks to Tom's comment, I can now get the last line with the following code:
libname Ch4data '\\Client\C$\Users\m210028\Google Drive\Adrian\Self-Study\SAS\Chapter4_data';
Data Ch4data.my_donations;
Infile '\\Client\C$\Users\m210028\Google Drive\Adrian\Self-Study\SAS\Chapter4_data\Donations.dat' MISSOVER END=end1;
Array amounts(10);
Array months(10);
Input first_name $ 6 - 19
last_name $ 20 - 33
street_address $ 34 - 58
city $ 59 - 88
state_code $ 89 - 93
zip_code $ 94 - 100
amounts(1) 101 - 105 # 106
months(1);
If ~(end1) Then
Do;
Input test_char $ 6-6 #;
i = 2;
Do While (0 = ANYALPHA(test_char));
Input amounts(i) 101 - 105 # 106
months(i);
If ~(end1) Then Input test_char $ 6-6 #;
Else test_char = '';
i = i+1;
End;
End;
Run;
Proc Print Data = Ch4data.my_donations;
Title 'Donations to Coastal Humane Society';
Run;
Unfortunately, it's not getting the second-to-last line. For that matter, it's skipping a lot of first lines of records. Thoughts?
You are trying to combine reading and transposing. It is probably easier to read first and then transpose. In fact you can just read
data step1;
Infile example truncover ;
Input first_name $ 6 - 19
last_name $ 20 - 33
street_address $ 34 - 58
city $ 59 - 88
state_code $ 89 - 93
zip_code $ 94 - 100
amount 101 - 105
month 105 - 110
;
if not missing(first_name) then case+1;
run;
and then apply the carry-forward of the names etc.
data step2;
update step1(obs=0) step1;
by case;
output;
run;
and then transpose.
data want;
do row=1 by 1 until(last.case);
set step2;
by case;
array months [10];
array amounts [10];
months[row]=month;
amounts[row]=amount;
end;
drop row amount month;
run;
You will need to use the line holding specifier ## to hold the line when your name check detects the first line of the next group.
filename exercise 'c:\temp\exercise.txt';
* create file to read in;
data _null_;
file exercise;
input;
put _infile_;
datalines;
3496 Jerry Nelson 13960 Wilson Dr. San Diego CA 92191 40 4
3498 Scott Mason 9226 College Dr. Oak View CA 93022 95 2
3498 CA 35 3
3498 CA 35 11
3500 Michele Stone 8393 West Ct. Emeryville CA 94608 55 5
3500 CA 70 5
run;
* read-in the data;
* error will occur if data file has a group with more than 10 months of data;
data want;
infile exercise end=end_of_data ;
array amounts(10);
array months(10);
input first_name $ 6 - 19
last_name $ 20 - 33
street_address $ 34 - 58
city $ 59 - 88
state_code $ 89 - 93
zip_code $ 94 - 100
amounts(1) 101 - 105
# 106 months(1);
do i = 2 by 1 while (not end_of_data);
input name_check $ 6-6 ##;
if name_check = ' ' then
input amounts(i) 101-105 #106 months(i);
else
leave; /* jump out of loop
* when control returns to top the input will be of the held line
*/
end;
run;

Doing Principal Components in SAS Using a Holdout and to Score New Data

I am performing Principal Components Analysis in SAS Enterprise Guide and wish to compute factor/component scores on some holdout.
KeepCombinedLR is my primary source of truth. I have another dataset, with the exact same variables, that I would like to be scored without including it in the actual factor analyses.
proc factor data = KeepCombinedLR
simple
method = prin
priors = one
rotate = varimax reorder
mineigen = 1
nfactors = 25
out = FactorScores;
var var1--var40;
run;
data Fitness;
input Age Weight Oxygen RunTime RestPulse RunPulse ##;
datalines;
44 89.47 44.609 11.37 62 178 40 75.07 45.313 10.07 62 185
44 85.84 54.297 8.65 45 156 42 68.15 59.571 8.17 40 166
38 89.02 49.874 9.22 55 178 47 77.45 44.811 11.63 58 176
40 75.98 45.681 11.95 70 176 43 81.19 49.091 10.85 64 162
44 81.42 39.442 13.08 63 174 38 81.87 60.055 8.63 48 170
44 73.03 50.541 10.13 45 168 45 87.66 37.388 14.03 56 186
;
proc factor data=Fitness outstat=FactOut
method=prin rotate=varimax score;
var Age Weight RunTime RunPulse RestPulse;
title 'Factor Scoring Example';
run;
proc print data=FactOut;
title2 'Data Set from PROC FACTOR';
run;
proc score data=Fitness score=FactOut out=FScore;
var Age Weight RunTime RunPulse RestPulse;
run;
proc print data=FScore;
title2 'Data Set from PROC SCORE';
run;
PROC SCORE will score your data for you, using your 'holdout' data set.
https://documentation.sas.com/?docsetId=statug&docsetTarget=statug_score_examples01.htm&docsetVersion=14.3&locale=en

How to shift value of column as new variable name?

I have a dataset that looks like this
ID Model_Value Count_Model
111 24 2
222 12 9
234 88 6
111 88 8
222 24 10
222 88 17
I want it to look like this:
ID Model_12 Model_24 Model_88
111 0 2 8
222 9 10 17
234 0 0 6
I don't think I am searching online for the correct terms, I thought initially a transform might work but I still want the row to represent the ID not the model.
How do I go about creating this output from what I have?
Ok I believe this is it! Thank you #mjsqu !!
I was able to do this with the help of this link: http://www.sascommunity.org/mwiki/images/d/dd/PROC_Transpose_slides.pdf
data test_transpose ;
input #1 ID_P #6 Model_Value #18 Count_Model ;
cards;
111 24 2
222 12 9
234 88 6
111 88 8
222 24 10
222 88 17
run;
proc print data=test_transpose;
run;
proc sort data=test_transpose out=test_transpose_S;
By ID_P;
run;
proc transpose
data = test_transpose_S
out = test_transpose_result (drop=_name_)
prefix=Model_Value;
var Count_Model;
BY ID_P;
id Model_Value;
run;
proc print data=test_transpose_result ;
run;
Output of the original sorted dataset and the transpose!

How to create a running 3 observation average in SAS?

I have a dataset with some volumes in a column and I want to create a second column that contains the average of the previous three observations. Is this possible?
e.g.
data have;
input Vol Avg_pre_4;
datalines;
228 .
141 .
125 .
101 164.66
116 122.33
107 114
74 108
118 99
127 99.67
123 106.33
;
run;
The LAG function is an automatic built-in queue.
VOL_AVG_OF_PRIOR3 = MEAN ( lag(Vol), lag2(Vol), lag3(Vol) )
if _n_ < 4 then VOL_AVG_OF_PRIOR3 = .;

In the following SAS statement, what do the parameters "noobs" and "label" stand for?

In the following SAS statement, what do the parameters "noobs" and "label" stand for?
proc print data-sasuser.schedule noobs label;
per SAS 9.2 documentation on PROC PRINT:
"NOOBS - Suppress the column in the output that identifies each observation by number"
"LABEL - Use variables' labels as column headings"
noobs don't show you the column of observations number
(1,2,3,4,5,....)
my first title
results without noobs
Obs name sex group height weight
1 mike m a 21 150
2 henry m b 30 140
3 norian f b 18 130
4 nadine f b 32 135
5 dianne f a 23 135
results with noobs
my first title
name sex group height weight
mike m a 21 150
henry m b 30 140
norian f b 18 130
nadine f b 32 135
dianne f a 23 135