Create date variable from time (Using SAS 9.3) - sas

Using SAS 9.3
I have files with two variables (Time and pulse), one file for each person.
I have the information which date they started measuring for each person.
Now I want create a date variable whom change date at midnight (of course), how?
Example from text files:
23:58:02 106
23:58:07 105
23:58:12 103
23:58:17 98
23:58:22 100
23:58:27 97
23:58:32 99
23:58:37 100
23:58:42 99
23:58:47 104
23:58:52 95
23:58:57 96
23:59:02 98
23:59:07 96
23:59:12 104
23:59:17 109
23:59:22 105
23:59:27 111
23:59:32 111
23:59:37 104
23:59:42 110
23:59:47 100
23:59:52 106
23:59:57 114
00:00:02 123
00:00:07 130
00:00:12 130
00:00:17 125
00:00:22 119
00:00:27 116
00:00:32 122
00:00:37 116
00:00:42 119
00:00:47 117
00:00:52 114
00:00:57 114
00:01:02 110
00:01:07 103
00:01:12 98
00:01:17 98
00:01:22 102
00:01:27 97
00:01:32 99
00:01:37 93
00:01:42 97
00:01:47 103
00:01:52 96
00:01:57 93
00:02:02 93
00:02:07 95
00:02:12 106
00:02:17 99
00:02:22 102
00:02:27 96
00:02:32 93
00:02:37 97
00:02:42 102
00:02:47 101
00:02:52 95
00:02:57 92
00:03:02 100
00:03:07 95
00:03:12 102
00:03:17 102
00:03:22 109
00:03:27 109
00:03:32 107
00:03:37 111
00:03:42 112
00:03:47 113
00:03:52 115

Regex:
\d{2}:\d{2}:\d{2} \d*
See here for an example and play around with regex:
https://regex101.com/r/xF1fQ5/1
EDIT: and have a look at the SAS regex tip sheet: http://support.sas.com/rnd/base/datastep/perl_regexp/regexp-tip-sheet.pdf

Something like this:
Date lastDate = startDate;
List<NData> ListData = new ArrayList<NData>();
for(FileData fdat:ListFileData){
Date nDate = this.getDate(lastDate,fdat.gettime());
NData ndata= new NData(ndate,fdat.getMeasuring());
LisData.add(nData);
lastDate = nDate;
}
.
.
.
.
function Date getDate(Date ld,String time){
Calendar cal = Calendar.getInstance();
cal.setTime(ld);
int year = cal.get(Calendar.YEAR);
int month = cal.get(Calendar.MONTH)+1;
int day = cal.get(Calendar.DAY_OF_MONTH);
int hourOfDay = this.getHour(time);
int minuteOfHour = this.getMinute(time);
org.joda.time.LocalDateTime lastDate = new org.joda.time.LocalDateTime(ld)
org.joda.time.LocalDateTime newDate = new org.joda.time.LocalDateTime(year,month,day,hourOfDay,minuteOfHour);
if(newDate.isBefore(lastDate)){
newDate = newDate.plusDays(1);
}
return newDate.toDate();
}

It's hard to provide a complete answer without sample code, but the SAS lag() function might be enough to do what you need. Your data step would include lines like the following, assuming your time variable is called time and your date variable is called date:
retain date;
if time < lag(time) then date = date + 1;
This assumes you never have any 24 hour gaps (but it appears you'd have to assume that anyway).
This answer also assumes that the time field is already in a SAS time format.

Related

SAS: Unable to add variable to data set

I have a data set and am trying to add four new variables using the existing ones. I keep getting an error that says the code is incomplete. I'm having trouble seeing where it is incomplete. How do I fix this?
data dataset;
input ID $
Height
Weight
SBP
DBP
WtKg = Weight/2.2;
HtCm = Height/2.4;
AveBP = DBP + (SBP - DBP)/3;
HtPolynomial = (2*Height)**2 + (1.5*Height)**3;
datalines;
001 68 150 110 70
002 73 240 150 90
003 62 101 120 80
run;
You did not end your input statement with a semicolon. input reads variables from external data (in this case, in-line data with the datalines statement). New variables are not created within input in the way you've specified.
Use input to read in the five variables of your data. After that, create new variables based on those five read-in variables:
data dataset;
input ID $
Height
Weight
SBP
DBP
;
WtKg = Weight/2.2;
HtCm = Height/2.4;
AveBP = DBP + (SBP - DBP)/3;
HtPolynomial = (2*Height)**2 + (1.5*Height)**3;
datalines;
001 68 150 110 70
002 73 240 150 90
003 62 101 120 80
;
run;
Correcting 2 errors should fix this:
Add a semicolon after the last field being read in from the datalines, which is DBP.
(A previous version of this question used the ^ symbol for exponents.) Instead of ^ to raise to the power of something, use **
For reference, SAS arithmetic operators are described here.
After making the 2 corrections above I ran the revised code below without any errors.
data dataset;
input ID $
Height
Weight
SBP
DBP;
WtKg = Weight/2.2;
HtCm = Height/2.4;
AveBP = DBP + (SBP - DBP)/3;
HtPolynomial = (2*Height)**2 + (1.5*Height)**3;
datalines;
001 68 150 110 70
002 73 240 150 90
003 62 101 120 80
run;

How can I delete objects without complete data by using stata

I have a large panel dataset that looks as follows.
input id age high weight str6 daily_drink
1 10 110 35 water
1 10 110 35 coffee
1 11 120 38 water
1 11 120 38 coffee
1 12 130 50 water
1 12 130 50 coffee
2 11 118 31 water
2 11 118 31 coffee
2 11 118 31 milk
2 12 123 38 water
2 12 123 38 coffee
2 12 123 38 milk
3 10 98 55 water
3 11 116 36 water
3 12 129 39 water
4 12 125 40 water
end
However, I would like to use stata to keep objects with complete 10, 11, and 12 age. Looks like this.
id age high weight daily_drink
1 10 110 35 water
1 10 110 35 coffee
1 11 120 38 water
1 11 120 38 coffee
1 12 130 50 water
1 12 130 50 coffee
3 10 98 55 water
3 11 116 36 water
3 12 129 39 water
However, all the rows are without missing data, so I cannot simply delete the row with missing data. Is there any way to do it? Any suggestion will help. Thanks in advance.
You can use bysort and egen for this. Something along the lines of
bysort id: egen has10 = total(age==10)
bysort id: egen has11 = total(age==11)
bysort id: egen has12 = total(age==12)
keep if (has10 != 0) & (has11 != 0) & (has12 != 0)
should work (untested). See help egen for more info. Install gtools if you have very large data (ssc install gtools) and then replace egen by gegen.
A solution that works if 10, 11, 12 are the only age values possible:
bysort id (age) : gen nvals = sum(age != age[_n-1])
by id : replace nvals = nvals[_N]
keep if nvals == 3
Consider also
bysort id (age) : gen OK1 = age[1] == 10 & age[_N] == 12
by id : egen OK2 = max(age == 11)
keep if OK1 & OK2

Doing Principal Components in SAS Using a Holdout and to Score New Data

I am performing Principal Components Analysis in SAS Enterprise Guide and wish to compute factor/component scores on some holdout.
KeepCombinedLR is my primary source of truth. I have another dataset, with the exact same variables, that I would like to be scored without including it in the actual factor analyses.
proc factor data = KeepCombinedLR
simple
method = prin
priors = one
rotate = varimax reorder
mineigen = 1
nfactors = 25
out = FactorScores;
var var1--var40;
run;
data Fitness;
input Age Weight Oxygen RunTime RestPulse RunPulse ##;
datalines;
44 89.47 44.609 11.37 62 178 40 75.07 45.313 10.07 62 185
44 85.84 54.297 8.65 45 156 42 68.15 59.571 8.17 40 166
38 89.02 49.874 9.22 55 178 47 77.45 44.811 11.63 58 176
40 75.98 45.681 11.95 70 176 43 81.19 49.091 10.85 64 162
44 81.42 39.442 13.08 63 174 38 81.87 60.055 8.63 48 170
44 73.03 50.541 10.13 45 168 45 87.66 37.388 14.03 56 186
;
proc factor data=Fitness outstat=FactOut
method=prin rotate=varimax score;
var Age Weight RunTime RunPulse RestPulse;
title 'Factor Scoring Example';
run;
proc print data=FactOut;
title2 'Data Set from PROC FACTOR';
run;
proc score data=Fitness score=FactOut out=FScore;
var Age Weight RunTime RunPulse RestPulse;
run;
proc print data=FScore;
title2 'Data Set from PROC SCORE';
run;
PROC SCORE will score your data for you, using your 'holdout' data set.
https://documentation.sas.com/?docsetId=statug&docsetTarget=statug_score_examples01.htm&docsetVersion=14.3&locale=en

How to shift value of column as new variable name?

I have a dataset that looks like this
ID Model_Value Count_Model
111 24 2
222 12 9
234 88 6
111 88 8
222 24 10
222 88 17
I want it to look like this:
ID Model_12 Model_24 Model_88
111 0 2 8
222 9 10 17
234 0 0 6
I don't think I am searching online for the correct terms, I thought initially a transform might work but I still want the row to represent the ID not the model.
How do I go about creating this output from what I have?
Ok I believe this is it! Thank you #mjsqu !!
I was able to do this with the help of this link: http://www.sascommunity.org/mwiki/images/d/dd/PROC_Transpose_slides.pdf
data test_transpose ;
input #1 ID_P #6 Model_Value #18 Count_Model ;
cards;
111 24 2
222 12 9
234 88 6
111 88 8
222 24 10
222 88 17
run;
proc print data=test_transpose;
run;
proc sort data=test_transpose out=test_transpose_S;
By ID_P;
run;
proc transpose
data = test_transpose_S
out = test_transpose_result (drop=_name_)
prefix=Model_Value;
var Count_Model;
BY ID_P;
id Model_Value;
run;
proc print data=test_transpose_result ;
run;
Output of the original sorted dataset and the transpose!

How to create a running 3 observation average in SAS?

I have a dataset with some volumes in a column and I want to create a second column that contains the average of the previous three observations. Is this possible?
e.g.
data have;
input Vol Avg_pre_4;
datalines;
228 .
141 .
125 .
101 164.66
116 122.33
107 114
74 108
118 99
127 99.67
123 106.33
;
run;
The LAG function is an automatic built-in queue.
VOL_AVG_OF_PRIOR3 = MEAN ( lag(Vol), lag2(Vol), lag3(Vol) )
if _n_ < 4 then VOL_AVG_OF_PRIOR3 = .;