SAS Proc IML rolling window (loop) - sas

I wanted to modify this working module given below into this upper one with purpose that instead of using whole sample of p from 1 to m, the module would use only previous 18 and next 18 values around the time-point x. So p(x-18...x+18). But I end up with error and can't really understand where's the problem. Error message with whole command line at the end of post.
start mhatx2(m,p,h,pi,e);
t5=j(m,1); /*mhatx omit x=t*/
upb=m-18;
do x=19 to upb;
lo=x-18;
up=x+18;
i=T(lo:up);
temp1=x-i;
ue=Kmod(temp1,h,pi,e)#p[i];
le=Kmod(temp1,h,pi,e);
t5[x]=(sum(ue)-ue[x])/(sum(le)-le[x]);
end;
return (t5);
finish;
start mhatx2(m,p,h,pi,e);
t5=j(m,1); /*mhatx omit x=t*/
do x=1 to nrow(p);
i=T(1:m);
temp1=x-i;
ue=Kmod(temp1,h,pi,e)#p[i];
le=Kmod(temp1,h,pi,e);
t5[x]=(sum(ue)-ue[x])/(sum(le)-le[x]);
end;
return (t5);
finish;
Error message:
430 proc iml;
NOTE: IML Ready
431
432
433 EDIT kirjasto.basfraaka var "open";
434
435 read all var "open" into p;
436
437
438 m=nrow(p);
439 x=T(1:m);
440 pi=constant("pi");
441 e=constant("e");
442
443 h=0.75;
444
445 start Kmod(x,h,pi,e);
446 k=1/(h#(2#pi)##(1/2))#e##(-x##2/(2#h##2));
447 return (k);
448 finish;
NOTE: Module KMOD defined.
449 start mhatx2(m,p,h,pi,e);
450 t5=j(m,1);
450! /*mhatx omit x=t*/
451 upb=m-18;
452 do x=19 to upb;
453 lo=x-18;
454 up=x+18;
455 i=T(lo:up);
456 temp1=x-i;
457 ue=Kmod(temp1,h,pi,e)#p[i];
458 le=Kmod(temp1,h,pi,e);
459 t5[x]=(sum(ue)-ue[x])/(sum(le)-le[x]);
460 end;
461 return (t5);
462 finish;
NOTE: Module MHATX2 defined.
463
464 ptz=j(m,1);
465 ptz=mhatx2(m,p,h,pi,e);
ERROR: (execution) Invalid subscript or subscript out of range.
operation : [ at line 459 column 18
operands : ue, x
ue 37 rows 1 col (numeric)
x 1 row 1 col (numeric)
38
statement : ASSIGN at line 459 column 1
traceback : module MHATX2 at line 459 column 1
NOTE: Paused in module MHATX2.
466 print ptz;
ERROR: Matrix ptz has not been set to a value.
statement : PRINT at line 466 column 1

It looks like this line:
t5[x]=(sum(ue)-ue[x])/(sum(le)-le[x]);
is incorrectly referencing ue and le members. If you're trying to subtract out the 'current iteration' piece, then you want
t5[x]=(sum(ue)-ue[19])/(sum(le)-le[19]);
since that is the 'middle' of the range (which corresponds to the current x value).

Related

How to create a running 3 observation average in SAS?

I have a dataset with some volumes in a column and I want to create a second column that contains the average of the previous three observations. Is this possible?
e.g.
data have;
input Vol Avg_pre_4;
datalines;
228 .
141 .
125 .
101 164.66
116 122.33
107 114
74 108
118 99
127 99.67
123 106.33
;
run;
The LAG function is an automatic built-in queue.
VOL_AVG_OF_PRIOR3 = MEAN ( lag(Vol), lag2(Vol), lag3(Vol) )
if _n_ < 4 then VOL_AVG_OF_PRIOR3 = .;

Reading Messy SAS Data

How can I read this -
C 303 102 140 B 293 C 399 B 450 233 456
450 A 289 282 555
like this -
Group Score
C 303
C 102
C 140
B 293
C 399
B 450
B 233
B 456
B 450
A 289
A 282
A 555
In SAS? I have tried the #'character' column pointer, which I cant seem to get right. This is the code so far :( -
data OUTCOMES;
infile 'testscores.txt';
input #'C' SCORES; Run;
Coo:
The double ampersand (##) operator for held input looks good [pun intended] for scanning inputs across line boundaries.
Construct an example external data file:
filename haveFile temp;
data _null_;
file haveFile;
put " C 303 102 140 B 293 C 399 B 450 233 456";
put "450 A 289 282 555";
run;
Read from the file, one token at a time.
data have ;
attrib
token group length=$10
score length=8
;
retain group;
infile haveFile ;
input token ##;
score = input (token, ?? 12.); * check if token can be interpreted as a number, the ?? modifier prevents errors and notes in the log;
if missing (score) and token ne '.' then
group = token;
else
output;
run;

Python frequency of 2D array

I want to see the frequency of the data for each year.
My array looks like this : List[Data,Year]
List[[259,1910],[259,1910],[259,1910],[192,1910].....
Data Year
259 1910
259 1910
259 1910
192 1910
313 1910
259 1911
259 1911
192 1912
313 1912
I want to get the result like
Data Year Frequency
259 1910 3
259 1911 2
259 1912 0
192 1910 1
192 1911 0
192 1912 1
...
..
.
You can use dictionary to count frequency. Python allows using tuple as dictionary key.
data = [259, 259, 192, 313, 259, 259, 192, 313]
yrs = [1910, 1910, 1910, 1910, 1911, 1911, 1912, 1912]
frequencies = {}
for idx in range(len(data)):
key = (data[idx], yrs[idx])
if key in frequencies:
frequencies[key] += 1
else:
frequencies[key] = 1
data_with_freq = []
for key, freq in frequencies.iteritems():
print (key[0], key[1], freq)
data_with_freq.append((key[0], key[1], freq))

Proc Format/proc tabulate error

I am running the following code:
ods listing close;
ods pdf file = "D:\work.pdf";
proc format;
value MS 1 = "Married - Spouse Present"
2 = "Married - Spouse Absent"
3 = "Widowed"
4 = "Divorced"
5 = "Seperated"
6 = "Never Married";
value Sex 1 = "Male"
2 = "Female";
value Race 1 = "White"
2 = "Black"
4 = "Asian";
value Hispanic 1 = "Hispanic";
value Age (multilabel);
16 - 19 = "16 to 19 years"
20 - 24 = "20 to 24 years"
25 - 54 = "25 to 54 years"
55 - 64 = "55 to 64 years"
16 - 85 = "Total, 16 years and over"
20 - 85 = "20 years and over"
25 - 85 = "25 years and over"
55 - 85 = "55 years and over"
65 - 85 = "65 years and over"
;
quit;
proc tabulate data = Final;
format age age.;
class Age/mlf;
class Race Hispanic_NonHispanic Marital_Status Full_Part_Time_Status Sex Year;
var Multi_Job;
table Age Race Hispanic_nonhispanic Marital_Status Full_Part_Time_Status, Sex*year*Multi_Job All / printmiss;
format Race Race. Hispanic_nonHispanic Hispanic. Marital_Status MS. Sex Sex.;
run;
ods pdf close;
ods listing;
But I get the following error message:
ods listing close;
481 ods pdf file = "D:\work.pdf";
NOTE: Writing ODS PDF output to DISK destination "D:\work.pdf", printer "PDF".
482
483 proc format;
484 value MS 1 = "Married - Spouse Present"
485 2 = "Married - Spouse Absent"
486 3 = "Widowed"
487 4 = "Divorced"
488 5 = "Seperated"
489 6 = "Never Married";
NOTE: Format MS has been output.
490
491 value Sex 1 = "Male"
492 2 = "Female";
NOTE: Format SEX has been output.
493
494 value Race 1 = "White"
495 2 = "Black"
496 4 = "Asian";
NOTE: Format RACE has been output.
497
498 value Hispanic 1 = "Hispanic";
NOTE: Format HISPANIC has been output.
499
500 value Age (multilabel);
NOTE: Format AGE has been output.
501
502 16 - 19 = "16 to 19 years"
............so on
ERROR: Write Access Violation In Task ( TABULATE )
Exception occurred at (679B8D96)
Task Traceback
Address Frame (DBGHELP API Version 4.0 rev 5)
679B8D96 053BF9EC 0001:00057D96 sasxkern.dll
679A0070 053BFAA8 0001:0003F070 sasxkern.dll
679788B2 053BFB3C 0001:000178B2 sasxkern.dll
66FC6323 053BFB4C 0001:00005323 sassfm01.dll
66FCD034 053BFBC8 0001:0000C034 sassfm01.dll
66FDD32B 053BFC28 0001:0001C32B sassfm01.dll
66FCBDC6 053BFCC8 0001:0000ADC6 sassfm01.dll
66FC386E 053BFCEC 0001:0000286E sassfm01.dll
661217DD 053BFF58 0001:000007DD sastabul.dll
67E223EE 053BFF74 0001:000113EE sashost.dll
67E26DE0 053BFF88 0001:00015DE0 sashost.dll
7638338A 053BFF94 kernel32:BaseThreadInitThunk+0x12
772D9A02 053BFFD4 ntdll:RtlInitializeExceptionChain+0x63
772D99D5 053BFFEC ntdll:RtlInitializeExceptionChain+0x36
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 546 observations read from the data set WORK.FINAL.
NOTE: PROCEDURE TABULATE used (Total process time):
real time 0.02 seconds
cpu time 0.03 seconds
5 22 ods pdf close;
NOTE: ODS PDF printed no output.
(This sometimes results from failing to place a RUN statement before the ODS PDF CLOSE
statement.)
523 ods listing;
What am I doing incorrectly? I also tried doing the (multilabel notsorted) in the prof formate for age but it still didn't work properly. Not sure what I am doing wrong and I would appreciate some help.
You have an extra semicolon at the end of the line for your age format.
value Age (multilabel);
Remove it and you should be okay.

Create date variable from time (Using SAS 9.3)

Using SAS 9.3
I have files with two variables (Time and pulse), one file for each person.
I have the information which date they started measuring for each person.
Now I want create a date variable whom change date at midnight (of course), how?
Example from text files:
23:58:02 106
23:58:07 105
23:58:12 103
23:58:17 98
23:58:22 100
23:58:27 97
23:58:32 99
23:58:37 100
23:58:42 99
23:58:47 104
23:58:52 95
23:58:57 96
23:59:02 98
23:59:07 96
23:59:12 104
23:59:17 109
23:59:22 105
23:59:27 111
23:59:32 111
23:59:37 104
23:59:42 110
23:59:47 100
23:59:52 106
23:59:57 114
00:00:02 123
00:00:07 130
00:00:12 130
00:00:17 125
00:00:22 119
00:00:27 116
00:00:32 122
00:00:37 116
00:00:42 119
00:00:47 117
00:00:52 114
00:00:57 114
00:01:02 110
00:01:07 103
00:01:12 98
00:01:17 98
00:01:22 102
00:01:27 97
00:01:32 99
00:01:37 93
00:01:42 97
00:01:47 103
00:01:52 96
00:01:57 93
00:02:02 93
00:02:07 95
00:02:12 106
00:02:17 99
00:02:22 102
00:02:27 96
00:02:32 93
00:02:37 97
00:02:42 102
00:02:47 101
00:02:52 95
00:02:57 92
00:03:02 100
00:03:07 95
00:03:12 102
00:03:17 102
00:03:22 109
00:03:27 109
00:03:32 107
00:03:37 111
00:03:42 112
00:03:47 113
00:03:52 115
Regex:
\d{2}:\d{2}:\d{2} \d*
See here for an example and play around with regex:
https://regex101.com/r/xF1fQ5/1
EDIT: and have a look at the SAS regex tip sheet: http://support.sas.com/rnd/base/datastep/perl_regexp/regexp-tip-sheet.pdf
Something like this:
Date lastDate = startDate;
List<NData> ListData = new ArrayList<NData>();
for(FileData fdat:ListFileData){
Date nDate = this.getDate(lastDate,fdat.gettime());
NData ndata= new NData(ndate,fdat.getMeasuring());
LisData.add(nData);
lastDate = nDate;
}
.
.
.
.
function Date getDate(Date ld,String time){
Calendar cal = Calendar.getInstance();
cal.setTime(ld);
int year = cal.get(Calendar.YEAR);
int month = cal.get(Calendar.MONTH)+1;
int day = cal.get(Calendar.DAY_OF_MONTH);
int hourOfDay = this.getHour(time);
int minuteOfHour = this.getMinute(time);
org.joda.time.LocalDateTime lastDate = new org.joda.time.LocalDateTime(ld)
org.joda.time.LocalDateTime newDate = new org.joda.time.LocalDateTime(year,month,day,hourOfDay,minuteOfHour);
if(newDate.isBefore(lastDate)){
newDate = newDate.plusDays(1);
}
return newDate.toDate();
}
It's hard to provide a complete answer without sample code, but the SAS lag() function might be enough to do what you need. Your data step would include lines like the following, assuming your time variable is called time and your date variable is called date:
retain date;
if time < lag(time) then date = date + 1;
This assumes you never have any 24 hour gaps (but it appears you'd have to assume that anyway).
This answer also assumes that the time field is already in a SAS time format.