Difference between WEKA instance predictions and confusion matrix results? - weka

I am not new to data mining, so am completely stumped with the WEKA results. Was hoping for some help. Thanks in advance!
I have a data set of numeric vectors that have a binary classification (S,H). I train a NaiveBayes model (although the method really doesn't matter) in leave one out cross-validation. The results are below:
=== Predictions on test data ===
inst# actual predicted error distribution
1 1:H 1:H *1,0
1 1:H 1:H *1,0
1 1:H 1:H *1,0
1 1:H 1:H *1,0
1 1:H 1:H *1,0
1 1:H 1:H *1,0
1 1:H 1:H *1,0
1 1:H 1:H *1,0
1 1:H 2:S + 0,*1
1 1:H 1:H *1,0
1 1:H 1:H *1,0
1 1:H 1:H *1,0
1 1:H 1:H *1,0
1 1:H 1:H *1,0
1 1:H 1:H *0.997,0.003
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 1:H + *1,0
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 2:S 0,*1
1 2:S 1:H + *1,0
=== Stratified cross-validation ===
=== Summary ===
Total Number of Instances 66
=== Confusion Matrix ===
a b <-- classified as
14 1 | a = H
2 49 | b = S
As you can see there are three errors in both the output and the confusion matrix.
I then re-evaluate the model using an independent data set with the same attributes and same two classes. Here's the result:
=== Re-evaluation on test set ===
User supplied test set
Relation: FCBC_New.TagProt
Instances: unknown (yet). Reading incrementally
Attributes: 355
=== Predictions on user test set ===
inst# actual predicted error distribution
1 1:S 2:H + 0,*1
2 1:S 1:S *1,0
3 1:S 2:H + 0,*1
4 2:H 1:S + *1,0
5 2:H 2:H 0,*1
6 1:S 2:H + 0,*1
7 1:S 2:H + 0,*1
8 2:H 2:H 0,*1
9 1:S 1:S *1,0
10 1:S 2:H + 0,*1
11 1:S 2:H + 0,*1
12 2:H 1:S + *1,0
13 2:H 2:H 0,*1
14 1:S 2:H + 0,*1
15 1:S 2:H + 0,*1
16 1:S 2:H + 0,*1
17 2:H 2:H 0,*1
18 2:H 2:H 0,*1
19 1:S 2:H + 0,*1
20 1:S 2:H + 0,*1
21 1:S 2:H + 0,*1
22 1:S 1:S *1,0
23 1:S 2:H + 0,*1
24 1:S 2:H + 0,*1
25 2:H 1:S + *1,0
26 1:S 2:H + 0,*1
27 1:S 1:S *1,0
28 1:S 2:H + 0,*1
29 1:S 2:H + 0,*1
30 1:S 2:H + 0,*1
31 1:S 2:H + 0,*1
32 1:S 2:H + 0,*1
33 1:S 2:H + 0,*1
34 1:S 1:S *1,0
35 2:H 1:S + *1,0
36 1:S 2:H + 0,*1
37 1:S 1:S *1,0
38 1:S 1:S *1,0
39 2:H 1:S + *1,0
40 1:S 2:H + 0,*1
41 1:S 2:H + 0,*1
42 1:S 2:H + 0,*1
43 1:S 2:H + 0,*1
44 1:S 2:H + 0,*1
45 1:S 2:H + 0,*1
46 1:S 2:H + 0,*1
47 2:H 1:S + *1,0
48 1:S 2:H + 0,*1
49 2:H 1:S + *1,0
50 2:H 1:S + *1,0
51 1:S 2:H + 0,*1
52 1:S 2:H + 0,*1
53 2:H 1:S + *1,0
54 1:S 2:H + 0,*1
55 1:S 2:H + 0,*1
56 1:S 2:H + 0,*1
=== Summary ===
Correctly Classified Instances 44 78.5714 %
Incorrectly Classified Instances 12 21.4286 %
Kappa statistic 0.4545
Mean absolute error 0.2143
Root mean squared error 0.4629
Coverage of cases (0.95 level) 78.5714 %
Total Number of Instances 56
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure MCC ROC Area PRC Area Class
0.643 0.167 0.563 0.643 0.600 0.456 0.828 0.566 H
0.833 0.357 0.875 0.833 0.854 0.456 0.804 0.891 S
Weighted Avg. 0.786 0.310 0.797 0.786 0.790 0.456 0.810 0.810
=== Confusion Matrix ===
a b <-- classified as
9 5 | a = H
7 35 | b = S
And here is where my problems lie. The output clearly shows that there are many errors. In fact, there are 44. The confusion matrix and the result summary, on the other hand, suggest that there are 12 errors. Now, if the prediction classes were reversed, the confusion matrix would be true. So now, I look at the distribution of scores and I see that in the cross-validation results the value before the comma represents the H class, and the second value is the S class (so the value 1,0 means H prediction). However, in the test results these are reversed and the value 1,0 means S prediction. So, if I take the score distribution, the confusion matrix is right. If I take the prediction (H or S) the confusion matrix is wrong.
I tried changing all test file classes to be H or S. This does NOT change the output results or the confusion matrix totals: in the confusion matrix, 16 instances are always predicted a(H) and 40 are always b(S), even though the plain text output is actually 16 b(S) and 40 a(H). Any ideas what is going wrong? It must be a simple thing, but I am completely and totally at a loss...

It would be better if you could take a look at this weka tutorial on classification of instances http://preciselyconcise.com/apis_and_installations/training_a_weka_classifier_in_java.php Hope it helps. This tutorial also deals with binary classification (positive,negative).

Related

Point chart - two (or more) data rows

I would like to add the average Y-axis values for each X-axis value to a point chart. Is there any way to do this please? I would like to achieve a similar result to the second picture.
Expected result is here.
Data example
id
datum
year
month
day
weekday
hour
hourly_steps
cumulative_daily_steps
daily_steps
1
2021-01-01
2021
1
1
5
17
49
49
5837
2
2021-01-01
2021
1
1
5
18
4977
5026
5837
3
2021-01-01
2021
1
1
5
19
692
5718
5837
4
2021-01-01
2021
1
1
5
20
13
5731
5837
5
2021-01-01
2021
1
1
5
22
106
5837
5837
6
2021-01-02
2021
1
2
6
6
48
48
7965
7
2021-01-02
2021
1
2
6
9
97
145
7965
8
2021-01-02
2021
1
2
6
10
1109
1254
7965
9
2021-01-02
2021
1
2
6
11
253
1507
7965
10
2021-01-02
2021
1
2
6
12
126
1633
7965
11
2021-01-02
2021
1
2
6
13
51
1684
7965
12
2021-01-02
2021
1
2
6
14
690
2374
7965
13
2021-01-02
2021
1
2
6
15
3690
6064
7965
14
2021-01-02
2021
1
2
6
16
956
7020
7965
15
2021-01-02
2021
1
2
6
17
667
7687
7965
16
2021-01-02
2021
1
2
6
18
36
7723
7965
17
2021-01-02
2021
1
2
6
19
45
7768
7965
18
2021-01-02
2021
1
2
6
20
38
7806
7965
19
2021-01-02
2021
1
2
6
21
47
7853
7965
20
2021-01-02
2021
1
2
6
22
15
7868
7965
21
2021-01-02
2021
1
2
6
23
97
7965
7965
22
2021-01-03
2021
1
3
7
0
147
147
8007
23
2021-01-03
2021
1
3
7
7
15
162
8007
24
2021-01-03
2021
1
3
7
8
54
216
8007
25
2021-01-03
2021
1
3
7
9
47
263
8007
26
2021-01-03
2021
1
3
7
10
16
279
8007
27
2021-01-03
2021
1
3
7
11
16
295
8007
28
2021-01-03
2021
1
3
7
12
61
356
8007
29
2021-01-03
2021
1
3
7
13
1459
1815
8007
30
2021-01-03
2021
1
3
7
14
2869
4684
8007
31
2021-01-03
2021
1
3
7
15
2670
7354
8007
32
2021-01-03
2021
1
3
7
16
131
7485
8007
33
2021-01-03
2021
1
3
7
17
67
7552
8007
34
2021-01-03
2021
1
3
7
18
27
7579
8007
35
2021-01-03
2021
1
3
7
19
50
7629
8007
36
2021-01-03
2021
1
3
7
20
48
7677
8007
37
2021-01-03
2021
1
3
7
22
119
7796
8007
38
2021-01-03
2021
1
3
7
23
211
8007
8007
39
2021-01-04
2021
1
4
1
4
19
19
6022
40
2021-01-04
2021
1
4
1
6
94
113
6022
41
2021-01-04
2021
1
4
1
10
48
161
6022
42
2021-01-04
2021
1
4
1
11
97
258
6022
43
2021-01-04
2021
1
4
1
12
48
306
6022
44
2021-01-04
2021
1
4
1
13
39
345
6022
45
2021-01-04
2021
1
4
1
14
499
844
6022
46
2021-01-04
2021
1
4
1
15
799
1643
6022
47
2021-01-04
2021
1
4
1
16
180
1823
6022
48
2021-01-04
2021
1
4
1
17
55
1878
6022
49
2021-01-04
2021
1
4
1
18
27
1905
6022
50
2021-01-04
2021
1
4
1
19
2246
4151
6022
51
2021-01-04
2021
1
4
1
20
1518
5669
6022
52
2021-01-04
2021
1
4
1
21
247
5916
6022
53
2021-01-04
2021
1
4
1
22
106
6022
6022
54
2021-01-05
2021
1
5
2
4
18
18
7623
55
2021-01-05
2021
1
5
2
6
44
62
7623
56
2021-01-05
2021
1
5
2
7
51
113
7623
57
2021-01-05
2021
1
5
2
8
450
563
7623
58
2021-01-05
2021
1
5
2
9
385
948
7623
59
2021-01-05
2021
1
5
2
10
469
1417
7623
60
2021-01-05
2021
1
5
2
11
254
1671
7623
61
2021-01-05
2021
1
5
2
12
1014
2685
7623
62
2021-01-05
2021
1
5
2
13
415
3100
7623
63
2021-01-05
2021
1
5
2
14
297
3397
7623
64
2021-01-05
2021
1
5
2
15
31
3428
7623
65
2021-01-05
2021
1
5
2
17
50
3478
7623
66
2021-01-05
2021
1
5
2
18
3771
7249
7623
67
2021-01-05
2021
1
5
2
19
52
7301
7623
68
2021-01-05
2021
1
5
2
20
96
7397
7623
69
2021-01-05
2021
1
5
2
21
59
7456
7623
70
2021-01-05
2021
1
5
2
22
167
7623
7623
71
2021-01-06
2021
1
6
3
6
54
54
7916
72
2021-01-06
2021
1
6
3
7
1223
1277
7916
73
2021-01-06
2021
1
6
3
8
118
1395
7916
74
2021-01-06
2021
1
6
3
10
77
1472
7916
75
2021-01-06
2021
1
6
3
11
709
2181
7916
76
2021-01-06
2021
1
6
3
12
123
2304
7916
77
2021-01-06
2021
1
6
3
13
36
2340
7916
78
2021-01-06
2021
1
6
3
14
14
2354
7916
79
2021-01-06
2021
1
6
3
15
156
2510
7916
80
2021-01-06
2021
1
6
3
16
149
2659
7916
81
2021-01-06
2021
1
6
3
17
995
3654
7916
82
2021-01-06
2021
1
6
3
18
2022
5676
7916
83
2021-01-06
2021
1
6
3
19
34
5710
7916
84
2021-01-06
2021
1
6
3
21
937
6647
7916
85
2021-01-06
2021
1
6
3
22
1208
7855
7916
86
2021-01-06
2021
1
6
3
23
61
7916
7916
Here you go.
Data:
Add Deneb visual and then add the following fields ensuring that don't summarise is selected for each column.
Inside Deneb, paste the following spec.
{
"data": {"name": "dataset"},
"transform": [
{
"calculate": "datum['weekday ']<= 5?'weekday':'weekend'",
"as": "type"
}
],
"layer": [
{"mark": {"type": "point"}},
{
"mark": {"type": "line", "interpolate":"basis"},
"encoding": {
"x": {
"field": "hour"
},
"y": {
"aggregate": "mean",
"field": "cumulative_daily_steps"
}
}
}
],
"encoding": {
"x": {
"field": "hour",
"type": "quantitative",
"axis": {"title": "Hour of Day"}
},
"y": {
"field": "cumulative_daily_steps",
"type": "quantitative",
"axis": {
"title": "Cumulative Step Count"
}
},
"color": {
"field": "type",
"type": "nominal",
"scale": {
"range": ["red", "green"]
},
"legend": {"title": ""}
}
}
}

How to add a row where there is a disruption in series of numbers in Stata

I'm attempting to format a table of 40 different age-race-sex strata to be inputted into R-INLA and noticed that it's important to include all strata (even if they are not present in a county). These should be zeros. However, at this point my table only contains records for strata that are not empty. I can identify places where strata are missing for each county by looking at my strata variable and finding the breaks in the series 1 through 40 (marked with a red x in the image below).
In these places (marked by the red x) I need to add the missing rows and fill in the corresponding county code, strata code, population=0, and the correct corresponding race, sex, age code for the strata.
If I can figure out a way to add an empty row in the spaces with the red Xs from the image, and correctly assign the strata code (and county code) to these empty/missing rows, I am able to populate the rest of the values with the code below:
recode race = 1 & sex= 1 & age =4 if strata = 4
...etc
I'm wondering if there is a way to add the missing rows using an if statement that considers the fact that there are supposed to be forty strata for each county code. It would be ideal if this could populate the correct county code and strata code as well!
Dataex sample data:
* Example generated by -dataex-. To install: ssc install dataex
clear
input float OID str5 fips_statecounty double population byte(race sex age) float strata
1 "" 672 1 1 1 1
2 "" 1048 1 1 2 2
3 "" 883 1 1 3 3
4 "" 1129 1 1 4 4
5 "" 574 1 2 1 5
6 "" 986 1 2 2 6
7 "" 899 1 2 3 7
8 "" 1820 1 2 4 8
9 "" 96 2 1 1 9
10 "" 142 2 1 2 10
11 "" 81 2 1 3 11
12 "" 99 2 1 4 12
13 "" 71 2 2 1 13
14 "" 125 2 2 2 14
15 "" 103 2 2 3 15
16 "" 162 2 2 4 16
17 "" 31 3 1 1 17
18 "" 32 3 1 2 18
19 "" 18 3 1 3 19
20 "" 31 3 1 4 20
21 "" 22 3 2 1 21
22 "" 28 3 2 2 22
23 "" 28 3 2 3 23
24 "" 44 3 2 4 24
25 "" 20 4 1 1 25
26 "" 24 4 1 2 26
27 "" 21 4 1 3 27
28 "" 43 4 1 4 28
29 "" 19 4 2 1 29
30 "" 26 4 2 2 30
31 "" 24 4 2 3 31
32 "" 58 4 2 4 32
33 "" 6 5 1 1 33
34 "" 11 5 1 2 34
35 "" 13 5 1 3 35
36 "" 7 5 1 4 36
37 "" 7 5 2 1 37
38 "" 9 5 2 2 38
39 "" 10 5 2 3 39
40 "" 11 5 2 4 40
41 "01001" 239 1 1 1 1
42 "01001" 464 1 1 2 2
43 "01001" 314 1 1 3 3
44 "01001" 232 1 1 4 4
45 "01001" 284 1 2 1 5
46 "01001" 580 1 2 2 6
47 "01001" 392 1 2 3 7
48 "01001" 440 1 2 4 8
49 "01001" 41 2 1 1 9
50 "01001" 38 2 1 2 10
51 "01001" 23 2 1 3 11
52 "01001" 26 2 1 4 12
53 "01001" 34 2 2 1 13
54 "01001" 52 2 2 2 14
55 "01001" 40 2 2 3 15
56 "01001" 50 2 2 4 16
57 "01001" 4 3 1 1 17
58 "01001" 2 3 1 2 18
59 "01001" 3 3 1 3 19
60 "01001" 6 3 2 1 21
61 "01001" 4 3 2 2 22
62 "01001" 6 3 2 3 23
63 "01001" 4 3 2 4 24
64 "01001" 1 4 1 4 28
65 "01003" 1424 1 1 1 1
66 "01003" 2415 1 1 2 2
67 "01003" 1680 1 1 3 3
68 "01003" 1823 1 1 4 4
69 "01003" 1545 1 2 1 5
70 "01003" 2592 1 2 2 6
71 "01003" 1916 1 2 3 7
72 "01003" 2527 1 2 4 8
73 "01003" 68 2 1 1 9
74 "01003" 82 2 1 2 10
75 "01003" 52 2 1 3 11
76 "01003" 54 2 1 4 12
77 "01003" 72 2 2 1 13
78 "01003" 129 2 2 2 14
79 "01003" 81 2 2 3 15
80 "01003" 106 2 2 4 16
81 "01003" 10 3 1 1 17
82 "01003" 14 3 1 2 18
83 "01003" 8 3 1 3 19
84 "01003" 4 3 1 4 20
85 "01003" 8 3 2 1 21
86 "01003" 14 3 2 2 22
87 "01003" 17 3 2 3 23
88 "01003" 10 3 2 4 24
89 "01003" 4 4 1 1 25
90 "01003" 1 4 1 3 27
91 "01003" 2 4 1 4 28
92 "01003" 2 4 2 1 29
93 "01003" 3 4 2 2 30
94 "01003" 4 4 2 3 31
95 "01003" 10 4 2 4 32
96 "01003" 5 5 1 1 33
97 "01003" 4 5 1 2 34
98 "01003" 3 5 1 3 35
99 "01003" 5 5 1 4 36
100 "01003" 5 5 2 2 38
end
label values race race
label values sex sex
My answer to your previous question
Nested for-loop: error variable already defined
detailed how to create a minimal dataset with all strata present. Therefore you should just merge that with your main dataset and replace missings on the absent strata with whatever your other software expects, zeros it seems.
The complication most obvious at this point is you need to factor in a county variable. I can't see any information on how many counties you have in your dataset, which may affect what is practical. You should be able to break down the preparation into: first, prepare a minimal county dataset with identifiers only; then merge that with a complete strata dataset.

Effeciently extract last 12 month data from master table for each id and month using sas

I am currently practicing SAS programming on using two SAS dataset(sample and master) . Below are the hypothetical or dummy data created for illustration purpose to solve my problem through SAS programming . I would like to extract the data for the id's in sample dataset from master dataset. I have given an example with few id's as sample dataset, for which i need to extract last 12 month information from master table for each id's based on the yearmonth information( desired output given in the third output).
similar to this, i have many column which i need 12 months data for each id and yearmonth.
I have written a code with do loop to iterate each row of sample dataset then find the data in the master table from start (yearmonth and end date(12 month ago) for each iteration, and then transpose it using proc transpose. Then merge the sample dataset with transpose data using data step merge using id and yearmonth. But i feel the code which i have written is not optimized because it is lopping several times for each row in sample dataset and finds data from master table . Can anyone help me in solving this problem using SAS programming with optimized way.
One sample dataset (dataset name - sample).
ID YEARMONTH NO_OF_CUST
1 200909 50
1 201005 65
1 201008 78
1 201106 95
2 200901 65
2 200902 45
2 200903 69
2 201005 14
2 201006 26
2 201007 98
3 201011 75
3 201012 75
One master dataset(dataset name - master dataset huge dataset over the year for each id from start of the account to till date.)
ID YEARMONTH NO_OF_CUST
1 200808 125
1 200809 125
1 200810 111
1 200811 174
1 200812 98
1 200901 45
1 200902 74
1 200903 73
1 200904 101
1 200905 164
1 200906 104
1 200907 22
1 200908 35
1 200909 50
1 200910 77
1 200911 86
1 200912 95
1 201001 95
1 201002 87
1 201003 79
1 201004 71
1 201005 65
1 201006 66
1 201007 66
1 201008 78
1 201009 88
1 201010 54
1 201011 45
1 201012 100
1 201101 136
1 201102 111
1 201103 17
1 201104 77
1 201105 111
1 201106 95
1 201107 79
1 201108 777
1 201109 758
1 201110 32
1 201111 15
1 201112 22
2 200711 150
2 200712 150
2 200801 44
2 200802 385
2 200803 65
2 200804 66
2 200805 200
2 200806 333
2 200807 285
2 200808 265
2 200809 222
2 200810 220
2 200811 205
2 200812 185
2 200901 65
2 200902 45
2 200903 69
2 200904 546
2 200905 21
2 200906 256
2 200907 214
2 200908 14
2 200909 44
2 200910 65
2 200911 88
2 200912 79
2 201001 65
2 201002 45
2 201003 69
2 201004 54
2 201005 14
2 201006 26
2 201007 98
3 200912 77
3 201001 66
3 201002 69
3 201003 7
3 201004 7
3 201005 7
3 201006 65
3 201007 75
3 201008 85
3 201009 89
3 201010 100
3 201011 75
3 201012 75
Below is sample output which i am trying to update for an each sample id's in sample dataset.
Without sample code of what you've been trying to do so far it is a bit difficult to figure out what you want but a "SAS" way of getting the same result as the image file might be the following.
EDIT: Edited my answer so it takes the last 12 months by ID
data test;
infile datalines dlm='09'x;
input ID YEARMONTH NO_OF_CUST;
datalines;
1 200808 125
1 200809 125
1 200810 111
1 200811 174
1 200812 98
1 200901 45
1 200902 74
1 200903 73
1 200904 101
1 200905 164
1 200906 104
1 200907 22
1 200908 35
1 200909 50
1 200910 77
1 200911 86
1 200912 95
1 201001 95
1 201002 87
1 201003 79
1 201004 71
1 201005 65
1 201006 66
1 201007 66
1 201008 78
1 201009 88
1 201010 54
1 201011 45
1 201012 100
1 201101 136
1 201102 111
1 201103 17
1 201104 77
1 201105 111
1 201106 95
1 201107 79
1 201108 777
1 201109 758
1 201110 32
1 201111 15
1 201112 22
2 200711 150
2 200712 150
2 200801 44
2 200802 385
2 200803 65
2 200804 66
2 200805 200
2 200806 333
2 200807 285
2 200808 265
2 200809 222
2 200810 220
2 200811 205
2 200812 185
2 200901 65
2 200902 45
2 200903 69
2 200904 546
2 200905 21
2 200906 256
2 200907 214
2 200908 14
2 200909 44
2 200910 65
2 200911 88
2 200912 79
2 201001 65
2 201002 45
2 201003 69
2 201004 54
2 201005 14
2 201006 26
2 201007 98
3 200912 77
3 201001 66
3 201002 69
3 201003 7
3 201004 7
3 201005 7
3 201006 65
3 201007 75
3 201008 85
3 201009 89
3 201010 100
3 201011 75
3 201012 75
;
run;
proc sort data=test;
by id yearmonth;
run;
data result;
set test;
array prev_month {13} PREV_MONTH_0-PREV_MONTH_12;
by id;
if first.id then do;
do i = 1 to 13;
prev_month(i) = 0;
end;
end;
do i = 13 to 2 by -1;
prev_month(i) = prev_month(i-1);
end;
prev_month(1) = NO_OF_CUST;
drop i PREV_MONTH_0;
retain PREV_MONTH:;
run;

Regular Expression complex serie with specific pattern

I will try to explain what I need help with.
The numbers in below series that I want to check by the regex, are "2 904", "3 231", "2 653", "2 653", "2 353" and so on. My goal is to only get a match if one of these numbers will be in format "123" (3-digits, between 100-999)
No match:
sö 31 1 2 904 2 3 231 3 2 653 32 4 2 653 5 2 353 6 2 353 7 2 353 8 2 904 9 3 002 10 3 143 33 11 2 615 12 2 353 13 2 353 14 2 353 15 2 353 16 2 653 17 2 353 34 18 2 157 19 1 699 20 1 699
Match:
sö 31 1 2 904 2 3 231 3 653 32 4 2 653 5 2 353 6 2 353 7 2 353 8 2 904 9 3 002 10 3 143 33 11 2 615 12 2 353 13 2 353 14 2 353 15 2 353 16 2 653 17 2 353 34 18 2 157 19 1 699 20 1 699
sö 31 1 2 904 2 3 231 3 2 653 32 4 2 653 5 2 353 6 2 353 7 2 353 8 2 904 9 3 002 10 3 143 33 11 2 615 12 2 353 13 2 353 14 953 15 2 353 16 2 653 17 2 353 34 18 2 157 19 1 699 20 1 699
As you can see from my examples, number "2 653" changed to "653" just after number "3"
And number " 2 353" changed to "953" after number "14".
Numbers between, i.e 1-20, are static numbers and will never change
Possible?
I will try it then at http://rubular.com/

Crash after QWebFrame::setHtml

I'm trying to set html content dynamically to document node in the main thread.
QWebElement dynamicContent = ui->webView->page()->mainFrame()->documentElement().findFirst("div#dynamicContent");
if (QWebFrame *frame = dynamicContent.webFrame())
{
param = "<html><body></body></html>";
frame->setHtml(param);
}
These lines are executed normally, but after that i get read access violation with call stack
0 QWebPluginDatabase::searchPaths QtWebKitd4 0x1009deca
1 QWebPluginDatabase::searchPaths QtWebKitd4 0x10749451
2 QWebPluginDatabase::searchPaths QtWebKitd4 0x10749434
3 QWebPluginDatabase::searchPaths QtWebKitd4 0x10749347
4 QWebPluginDatabase::searchPaths QtWebKitd4 0x10748469
5 QWebPluginDatabase::searchPaths QtWebKitd4 0x1074856f
6 QWebPluginDatabase::searchPaths QtWebKitd4 0x107100d6
7 QWebPluginDatabase::searchPaths QtWebKitd4 0x107244ce
8 QWebPluginDatabase::searchPaths QtWebKitd4 0x1074cd58
9 QWebPluginDatabase::searchPaths QtWebKitd4 0x109f9462
10 QWebPluginDatabase::searchPaths QtWebKitd4 0x109fa78e
11 QWebPluginDatabase::searchPaths QtWebKitd4 0x10725e17
12 QWebPluginDatabase::searchPaths QtWebKitd4 0x10725b25
13 QWebPluginDatabase::searchPaths QtWebKitd4 0x1074c180
14 QWebPluginDatabase::searchPaths QtWebKitd4 0x1074cf2e
15 QWebPluginDatabase::searchPaths QtWebKitd4 0x109f9462
16 QWebPluginDatabase::searchPaths QtWebKitd4 0x109fcdb2
17 QWebPluginDatabase::searchPaths QtWebKitd4 0x1074ca2e
18 QWebPluginDatabase::searchPaths QtWebKitd4 0x10721800
19 QWebPluginDatabase::searchPaths QtWebKitd4 0x10721383
20 QWebPluginDatabase::searchPaths QtWebKitd4 0x107253c3
21 QWebPluginDatabase::searchPaths QtWebKitd4 0x10720293
22 QWebPluginDatabase::searchPaths QtWebKitd4 0x10750cba
23 QWebPluginDatabase::searchPaths QtWebKitd4 0x10751ab9
24 QWebPluginDatabase::searchPaths QtWebKitd4 0x108639e4
25 QWebPluginDatabase::searchPaths QtWebKitd4 0x10863916
26 QWebPluginDatabase::searchPaths QtWebKitd4 0x109ecb72
27 QTemporaryFile::tr QtCored4 0x671ec7fa
28 QPictureIO::init QtGuid4 0x65071a2e
29 QPictureIO::init QtGuid4 0x6506f6aa
30 QTemporaryFile::tr QtCored4 0x671ceb81
31 QTemporaryFile::tr QtCored4 0x671d3d29
32 QTemporaryFile::tr QtCored4 0x67214812
33 QPictureIO::init QtGuid4 0x65071a2e
34 QPictureIO::init QtGuid4 0x6506f6aa
35 QTemporaryFile::tr QtCored4 0x671ceb81
36 QTemporaryFile::tr QtCored4 0x671d3d29
37 QTemporaryFile::tr QtCored4 0x671cfadb
38 QTemporaryFile::tr QtCored4 0x672129dd
39 InternalCallWinProc USER32 0x74d66238
40 UserCallWinProcCheckWow USER32 0x74d668ea
41 DispatchMessageWorker USER32 0x74d67d31
42 DispatchMessageW USER32 0x74d67dfa
43 QTemporaryFile::tr QtCored4 0x672139f6
44 QPictureIO::init QtGuid4 0x6512c4ce
45 QTemporaryFile::tr QtCored4 0x671cc68e
46 QTemporaryFile::tr QtCored4 0x671cc7c0
47 QTemporaryFile::tr QtCored4 0x671cf0fd
48 QPictureIO::init QtGuid4 0x6506f398
49 main main.cpp 28 0x401ec9
Any googling or stackoverflowing of problem did not succeed. Had anyone else the same issue? What is the proper usage of QWebFrame::setHtml?
Thank you
[Solved] This issue happens while calling QWebFrame::setHtml not from main thread.
Have solved this issue, and answer is DO NOT use setHtml method from thread, that differs from main.