If-Ifelse logic with partly repeating condition - if-statement

I1 S1 I2 S2 I3 S3 I4 S4 I5 S5 I6 S6 ACTUAL WANTED
10 1 10 2 0 0 0 0 0 0 0 0 3 3
10 1 10 2 10 3 0 0 0 0 0 0 3 6
10 1 10 2 10 3 10 4 0 0 0 0 3 10
if I1=I2 then (S1+S2)
elseif I1=I2=I3 then (S1+S2+S3)
elseif I1=I2=I3=I4 then (S1+S2+S3+S4)
else ""
endif
the code don't consider the other condition when the first is verified, i don't know how to build a loop and consider all condition.
Thank you very much for your help i hope is clear now , and you're write is spss modeler

Your first logical condition covers all cases where I1=I2, including cases where I1=I2=I3 or I1=I2=I3=I4. So ALL these cases will get the result (S1+S2) and none of them are left for the elseif.
In order to give special treatment for the I1=I2=I3=I4 cases is to treat them FIRST, then what's left are the cases where I1=I2=I3 or I1=I2 but not I1=I2=I3=I4. Now you do the same thing with the I1=I2=I3 cases. Once you've treated them, you are left with the rest of the I1=I2 cases.
So I know nothing about SPSS modeler language, but based on the code you've posted, your command should look like this:
if I1=I2=I3=I4 then (S1+S2+S3+S4)
elseif I1=I2=I3 then (S1+S2+S3)
elseif I1=I2 then (S1+S2)
else ""

Related

Conditional summation in time-to-event data

I have the following data that has been prepared with stset. The resulting variables signify cohort entry and exit times along with event status. In addition, a numerical variable - prob has been calculated based on the riskset size.
For those subjects that are not cases (where _d == 0), I need to sum all values of the prob variable where _t falls within that subject's follow-up time.
For example, subject 8 enters the cohort at _t0 == 0 and exits at _t == 8. Between these times, there are three prob values 0.9, 0.875 and 0.875 - giving the desired answer for subject 8 as 2.65.
* Example generated by -dataex-. To install: ssc install dataex
clear
input long id byte(_t0 _t _d) float prob
1 0 1 0 .
2 0 2 0 .
3 1 3 1 .9
4 0 4 0 .
5 0 5 1 .875
6 0 6 1 .875
7 5 7 0 .
8 0 8 0 .
9 0 9 1 .8333333
10 0 10 1 .8
11 0 11 0 .
12 8 12 1 .6666667
13 0 13 0 .
14 0 14 0 .
15 0 15 0 .
end
The desired output would return all of the data with an additional variable signifying the summed values of prob.
Thanks so much in advance.

sas search value across column with array and extract values of next 12 columns

I want to count the number of 'noncure' occurrences across different columns with some condition, at different position dates. How do I search for the occurrence of 12 '1's across columns.
[UPDATE]
I've modified my dataset and think this is the best way to populate out my desired results.
This is a sample of my raw data
data have;
input acct flg1 flg2 flg3 flg4 flg5 flg6 flg7 flg8 flg9 flg10 flg11 flg12 flg13 flg14 flg15 flg16 flg17 flg18 flg19 flg20 flg21 flg22 flg23 flg24 flg25;
datalines;
AA 0 0 0 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1
run;
The numbers on flg represent months - eg flg1 = jan10, flg2 = feb10 & so on.
To get noncure, certain conditions have to be fulfilled.
flg(i) has to be 0
noncure only happens if there is a minimum of 12 consecutive flg of '1' in the future
an account can have more than 1 noncure incidents
The computation of noncure should look like this (Refer to image for a better view - highlighted in green)
AA 1 1 1 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
noncure1 is 1 because flg1 is 0 and the next 12 1 is at flg9
noncure2 is 1 because flg2 is 0 and the next 12 1 is at flg9
noncure4 is 0 because flg4 is not 0
noncure23 is 0 because even though flg23 is 0, there is no following consecutive 12 at flg25 (only one count of '1')
I'm having problems searching for my first instance of consecutive 12 '1' at flg(i).
I was thinking of doing an array to populate out position of consecutive 12 (eg nc_pos) then do i to nc_pos - something along the lines of
nc_pos = <search for 12 consecutive occurrence of '1' from flg(i)> **I don't know the code for this**
if flg(i) = 0 then do i to nc_pos;
noncure_tag = 1;
obs_pos = i;
FYI I have few hundred thousand accounts with a total of 84 months and their starting positions are different (eg flg1 could be null and the first 0 or 1 may appear at flg3).
My final output should look something like the image file labelled TARGET highlighted in yellow.

Solving a maze in 2d array

I had referred to many articles and questions that answered how to solve a maze effectively but here I want to confirm what's going wrong in my code. Consider the maze:
2 1 0 0 3
0 1 0 1 1
0 1 0 0 1
0 1 1 0 0
0 0 0 0 0
where the 1's represent the walls and 0's represent the path.(source is 2 and destination is 3).
I have to output whether there is a path or not.
int y=0;
while(y==0)
{
robo1(n,m,maze);//this function adds 2 to any '0'/'3' in (i,j+1),(i+1,j),(i-1,j),(i,j-1) (if exists),where (i,j) is 2
robo2(n,m,k2,maze);//this function adds 3 to any '0'/'2' in (i,j+1),(i+1,j),(i-1,j),(i,j-1) (if exists), where (i,j) is 3
if(find5(n,m,maze)==1)//this function returns 1 if there is '5' in the maze
y++;
if(find0(n,m,maze)==0)//this function returns 0 if there are no '0' in the maze
break;
}
if(find0(n,m,maze)==0 && y==0)
printf("-1\n");//no path
else
printf("1\n");//there is a path
My idea is that if after any number of loops a five is found in the maze, then it would mean there is a path.
But while implementing this function in code I get wrong answers and sometimes run-time errors.
Is there any flaw in the above logic?
The general idea should almost work, but of course everything is in the details.
One case in which your approach will not work even if implemented correctly is however this:
2 1 0 0 0
1 1 0 1 1
0 0 0 1 3
i.e. if both 2 and 3 are "closed" by walls but there are 0s in the room. Your loop will never end because despite having 0s around neither of the two robo function will change anything.
A simple solution is returning 0/1 from robos if they actually changed at least a value in the matrix and quitting when this doesn't happen.
Note that this is not a very efficient way of solving a maze (your code will keep checking the same cells over and over many times).

Regular Expression to match few characters from a string

I am trying to find a string within another string. However, I am trying to match even if one or more character is not matching.
Let me explain with an example :
Let's say I have a string 'abcdefghij'. Now if the string to match is 'abcd',
I could write strfind('abcdefghij', 'abc')
Now, I have a string 'adcf'. Notice that, there is a mismatch in two characters, I would consider it as a match.
Any idea how to do it ?
I know, this is not the most optimal code.
Example :
a='abcdefghijk';
b='xbcx'
c='abxx'
d='axxd'
e='abcx'
f='xabc'
g='axcd'
h='abxd'
i ='abcd'
All these strings should match with a. I hope this example makes it more clear. The idea is, if there is a mismatch of 1 or 2 characters also, it should be considered as a match.
You could do it like this:
A = 'abcdefghij'; % Main string
B = 'adcf'; % String to be found
tolerance = 2; % Maximum number of different characters to tolerate
nA = numel(A);
nB = numel(B);
pos = find(sum(A(mod(cumsum([(1:nA)' ones(nA, nB - 1)], 2) - 1, nA) + 1) == repmat(B, nA, 1), 2) >= nB - tolerance);
In this case it will return pos = [1 3]'; because "adcf" can be matched on the first position (matching "a?c?") and on the third position (matching "?d?f")
Explanation:
First, we take the sizes of A and B
Then, we create the matrix [(1:nA)' ones(nA, nB - 1)], which gives us this:
Output:
1 1 1 1
2 1 1 1
3 1 1 1
4 1 1 1
5 1 1 1
6 1 1 1
7 1 1 1
8 1 1 1
9 1 1 1
10 1 1 1
We perform a cumulative sum to the right, using cumsum, to achieve this:
Output:
1 2 3 4
2 3 4 5
3 4 5 6
4 5 6 7
5 6 7 8
6 7 8 9
7 8 9 10
8 9 10 11
9 10 11 12
10 11 12 13
And use the mod function so each number is between 1 and nA, like this:
Output:
1 2 3 4
2 3 4 5
3 4 5 6
4 5 6 7
5 6 7 8
6 7 8 9
7 8 9 10
8 9 10 1
9 10 1 2
10 1 2 3
We then use that matrix as an index for the A matrix.
Output:
abcd
bcde
cdef
defg
efgh
fghi
ghij
hija
ijab
jabc
Note this matrix has all possible substrings of A with size nB.
Now we use repmat to replicate B down, 'nA rows'.
Output:
adcf
adcf
adcf
adcf
adcf
adcf
adcf
adcf
adcf
adcf
And perform a direct comparison:
Output:
1 0 1 0
0 0 0 0
0 1 0 1
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
Summing to the right give us this:
Output:
2
0
2
0
0
0
0
0
0
0
Which are the number of character matches on each possible substring.
To finish, we use find to select the indexes of the matches within our tolerance.
In your code
c=a-b is not valid (Matrix dimensions not same)
If you need at least one match, not in order, (as your example says), you can have something like this :-
>> a='abcdefgh';
>> b='adcf';
>> sum(ismember(a,b)) ~= 0
ans =
1

Function that given these 2 values produces this third value?

I'm trying to write a function that when given 2 arguments, the 2 leftmost columns, produces the third column as a result:
0 0 0
1 0 3
2 0 2
3 0 1
0 1 1
1 1 0
2 1 3
3 1 2
0 2 2
1 2 1
2 2 0
3 2 3
0 3 3
1 3 2
2 3 1
3 3 0
I know there will be a modulus involved but I can't quite figure it out.
I'm trying to figure out if 4 people are sitting at a table, given the person and target, from the person's perspective which seat is the target sitting in?
Thanks
If a and b are the positions of the two persons, their "distance" is:
(4+b-a) % 4
This also shows that the forth block in your example is wrong.
Assuming that last block of numbers is wrong, I think you're looking for (4 + b - a) % 4 gives c (for columns a b c).