I wrote the following function to find out the number of paths we can reach from the start cell (0,0) to the destination cell (n,n). I cannot, for the life of me, figure out why this is infinite recursion.
Code is as follows:
#include <iostream>
using namespace std;
int numOfPathsToDestUtil(int start, int end, int noOfPaths, int n) {
cout<<"Start: "<<start<<" and end: "<<end<<"and n: "<<n<<"\n";
if(start==end && start==n)
return noOfPaths;
if(end<start)
return 0;
numOfPathsToDestUtil(start+1, end, noOfPaths+1,n) + numOfPathsToDestUtil(start, end+1, noOfPaths+1,n);
}
int numOfPathsToDest( int n )
{
cout<<"n is: "<<n<<"\n";
return numOfPathsToDestUtil(0,0,0,n);
}
int main() {
int ans = numOfPathsToDest(4);
cout<<ans;
return 0;
}
Note: I am not requesting help with the code (saying so, because conditions like end<start are implementation-specific. Request you to let me understand why this recursion does not stop:
n is: 4
Start: 0 and end: 0and n: 4
Start: 1 and end: 0and n: 4
Start: 0 and end: 1and n: 4
Start: 1 and end: 1and n: 4
Start: 2 and end: 1and n: 4
Start: 1 and end: 2and n: 4
Start: 2 and end: 2and n: 4
Start: 3 and end: 2and n: 4
Start: 2 and end: 3and n: 4
Start: 3 and end: 3and n: 4
Start: 4 and end: 3and n: 4
Start: 3 and end: 4and n: 4
Start: 4 and end: 4and n: 4 --> I expect it to stop here as start=end and start=n
Start: 3 and end: 5and n: 4
Start: 4 and end: 5and n: 4
Start: 5 and end: 5and n: 4
Start: 6 and end: 5and n: 4
Start: 5 and end: 6and n: 4
Start: 6 and end: 6and n: 4
Thank you so much!
Let's label your calls
numOfPathsToDestUtil(0,0,0,n) # original (O)
numOfPathsToDestUtil(start+1, end, noOfPaths+1,n) # first-recursive (FR)
numOfPathsToDestUtil(start, end+1, noOfPaths+1,n) # second-recursive (SR)
Your output:
n is: 4
Start: 0 and end: 0and n: 4 # O - numOfPathsToDestUtil(0,0,0,4)
Start: 1 and end: 0and n: 4 # FR - numOfPathsToDestUtil(0+1,0,0,4)
Start: 0 and end: 1and n: 4 # SR - numOfPathsToDestUtil(0,0+1,0,4)
Start: 1 and end: 1and n: 4 # SR -> FR
Start: 2 and end: 1and n: 4 # SR -> FR -> FR
Start: 1 and end: 2and n: 4 # SR -> FR -> SR
Start: 2 and end: 2and n: 4 # SR -> FR -> SR -> FR
Start: 3 and end: 2and n: 4 # SR -> FR -> SR -> FR -> FR
Start: 2 and end: 3and n: 4 # SR -> FR -> SR -> FR -> SR
Start: 3 and end: 3and n: 4 # SR -> FR -> SR -> FR -> SR -> FR
Start: 4 and end: 3and n: 4 # SR -> FR -> SR -> FR -> SR -> FR -> FR
Start: 3 and end: 4and n: 4 # SR -> FR -> SR -> FR -> SR -> FR -> SR
Start: 4 and end: 4and n: 4 # SR -> FR -> SR -> FR -> SR -> FR -> SR -> FR (stops and returns value)
Start: 3 and end: 5and n: 4 # SR -> FR -> SR -> FR -> SR -> FR -> SR -> SR (never reaches where end==4 and n==4, keeps going and going)
Start: 4 and end: 5and n: 4
Start: 5 and end: 5and n: 4
Start: 6 and end: 5and n: 4
Start: 5 and end: 6and n: 4
Start: 6 and end: 6and n: 4
How to debug: I will suggest you to draw a calling tree
You are missing return statement on this line
numOfPathsToDestUtil(start+1, end, noOfPaths+1,n) + numOfPathsToDestUtil(start, end+1, noOfPaths+1,n);
Consider numOfPathsToDestUtil(start, end+1, noOfPaths+1,n) only.
Initial value (start,end) will be (0,0) then calling => (0,1) => then calling (0,2) => (0,3) =>(0,4) =>(0,5)... no termination constraint on end. This part will go into infinite loop
Now let's consider as a whole (hope below explanation is easy for you to understand)
Init(start, end,n, status)
(0,0,4,calling)
=>(1,0,4, will end)+(0,1,4,calling)
=>(1,1,4,calling)+(0,2, 4,calling)
=>(2,1,4,will end)+(1,2,4,calling)+(0,2, 4,calling)
=>(1,2,4,calling)+(0,2, 4,calling)
=>(2,2,4,calling)+(1,3,4,calling) +(0,2,4,calling)
I think you are able to derive the rest, and it shows that your recursion will not easily get out.
You need to modify your constraint to ensure what only "desired" condition will continue with recursion.
if end > n, will you continue recursion?
if start == end but start < n will continue recursion?
I will not list all. Hope it provide you a good thinking direction.
Related
I want to implement this psuedo code in Erlang:
function()
B[1] <-- 1
for m <-- 2 to 21 do
B[m] <-- 0
for k <-- 1 to m - 1 do
B[m] <-- B[m] − 9 * B[k]
B[m] <-- B[m]/m
return B
My first thought was to do something with a list comprehension, something like
[...|| M <- lists:seq(2,21), K <- lists:seq(1,M-1)] in order to try to "translate" the nested for-loops somehow, but now I'm stuck and I don't know how to continue.
I'd really appreciate some help on how to write this code in Erlang, I feel a bit lost.
Thanks in advance!
The code may be like as follows:
test_loop2()->
M = lists:seq(2,21),
Dict = dict:new(),
Dict_a = dict:store(1,1,Dict),
Dict_b = lists:foldl(fun(X,Acc_x)->
io:format("x:~p~n",[X]),
Value = lists:foldl(fun(A,Acc_a)->
Acc_a - 9*A
end,0,lists:seq(1,X-1)),
dict:store(X,Value,Acc_x)
end,Dict_a,M),
io:format("~p",[lists:sort(dict:to_list(Dict_b))]).
In erlang, a for-loop is written like this:
loop(StopIndex, StopIndex) -> ok;
loop(CurrentIndex, StopIndex) ->
%% do stuff
loop(CurrentIndex+1, StopIndex).
Nested loops look like this:
outer_loop(StopIndex, StopIndex) -> ok;
outer_loop(Index, StopIndex) ->
io:format("---->outer_loop index: ~w~n", [Index]),
inner_loop(1, Index-1),
outer_loop(Index+1, StopIndex).
inner_loop(StopIndex, StopIndex) ->
io:format("inner_loop index: ~w~n", [StopIndex]);
inner_loop(Index, StopIndex) ->
io:format("inner_loop index: ~w~n", [Index]),
inner_loop(Index+1, StopIndex).
In the shell:
12> a:outer_loop(2, 7).
---->outer_loop index: 2
inner_loop index: 1
---->outer_loop index: 3
inner_loop index: 1
inner_loop index: 2
---->outer_loop index: 4
inner_loop index: 1
inner_loop index: 2
inner_loop index: 3
---->outer_loop index: 5
inner_loop index: 1
inner_loop index: 2
inner_loop index: 3
inner_loop index: 4
---->outer_loop index: 6
inner_loop index: 1
inner_loop index: 2
inner_loop index: 3
inner_loop index: 4
inner_loop index: 5
ok
If you need to manipulate some data in the loops, then you can add other parameter variables to the function definitions to hold the data. For instance, in your example the inner loop would need a variable to store the data structure B.
Lastly, you should be aware that lists suck at random access, e.g. B[m], so consider using the array module.
I am trying to get the longest sublist within a list. I need a rule that recursively searches a list of lists and determines which list has the longest length.
For example:
input: [[1],[1,2],[],[1,2,3,4],[5,6]]
output: [1,2,3,4]
This is what I have so far:
max([H|T], Path, Length) :-
length(H, L),
(L #> Length ->
max(T, H, L) ;
max(T, Path, Length) ).
I would like max() to work like this:
? max([[1],[1,2],[],[1,2,3,4],[5,6]], Path, Distance).
Path = [1,2,3,4]
Distance = 4
When I run a trace, this is the output:
{trace}
| ?- max([[1],[1,2],[],[1,2,3,4],[5,6]], Path, Distance).
1 1 Call: max([[1],[1,2],[],[1,2,3,4],[5,6]],_307,_308) ?
2 2 Call: length([1],_387) ?
2 2 Exit: length([1],1) ?
3 2 Call: 1#>_308 ?
3 2 Exit: 1#>_308 ?
4 2 Call: max([[1,2],[],[1,2,3,4],[5,6]],[1],1) ?
5 3 Call: length([1,2],_462) ?
5 3 Exit: length([1,2],2) ?
6 3 Call: 2#>1 ?
6 3 Exit: 2#>1 ?
7 3 Call: max([[],[1,2,3,4],[5,6]],[1,2],2) ?
8 4 Call: length([],_537) ?
8 4 Exit: length([],0) ?
9 4 Call: 0#>2 ?
9 4 Fail: 0#>2 ?
9 4 Call: max([[1,2,3,4],[5,6]],[1,2],2) ?
10 5 Call: length([1,2,3,4],_587) ?
10 5 Exit: length([1,2,3,4],4) ?
11 5 Call: 4#>2 ?
11 5 Exit: 4#>2 ?
12 5 Call: max([[5,6]],[1,2,3,4],4) ?
13 6 Call: length([5,6],_662) ?
13 6 Exit: length([5,6],2) ?
14 6 Call: 2#>4 ?
14 6 Fail: 2#>4 ?
14 6 Call: max([],[1,2,3,4],4) ?
14 6 Fail: max([],[1,2,3,4],4) ?
12 5 Fail: max([[5,6]],[1,2,3,4],4) ?
9 4 Fail: max([[1,2,3,4],[5,6]],[1,2],2) ?
7 3 Fail: max([[],[1,2,3,4],[5,6]],[1,2],2) ?
4 2 Fail: max([[1,2],[],[1,2,3,4],[5,6]],[1],1) ?
1 1 Fail: max([[1],[1,2],[],[1,2,3,4],[5,6]],_307,_308) ?
(2 ms) no
I believe the issue is that I am not handling the occurrence of an empty set "[]". However, I have attempted several different methods and am unable to get my desired output.
You should define loop ending clause and need one more parameter for returning.
max([], _, Length, Length).
max([H|T], Path, Length, RetLength) :-
length(H, L),
( L #> Length ->
max(T, H, L,RetLength) ;
max(T, Path, Length,RetLength)
).
Test:
?- max([[1],[1,2],[],[1,2,3,4],[5,6]], Path, Distance,Len).
Len = 4.
I am trying to read content from a file and then organize it into a list of tuples. I have read the file into a list of numbers, however it seems to skip numbers immediately after newlines, how to prevent this behaviour?
I am guaranteed a file of even number of characters.
-module(brcp).
-export([parse_file/1]).
parse_file(Filename) ->
read_file(Filename).
read_file(Filename) ->
{ok, File} = file:read_file(Filename),
Content = unicode:characters_to_list(File),
build_tuples([begin {Int,_}=string:to_integer(Token), Int end|| Token<-string:tokens(Content," /n/r")]).
build_tuples(List) ->
case List of
[] -> [];
[E1,E2|Rest] -> [{E1,E2}] ++ build_tuples(Rest)
end.
Here is a sample input file:
1 7 11 0
1 3 5 0 7 0
1 8 10 0 1 11
99 0
-module(tuples).
-export([parse/0]).
parse() ->
{ok, File} = file:read_file("tuples.txt"),
List = binary:split(File, [<<" ">>, <<"\t">>, <<"\n">>], [global, trim_all]),
io:format("~p~n", [List]),
build_tuples(List, []).
build_tuples([X,Y|T], Acc) ->
build_tuples(T, [{X,Y}|Acc]);
build_tuples([X|T], Acc) ->
build_tuples(T, [{X, undefined}|Acc]);
build_tuples([], Acc) ->
lists:reverse(Acc).
The text file I used is almost as yours but I added tabs and multiple spaces to make it more realistic:
1 7 11 0
1 3 5 0 7 0
1 8 10 0 1 11
99 0
You can of course convert binaries to integers when adding them to tuples with erlang:binary_to_integer/1. The binary:split/3 function used in the code parses all empty characters (tabs, spaces, new lines) to empty binaries and then trim_all ignores them. You can skip them if your input is always well-formed. Result:
14> tuples:parse().
[<<"1">>,<<"7">>,<<"11">>,<<"0">>,<<"1">>,<<"3">>,<<"5">>,<<"0">>,<<"7">>,<<"0">>,<<"1">>,<<"8">>,<<"10">>,<<"0">>,<<"1">>,<<"11">>,<<"99">>,<<"0">>]
[{<<"1">>,<<"7">>},{<<"11">>,<<"0">>},{<<"1">>,<<"3">>},{<<"5">>,<<"0">>},{<<"7">>,<<"0">>},{<<"1">>,<<"8">>},{<<"10">>,<<"0">>},{<<"1">>,<<"11">>},{<<"99">>,<<"0">>}]
UPDATE 2
*I've added some code (and explanation) I wrote myself at the end of this question, this is however a suboptimal solution (both in coding efficiency as resulting output) but kind of manages to make a selection of items that adhere to the constraints. If you have any ideas on how to improve it (again both in efficiency as resulting output) please let me know.
1. Updated Post
Please look below for the initial question and sample code. Thx to alexis_laz his answer the problem was solved for a small number of items. However when the number of items becomes to large the combn function in R cannot calculate it anymore because of the invalid 'ncol' value (too large or NA) error. Since my dataset has indeed a lot of items, I was wondering whether replacing some of his code (shown after this) with C++ provides a solution to this, and if this is the case what code I should use for this? Tnx!
This is the code as provided by alexis_laz;
ff = function(x, No_items, No_persons)
{
do.call(rbind,
lapply(No_items:ncol(x),
function(n) {
col_combs = combn(seq_len(ncol(x)), n, simplify = F)
persons = lapply(col_combs, function(j) rownames(x)[rowSums(x[, j, drop = F]) == n])
keep = unlist(lapply(persons, function(z) length(z) >= No_persons))
data.frame(persons = unlist(lapply(persons[keep], paste, collapse = ", ")),
items = unlist(lapply(col_combs[keep], function(z) paste(colnames(x)[z], collapse = ", "))))
}))
}
2. Initial Post
Currently I'm working on a set of data coming from adaptive measurement, which means that not all persons have made all of the same items. For my analysis however I need a dataset that contains only items that have been made by all persons (or a subset of these persons).
I have a matrix object in R with rows = persons (100000), and columns = items(220), and a 1 in a cell if the person has made the item and a 0 if the person has not made the item.
How can I use R to determine which combination of at least 15 items, is made by the highest amount of persons?
Hopefully the question is clear (if not please ask me for more details and I will gladly provide those).
Tnx in advance.
Joost
Edit:
Below is a sample matrix with the items (A:E) as columns and persons (1:5) as rows.
mat <- matrix(c(1,1,1,0,0,1,1,0,1,1,1,1,1,0,1,0,1,1,0,0,1,1,1,1,0),5,5,byrow=T)
colnames(mat) <- c("A","B","C","D","E")
rownames(mat) <- 1:5
> mat
A B C D E
"1" 1 1 1 0 0
"2" 1 1 0 1 1
"3" 1 1 1 0 1
"4" 0 1 1 0 0
"5" 1 1 1 1 0
mat[1,1] = 1 means that person 1 has given a response to item 1.
Now (in this example) I'm interested in finding out which set of at least 3 items is made by at least 3 people. So here I can just go through all possible combinations of 3, 4 and 5 items to check how many people have a 1 in the matrix for each item in a combination.
This will result in me choosing the item combination A, B and C, since it is the only combination of items that has been made by 3 people (namely persons 1, 3 and 5).
Now for my real dataset I want to do this but then for a combination of at least 10 items that a group of at least 75 people all responded to. And since I have a lot of data preferably not by hand as in the example data.
I'm thus looking for a function/code in R, that will let me select the minimal amount of items, and questions, and than gives me all combinations of items and persons that adhere to these constraints or have a greater number of items/persons than the constrained.
Thus for the example matrix it would be something like;
f <- function(data,no.items,no.persons){
#code
}
> f(mat,3,3)
no.item no.pers items persons
1 3 3 A, B, C 1, 3, 5
Or in case of at least 2 items that are made by at least 3 persons;
> f(mat,2,3)
no.item no.pers items persons
1 2 4 A, B 1, 2, 3, 5
2 2 3 A, C 1, 3, 5
3 2 4 B, C 1, 3, 4, 5
4 3 3 A, B, C 1, 3, 5
Hopefully this clears up what my question actually is about. Tnx for the quick replies that I already received!
3. Written Code
Below is the code I've written today. It takes each item once as a starting point and then looks to the item that has been answered most by people who also responded to the start item. It the takes these two items and looks to a third item, and repeats this until the number of people that responded to all selected questions drops below the given limit. One drawback of the code is that it takes some time to run, (it goes up somewhat exponentially when the number of items grows). The second drawback is that this still does not evaluate all possible combinations of items, in the sense that the start item, and the subsequently chosen item may have a lot of persons that answered to these items in common, however if the chosen item has almost no similarities with the other (not yet chosen) items, the sample might shrink very fast. While if an item was chosen with somewhat less persons in common with the start item, and this item has a lot of connections to other items, the final collection of selected items might be much bigger than the one based on the code used below. So again suggestions and improvements in both directions are welcome!
set.seed(512)
mat <- matrix(rbinom(1000000, 1, .6), 10000, 100)
colnames(mat) <- 1:100
fff <- function(data,persons,items){
xx <- list()
for(j in 1:ncol(data)){
d <- matrix(c(j,length(which(data[,j]==1))),1,2)
colnames(d) <- c("item","n")
t = persons+1
a <- j
while(t >= persons){
b <- numeric(0)
for(i in 1:ncol(data)){
z <- c(a,i)
if(i %in% a){
b[i] = 0
} else {
b[i] <- length(which(rowSums(data[,z])==length(z)))
}
}
c <- c(which.max(b),max(b))
d <- rbind(d,c)
a <- c(a,c[1])
t <- max(b)
}
print(j)
xx[[j]] = d
}
x <- y <- z <- numeric(0)
zz <- matrix(c(0,0,rep(NA,ncol(data))),length(xx),ncol(data)+2,byrow=T)
colnames(zz) <- c("n.pers", "n.item", rep("I",ncol(data)))
for(i in 1:length(xx)){
zz[i,1] <- xx[[i]][nrow(xx[[i]])-1,2]
zz[i,2] <- length(unname(xx[[i]][1:nrow(xx[[i]])-1,1]))
zz[i,3:(zz[i,2]+2)] <- unname(xx[[i]][1:nrow(xx[[i]])-1,1])
}
zz <- zz[,colSums(is.na(zz))<nrow(zz)]
zz <- zz[which((rowSums(zz,na.rm=T)/rowMeans(zz,na.rm=T))-2>=items),]
zz <- as.data.frame(zz)
return(zz)
}
fff(mat,110,8)
> head(zz)
n.pers n.item I I I I I I I I I I
1 156 9 1 41 13 80 58 15 91 12 39 NA
2 160 9 2 27 59 13 81 16 15 6 92 NA
3 158 9 3 59 83 32 25 80 14 41 16 NA
4 160 9 4 24 27 71 32 10 63 42 51 NA
5 114 10 5 59 66 27 47 13 44 63 30 52
6 158 9 6 13 56 61 12 59 8 45 81 NA
#col 1 = number of persons in sample
#col 2 = number of items in sample
#col 3:12 = which items create this sample (NA if n.item is less than 10)
to follow up on my comment, something like:
set.seed(1618)
mat <- matrix(rbinom(1000, 1, .6), 100, 10)
colnames(mat) <- sample(LETTERS, 10)
rownames(mat) <- sprintf('person%s', 1:100)
mat1 <- mat[rowSums(mat) > 5, ]
head(mat1)
# A S X D R E Z K P C
# person1 1 1 1 0 1 1 1 1 1 1
# person3 1 0 1 1 0 1 0 0 1 1
# person4 1 0 1 1 1 1 1 0 1 1
# person5 1 1 1 1 1 0 1 1 0 0
# person6 1 1 1 1 0 1 0 1 1 0
# person7 0 1 1 1 1 1 1 1 0 0
table(rowSums(mat1))
# 6 7 8 9
# 24 23 21 5
tab <- table(sapply(1:nrow(mat1), function(x)
paste(names(mat1[x, ][mat1[x, ] == 1]), collapse = ',')))
data.frame(tab[tab > 1])
# tab.tab...1.
# A,S,X,D,R,E,P,C 2
# A,S,X,D,R,E,Z,P,C 2
# A,S,X,R,E,Z,K,C 3
# A,S,X,R,E,Z,P,C 2
# A,S,X,Z,K,P,C 2
Here is another idea that matches your output:
ff = function(x, No_items, No_persons)
{
do.call(rbind,
lapply(No_items:ncol(x),
function(n) {
col_combs = combn(seq_len(ncol(x)), n, simplify = F)
persons = lapply(col_combs, function(j) rownames(x)[rowSums(x[, j, drop = F]) == n])
keep = unlist(lapply(persons, function(z) length(z) >= No_persons))
data.frame(persons = unlist(lapply(persons[keep], paste, collapse = ", ")),
items = unlist(lapply(col_combs[keep], function(z) paste(colnames(x)[z], collapse = ", "))))
}))
}
ff(mat, 3, 3)
# persons items
#1 1, 3, 5 A, B, C
ff(mat, 2, 3)
# persons items
#1 1, 2, 3, 5 A, B
#2 1, 3, 5 A, C
#3 1, 3, 4, 5 B, C
#4 1, 3, 5 A, B, C
I am supposed to write output into a text file with C++ and then process it using GraphViz.
This extract from my code:
cout << " " << i << " -> " << j
shows this error when I run it:
Error: MyGraph5V20E:4: syntax error near line 4
context: 0 >>> - <<< > 1 [label="73"];
And this is the output file:
graph G {
node [shape=circle]
0 -> 1 [label="73"];
0 -> 2 [label="60"];
0 -> 3 [label="36",color=red];
0 -> 4 [label="71"];
1 -> 2 [label="50",color=red];
1 -> 3 [label="78"];
1 -> 4 [label="85"];
2 -> 3 [label="30",color=red];
2 -> 4 [label="23",color=red];
3 -> 4 [label="68"];
}
I suppose it has to do with " -> " in my code. How can I manouvre this??