So I have 20 conditions and im trying to check different combinations.
for example:
1 and 2 or 3
1 or 2 or 3 and 4
4 and 5 or 2
4 and 5 and 2 or 9
ect....
I'm trying to figure out what should I do. I though maybe use a swich case but I'm not really sure how to do that in a case like that with so many options.
I dont need to check All the possible options but i would like to try at least 15-20 "and " " or" combinations.
would love some advice here!
:)
This should give you all combinations:
a = '1234567890'
b = list(a)
for c in b:
for d in b:
print(c+d)
Related
I have a data set of flows between locations, say they are 50 locations, but the number of pairs is not even because some locations do not have flows. I would like to create ids for each pair of observation (w_id and h_id)
Thank you.
Desired output
w_code h_code w_id h_id
295101011001003 291892204451015 1 1
295101011001003 295101011001003 1 2
295101011001003 291892202003011 1 3
295101011001025 295101021003001 2 1
295101011001025 295101011001025 2 2
295101011001026 291879507003038 3 1
295101011001026 190130007001013 3 2
295101011001026 295101105001027 3 3
295101011001026 291892126002008 3 4
295101011001026 291892126001005 3 5
295101011001029 291892199006006 4 1
295101011002007 295101011002015 5 1
295101011002014 295101011002016 6 1
295101011002014 295101011001003 6 2
295101011002016 295101011001007 7 1
295101011002030 295101255001008 8 1
Documentation accessible through Stata includes this paper on composite categorical variables and this paper on handling dyadic data. The Stata command search would have led to these papers, except that the art to finding as well as searching is thinking of the right keywords.
In your case the natural question arises whether for example the pair (1, 2) is really the same as (2, 1) and for flows my guess is No. In mathematics, abstraction is often the key to solving a problem; in statistical computing some concreteness may make a problem clearer. Perhaps h means husband and w means wife, and perhaps not. Assuming that (1, 2) and (2, 1) are quite different, a joint identifier is immediately obtained by
egen newid = group(w_id h_id)
and for a small number of identifiers -- you mention 50 -- there is no pain in asking for values to be labelled, so that with
egen newid = group(w_id h_id), label
the pair (1, 1) would be mapped to the value 1 and the value label 1 1.
As this solution was not immediately obvious, it is likely that a study of help egen will reveal a bunch of tools likely to be useful in data management; some are directly statistical.
For pairs of identifiers where Billy, Bob is to be treated like Bob, Billy see the second paper linked above. Whether this is true for the OP is a little unclear, but it is likely to be true for some others reading this in the future.
I got stuck with a specific question in R around concatenating columns of a data frame by using a wildcard. Perhaps I am searching wrongly. However I could not find a matching answer yet.
Here is my question:
I have a data frame df where each column represents a user (U1, U2, U3), e.g.:
> df <-data.frame(U1=1:3, U2=4:6, U3=7:9)
> df
> U1 U2 U3
1 1 4 7
2 2 5 8
3 3 6 9
I would like to concatenate the values from all users into a single vector as one would do using the c() function, e.g.:
> c(df$U1, df$U2, df$U3)
[1] 1 2 3 4 5 6 7 8 9
However, my number of users is large and varies over time. So, I look for an elegant dynamic way of concatenating the columns such as
> c(df$U*)
Unfortunately this does not seem to work. I played around with grep and regular expressions but could not get it to work. For sure, I could use a for-loop and program my own cat function but I assume there is a better way. I just don't find it. Maybe I am just blind. Hope you can help.
sub_df <- df[, grep(pattern ='^U.*', names(df))]
stack(df)$values
Hope this works for you. You could first subset some columns according to your need.
Coerce the data frame to a matrix first:
as.vector(as.matrix(df))
Use the bracket [ to select columns whose names match a certain expression:
df[, grep("U.*", colnames(df)), drop = FALSE]
I am using Rapidminer Studio 6 and I want to replace values in dataset (or results, or series), lets say we have an attribute with values between 1 and 10, so I want to apply an operator which will replace values 1 to 4 and 8 to 10 by 0 so the new values will be 0's and numbers from 4 to 8. Say we have
2 4 1 5 7 9 -op-> 0 4 0 5 7 0.
Can someone tell me which operator to use, or subprocess?
(copied from original answer)
You can use the Generate Attributes operator for this with if in the parameters section.
If your attribute is called a2 and you want to change it to zero if its value is below 3 and above 5, the parameters to the Generate Attributes operator would look like this.
attribute name: a2
function expressions: if(a2<3,0,if(a2>5,0,a2))
We are given an integer N and we need to count the total number of permutations of numbers less than N. We are also given N-1 constraints. e.g.:
if N=4 then count permutations of 0,1,2,3 given:
0>1
0>2
0>3
I thought about making a graph and then counting total no of permutation of numbers at same level and multiply it with permutations at other level.e.g.:
For above example:
0
/ | \
/ | \
1 2 3 ------> 3!=6 So total no of permutations are 6.
But I have difficulty in implementing it in C++. Also, this question was asked in Facebook hacker cup, the competition is over now. I have seen code of other people and found that they did it using DFS. Any help?
The simplest way to do this is to use a standard permutation generator and filter out each permutation that violates the conditions. This is obviously very inefficient and for larger values of N is not computable. Doing this is sort of the "booby" option that these contests have which allows the less smart contestants to complete the problem.
The skilled approach requires insight into the ways of counting combinations and permutations. To illustrate the method I will use an example. Inputs:
N = 7
2 < 4
0 < 3
3 < 6
We first simplify this by combining the dependent conditions into a single condition, as follows:
2 < 4
0 < 3 < 6
Start with the longest condition, and determine the combination count of the gaps (this is the key insight). For example, some of the combinations are as follows:
XXXX036
XXX0X36
XXX03X6
XXX036X
XX0XX36
etc.
Now, you can see there are 4 gaps: ? 0 ? 3 ? 6 ?. We need to count the possible partitions of X's in these four gaps. The number of such partitions is (7 choose 3) = 35 (do you see why?). Now, we next multiply by the combinations of the next condition, which is 2 < 4 over the remaining blank spots (the Xs). We can multiply because this condition is fully independent of the 0<3<6 condition. This combination count is (4 choose 2) = 6. The final condition has 2 values in 2 spots = 2! = 2. Thus, the answer is 35 x 6 x 2 = 420.
Now, let's make it a little more complicated. Add the condition:
1 < 6
The way this changes the calculation is that before 036 had to appear in that order. But, now, we have three possible arrangements:
1036
0136
0316
Thus, the total count is now (7 choose 4) x 3 x (3 choose 2) = 35 x 3 x 3 = 315.
So, to recap, the procedure is you isolate the problem into independent conditions. For each independent condition you calculate the combinations of partitions, then you multiply them together.
I have walked through this example manually, but you can program the same procedure.
I'm trying to make a simple "fill in the blanks" type of exam in django and would like to know what is the best way to design the database.
Example: "9 is the sum of 4 and 5, or 3 and 6."
During the exam, the above sentence would appear as "__ is the sum of __ and _, or _ and __."
Obviously there are unlimited number of answers to this question, but assume that the above numbers are the only answers. But the catch is that you can switch the places of 4 and 5, or the places of 3 and 6 and still get the right answer. Besides, the number of blanks is not known, so it can be 1 or more.
I would go with something like. First define a Question table:
Question
--------------------------
Id Text
1 9 is the sum of 4 and 5, or 3 and 6
...
Then save the position of the hidden substrings, let's call them fields, in another table:
QuestionField
--------------------------
Id QuestionId StartsAt EndsAt Set
1 1 0 1 1
2 1 16 17 2
3 1 22 23 2 # NOTE: Is in the same set as QuestionField #2
...
This table lets you retrieve the actual value of the field by querying the Question table (e.g. entry one refers to the value '9' in the first question).
The "Set" column contains an identifier of the "set" in which this field is, where fields in the same set can be replaced by each other. When you populate it, you would have to ensure that all questions that can be replaced by each other are in the same set. The actual number of the set doesn't matter, as long as it's unique. But it makes sense to have it equal to the ID of one of the elements of the set.