Count the number of possible permutations of numbers less than integer N, given N-1 constraints - c++

We are given an integer N and we need to count the total number of permutations of numbers less than N. We are also given N-1 constraints. e.g.:
if N=4 then count permutations of 0,1,2,3 given:
0>1
0>2
0>3
I thought about making a graph and then counting total no of permutation of numbers at same level and multiply it with permutations at other level.e.g.:
For above example:
0
/ | \
/ | \
1 2 3 ------> 3!=6 So total no of permutations are 6.
But I have difficulty in implementing it in C++. Also, this question was asked in Facebook hacker cup, the competition is over now. I have seen code of other people and found that they did it using DFS. Any help?

The simplest way to do this is to use a standard permutation generator and filter out each permutation that violates the conditions. This is obviously very inefficient and for larger values of N is not computable. Doing this is sort of the "booby" option that these contests have which allows the less smart contestants to complete the problem.
The skilled approach requires insight into the ways of counting combinations and permutations. To illustrate the method I will use an example. Inputs:
N = 7
2 < 4
0 < 3
3 < 6
We first simplify this by combining the dependent conditions into a single condition, as follows:
2 < 4
0 < 3 < 6
Start with the longest condition, and determine the combination count of the gaps (this is the key insight). For example, some of the combinations are as follows:
XXXX036
XXX0X36
XXX03X6
XXX036X
XX0XX36
etc.
Now, you can see there are 4 gaps: ? 0 ? 3 ? 6 ?. We need to count the possible partitions of X's in these four gaps. The number of such partitions is (7 choose 3) = 35 (do you see why?). Now, we next multiply by the combinations of the next condition, which is 2 < 4 over the remaining blank spots (the Xs). We can multiply because this condition is fully independent of the 0<3<6 condition. This combination count is (4 choose 2) = 6. The final condition has 2 values in 2 spots = 2! = 2. Thus, the answer is 35 x 6 x 2 = 420.
Now, let's make it a little more complicated. Add the condition:
1 < 6
The way this changes the calculation is that before 036 had to appear in that order. But, now, we have three possible arrangements:
1036
0136
0316
Thus, the total count is now (7 choose 4) x 3 x (3 choose 2) = 35 x 3 x 3 = 315.
So, to recap, the procedure is you isolate the problem into independent conditions. For each independent condition you calculate the combinations of partitions, then you multiply them together.
I have walked through this example manually, but you can program the same procedure.

Related

Hash tables runtime complexity for chaining with 2 hash function

This question deal with collision based in a new approach for chaining in hash tables.
There is 2 hash functions: First function h1(x) = x mod m1
with this function all the items are hashed to the primary hash table.
inside each index for the primary hash table there is internal hash table that hash the key with function 2 : h2(x) = x mod m2 and (m1!=m2)
for example lets say i had m1 = 5 and m2 = 3
and i want to insert 2 .. h1(2) = 2 mod 5 = 2 and h2(2) = 2 mod 3=2
this mean 2 will be inserted in the second index in the primary table in the second index of the internal table.
when collision happen in the primary table (this mean h1(x)=x%m1= y%m1 =h1(y)) we going to the second hash function h2 and calculate h2(x) and h2(y) and we put each one in the corresponding index in the internal hash table. lets say h1(x)= x%5 and h2(x) = x%3 for example if we insert 7 and 12 we will get h1(7)=2 and h1(12)=2 this mean both will be in index 2 in the primary hash table. then we compute h2 for both ( h2(7) = 1 and h2(12)=0) which mean we put 7 in index 1 and 12 in index 0 in the internal table.(and by this we avoid collision)
this was the question in the exams, also first section about the question was if there is collision for this numbers - 0 5 15 17 (with m1=5 and m2=3) and there is obviously 0 and 15 have the modulo for 5 and 3. the second question was about the search worst case runtime complexity? and the third section was to give 5 numbers that make the worst case if we search for number 2 in the tablewhen collision happen in the primary table (this mean h1(x)=x%m1= y%m1 =h1(y)) we going to the second hash function h2 and calculate h2(x) and h2(y) and we put each one in the corresponding index in the internal hash table. lets say h1(x)= x%5 and h2(x) = x%3 for example if we insert 7 and 12 we will get h1(7)=2 and h1(12)=2 this mean both will be in index 2 in the primary hash table. then we compute h2 for both ( h2(7) = 1 and h2(12)=0) which mean we put 7 in index 1 and 12 in index 0 in the internal table.(and by this we avoid collision)
this was the question in the exams, also first section about the question was if there is collision for this numbers - 0 5 15 17 (with m1=5 and m2=3) and there is obviously 0 and 15 have the modulo for 5 and 3. the second question was about the search worst case runtime complexity? and the third section was to give 5 numbers that make the worst case if we search for number 2 in the table
the question is what is the search worst case runtime complexity?
and example for 5 numbers that can cause worst case if we search for number 2.
i think the complexity is o(1) and i used this 5 numbers
7 12 17 22 42
did this correct ?can anybody help with this!

Prolog List Neighbour of a Element

I am having problems with list of prolog. I want to make this:
[1,2,3,4,5]
[5,6,9,12,10]
You take a number for example 3, and you do a plus operation with the neighbours so the operation is 2+3+4 = 9. For the first and the last element you pretend there is an imaginary 1 there.
I have this now:
sum_list([A,X,B|T], [Xs|Ts]):-
add(A,X,B,Xs),
sum_list([X,B|T], Ts).
I haven't consider the first and the last element. My problem is I don't know how to get the element before and the next and then how to move on.
Note: I not allow to use meta-predicates.
Thanks.
I'm not sure how you calculated the first 5. The last 10 would be 4 + 5 + implicit 1. But following that calculation, the first element of your result should be 4 instead of 5?
Anyways, that doesn't really matter in terms of writing this code. You are actually close to your desired result. There are of course multiple ways of tackling this problem, but I think the simplest one would be to write a small 'initial' case in which you already calculate the first sum and afterwards recursively calculate all of the other sums. We can then write a case in which only 2 elements are left to calculate the last 'special' sum:
% Initial case for easily distinguishing the first sum
initial([X,Y|T],[Sum|R]) :-
Sum is X+Y+1,
others([X,Y|T],R).
% Match on 2 last elements left
others([X,Y],[Sum|[]]) :-
Sum is X+Y+1.
% Recursively keep adding neighbours
others([X,Y,Z|T],[Sum|R]) :-
Sum is X+Y+Z,
others([Y,Z|T],R).
Execution:
?- initial([1,2],Result)
Result = [4,4]
?- initial([1,2,3,4,5],Result)
Result = [4, 6, 9, 12, 10]
Note that we now don't have any cases (yet) for an empty list or a list with just one element in it. This still needs to be covered if necessary.

How to write loop across Hierarchical Data (household-individual) in stata?

I'm now working on a household survey data set and I'd like to give certain members extra IDs according to their relationship to the household head. More specifically, I need to identify the adult children of household head and his/her spouse, if married, and assign them "sub-household IDs".
The variables are: hhid - household ID; pid -individual ID; relhead - relationship with head.
Regarding relhead, a 1 represents the head, a 6 represents a child, and a 7 represents a child-in-law. Below some example data, including in the last column the desired outcome. I assume that whenever a 6 is followed by a 7, they constitute a couple and belong to the same sub-household.
hhid pid relhead sub_hhid(desired)
50 1 1 1
50 2 3 1
50 3 6 2
50 4 6 3
50 5 7 3
-----------------------------------------------
67 1 1 1
67 3 6 2
67 4 7 2
Here are some thoughts:
There may be married and unmarried adult children within one household, the family structure is a little bit complicated, so I want to write some loop across the members in a household.
The basic idea is in the outer loop we identify the children staying-at-home and then check if there's a spouse presented, if there is, then we give the couple an indicator, if not, we continue and give the single stay_chil other indicator. After walking through all the possible members within a household, we get a series of within-household IDs. To facilitate further analysis , I need some kind of external ID variable to separate the sub-families.
* Define N as the total number of household, n as number of individual household size
* sty_chil is indicator for adult child who living with parents(head)
* sty_chil_sp is adult child's spouse
* "hid" and "ind_id" are local macros
forvalue hid=1/N {
forvalue ind_id= 1/n {
if sty_chil[`ind_id']==1 {
check if sty_chil_sp[`ind_id+1']==1 {
if yes then assign sub_hhid to this couples *a 6-7 pairs,identifid as couple
}
else { * single 6 identifid as single child
assign sub_hhid to this child
}
else { *Other relationships rather than 6, move forward
++ind_id the members within a household
}
++hid *move forward across households
}
The built-in stata by,sort: is pretty powerful but here I want to treat part of family members who fall into certain criterion and leave other untouched, so a if-else type loop is more natural for me (even by: may achieve my goal,it's always too tactful when situation become not so simpleļ¼Œand we cannot exhaust all the possible pattern of household pattern).
An immediate problem is that I don't know how to write loop across house IDs and individual IDs, because I used to acquire the household size (increment of outer loop) using by command (I'm not sure in this case it's 1 or the numerber of family members), and I'm not sure if mix up the by and if loops is a good programming practice, I favor write a "full loop" in this case. Please give me some clues how to achieve my goal and provide (illustrate)pseudo code for me.
An extra question is I cannot find the ado file which contains the content of by command, does it exist?
I will abstract from the issue of whether the assumption used to create matches is a sensible one or not. Rather, let this be an example of reaching the desired results without using explicit loops. Some logic and the use of subscripting (see help subscripting) can get you far.
clear
set more off
*----- example data -----
input ///
hhid pid relhead sub_hhid
50 1 1 1
50 3 6 2
50 4 6 3
50 5 7 3
67 1 1 1
67 3 6 2
67 4 7 2
67 5 6 3
end
list, sepby(hhid)
*----- what you want -----
bysort hhid (pid): gen hhid2 = sum( !(relhead == 7 & relhead[_n-1] == 6) )
list, sepby(hhid)
As you can see, one line of code gets you there. The reasoning is the following:
sum() gives the running sum. The arguments to sum(), being conditions, can either be True or False. The ! denotes the logical not (see help operators).
If it is not the case that the relationship is daughter/son-in-law AND the previous relationship is daughter/son, the condition evaluates to True and takes on the value of 1, increasing the running sum by 1. If it evaluates to False, meaning that the relationship is daughter/son-in-law AND the previous relationship is daughter/son, then it takes on the value of 0 and the running sum will not increase. This gives the result you seek.
You do this using the by: prefix, since you want to check each original household independently, so to speak.
For the the first observation of each original household, the condition always evaluates to True. This is because there exist no "previous" observation (relationship), and Stata considers relhead to be missing (., a very large number) and therefore, not equal to 6. This takes the running sum from 0 to 1 for the first observation of each sub-group, and so on.
Bottom line: learn how to use by: and take advantage of the features offered by Stata. Do not swim against the current; not here.
Edit
Please note that instead of progressively changing your example data set, you should provide a representative example from the beginning. Not doing so can render answers that are initially OK, completely inadequate.
For your modified example, add:
replace hhid2 = 1 if !inlist(relhead,6,7)
That will simply assign anyone not 6 or 7 to the same household as the head. The head is assumed to always have hhid2 == 1. If the head can have hhid2 != 1, then
bysort hhid (relhead): replace hhid2 = hhid2[1] if !inlist(relhead,6,7)
should work.
You can follow with:
bysort hhid (pid): replace hhid2 = hhid2[_n-1] + 1 if hhid2 != hhid2[_n-1] & _n > 1
but because they are IDs, it's not really necessary.
Finally, use:
gen hhid3 = string(hhid) + "_" + string(hhid2)
to create IDs with the form 50_1, 50_2, 50_3, etc.
Like I said before, if your data presents more complications, you should present a relevant example.

How do I calculate the maximum or minimum seen so far in a sequence, and its associated id?

From this Stata FAQ, I know the answer to the first part of my question. But here I'd like to go a step further. Suppose I have the following data (already sorted by a variable not shown):
id v1
A 9
B 8
C 7
B 7
A 5
C 4
A 3
A 2
To calculate the minimum in this sequence, I do
generate minsofar = v1 if _n==1
replace minsofar = min(v1[_n-1], minsofar[_n-1]) if missing(minsofar)
To get
id v1 minsofar
A 9 9
B 8 9
C 7 8
B 7 7
A 5 7
C 4 5
A 3 4
A 2 3
Now I'd like to generate a variable, call it id_min that gives me the ID associated with minsofar, so something like
id v1 minsofar id_min
A 9 9 A
B 8 9 A
C 7 8 B
B 7 7 C
A 5 7 C
C 4 5 A
A 3 4 C
A 2 3 A
Note that C is associated with 7, because 7 is first associated with C in the current sorting. And just to be clear, my ID variable here shows as a string variable just for the sake of readability -- it's actually numeric.
Ideas?
EDIT:
I suppose
gen id_min = id if _n<=2
replace id_min = id[_n-1] if v1[_n-1]<minsofar[_n-1] & missing(id_min)
replace id_min = id_min[_n-1] if missing(id_min)
does the job at least for the data in this example. Don't know if it would work for more complex cases.
This works for your example. It uses the user-written command vlookup, which you can install running findit vlookup and following through the link that appears.
clear
set more off
input ///
str1 id v1
A 9
B 8
C 7
B 7
A 5
C 4
A 3
A 2
end
encode id, gen(id2)
order id2
drop id
list
*----- what you want -----
// your code
generate minsofar = v1 if _n==1
replace minsofar = min(v1[_n-1], minsofar[_n-1]) if missing(minsofar)
// save original sort
gen osort = _n
// group values of v1 but respecting original sort so values of
// id2 don't jump around
sort v1 osort
// set obs after first as missing so id2 is unique within v1
gen v2 = v1
by v1: replace v2 = . if _n > 1
// lookup
vlookup minsofar, gen(idmin) key(v2) value(id2)
// list
sort osort
drop osort v2
list, sep(0)
Your code has generate minsofar = v1 if _n==1 which is better coded as generate minsofar = v1 in 1, because it is more efficient.
Your minsofar variable is just a displaced copy of v1, so if this is always the case, there should be simpler ways of handling your problem. I suspect your problem is easier than you have acknowledged until now, and that has come through your post. Perhaps giving more context, expanded example data, etc. could get you better advice.
This is both easier and a little more challenging than implied so far. Given value (a little more evocative than the OP's v1) and a desire to keep track of minimum so far, that's for example
generate min_so_far = value[1]
replace min_so_far = value if value < min_so_far[_n-1] in 2/L
where the second statement exploits the unsurprising fact that Stata replaces in the current order of observations. [_n-1] is the index of the previous observation and in 2/L implies a loop over all observations from the second to the last.
Note that the OP's version is buggy: by always looking at the previous observation, the code never looks at the very last value and will overlook that if it is a new minimum. It may be that the OP really wants "minimum before now" but that is not what I understand by "minimum so far".
If we have missing values in value they will not enter the comparison in any malign way: missing is always regarded as arbitrarily large by Stata, so missings will be recorded if and only if no non-missings are present so far, which is as it should be.
The identifier of that minimum at first sight yields to the same logic
generate min_so_far = value[1]
gen id_min = id[1]
replace min_so_far = value if value < min_so_far[_n-1] in 2/L
replace id_min = id if value < min_so_far[_n-1] in 2/L
There are at least two twists that might bite. The OP mentions a possibility that the identifier might be missing so that we might have a new minimum but not know its identifier. The code just given will use a missing identifier, but if the desire is to keep separate track of the identifier of the minimum value with known identifiers, different code is needed.
A twist not mentioned to date is that observations with different identifier might all have the same minimum so far. The code above replaces the identifier only the first time a particular minimum is seen; if the desire is to record the identifier of the last occurrence the < in the last code line above should be replaced with <=. If the desire is to keep track of the all the identifiers of the minimum so far, then a string variable is needed to concatenate all the identifiers.
With a structure of panel or longitudinal data the whole thing is done under the aegis of by:.
I can't see a need to resort to user-written extensions here.

Time based rotation

I'm trying to figure out the best way of doing the following:
I have a list of values: L
I'd like to pick a subset of this list, of size N, and get a different subset (if the list has enough members) every X minutes.
I'd like the values to be picked sequentially, or randomly, as long as all the values get used.
For example, I have a list: [google.com, yahoo.com, gmail.com]
I'd like to pick X (2 for this example) values and rotate those values every Y(60 for now) minutes:
minute 0-59: [google.com, yahoo.com]
minute 60-119: [gmail.com, google.com
minute 120-179: [google.com, yahoo.com]
etc.
Random picking is also fine, i.e:
minute 0-59: [google.com, gmail.com]
minute 60-119: [yahoo.com, google.com]
Note: The time epoch should be 0 when the user sets the rotation up, i.e, the 0 point can be at any point in time.
Finally: I'd prefer not to store a set of "used" values or anything like that, if possible. i.e, I'd like this to be as simple as possible.
Random picking is actually preferred to sequential, but either is fine.
What's the best way to go about this? Python/Pseudo-code or C/C++ is fine.
Thank you!
You can use the itertools standard module to help:
import itertools
import random
import time
a = ["google.com", "yahoo.com", "gmail.com"]
combs = list(itertools.combinations(a, 2))
random.shuffle(combs)
for c in combs:
print(c)
time.sleep(3600)
EDIT: Based on your clarification in the comments, the following suggestion might help.
What you're looking for is a maximal-length sequence of integers within the range [0, N). You can generate this in Python using something like:
def modseq(n, p):
r = 0
for i in range(n):
r = (r + p) % n
yield r
Given an integer n and a prime number p (which is not a factor of n, making p greater than n guarantees this), you will get a sequence of all the integers from 0 to n-1:
>>> list(modseq(10, 13))
[3, 6, 9, 2, 5, 8, 1, 4, 7, 0]
From there, you can filter this list to include only the integers that contain the desired number of 1 bits set (see Best algorithm to count the number of set bits in a 32-bit integer? for suggestions). Then choose the elements from your set based on which bits are set to 1. In your case, you would use pass n as 2N if N is the number of elements in your set.
This sequence is deterministic given a time T (from which you can find the position in the sequence), a number N of elements, and a prime P.