Looking for hints to solve this dynamic programming problem - c++

I am trying to improve my problem solving skills for programming interviews and am trying to solve this problem. I have a feeling it can be solved using dynamic programming but the recursive relationship is not obvious to me.
To select the first three choir singers I simply use brute force. Since there are only 20 Choose 3 = 1140 ways to pick them. At first I thought dp[a][b][c] could represent the shortest song with three choir singers with remaining breath a, b, c. If I could calculate this using dp[a][b][c] = 1 + dp[a - 1][b - 1][c - 1], but what should be done when any of the indices equal 0, which choir singer should be substituted in. Additionally, we cannot reuse the dp array, because say in one instance we start with choir singers with breath a, b, c and in the second instance d, e, f. Once the first instance has been calculated and the dp array filled; the second instance may need to use dp[i][j][k] computed by the first instance. Since this value depends on the available choir singers in the first instance, and the available singers in both instances are not the same, dp[i][j][k] may not be possible in the second instance. This is because the shortest song length dp[i][j][k] may use choir singers which in the second instance are already being used.
I am out of ideas to tackle this problem and there is no solution anywhere. Could someone give me some hints to solve it?
Problem statement
We have N singers, who each have a certain time they can sing for and need 1 second to recover once out of breath. What is the minimum song they can sing, where three singers are singing at all times and where they all three finish singing simultaneously?
Input:
Input
3 < N <= 20
N integers Fi (1 <= Fi <= 10, for all 1 <= i <= N)

Here is the idea.
At each point in the singing, the current state can be represented by who the singers are, how long they have been singing, and which ones are currently out of breath. And from each state we need to transition to a new state, which is every singer out of breath is ready to sing again, every singer singing is good for one less turn, and new singers might be chosen.
Done naively, there are up to 20 choose 3 singers, each of which can be in 10 current states, plus up to 2 more who are out of breath. This is 175560000 combined states you can be in. That's too many, we need to be more clever to make this work.
Being more clever, we do not have 20 differentiable singers. We have 10 buckets of singers based on how long they can sing for. If a singer can sing for 7 turns, they can't be in 10 states if currently singing, but only 7. We do not care whether the two can sing for 7 turns are at 4 and 3 turns left or 3 and 4, they are the same. This introduces a lot of symmetries. Once we take care of all of the symmetries, that reduces the number of possible states that we might be in from hundreds of millions to (usually) tens of thousands.
And now we have a state transition for our DP which is dp[state1] to dp[state2]. The challenge being to produce a state representation that takes advantage of these symmetries that you can use as keys to your data structure.
UPDATE:
The main loop of the code would look like this Python:
while not finished:
song_length += 1
next_states = set()
for state in current_states:
for next_state in transitions(state):
if is_finished(next_state):
finished = True # Could break out of loops here
else:
next_states.add(next_state)
current_states = next_states
Most of the challenge is a good representation of a state, and your transitions function.

The state in terms of memoisation seems unrelated to the time elapsed since the start. Take any starting position,
a, b, c
where a, b, c are chosen magnitudes (how long each singer can hold their breath), and a is the smallest magnitude. We have
a, b, c
t = 0
and it's the same as:
0, b - a, c - a
t = a
So let's define the initial state with smallest magnitude a as:
b, c, ba, ca
where ba = b - a
ca = c - a
t = a
From here, every transition of the state is similar:
new_a <- x
where x is a magnitude in
the list that can be available
together with b and c. (We only
need to try each such unique
magnitude once during this
iteration. We must also prevent
a singer from repeating.)
let m = min(new_a, ba, ca)
then the new state is:
u, v, um, vm
t = t + m
where u and v are from the
elements of [new_a, b, c] that
aren't associated with m, and um
and vm are their pairs from
[new_a, ba, ca] that aren't m,
subtracted by m.
The state for memoisation of visited combinations can be only:
[(b, ba), (c, ca)] sorted by
the tuples' first element
with which we can prune a branch in the search if the associated t that is reached is equal or higher to the minimal one seen for that state.
Example:
2 4 7 6 5
Solution (read top-down):
4 5 6
7 4 5
2
States:
u v um vm
5 6 1 2
t = 4
new_a = 7
m = min(7, 1, 2) = 1 (associated with 5)
7 6 6 1
t = 5
new_a = 4
m = min(4, 6, 1) = 1 (associated with 6)
4 7 3 5
t = 6
new_a = 5
m = min(5, 3, 5) = 3 (associated with 4)
5 7 2 2
t = 9
new_a = 2
m = min(2, 2, 2) = 2 (associated with 2)
5 7 0 0
t = 11
Python code:
import heapq
from itertools import combinations
def f(A):
mag_counts = {}
for x in A:
if x in mag_counts:
mag_counts[x] = mag_counts[x] + 1
else:
mag_counts[x] = 1
q = []
seen = set()
# Initialise the queue with unique starting combinations
for comb in combinations(A, 3):
sorted_comb = tuple(sorted(comb))
if not sorted_comb in seen:
(a, b, c) = sorted_comb
heapq.heappush(q, (a, (b-a, b), (c-a, c), a))
seen.add(sorted_comb)
while q:
(t, (ba, b), (ca, c), prev) = heapq.heappop(q)
if ba == 0 and ca == 0:
return t
for mag in mag_counts.keys():
# Check that the magnitude is available
# and the same singer is not repeating.
[three, two] = [3, 2] if mag != prev else [4, 3]
if mag == b == c and mag_counts[mag] < three:
continue
elif mag == b and mag_counts[mag] < two:
continue
elif mag == c and mag_counts[mag] < two:
continue
elif mag == prev and mag_counts[mag] < 2:
continue
m = min(mag, ba, ca)
if m == mag:
heapq.heappush(q, (t + m, (ba-m, b), (ca-m, c), m))
elif m == ba:
heapq.heappush(q, (t + m, (mag-m, mag), (ca-m, c), b))
else:
heapq.heappush(q, (t + m, (mag-m, mag), (ba-m, b), c))
return float('inf')
As = [
[3, 2, 3, 3], # 3
[1, 2, 3, 2, 4], # 3
[2, 4, 7, 6, 5] # 11
]
for A in As:
print A, f(A)

Related

Algorithm to get best combination

I have items with ID 1, 3, 4, 5, 6, 7. Now I have data like following.
There is an offerId for each row. Array of Ids consist of combination of the ID in an array. Discount is the value for that offerId
offerId : Array of Ids : Discount
o1 : [1] : 45
o2 : [1 3 4] : 100
o3 : [3 5] : 55
o4 : [5] : 40
o5 : [6] : 30
o6 : [6 7] : 20
Now I have to select all the offerIds which give me best combination of Ids i.e. maximum total discount.
For example in above case : possible results can be:
[o2, o4, o5] maximum discount is 170(100 + 40 + 30).
Note. the result offerId should be such that Ids don't repeat. Example for o2,o4,o6 ids are [1,3,4], [5], [6] all are distinct.
Other combination can be :
o1, o3, 06 for which ids are [1], [3,5], [6,7] However the total is 120(45+55+20) which is less then 170 as in previous case.
I need an algorithm/code which will help me to identify combination of offerIds which will give maximum discount , considering that each offer should contain distinct Ids.
NOTE I am writing my code in go language. But solutions/Logic in any language will be helpful.
NOTE : I hope I am able to explain my requirement properly. please comment if any extra information is required. Thanks.
Here is a dynamic programming solution which, for every possible subset of IDs, finds the combination of offers for which the discount is maximum possible.
This will be pseudocode.
Let our offers be structures with fields offerNumber, setOfItems and discount.
For the purposes of implementation, we first renumerate the possible items by integers from zero to number of different possible items (say k) minus one.
After that, we can represent setOfItems by a binary number of length k.
For example, if k = 6 and setOfItems = 1011102, this set includes items 5, 3, 2 and 1 and excludes items 4 and 0, since bits 5, 3, 2 and 1 are ones and bits 4 and 0 are zeroes.
Now let f[s] be the best discount we can get using exactly set s of items.
Here, s can be any integer between 0 and 2k - 1, representing one of the 2k possible subsets.
Furthermore, let p[s] be the list of offers which together allow us to get discount f[s] for the set of items s.
The algorithm goes as follows.
initialize f[0] to zero, p[0] to empty list
initialize f[>0] to minus infinity
initialize bestF to 0, bestP to empty list
for each s from 0 to 2^k - 1:
for each o in offers:
if s & o.setOfItems == o.setOfItems: // o.setOfItems is a subset of s
if f[s] < f[s - o.setOfItems] + o.discount: // minus is set subtraction
f[s] = f[s - o.setOfItems] + o.discount
p[s] = p[s - o.setOfItems] append o.offerNumber
if bestF < f[s]:
bestF = f[s]
bestP = p[s]
After that, bestF is the best possible discount, and bestP is the list of offers which get us that discount.
The complexity is O (|offers| * 2k) where k is the total number of items.
Here is another implementation which is asymptotically the same, but might be faster in practice when most subsets are unreachable.
It is "forward" instead of "backward" dynamic programming.
initialize f[0] to zero, p[0] to empty list
initialize f[>0] to -1
initialize bestF to 0, bestP to empty list
for each s from 0 to 2^k - 1:
if f[s] >= 0: // only for reachable s
if bestF < f[s]:
bestF = f[s]
bestP = p[s]
for each o in offers:
if s & o.setOfItems == 0: // s and o.setOfItems don't intersect
if f[s + o.setOfItems] < f[s] + o.discount: // plus is set addition
f[s + o.setOfItems] = f[s] + o.discount
p[s + o.setOfItems] = p[s] append o.offerNumber

Mathematica collecting elements in a list

Given the 256 tuples generated from:
Tuples[{a,b,c,d},4] = {{a,a,a,a},{a,a,a,b}...,{d,d,d,d}}
I would like to filter all of the tuples that have exactly 3 of a kind. For example, I want to keep {c,b,c,c} & {a,a,d,a} etc.. but not {d,d,d,d} or {a,b,b,c}.
I know there are:
Binomial[4,3]*4*3 = 48
such tuples from simple maths. But I am looking for a programmatic way of counting these.
My final goal is from the tuples:
Tuples[{1,2,3,...,n},k]
I would like to know how many of those tuples have exactly one subset with m of a kind, with all other subgroups of a kind having size less than m.
In case you are interested, this problem spawned from asking: What is the average number of rounds played before there is a winner in the game "Cards Against Humanity"? Assuming we have n players and the first person with x cards wins.
This will find your 48 tuples
Select[Tuples[{a, b, c, d}, 4],
MatchQ[Sort[#], {a_, a_, a_, b_} | {b_, a_, a_, a_}] &&
Length[Union[#]] != 1 &]
This will show you the tuples of four items over 1,...,6 with m identical items and all other items appearing less than m times.
m = 2;
f[v_] := Module[{runlens},
runlens = Sort[Map[Length, Split[Sort[v]]]];
runlens[[-1]] == m && If[Length[runlens] == 1, True, runlens[[-2]] < m]]
];
Select[Tuples[Range[6], 4], f]
Use Count on that result and you know how many you have.
Another approach:
Select[ Tuples[{a, b, c, d}, 4] ,
((Count[#, 3] == 1 && Max[#] == 3) &#Tally[#][[All, 2]] ) & ]
Of course if the set size is greater than half the list length it is redundant to check both Max and Count

Make a one sequence if possible [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
A sequence of integers is a one- sequence if the difference between any two consecutive numbers in this sequence is -1 or 1 and its first element is 0.
More precisely: a1, a2, ..., an is a one-sequence if:
For any k (1 ≤ k < n): |a[k] - a[k+1]|=1,
a[1]=0
Given n and s ─ sum of all elements in a. W need to construct a one-sequence with the given parameters.
Like If n=8 and s=4 then one of such sequence is [0 1 2 1 0 -1 0 1].
Note if for given n and s we cant form such sequence than also we need to tell that its not possible.Otherwise we need to tell any of such one sequence.How to do this problem Please help.
Here's another take on aioobe's algorithm, with a formal proof of correctness.
Given a sequence a(k), define the difference sequence d(k) = a(k+1) - a(k) and observe that a(1) + a(2) + ... + a(n) = (n-1)d(1) + (n-2)d(2) + ... + 1d(n-1).
Theorem: for parameters n and s, there exists a length-n one-sequence summing to s if and only if (1) n(n-1)/2 mod 2 = s mod 2 and (2) |s| ≤ n(n-1)/2.
Proof: by induction on n. The base case, n = 1, is trivial. Inductively, since d(k) &in; {±1}, we observe that both (1) and (2) are necessary conditions, as n-1 + n-2 + ... + 1 = n(n-1)/2 and -1 mod 2 = 1 mod 2. Conversely, assume both (1) and (2). If s ≥ 0, then construct a length-(n-1) sequence summing to s - (n-1). If s < 0, then construct a length-(n-1) sequence summing to s + (n-1). Both (1) and (2) are satisfied for these constructions (some tedious case analysis omitted), so it follows from the inductive hypothesis that they succeed. Increase/decrease the elements of this sequence by one depending on whether s ≥ 0/s < 0 and put 0 at the beginning.
Since the proof of the theorem is constructive, we can implement it in Python.
def oneseq(n, s):
assert isinstance(n, int)
assert isinstance(s, int)
nchoose2 = n*(n-1)//2
abss = abs(s)
if n < 1 or abss%2 != nchoose2%2 or abss > nchoose2: return None
a = [0]
for k in range(n-1, 0, -1): # n-1, n-2, ..., 1
d = 1 if s >= 0 else -1
a.append(a[-1] + d) # a[-1] equivalent to a[len(a) - 1]
s -= k*d
return a
First, to decide if it's possible to solve or not can be done up front. Since you go either +1 or -1 in each step, you'll go from even, to odd, to even, to odd... So with an odd value for n you'll only be able to reach an even number, and for an even value of n you'll only be able to reach an odd number. The reachable range is simple as well: ±(1+2+3+...+n).
Second, if you draw the "decision tree" on whether to go up (+1) or down (-1) in each step, and draw the accumulated sum in each node, you'll see that you can do a kind of binary search to find the sum at one of the leaves in the tree.
You go +1 if you're about to undershoot, and go -1 if you're about to overshoot. The tricky part is to figure out if you're going to undershoot/overshoot. Your current "state" should be computed by
"what I have so far" + "what I'll get for free by staying at this level for the rest of the array".
What you have "for free by staying at this level" is stepsLeft * previousValue.
Here's some pseudo code.
solve(stepsLeft, prev, acc) {
if stepsLeft == 0, return empty list // base case
ifIStayHere = acc + prev*stepsLeft
step = ifIstayHere > s ? prev-1 : prev+1
return [step] concatenated by solve(stepsLeft-1, step, acc+step)
}
Note that this solution does not include the initial 0, so call it with stepsLeft = n-1.
As you can see, it's θ(n) and it works for all cases I've tested. (Implemented it in Java.)

How to do a set difference, except without eliminating repeated elements

I am trying to do the following in Matlab. Take two lists of numbers, possibly containing repeated elements, and subtract one set from the other set.
Ex: A=[1 1 2 4]; B=[1 2 4];
Desired result would be A-B=C=[1]
Or, another example, E=[3 3 5 5]; F=[3 3 5];
Desired result would be E-F=G=[5]
I wish I could do this using Matlab's set operations, but their function setdiff does not respect the repeated elements in the matrices. I appreciate that this is correct from a strict set theory standpoint, but would nevertheless like to tackle problems like: "I have 3 apples and 4 oranges, and you take 2 apples and 1 orange, how many of each do I have left." My range of possible values in these sets is in the thousands, so building a large matrix for tallying elements and then subtracting matrices does not seem feasible for speed reasons. I will have to do thousands of these calculations with thousands of set elements during a gui menu operation.
Example of what I would like to avoid for tackling the second example above:
E=[0 0 2 0 2]; F=[0 0 2 0 1];
G=E-F=[0 0 0 0 1];
Thanks for your help!
This can be done with the accumarray command.
A = [1 1 2 4]';
B = [1 2 4]'; % <-make these column vectors
X = accumarray(A, 1);
Y = accumarray(B, 1);
This will produce the output
X = [2 1 0 1]'
and
Y = [1 1 0 1]'
Where X(i) represents the number of incidents of the number i, in vector A, and Y(i) represents the number of incidents of number i in vector B.
Then you can just take X - Y.
One caveat: if the maximum values of A and B are different, the output from accummarray will have different lengths. If that is the case, you can just assign the output to be a subset of a vector of zeros that is the size of the larger vector.
I just want to improve on Prototoast's answer.
In order to avoid pitfalls involving non-positive numbers in A or B use hist:
A = [-10 0 1 1 2 4];
B = [1 2 4];
We need the minimum and maximum values in the union of A and B:
U = [A,B];
range_ = min(U):max(U);
So that we can use hist to give us same length vectors:
a = hist(A,range_)
b = hist(B,range_)
Now you need to subtract the histograms:
r = a-b
If you wish the set difference operator be symmetric then use:
r = abs(a-b)
The following will give you which items are in A \ B (\ here is your modified set difference):
C = range_(logical(r))
Hope this helps.

Enumerating combinations in a distributed manner

I have a problem where I must analyse 500C5 combinations (255244687600) of something. Distributing it over a 10-node cluster where each cluster processes roughly 10^6 combinations per second means the job will be complete in about seven hours.
The problem I have is distributing the 255244687600 combinations over the 10 nodes. I'd like to present each node with 25524468760, however the algorithms I'm using can only produce the combinations sequentially, I'd like to be able to pass the set of elements and a range of combination indicies, for example, [0-10^7), [10^7,2.0 10^7), etc. and have the nodes themselves figure out the combinations.
The algorithms I'm using at the moment are from the following:
http://howardhinnant.github.io/combinations.html
Stack Overflow question Efficiently computing vector combinations
I've considered using a master node, that enumerates each of the combinations and sends work to each of the nodes. However, the overhead incurred in iterating the combinations from a single node and communicating back and forth work is enormous, and it will subsequently lead to the master node becoming the bottleneck.
Is there any good combination iterating algorithms geared up for efficient/optimal distributed enumeration?
You may have some success with combinatorial numbers, which allow you to retrieve the N'th (n/10th) k-combination with a simple algorithm; then run the next_combination algorithm n/10 times on each of the ten nodes to iterate.
Sample code (in C#, but quite readable for a C++ programmer) can be found on MSDN.
Have node number n process every tenth combination, starting from the nth.
I know this question is old, but here is how it may be done efficiently.
All code currently in Python which I'm sure will be easy enough to translate to C++.- You will probably want to move from using an integer for the characteristic vector to using a bit array, since the integers used will need 500 bits (not a problem in Python)- Feel free to update to C++ anyone.
Distribute to the nodes their range of combinations (start number and length to process), the set of items from which combinations are to be chosen, and the number, k, of items to choose.
Initialise each node by having it find its starting combination directly from start and items.
Run each node by having it do the work for the first combination then iterate through the rest of its combinations, and do the associated work.
To perform 1 do as you suggest find n-choose-k and divide it into ranges - in your case 500-Choose-5 is, as you said, 255244687600 so, for node=0 to 9 you distribute:(start=node*25524468760, length=25524468760, items=items, k=5)
To perform 2 you can find the starting combination directly (without iteration) using the combinatorial number system and find the integer representation of the combination's characteristic vector (to be used for the iteration in 3) at the same time:
def getCombination(index, items, k):
'''Returns (combination, characteristicVector)
combination - The single combination, of k elements of items, that would be at
the provided index if all possible combinations had each been sorted in
descending order (as defined by the order of items) and then placed in a
sorted list.
characteristicVector - an integer with chosen item's bits set.
'''
combination = []
characteristicVector = 0
n = len(items)
nCk = 1
for nMinusI, iPlus1 in zip(range(n, n - k, -1), range(1, k + 1)):
nCk *= nMinusI
nCk //= iPlus1
curIndex = nCk
for k in range(k, 0, -1):
nCk *= k
nCk //= n
while curIndex - nCk > index:
curIndex -= nCk
nCk *= (n - k)
nCk -= nCk % k
n -= 1
nCk //= n
n -= 1
combination .append(items[n])
characteristicVector += 1 << n
return combination, characteristicVector
The integer representation of the characteristic vector has k bits set in the positions of the items that make up the combination.
To perform 3 you can use Gosper's hack to iterate to the next characteristic vector for the combination in the same number system (the next combination that would appear in a sorted list of reverse sorted combinations relative to items) and, at the same time, create the combination:
def nextCombination(items, characteristicVector):
'''Returns the next (combination, characteristicVector).
combination - The next combination of items that would appear after the
combination defined by the provided characteristic vector if all possible
combinations had each been sorted in descending order (as defined by the order
of items) and then placed in a sorted list.
characteristicVector - an integer with chosen item's bits set.
'''
u = characteristicVector & -characteristicVector
v = u + characteristicVector
if v <= 0:
raise OverflowError("Ran out of integers") # <- ready for C++
characteristicVector = v + (((v ^ characteristicVector) // u) >> 2)
combination = []
copiedVector = characteristicVector
index = len(items) - 1
while copiedVector > 0:
present, copiedVector = divmod(copiedVector, 1 << index)
if present:
combination.append(items[index])
index -= 1
return combination, characteristicVector
Repeat this length-1 times (since you already found the first one directly).
For example:
Five nodes processing 7-choose-3 letters:
>>> items = ('A','B','C','D','E','F','G')
>>> k = 3
>>> nodes = 5
>>> n = len(items)
>>> for nmip1, i in zip(range(n - 1, n - k, -1), range(2, k + 1)):
... n = n * nmip1 // i
...
>>> for node in range(nodes):
... length = n // nodes
... start = node * length
... print("Node {0} initialised".format(node))
... combination, cv = getCombination(start, items, k)
... doWork(combination)
... for i in range(length-1):
... combination, cv = nextCombination(items, cv)
... doWork(combination)
...
Node 0 initialised
Doing work with: C B A
Doing work with: D B A
Doing work with: D C A
Doing work with: D C B
Doing work with: E B A
Doing work with: E C A
Doing work with: E C B
Node 1 initialised
Doing work with: E D A
Doing work with: E D B
Doing work with: E D C
Doing work with: F B A
Doing work with: F C A
Doing work with: F C B
Doing work with: F D A
Node 2 initialised
Doing work with: F D B
Doing work with: F D C
Doing work with: F E A
Doing work with: F E B
Doing work with: F E C
Doing work with: F E D
Doing work with: G B A
Node 3 initialised
Doing work with: G C A
Doing work with: G C B
Doing work with: G D A
Doing work with: G D B
Doing work with: G D C
Doing work with: G E A
Doing work with: G E B
Node 4 initialised
Doing work with: G E C
Doing work with: G E D
Doing work with: G F A
Doing work with: G F B
Doing work with: G F C
Doing work with: G F D
Doing work with: G F E
>>>