Extract all possible ordered sub-sets - c++

I have a set of elements from which I want to extract ordered subsets. What I mean by ordered subsets is that I cannot switch elements inside the set. I gave three examples to show how I am trying to solve the problem.
How can I extract these subsets recursively?
Do you have any pseudo-code in mind?
{ . } = subset
Example 1
Let S = {f1,f2,f3} be a set composed of 3 elements. I want to extract all the possible ordered sub-sets as follows:
-{f1},{f2},{f3} // {f1} is a subset, {f2} is a subset etc.
-{f1,f2},{f3} // {f1,f2} form a subset and {f3} is also a subset
-{f1},{f2,f3} // {f1} is a subset and {f2,f3} form a subset
Example 2
Let S = {f1,f2,f3,f4} be set a composed of 4 elements.
Possible ordered subsets:
-{f1},{f2},{f3},{f4}
-{f1,f2},{f3,f4}
-{f1},{f2,f3},{f4}
-{f1},{f2},{f3,f4}
-{f1,f2,f3}{f4}
-{f1},{f2,f3,f4}
-{f1,f2},{f3},{f4}
-{f1,f2,f3,f4}
Example 3
Let S = {f1,f2,f3,f4,f5} be set a composed of 5 elements.
Possible ordered subsets:
-{f1},{f2},{f3},{f4},{f5}
-{f1,f2},{f3},{f4},{f5}
-{f1},{f2,f3},{f4},{f5}
-{f1},{f2},{f3,f4},{f5}
-{f1},{f2},{f3},{f4,f5}
-{f1,f2},{f3,f4},{f5}
-{f1},{f2,f3},{f4,f5}
-{f1,f2,f3},{f4,f5}
-{f1,f2,f3},{f4},{f5}
-{f1},{f2,f3,f4},{f5}
-{f1},{f2},{f3,f4,f5}
-{f1,f2},{f3,f4,f5}
-{f1,f2,f3,f4}{f5}
-{f1},{f2,f3,f4,f5}
- etc...

If an array contains the set, modify the array such that there is one space between every element. This space is reserved for partitioning. Take any naming convention. 0 implies no partition whereas 1 implies partition. Now traverse through the array to recursively add 1 or 0 in the partition. All possible combinations can be generated.
Taking Example 1:
S = {f1,f2,f3}
S'= {f1,0,f2,0,f3}
So the subsets will be:
{f1,0,f2,0,f3}, {f1,0,f2,1,f3}, {f1,1,f2,0,f3}, {f1,1,f2,1,f3}
which is same as:
{f1,f2,f3}, {{f1,f2},{f3}}, {{f1},{f2,f3}}, {{f1},{f2},{f3}}
If you don't want the original set to appear in the set of all subsets, just don't consider the state where every partition contains 0.

Let's say set S = {a,b,c,d} contain 4 elements. All the subsets can be generated by writing 2 ^ n - 1 in binary and subsequent subtraction.
a b c d
1 1 1 1 => (a b c d)
1 1 1 0 => (a b c)(d)
1 1 0 1 => (a b d)(c) //The logic is to club all the 1's together
1 1 0 0 => (a b) now 0 0 can be further broken down into (1 1) => (c d) , (1 0) => (c)(d)
1 0 1 1 => (a c d)(b)
1 0 1 0 => (a c) now 0 0 can be further broken down into (1 1) => (b d ), (1 0 ) => (b)(d)
1 0 0 1 => (a d) same steps as above
1 0 0 0 => (a) now left with 3 zeros we have b c d as 3 sets now we can start afresh with 1 1 1 and then go to 1 1 0 and so on.
In this way we are able to generate all the subsets.

Related

Can we really avoid extra space when all the values are non-negative?

This question is a follow-up of another one I had asked quite a while ago:
We have been given an array of integers and another number k and we need to find the total number of continuous subarrays whose sum equals to k. For e.g., for the input: [1,1,1] and k=2, the expected output is 2.
In the accepted answer, #talex says:
PS: BTW if all values are non-negative there is better algorithm. it doesn't require extra memory.
While I didn't think much about it then, I am curious about it now. IMHO, we will require extra memory. In the event that all the input values are non-negative, our running (prefix) sum will go on increasing, and as such, sure, we don't need an unordered_map to store the frequency of a particular sum. But, we will still need extra memory (perhaps an unordered_set) to store the running (prefix) sums that we get along the way. This obviously contradicts what #talex said.
Could someone please confirm if we absolutely do need extra memory or if it could be avoided?
Thanks!
Let's start with a slightly simpler problem: all values are positive (no zeros). In this case the sub arrays can overlap, but they cannot contain one another.
I.e.: arr = 2 1 5 1 1 5 1 2, Sum = 8
2 1 5 1 1 5 1 2
|---|
|-----|
|-----|
|---|
But this situation can never occur:
* * * * * * *
|-------|
|---|
With this in mind there is algorithm that doesn't require extra space (well.. O(1) space) and has O(n) time complexity. The ideea is to have left and right indexes indicating the current sequence and the sum of the current sequence.
if the sum is k increment the counter, advance left and right
if the sum is less than k then advance right
else advance left
Now if there are zeros the intervals can contain one another, but only if the zeros are on the margins of the interval.
To adapt to non-negative numbers:
Do as above, except:
skip zeros when advancing left
if sum is k:
count consecutive zeros to the right of right, lets say zeroes_right_count
count consecutive zeros to the left of left. lets say zeroes_left_count
instead of incrementing the count as before, increase the counter by: (zeroes_left_count + 1) * (zeroes_right_count + 1)
Example:
... 7 0 0 5 1 2 0 0 0 9 ...
^ ^
left right
Here we have 2 zeroes to the left and 3 zeros to the right. This makes (2 + 1) * (3 + 1) = 12 sequences with sum 8 here:
5 1 2
5 1 2 0
5 1 2 0 0
5 1 2 0 0 0
0 5 1 2
0 5 1 2 0
0 5 1 2 0 0
0 5 1 2 0 0 0
0 0 5 1 2
0 0 5 1 2 0
0 0 5 1 2 0 0
0 0 5 1 2 0 0 0
I think this algorithm would work, using O(1) space.
We maintain two pointers to the beginning and end of the current subsequence, as well as the sum of the current subsequence. Initially, both pointers point to array[0], and the sum is obviously set to array[0].
Advance the end pointer (thus extending the subsequence to the right), and increase the sum by the value it points to, until that sum exceeds k. Then advance the start pointer (thus shrinking the subsequence from the left), and decrease the sum, until that sum gets below k. Keep doing this until the end pointer reaches the end of the array. Keep track of the number of times the sum was exactly k.

Intuition behind working with `k` to find the kth-symbol in the grammar

I took part in a coding contest wherein I encountered the following question:
On the first row, we write a 0. Now in every subsequent row, we look at the previous row and replace each occurrence of 0 with 01, and each occurrence of 1 with 10. Given row N and index K, return the K-th indexed symbol in row N. (The values of K are 1-indexed.)
While solving the question, I solved it like a level-order traversal of a tree, trying to form the new string at each level. Unfortunately, it timed-out. I then tried to think along the terms of caching the results, etc. with no luck.
One of the highly upvoted solutions is like this:
class Solution {
public:
int kthGrammar(int N, int K) {
if (N == 1) return 0;
if (K % 2 == 0) return (kthGrammar(N - 1, K / 2) == 0) ? 1 : 0;
else return (kthGrammar(N - 1, (K + 1) / 2) == 0) ? 0 : 1;
}
};
My question is simple - what is the intuition behind working with the value of K (especially, the parities of K)? (I hope to be able to identify such questions when I encounter them in future).
Thanks.
Look at the sequence recursively. In generating a new row, the first half is identical to the process you used to get the previous row, so that part of the expansion is already done. The second half is merely the same sequence inverted (0 for 1, 1 for 0). This is one classic way to generate a parity map: flip all the bits and append, representing adding a 1 to the start of each binary number. Thinking of expanding the sequence 0-3 to 0-7, we start with
00 => 0
01 => 1
10 => 1
11 => 0
We now replicate the 2-digit sequence twice: first with a leading 0, which preserves the original parity; second with a leading 1, which inverts the parity.
000 => 0
001 => 1
010 => 1
011 => 0
100 => 1
101 => 0
110 => 0
111 => 1
Is that an intuition that works for you?
Just for fun, as a different way to solve this, consider that the nth row (0-indexed) has 2^n elements in it, and a determination as to the value of the kth (0-indexed) element can be made soley according to the parity of how many bits are set in k.
The check for parity in the code you posted is just to make the division by two correct, there's no advanced math or mystery hiding here :) Since the pattern is akin to a tree, where the pattern size multiplies by two for each added row, correctly dividing points to the element's parent. The indexes in this question are said to be "1-indexed;" if the index is 2, dividing by two yields the parent index (1) in the row before; and if the index is 1, dividing (1+1) by two yields that same parent index. I'll leave it to the reader to generalize that to ks parity. After finding the parent, the code follows the rule stated in the question: if the parent is 0, the left-child must be 0 and right-child 1, and vice versa.
0
0 1
0 1 1 0
0 1 1 0 1 0 0 1
0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0
a a b a b b a
0 01 0110 01101001 0110100110010110
a b b a b a a b
0110100110010110 1001011001101001

How do I generate all vectors of size n where each element may contain 1 of m different values?

Sorry if this is a duplicate, but I did not find any answers which match mine.
Consider that I have a vector which contains 3 values. I want to construct another vector of a specified length from this vector. For example, let's say that the length n=3 and the vector contains the following values 0 1 2. The output that I expect is as follows:
0 0 0
0 0 1
0 0 2
0 1 0
0 1 1
0 1 2
0 2 0
0 2 1
0 2 2
1 0 0
1 0 1
1 0 2
1 1 0
1 1 1
1 1 2
1 2 0
1 2 1
1 2 2
2 0 0
2 0 1
2 0 2
2 1 0
2 1 1
2 1 2
2 2 0
2 2 1
2 2 2
My current implementation simply constructs for loops based on nand generates the expected output. I want to be able to construct output vectors of different lengths and with different values in the input vector.
I have looked at possible implementations using next_permutation, but unfortunately passing a length value does not seem to work.
Are there time and complexity algorithms that one can use for this case? Again, I might have compute this for up to n=17and sizeof vector around 6.
Below is my implementation for n=3. Here, encis the vector which contains the input.
vector<vector<int> > combo_3(vector<double>enc,int bw){
vector<vector<int> > possibles;
for (unsigned int inner=0;inner<enc.size();inner++){
for (unsigned int inner1=0;inner1<enc.size();inner1++){
for (unsigned int inner2=0;inner2<enc.size();inner2++){
cout<<inner<<" "<<inner1<<" "<<inner2<<endl;
unsigned int arr[]={inner,inner1,inner2};
vector<int>current(arr,arr+sizeof(arr)/sizeof(arr[0]));
possibles.push_back(current);
current.clear();
}
}
}
return possibles;
}
What you are doing is simple counting. Think of your output vector as a list of a list of digits (a vector of a vector). Each digit may have one of m different values where m is the size of your input vector.
This is not permutation generation. Generating every permutation means generating every possible ordering of an input vector, which is not what you're looking for at all.
If you think of this as a counting problem the answer may become clearer to you. For example, how would you generate all base 10 numbers with 5 digits? In that case, your input vector has size 10, and each vector in your output list has length 5.

How Domain maps map indexes to target locales array in multi-dimension case

I didn't find how the domain map maps the indices in the multi-dimensional domains to the multi-dimensional target locales.
1.) How the target locales (one dimension) is arranged in multi-dimension fashion which equals the distribution dimension to map the indexes?
2.) In documentation it states that for multi-dimension case, the computation should be done in every dimension. For the domain {1..8, 1..8} ==> dom
assume dom is block-distributed over 6 target locales.
Steps in mapping
1 for 1st dimension (1..8) do the computation
if idx is low<=idx<=high then locid is
floor (idx-low)*N / (high-low+1) gives me an index say i.
Repeat the same for 2nd dimension which gives me an index say j.
Now I have a tuple ( i, j )
how this is mapped to the target locales array of dimension 2?
What the domain map do for changing the 1D target locales array to distribution dimension?
Is something like reshape function ?
Please let me know if this lacks sufficient information.
The specific details about how a domain's indices are mapped to a program's locales are not defined by the Chapel language itself, but rather by the implementation of the domain map used to declare the domain. In the comments under your question, you mention that you're referring to the Block distribution, so I'll focus on that in my answer (documented here), but note that any other domain map could take a different approach.
The Block distribution takes an optional targetLocales argument which permits you to specify the set of locales to be targeted, as well as their virtual topology. For instance, if I declare and populate a few arrays of locales:
var grid1: [1..3, 1..2] locale, // a 3 x 2 array of locales
grid2: [1..2, 1..3] locale; // a 2 x 3 array of locales
for i in 1..3 {
for j in 1..2 {
grid1[i,j] = Locales[(2*(i-1) + j-1)%numLocales];
grid2[j,i] = Locales[(3*(j-1) + i-1)%numLocales];
}
}
I can then pass them in as the targetLocales arguments to a few instances of a Block-distributed domain:
use BlockDist;
config const n = 8;
const D = {1..n, 1..n},
D1 = D dmapped Block(D, targetLocales=grid1),
D2 = D dmapped Block(D, targetLocales=grid2);
Each domain will distribute its n rows to the first dimension of its targetLocales grid and its n columns to the second dimension. We can see the results of this distribution by declaring arrays of integers over these domains and assigning them in parallel to make each element store its owning locale's ID, as follows:
var A1: [D1] int,
A2: [D2] int;
forall a in A1 do
a = here.id;
forall a in A2 do
a = here.id;
writeln(A1, "\n");
writeln(A2, "\n");
When running on six or more locales (./a.out -nl 6), the output is as follows, revealing the underlying grid structure:
0 0 0 0 1 1 1 1
0 0 0 0 1 1 1 1
0 0 0 0 1 1 1 1
2 2 2 2 3 3 3 3
2 2 2 2 3 3 3 3
2 2 2 2 3 3 3 3
4 4 4 4 5 5 5 5
4 4 4 4 5 5 5 5
0 0 0 1 1 1 2 2
0 0 0 1 1 1 2 2
0 0 0 1 1 1 2 2
0 0 0 1 1 1 2 2
3 3 3 4 4 4 5 5
3 3 3 4 4 4 5 5
3 3 3 4 4 4 5 5
3 3 3 4 4 4 5 5
For a 1-dimensional targetLocales array, the documentation says:
If the rank of targetLocales is 1, a greedy heuristic is used to reshape the array of target locales so that it matches the rank of the distribution and each dimension contains an approximately equal number of indices.
For example, if we distribute to a 1-dimensional 4-element array of locales:
var grid3: [1..4] locale;
for i in 1..4 do
grid3[i] = Locales[(i-1)%numLocales];
var D3 = D dmapped Block(D, targetLocales=grid3);
var A3: [D3] int;
forall a in A3 do
a = here.id;
writeln(A3);
we can see that the target locales form a square, as expected:
0 0 0 0 1 1 1 1
0 0 0 0 1 1 1 1
0 0 0 0 1 1 1 1
0 0 0 0 1 1 1 1
2 2 2 2 3 3 3 3
2 2 2 2 3 3 3 3
2 2 2 2 3 3 3 3
2 2 2 2 3 3 3 3
The documentation is intentionally vague about how a 1D targetLocales argument will be reshaped if it's not a perfect square, but we can find out what's done in practice by using the targetLocales() query on the domain. Also, note that if no targetLocales array is supplied, the entire Locales array (which is 1D) is used by default. As an illustration of both these things, if the following code is run on six locales:
var D0 = D dmapped Block(D);
writeln(D0.targetLocales());
we get:
LOCALE0 LOCALE1
LOCALE2 LOCALE3
LOCALE4 LOCALE5
illustrating that the current heuristic matches our explicit grid1 declaration above.

How to find and return a repeating sequence within a vector

I have a vector that is filled dynamically and will always contain a repeating sequence with characters and length that I am unsure of. For example, the vector could contain these elements:
0 1 1 2 3 1 0 1 1 2 3 1 0 1 1 2
and the repeating sequence in that vector is:
0 1 1 2 3 1
How can I search the vector and find those elements. I would like to put the found sequence in a new vector. I assumed at first it would only take a simple for loop and checking for repetition of the first and second element in the array, so in the case above I would exit the loop when I reached 0 1 a second time, but the problem is that it cannot be assumed that the first 2 elements will be in the repeating pattern, so
0 1 2 3 2 3 2 3 2 3
can be valid elements in the vector. Any ideas?
in general (infinite result) it is impossible to know the sequence because something like that can happen 1 million 0 and then 1,after 1000 zero u will think that the sequence is zero only,but if the vector is finite
you can write somethink like that
for(I..VECTORSIZE / 2)
if(VECTORSIZE % I == 0)
CHECK IF SUBVECTOR(0,I) == SUBVECTOR(I,I*2) == SUBVECTOR(I*2,I*3)....
return I
else continute;