Generating all permutations excluding cyclic rotations - c++

So I need an algorithm to generate all permutations of a list of numbers excluding cyclic rotations (e.g. [1,2,3] == [2,3,1] == [3,1,2]).
When there is at least 1 unique number in the sequence it is fairly straight forward, take out that unique number, generate all permutations of the remaining numbers (but with a small modification to the 'standard' permutations algorithm) and add the unique number to the front.
For generating the permutations I've found that it's necessary to change the permutations code to:
def permutations(done, options)
permuts = []
seen = []
for each o in options
if o not in seen
seen.add(o)
permuts += permutations(done+o, options.remove(o))
return permuts
Only using each unique number in options once means that you don't get 322 twice.
This algorithm still outputs rotations when there are no unique elements, e.g. for [1,1,2,2] it would output [1,1,2,2], [1,2,2,1] and [1,2,1,2] and the first two are cyclic rotations.
So is there an efficient algorithm that would allow me to generate all the permutations without having to go through afterwards to remove cyclic rotations?
If not, what would be the most efficient way to remove cyclic rotations?
NOTE: this is not using Python, but rather C++.

For the case of permutations where all numbers are distinct, this is simple. Say the numbers are 1,2,...,n, then generate all permutations of 1,2,...,n-1 and stick n at the front. This gives all permutations of the full set modulo cyclic rotations. For example, with n=4, you would do
4 1 2 3
4 1 3 2
4 2 1 3
4 2 3 1
4 3 1 2
4 3 2 1
This ensures that some cyclic rotation of each permutation of 1,2,3,4 appears exactly once in the list.
For the general case where you want permutations of a multiset (repeated entries allowed), you can use a similar trick. Remove all instances of the largest letter n (similar to ignoring the 4 in the above example) and generate all permutations of the remaining multiset. The next step is to put the ns back into the permutations in canonical ways (similar to putting the 4 at the beginning in the above example).
This is really just a case of finding all Lyndon words to generate necklaces

Think about testing each of the permutations you output, looking for a cyclic rotation that's "lexically" earlier than the one you've got. If there is one, don't return it - it will have been enumerated somewhere else.
Choosing a "unique" first element, if one exists, helps you optimize. You know if you fix the first element, and it's unique, then you can't possibly have duplicated it with a rotation. On the other hand, if there's no such unique element, just choose the one that occurs the least. That way you only need to check for cyclic rotations that have that first element. (Example, when you generate [1,2,2,1] - you only need to check [1,1,2,2], not [2,2,1,1] or [2,1,1,2]).
OK, pseudocode... clearly O(n!), and I'm convinced there's no smarter way, since the case "all symbols different" obviously has to return (n-1)! elements.
// generate all permutations with count[0] 0's, count[1] 1's...
def permutations(count[])
if(count[] all zero)
return just the empty permutation;
else
perms = [];
for all i with count[i] not zero
r = permutations(copy of count[] with element i decreased);
perms += i prefixed on every element of r
return perms;
// generate all noncyclic permutations with count[0] 0's, count[1] 1's...
def noncyclic(count[])
choose f to be the index with smallest count[f];
perms = permutations(copy of count[] with element f decreased);
if (count[f] is 1)
return perms;
else
noncyclic = [];
for each perm in perms
val = perm as a value in base(count.length);
for each occurence of f in perm
test = perm rotated so that f is first
tval = test as a value in base(count.length);
if (tval < val) continue to next perm;
if not skipped add perm to noncyclic;
return noncyclic;
// return all noncyclic perms of the given symbols
def main(symbols[])
dictionary = array of all distinct items in symbols;
count = array of counts, count[i] = count of dictionary[i] in symbols
nc = noncyclic(count);
return (elements of nc translated back to symbols with the dictionary)

This solution is going to involve a bit of itertools.permutations usage, set(), and some good ol' fashioned set difference. Bear in mind, the runtime for finding a permutation will still be O(n!). My solution won't do it in-line, either, but there may be a much more elegant solution that allows you to do so (and doesn't involve itertools.permutations). For this purpose, this is the straightforward way to accomplish the task.
Step 1: Algorithm for generating cycles, using the first element given. For a list [1, 1, 2, 2], this will give us [1, 1, 2, 2], [1, 2, 2, 1], [2, 1, 1, 2], [2, 2, 1, 1].
def rotations(li):
count = 0
while count < len(li):
yield tuple(li)
li = li[1:] + [li[0]]
count += 1
Step 2: Importing itertools.permutations to give us the permutations in the first place, then setting up its results into a set.
from itertools import permutations
perm = set(permutations([1, 1, 2, 2]))
Step 3: Using the generator to give us our own set, with the cycles (something we want to rid ourselves of).
cycles = set(((i for i in rotations([1, 1, 2, 2]))))
Step 4: Apply set difference to each and the cycles are removed.
perm = perm.difference(cycles)
Hopefully this will help you out. I'm open to suggestions and/or corrections.

First I'll show the containers and algorithms we'll be using:
#include <vector>
#include <set>
#include <algorithm>
#include <iostream>
#include <iterator>
using std::vector;
using std::set;
using std::sort;
using std::next_permutation;
using std::copy;
using std::ostream_iterator;
using std::cout;
Next our vector which will represent the Permutation:
typedef vector<unsigned int> Permutation;
We need a comparison object to check whether a permutation is a rotation:
struct CompareCyclicPermutationsEqual
{
bool operator()(const Permutation& lhs, const Permutation& rhs);
};
And typedef a set which uses the cyclic comparison object:
typedef set<Permutation, CompareCyclicPermutationsEqual> Permutations;
Then the main function is quite simple:
int main()
{
Permutation permutation = {1, 2, 1, 2};
sort(permutation.begin(). permutation.end());
Permutations permutations;
do {
permutations.insert(permutation);
} while(next_permutation(numbers.begin(), numbers.end()))
copy(permutations.begin(), permutations.end(),
ostream_iterator<Permutation>(cout, "\n");
return 0;
}
Output:
1, 1, 2, 2,
1, 2, 1, 2,
I haven't implemented CompareCyclicPermutationsEqual yet. Also you'd need to implement ostream& operator<<(ostream& os, const Permutation& permutation).

Related

Perfect sum problem with fixed subset size

I am looking for a least time-complex algorithm that would solve a variant of the perfect sum problem (initially: finding all variable size subset combinations from an array [*] of integers of size n that sum to a specific number x) where the subset combination size is of a fixed size k and return the possible combinations without direct and also indirect (when there's a combination containing the exact same elements from another in another order) duplicates.
I'm aware this problem is NP-hard, so I am not expecting a perfect general solution but something that could at least run in a reasonable time in my case, with n close to 1000 and k around 10
Things I have tried so far:
Finding a combination, then doing successive modifications on it and its modifications
Let's assume I have an array such as:
s = [1,2,3,3,4,5,6,9]
So I have n = 8, and I'd like x = 10 for k = 3
I found thanks to some obscure method (bruteforce?) a subset [3,3,4]
From this subset I'm finding other possible combinations by taking two elements out of it and replacing them with other elements that sum the same, i.e. (3, 3) can be replaced by (1, 5) since both got the same sum and the replacing numbers are not already in use. So I obtain another subset [1,5,4], then I repeat the process for all the obtained subsets... indefinitely?
The main issue as suggested here is that it's hard to determine when it's done and this method is rather chaotic. I imagined some variants of this method but they really are work in progress
Iterating through the set to list all k long combinations that sum to x
Pretty self explanatory. This is a naive method that do not work well in my case since I have a pretty large n and a k that is not small enough to avoid a catastrophically big number of combinations (the magnitude of the number of combinations is 10^27!)
I experimented several mechanism related to setting an area of research instead of stupidly iterating through all possibilities, but it's rather complicated and still work in progress
What would you suggest? (Snippets can be in any language, but I prefer C++)
[*] To clear the doubt about whether or not the base collection can contain duplicates, I used the term "array" instead of "set" to be more precise. The collection can contain duplicate integers in my case and quite much, with 70 different integers for 1000 elements (counts rounded), for example
With reasonable sum limit this problem might be solved using extension of dynamic programming approach for subset sum problem or coin change problem with predetermined number of coins. Note that we can count all variants in pseudopolynomial time O(x*n), but output size might grow exponentially, so generation of all variants might be a problem.
Make 3d array, list or vector with outer dimension x-1 for example: A[][][]. Every element A[p] of this list contains list of possible subsets with sum p.
We can walk through all elements (call current element item) of initial "set" (I noticed repeating elements in your example, so it is not true set).
Now scan A[] list from the last entry to the beginning. (This trick helps to avoid repeating usage of the same item).
If A[i - item] contains subsets with size < k, we can add all these subsets to A[i] appending item.
After full scan A[x] will contain subsets of size k and less, having sum x, and we can filter only those of size k
Example of output of my quick-made Delphi program for the next data:
Lst := [1,2,3,3,4,5,6,7];
k := 3;
sum := 10;
3 3 4
2 3 5 //distinct 3's
2 3 5
1 4 5
1 3 6
1 3 6 //distinct 3's
1 2 7
To exclude variants with distinct repeated elements (if needed), we can use non-first occurence only for subsets already containing the first occurence of item (so 3 3 4 will be valid while the second 2 3 5 won't be generated)
I literally translate my Delphi code into C++ (weird, I think :)
int main()
{
vector<vector<vector<int>>> A;
vector<int> Lst = { 1, 2, 3, 3, 4, 5, 6, 7 };
int k = 3;
int sum = 10;
A.push_back({ {0} }); //fictive array to make non-empty variant
for (int i = 0; i < sum; i++)
A.push_back({{}});
for (int item : Lst) {
for (int i = sum; i >= item; i--) {
for (int j = 0; j < A[i - item].size(); j++)
if (A[i - item][j].size() < k + 1 &&
A[i - item][j].size() > 0) {
vector<int> t = A[i - item][j];
t.push_back(item);
A[i].push_back(t); //add new variant including current item
}
}
}
//output needed variants
for (int i = 0; i < A[sum].size(); i++)
if (A[sum][i].size() == k + 1) {
for (int j = 1; j < A[sum][i].size(); j++) //excluding fictive 0
cout << A[sum][i][j] << " ";
cout << endl;
}
}
Here is a complete solution in Python. Translation to C++ is left to the reader.
Like the usual subset sum, generation of the doubly linked summary of the solutions is pseudo-polynomial. It is O(count_values * distinct_sums * depths_of_sums). However actually iterating through them can be exponential. But using generators the way I did avoids using a lot of memory to generate that list, even if it can take a long time to run.
from collections import namedtuple
# This is a doubly linked list.
# (value, tail) will be one group of solutions. (next_answer) is another.
SumPath = namedtuple('SumPath', 'value tail next_answer')
def fixed_sum_paths (array, target, count):
# First find counts of values to handle duplications.
value_repeats = {}
for value in array:
if value in value_repeats:
value_repeats[value] += 1
else:
value_repeats[value] = 1
# paths[depth][x] will be all subsets of size depth that sum to x.
paths = [{} for i in range(count+1)]
# First we add the empty set.
paths[0][0] = SumPath(value=None, tail=None, next_answer=None)
# Now we start adding values to it.
for value, repeats in value_repeats.items():
# Reversed depth avoids seeing paths we will find using this value.
for depth in reversed(range(len(paths))):
for result, path in paths[depth].items():
for i in range(1, repeats+1):
if count < i + depth:
# Do not fill in too deep.
break
result += value
if result in paths[depth+i]:
path = SumPath(
value=value,
tail=path,
next_answer=paths[depth+i][result]
)
else:
path = SumPath(
value=value,
tail=path,
next_answer=None
)
paths[depth+i][result] = path
# Subtle bug fix, a path for value, value
# should not lead to value, other_value because
# we already inserted that first.
path = SumPath(
value=value,
tail=path.tail,
next_answer=None
)
return paths[count][target]
def path_iter(paths):
if paths.value is None:
# We are the tail
yield []
else:
while paths is not None:
value = paths.value
for answer in path_iter(paths.tail):
answer.append(value)
yield answer
paths = paths.next_answer
def fixed_sums (array, target, count):
paths = fixed_sum_paths(array, target, count)
return path_iter(paths)
for path in fixed_sums([1,2,3,3,4,5,6,9], 10, 3):
print(path)
Incidentally for your example, here are the solutions:
[1, 3, 6]
[1, 4, 5]
[2, 3, 5]
[3, 3, 4]
You should first sort the so called array. Secondly, you should determine if the problem is actually solvable, to save time... So what you do is you take the last k elements and see if the sum of those is larger or equal to the x value, if it is smaller, you are done it is not possible to do something like that.... If it is actually equal yes you are also done there is no other permutations.... O(n) feels nice doesn't it?? If it is larger, than you got a lot of work to do..... You need to store all the permutations in an seperate array.... Then you go ahead and replace the smallest of the k numbers with the smallest element in the array.... If this is still larger than x then you do it for the second and third and so on until you get something smaller than x. Once you reach a point where you have the sum smaller than x, you can go ahead and start to increase the value of the last position you stopped at until you hit x.... Once you hit x that is your combination.... Then you can go ahead and get the previous element so if you had 1,1,5, 6 in your thingy, you can go ahead and grab the 1 as well, add it to your smallest element, 5 to get 6, next you check, can you write this number 6 as a combination of two values, you stop once you hit the value.... Then you can repeat for the others as well.... You problem can be solved in O(n!) time in the worst case.... I would not suggest that you 10^27 combinations, meaning you have more than 10^27 elements, mhmmm bad idea do you even have that much space??? That's like 3bits for the header and 8 bits for each integer you would need 9.8765*10^25 terabytes just to store that clossal array, more memory than a supercomputer, you should worry about whether your computer can even store this monster rather than if you can solve the problem, that many combinations even if you find a quadratic solution it would crash your computer, and you know what quadratic is a long way off from O(n!)...
A brute force method using recursion might look like this...
For example, given variables set, x, k, the following pseudo code might work:
setSumStructure find(int[] set, int x, int k, int setIdx)
{
int sz = set.length - setIdx;
if (sz < x) return null;
if (sz == x) check sum of set[setIdx] -> set[set.size] == k. if it does, return the set together with the sum, else return null;
for (int i = setIdx; i < set.size - (k - 1); i++)
filter(find (set, x - set[i], k - 1, i + 1));
return filteredSets;
}

Every sum possibilities of elements

From a given array (call it numbers[]), i want another array (results[]) which contains all sum possibilities between elements of the first array.
For example, if I have numbers[] = {1,3,5}, results[] will be {1,3,5,4,8,6,9,0}.
there are 2^n possibilities.
It doesn't matter if a number appears two times because results[] will be a set
I did it for sum of pairs or triplet, and it's very easy. But I don't understand how it works when we sum 0, 1, 2 or n numbers.
This is what I did for pairs :
std::unordered_set<int> pairPossibilities(std::vector<int> &numbers) {
std::unordered_set<int> results;
for(int i=0;i<numbers.size()-1;i++) {
for(int j=i+1;j<numbers.size();j++) {
results.insert(numbers.at(i)+numbers.at(j));
}
}
return results;
}
Also, assuming that the numbers[] is sorted, is there any possibility to sort results[] while we fill it ?
Thanks!
This can be done with Dynamic Programming (DP) in O(n*W) where W = sum{numbers}.
This is basically the same solution of Subset Sum Problem, exploiting the fact that the problem has optimal substructure.
DP[i, 0] = true
DP[-1, w] = false w != 0
DP[i, w] = DP[i-1, w] OR DP[i-1, w - numbers[i]]
Start by following the above solution to find DP[n, sum{numbers}].
As a result, you will get:
DP[n , w] = true if and only if w can be constructed from numbers
Following on from the Dynamic Programming answer, You could go with a recursive solution, and then use memoization to cache the results, top-down approach in contrast to Amit's bottom-up.
vector<int> subsetSum(vector<int>& nums)
{
vector<int> ans;
generateSubsetSum(ans,0,nums,0);
return ans;
}
void generateSubsetSum(vector<int>& ans, int sum, vector<int>& nums, int i)
{
if(i == nums.size() )
{
ans.push_back(sum);
return;
}
generateSubsetSum(ans,sum + nums[i],nums,i + 1);
generateSubsetSum(ans,sum,nums,i + 1);
}
Result is : {9 4 6 1 8 3 5 0} for the set {1,3,5}
This simply picks the first number at the first index i adds it to the sum and recurses. Once it returns, the second branch follows, sum, without the nums[i] added. To memoize this you would have a cache to store sum at i.
I would do something like this (seems easier) [I wanted to put this in comment but can't write the shifting and removing an elem at a time - you might need a linked list]
1 3 5
3 5
-----
4 8
1 3 5
5
-----
6
1 3 5
3 5
5
------
9
Add 0 to the list in the end.
Another way to solve this is create a subset arrays of vector of elements then sum up each array's vector's data.
e.g
1 3 5 = {1, 3} + {1,5} + {3,5} + {1,3,5} after removing sets of single element.
Keep in mind that it is always easier said than done. A single tiny mistake along the implemented algorithm would take a lot of time in debug to find it out. =]]
There has to be a binary chop version, as well. This one is a bit heavy-handed and relies on that set of answers you mention to filter repeated results:
Split the list into 2,
and generate the list of sums for each half
by recursion:
the minimum state is either
2 entries, with 1 result,
or 3 entries with 3 results
alternatively, take it down to 1 entry with 0 results, if you insist
Then combine the 2 halves:
All the returned entries from both halves are legitimate results
There are 4 additional result sets to add to the output result by combining:
The first half inputs vs the second half inputs
The first half outputs vs the second half inputs
The first half inputs vs the second half outputs
The first half outputs vs the second half outputs
Note that the outputs of the two halves may have some elements in common, but they should be treated separately for these combines.
The inputs can be scrubbed from the returned outputs of each recursion if the inputs are legitimate final results. If they are they can either be added back in at the top-level stage or returned by the bottom level stage and not considered again in the combining.
You could use a bitfield instead of a set to filter out the duplicates. There are reasonably efficient ways of stepping through a bitfield to find all the set bits. The max size of the bitfield is the sum of all the inputs.
There is no intelligence here, but lots of opportunity for parallel processing within the recursion and combine steps.

C++: Recursive function for variations with repetitions, ordered by amount of different letters

I have a function that generates variations like this: 111, 112, ..., 133, 211, 212, ..., 233, 311, ..., 333. Length of generated sequences always matches length of dictionary; with 4 symbols it'd be 1111 to 4444.
This is done in a brute force algorithm for graph coloring. We're trying to find the right sequence that has as less different colors as possible, i.e. if both 12343 and 12321 are solutions, we'd prefer the latter.
Right now I go and check each and every sequence if it’s right, and then store the best result in process. It’s not really a good code.
So professor asked me to write a function that generates variations in specific order. These sequences should come ordered by their amount of different numbers, like this: 111, 222, 333; 112, 113, 121, …, 323; 123, 213. In this case, if we found out that, say, 121 is right, we just stop, because we already know that it’s the best solution.
The idea is to skip as much sequence checks as possible so the code would run faster. Please help :)
Right now I use this code:
init function
std::vector<int> res; //contains the "alphabet"
res.reserve(V);
for (int i = V - 1; i >= 0; i--) {
res.push_back(i);
}
std::vector<int> index(res.size());
std::vector<int> bestresult(V); //here goes the best answer if it's found
for (int i = V - 1; i >= 0; i--) {
bestresult.push_back(i);
}
int bestcolors = V;
permutate(res, index, 0, bestresult, bestcolors);
result = bestresult;
permutate:
void Graph::permutate(const std::vector<int>& s, std::vector<int>& index, std::size_t depth, std::vector<int>& bestres, int &lowestAmountOfColors)
{
if (depth == s.size()) {
//doing all needed checks and saving bestresult here;
return;
}
for (std::size_t i = 0; i < s.size(); ++i) {
index[depth] = i;
permutate(s, index, depth + 1, bestres, lowestAmountOfColors);
}
}
How can I alter these functions?
The challenge is to find all permutations of colors so that you can test if they are a valid graph coloring. Unfortunately, it is exponential. So we need to search the permutations in a way that we check the smallest solutions first, and we need to prune the solution space dramatically.
To find the smallest solutions first, we must limit the number of colors available, and exhaust those permutations before we grow the number of colors. Pretty simple. We just need a function that considers n colors for N vertices. The number of vertices remains fixed, but we consider n=1, then n=2, etc.
Within the function, we know that we need various combinations of 1, 2, ... n with enough repetition to get a total of N different values. So I made a vector of counts. This vector has n entries, and the values sum up to N.
For example, if we are considering three color solutions for a graph with 7 vertices, one possible count array would be {4, 3, 1} would be used to generate the candidate {1, 1, 1, 1, 2, 2, 2, 3}. Color 1 appears 4 times. Color 2 appears 3 times. Color 3 appears 1 time.
The cool thing about this counts array is that as long as it is sorted greatest to least, then its combinations cannot duplicate any other combination we have considered, because colors are interchangable. (Okay, not entirely accurate, there are some duplications when colors have the same count, but we eliminated a lot of permutations from ever being looked at, which is the whole point).
Once you reduce the counts array to an actual candidate solution, you can find all ordering using combinations, not permutations. This will generate fewer candidates. Google next_combination to find some good code showing how to do this.
When we generate the counts array, I initialized all values to 1, then added all the remaining counts to the first color. I search ALL combinations which meet the counts array. Then I get the next candidate by shifting the counts to the right in such a way that it remains sorted.
So to sum up, find_minimum_graph_coloring has a for loop which calls solve_for_n. That function generates all the possible counts-arrays for that value of n, and calls another function. That function checks all combinations for that counts-array.
The first for loop checks smaller numbers of colors first, so we can return immediately upon finding a solution. The counts-array notation eliminates many equivalent colorations so if we consider {1, 1, 2} then we will never try {2, 2, 1}

Algorithm to find isomorphic set of permutations

I have an array of set of permutations, and I want to remove isomorphic permutations.
We have S sets of permutations, where each set contain K permutations, and each permutation is represented as and array of N elements. I'm currently saving it as an array int pset[S][K][N], where S, K and N are fixed, and N is larger than K.
Two sets of permutations, A and B, are isomorphic, if there exists a permutation P, that converts elements from A to B (for example, if a is an element of set A, then P(a) is an element of set B). In this case we can say that P makes A and B isomorphic.
My current algorithm is:
We choose all pairs s1 = pset[i] and s2 = pset[j], such that i < j
Each element from choosen sets (s1 and s2) are numered from 1 to K. That means that each element can be represented as s1[i] or s2[i], where 0 < i < K+1
For every permutation T of K elements, we do the following:
Find the permutation R, such that R(s1[1]) = s2[1]
Check if R is a permutation that make s1 and T(s2) isomorphic, where T(s2) is a rearrangement of the elements (permutations) of the set s2, so basically we just check if R(s1[i]) = s2[T[i]], where 0 < i < K+1
If not, then we go to the next permutation T.
This algorithms works really slow: O(S^2) for the first step, O(K!) to loop through each permutation T, O(N^2) to find the R, and O(K*N) to check if the R is the permutation that makes s1 and s2 isomorphic - so it is O(S^2 * K! * N^2).
Question: Can we make it faster?
You can sort and compare:
// 1 - sort each set of permutation
for i = 0 to S-1
sort(pset[i])
// 2 - sort the array of permutations itself
sort(pset)
// 3 - compare
for i = 1 to S-1 {
if(areEqual(pset[i], pset[i-1]))
// pset[i] and pset[i-1] are isomorphic
}
A concrete example:
0: [[1,2,3],[3,2,1]]
1: [[2,3,1],[1,3,2]]
2: [[1,2,3],[2,3,1]]
3: [[3,2,1],[1,2,3]]
After 1:
0: [[1,2,3],[3,2,1]]
1: [[1,3,2],[2,3,1]] // order changed
2: [[1,2,3],[2,3,1]]
3: [[1,2,3],[3,2,1]] // order changed
After 2:
2: [[1,2,3],[2,3,1]]
0: [[1,2,3],[3,2,1]]
3: [[1,2,3],[3,2,1]]
1: [[1,3,2],[2,3,1]]
After 3:
(2, 0) not isomorphic
(0, 3) isomorphic
(3, 1) not isomorphic
What about the complexity?
1 is O(S * (K * N) * log(K * N))
2 is O(S * K * N * log(S * K * N))
3 is O(S * K * N)
So the overall complexity is O(S * K * N log(S * K * N))
There is a very simple solution for this: transposition.
If two sets are isomorphic, it means a one-to-one mapping exists, where the set of all the numbers at index i in set S1 equals the set of all the numbers at some index k in set S2. My conjecture is that no two non-isomorphic sets have this property.
(1) Jean Logeart's example:
0: [[1,2,3],[3,2,1]]
1: [[2,3,1],[1,3,2]]
2: [[1,2,3],[2,3,1]]
3: [[3,2,1],[1,2,3]]
Perform ONE pass:
Transpose, O(n):
0: [[1,3],[2,2],[3,1]]
Sort both in and between groups, O(something log something):
0: [[1,3],[1,3],[2,2]]
Hash:
"131322" -> 0
...
"121233" -> 1
"121323" -> 2
"131322" -> already hashed.
0 and 3 are isomorphic.
(2) vsoftco's counter-example in his comment to Jean Logeart's answer:
A = [ [0, 1, 2], [2, 0, 1] ]
B = [ [1, 0, 2], [0, 2, 1] ]
"010212" -> A
"010212" -> already hashed.
A and B are isomorphic.
You can turn each set into a transposed-sorted string or hash or whatever compressed object for linear-time comparison. Note that this algorithm considers all three sets A, B and C as isomorphic even if one p converts A to B and another p converts A to C. Clearly, in this case, there are ps to convert any one of these three sets to the other, since all we are doing is moving each i in one set to a specific k in the other. If, as you stated, your goal is to "remove isomorphic permutations," you will still get a list of sets to remove.
Explanation:
Assume that along with our sorted hash, we kept a record of which permutation each i came from. vsoftco's counter-example:
010212 // hash for A and B
100110 // origin permutation, set A
100110 // origin permutation, set B
In order to confirm isomorphism, we need to show that the i's grouped in each index from the first set moved to some index in the second set, which index does not matter. Sorting the groups of i's does not invalidate the solution, rather it serves to confirm movement/permutation between sets.
Now by definition, each number in a hash and each number in each group in the hash is represented in an origin permutation exactly one time for each set. However we choose to arrange the numbers in each group of i's in the hash, we are guaranteed that each number in that group is representing a different permutation in the set; and the moment we theoretically assign that number, we are guaranteed it is "reserved" for that permutation and index only. For a given number, say 2, in the two hashes, we are guaranteed that it comes from one index and permutation in set A, and in the second hash corresponds to one index and permutation in set B. That is all we really need to show - that the number in one index for each permutation in one set (a group of distinct i's) went to one index only in the other set (a group of distinct k's). Which permutation and index the number belongs to is irrelevant.
Remember that any set S2, isomorphic to set S1, can be derived from S1 using one permutation function or various combinations of different permutation functions applied to S1's members. What the sorting, or reordering, of our numbers and groups actually represents is the permutation we are choosing to assign as the solution to the isomorphism rather than an actual assignment of which number came from which index and permutation. Here is vsoftco's counter-example again, this time we will add the origin indexes of our hashes:
110022 // origin index set A
001122 // origin index set B
Therefore our permutation, a solution to the isomorphism, is:
Or, in order:
(Notice that in Jean Logeart's example there is more than one solution to the isomorphism.)
Suppose that two elements of s1, s2 \in S are isomorphic. Then if p1 and p2 are permutations, then s1 is isomorphic to s2 iff p1(s1) is isomorphic to p2(s2) where pi(si) is the set of permutations obtained by applying pi to every element in si.
For each i in 1...s and j in 1...k, choose the j-th member of si, and find the permutation that changes it to unity. Apply it to all the elements of si. Hash each of the k permutations to a number, obtaining k numbers, for any choice of i and j, at cost nk.
Comparing the hashed sets for two different values of i and j is k^2 < nk. Thus, you can find the set of candidate matches at cost s^2 k^3 n. If the actual number of matches is low, the overall complexity is far beneath what you specified in your question.
Take a0 in A. Then find it's inverse (fast, O(N)), call it a0inv. Then choose some i in B and define P_i = b_i * ainv and check that P_i * a generates B, when varying a over A. Do this for every i in B. If you don't find any i for which the relation holds, then the sets are not isomorphic. If you find such an i, then the sets are isomorphic. The runtime is O(K^2) for each pair of sets it checks, and you'd need to check O(S^2) sets, so you end up with O(S^2 * K^2 * N).
PS: I assumed here that by "maps A to B" you mean mapping under permutation composition, so P(a) is actually the permutation P composed with the permutation a, and I've used the fact that if P is a permutation, then there must exist an i for which Pa = b_i for some a.
EDIT I decided to undelete my answer as I am not convinced the previous one (#Jean Logeart) based on searching is correct. If yes, I'll gladly delete mine, as it performs worse, but I think I have a counterexample, see the comments below Jean's answer.
To check if two sets S₁ and S₂ are isomorphic you can do a much shorter search.
If they are isomorphic then there is a permutation t that maps each element of S₁ to an element of S₂; to find t you can just pick any fixed element p of S₁ and consider the permutations
t₁ = (1/p) q₁
t₂ = (1/p) q₂
t₃ = (1/p) q₃
...
for all elements q of S₂. For, if a valid t exists then it must map the element p to an element of S₂, so only permutations mapping p to an element of S₂ are possible candidates.
Moreover given a candidate t to check if two sets of permutations S₁t and S₂ are equal you could use an hash computed as the x-or of an hash code for each element, doing the full check of all the permutations only if the hash matches.

Determine unique values across multiple sets

In this project, there are multiple sets in which they hold values from 1 - 9. Within this, I need to efficiently determine if there are values that is unique in one set but not others.
For Example:
std::set<int> s_1 = { 1, 2, 3, 4, 5 };
std::set<int> s_2 = { 2, 3, 4 };
std::set<int> s_3 = { 2, 3, 4, 6 };
Note: The number of sets is unknown until runtime.
As you can see, s_1 contains the unique value of 1 and 5 and s_3 contains the unique value of 6.
After determining the unique values, the aforementioned sets should then just contain the unique values like:
// s_1 { 1, 5 }
// s_2 { 2, 3, 4 }
// s_3 { 6 }
What I've tried so far is to loop through all the sets and record the count of the numbers that have appeared. However I wanted to know if there is a more efficient solution out there.
There are std algorithm in the std C++ library for intersection, difference and union operations on 2 sets.
If I understood well your problem you could do this :
do an intersection on all sets (in a loop) to determine a base, and then apply a difference between each set and the base ?
You could benchmark this against your current implementation. Should be faster.
Check out this answer.
Getting Union, Intersection, or Difference of Sets in C++
EDIT: cf Tony D. comment : You can basically do the same operation using a std::bitset<> and binary operators (& | etc..), which should be faster.
Depending on the actual size of your input, might be well worth a try.
I would suggest something in c# like this
Dictionary<int, int> result = new Dictionary<int, int>();
foreach(int i in sets){
if(!result.containskey(i))
result.add(i,1);
else
result[i].value = result[i].value+1;
}
now the Numbers with count value only 1 means its unique, then find the sets with these numbers...
I would suggest :
start inserting all the elements in all the sets into a multimap.
Here each element is a key and and the set name with be the value.
One your multimap is filled with all the elements in all the sets,
then loop throgth the multimap and take count of each element in the
multimap.
If the count is 1 for any key, this means its unique and value of
that will be the set name.