Algorithm that can create all combinations and all groups of those combinations - c++

Let's say I have a set of elements S = { 1, 2, 3, 4, 5, 6, 7, 8, 9 }
I would like to create combinations of 3 and group them in a way such that no number appears in more than one combination.
Here is an example:
{ {3, 7, 9}, {1, 2, 4}, {5, 6, 8} }
The order of the numbers in the groups does not matter, nor does the order of the groups in the entire example.
In short, I want every possible group combination from every possible combination in the original set, excluding the ones that have a number appearing in multiple groups.
My question: is this actually feasible in terms of run time and memory? My sample sizes could be somewhere around 30-50 numbers.
If so, what is the best way to create this algorithm? Would it be best to create all possible combinations, and choose the groups only if the number hasn't already appeared?
I'm writing this in Qt 5.6, which is a C++ based framework.

You can do this recursively, and avoid duplicates, if you keep the first element fixed in each recursion, and only make groups of 3 with the values in order, eg:
{1,2,3,4,5,6,7,8,9}
Put the lowest element in the first spot (a), and keep it there:
{a,b,c} = {1, *, *}
For the second spot (b), iterate over every value from the second-lowest to the second-highest:
{a,b,c} = {1, 2~8, *}
For the third spot (c), iterate over every value higher than the second value:
{1, 2~8, b+1~9}
Then recurse with the rest of the values.
{1,2,3} {4,5,6} {7,8,9}
{1,2,3} {4,5,7} {6,8,9}
{1,2,3} {4,5,8} {6,7,9}
{1,2,3} {4,5,9} {6,7,8}
{1,2,3} {4,6,7} {5,8,9}
{1,2,3} {4,6,8} {5,7,9}
{1,2,3} {4,6,9} {5,7,8}
{1,2,3} {4,7,8} {5,6,9}
{1,2,3} {4,7,9} {5,6,8}
{1,2,3} {4,8,9} {5,6,7}
{1,2,4} {3,5,6} {7,8,9}
...
{1,8,9} {2,6,7} {3,4,5}
Wen I say "in order", that doesn't have to be any specific order (numerical, alphabetical...), it can just be the original order of the input. You can avoid having to re-sort the input of each recursion if you make sure to pass the rest of the values on to the next recursion in the order you received them.
A run-through of the recursion:
Let's say you get the input {1,2,3,4,5,6,7,8,9}. As the first element in the group, you take the first element from the input, and for the other two elements, you iterate over the other values:
{1,2,3}
{1,2,4}
{1,2,5}
{1,2,6}
{1,2,7}
{1,2,8}
{1,2,9}
{1,3,4}
{1,3,5}
{1,3,6}
...
{1,8,9}
making sure the third element always comes after the second element, to avoid duplicates like:
{1,3,5} ⇆ {1,5,3}
Now, let's say that at a certain point, you've selected this as the first group:
{1,3,7}
You then pass the rest of the values onto the next recursion:
{2,4,5,6,8,9}
In this recursion, you apply the same rules as for the first group: take the first element as the first element in the group and keep it there, and iterate over the other values for the second and third element:
{2,4,5}
{2,4,6}
{2,4,8}
{2,4,9}
{2,5,6}
{2,5,8}
{2,5,9}
{2,6,7}
...
{2,8,9}
Now, let's say that at a certain point, you've selected this as the second group:
{2,5,6}
You then pass the rest of the values onto the next recursion:
{4,8,9}
And since this is the last group, there is only one possibility, and so this particular recursion would end in the combination:
{1,3,7} {2,5,6} {4,8,9}
As you see, you don't have to sort the values at any point, as long as you pass them onto the next recursion in the order you recevied them. So if you receive e.g.:
{q,w,e,r,t,y,u,i,o}
and you select from this the group:
{q,r,u}
then you should pass on:
{w,e,t,y,i,o}
Here's a JavaScript snippet which demonstrates the method; it returns a 3D array with combinations of groups of elements.
(The filter function creates a copy of the input array, with elements 0, i and j removed.)
function clone2D(array) {
var clone = [];
for (var i = 0; i < array.length; i++) clone.push(array[i].slice());
return clone;
}
function groupThree(input) {
var result = [], combination = [];
group(input, 0);
return result;
function group(input, step) {
combination[step] = [input[0]];
for (var i = 1; i < input.length - 1; i++) {
combination[step][1] = input[i];
for (var j = i + 1; j < input.length; j++) {
combination[step][2] = input[j];
if (input.length > 3) {
var rest = input.filter(function(elem, index) {
return index && index != i && index != j;
});
group(rest, step + 1);
}
else result.push(clone2D(combination));
}
}
}
}
var result = groupThree([1,2,3,4,5,6,7,8,9]);
for (var r in result) document.write(JSON.stringify(result[r]) + "<br>");

For n things taken 3 at a time, you could use 3 nested loops:
for(k = 0; k < n-2; k++){
for(j = k+1; j < n-1; j++){
for(i = j+1; i < n ; i++){
... S[k] ... S[j] ... S[i]
}
}
}
For a generic solution of n things taken k at a time, you could use an array of k counters.

I think You can solve it by using coin change problem with dynamic programming, just assume You are looking for change of 3 and every index in array is a coin value 1, then just output coins(values in Your array) that has been found.
Link: https://www.youtube.com/watch?v=18NVyOI_690

Related

Perfect sum problem with fixed subset size

I am looking for a least time-complex algorithm that would solve a variant of the perfect sum problem (initially: finding all variable size subset combinations from an array [*] of integers of size n that sum to a specific number x) where the subset combination size is of a fixed size k and return the possible combinations without direct and also indirect (when there's a combination containing the exact same elements from another in another order) duplicates.
I'm aware this problem is NP-hard, so I am not expecting a perfect general solution but something that could at least run in a reasonable time in my case, with n close to 1000 and k around 10
Things I have tried so far:
Finding a combination, then doing successive modifications on it and its modifications
Let's assume I have an array such as:
s = [1,2,3,3,4,5,6,9]
So I have n = 8, and I'd like x = 10 for k = 3
I found thanks to some obscure method (bruteforce?) a subset [3,3,4]
From this subset I'm finding other possible combinations by taking two elements out of it and replacing them with other elements that sum the same, i.e. (3, 3) can be replaced by (1, 5) since both got the same sum and the replacing numbers are not already in use. So I obtain another subset [1,5,4], then I repeat the process for all the obtained subsets... indefinitely?
The main issue as suggested here is that it's hard to determine when it's done and this method is rather chaotic. I imagined some variants of this method but they really are work in progress
Iterating through the set to list all k long combinations that sum to x
Pretty self explanatory. This is a naive method that do not work well in my case since I have a pretty large n and a k that is not small enough to avoid a catastrophically big number of combinations (the magnitude of the number of combinations is 10^27!)
I experimented several mechanism related to setting an area of research instead of stupidly iterating through all possibilities, but it's rather complicated and still work in progress
What would you suggest? (Snippets can be in any language, but I prefer C++)
[*] To clear the doubt about whether or not the base collection can contain duplicates, I used the term "array" instead of "set" to be more precise. The collection can contain duplicate integers in my case and quite much, with 70 different integers for 1000 elements (counts rounded), for example
With reasonable sum limit this problem might be solved using extension of dynamic programming approach for subset sum problem or coin change problem with predetermined number of coins. Note that we can count all variants in pseudopolynomial time O(x*n), but output size might grow exponentially, so generation of all variants might be a problem.
Make 3d array, list or vector with outer dimension x-1 for example: A[][][]. Every element A[p] of this list contains list of possible subsets with sum p.
We can walk through all elements (call current element item) of initial "set" (I noticed repeating elements in your example, so it is not true set).
Now scan A[] list from the last entry to the beginning. (This trick helps to avoid repeating usage of the same item).
If A[i - item] contains subsets with size < k, we can add all these subsets to A[i] appending item.
After full scan A[x] will contain subsets of size k and less, having sum x, and we can filter only those of size k
Example of output of my quick-made Delphi program for the next data:
Lst := [1,2,3,3,4,5,6,7];
k := 3;
sum := 10;
3 3 4
2 3 5 //distinct 3's
2 3 5
1 4 5
1 3 6
1 3 6 //distinct 3's
1 2 7
To exclude variants with distinct repeated elements (if needed), we can use non-first occurence only for subsets already containing the first occurence of item (so 3 3 4 will be valid while the second 2 3 5 won't be generated)
I literally translate my Delphi code into C++ (weird, I think :)
int main()
{
vector<vector<vector<int>>> A;
vector<int> Lst = { 1, 2, 3, 3, 4, 5, 6, 7 };
int k = 3;
int sum = 10;
A.push_back({ {0} }); //fictive array to make non-empty variant
for (int i = 0; i < sum; i++)
A.push_back({{}});
for (int item : Lst) {
for (int i = sum; i >= item; i--) {
for (int j = 0; j < A[i - item].size(); j++)
if (A[i - item][j].size() < k + 1 &&
A[i - item][j].size() > 0) {
vector<int> t = A[i - item][j];
t.push_back(item);
A[i].push_back(t); //add new variant including current item
}
}
}
//output needed variants
for (int i = 0; i < A[sum].size(); i++)
if (A[sum][i].size() == k + 1) {
for (int j = 1; j < A[sum][i].size(); j++) //excluding fictive 0
cout << A[sum][i][j] << " ";
cout << endl;
}
}
Here is a complete solution in Python. Translation to C++ is left to the reader.
Like the usual subset sum, generation of the doubly linked summary of the solutions is pseudo-polynomial. It is O(count_values * distinct_sums * depths_of_sums). However actually iterating through them can be exponential. But using generators the way I did avoids using a lot of memory to generate that list, even if it can take a long time to run.
from collections import namedtuple
# This is a doubly linked list.
# (value, tail) will be one group of solutions. (next_answer) is another.
SumPath = namedtuple('SumPath', 'value tail next_answer')
def fixed_sum_paths (array, target, count):
# First find counts of values to handle duplications.
value_repeats = {}
for value in array:
if value in value_repeats:
value_repeats[value] += 1
else:
value_repeats[value] = 1
# paths[depth][x] will be all subsets of size depth that sum to x.
paths = [{} for i in range(count+1)]
# First we add the empty set.
paths[0][0] = SumPath(value=None, tail=None, next_answer=None)
# Now we start adding values to it.
for value, repeats in value_repeats.items():
# Reversed depth avoids seeing paths we will find using this value.
for depth in reversed(range(len(paths))):
for result, path in paths[depth].items():
for i in range(1, repeats+1):
if count < i + depth:
# Do not fill in too deep.
break
result += value
if result in paths[depth+i]:
path = SumPath(
value=value,
tail=path,
next_answer=paths[depth+i][result]
)
else:
path = SumPath(
value=value,
tail=path,
next_answer=None
)
paths[depth+i][result] = path
# Subtle bug fix, a path for value, value
# should not lead to value, other_value because
# we already inserted that first.
path = SumPath(
value=value,
tail=path.tail,
next_answer=None
)
return paths[count][target]
def path_iter(paths):
if paths.value is None:
# We are the tail
yield []
else:
while paths is not None:
value = paths.value
for answer in path_iter(paths.tail):
answer.append(value)
yield answer
paths = paths.next_answer
def fixed_sums (array, target, count):
paths = fixed_sum_paths(array, target, count)
return path_iter(paths)
for path in fixed_sums([1,2,3,3,4,5,6,9], 10, 3):
print(path)
Incidentally for your example, here are the solutions:
[1, 3, 6]
[1, 4, 5]
[2, 3, 5]
[3, 3, 4]
You should first sort the so called array. Secondly, you should determine if the problem is actually solvable, to save time... So what you do is you take the last k elements and see if the sum of those is larger or equal to the x value, if it is smaller, you are done it is not possible to do something like that.... If it is actually equal yes you are also done there is no other permutations.... O(n) feels nice doesn't it?? If it is larger, than you got a lot of work to do..... You need to store all the permutations in an seperate array.... Then you go ahead and replace the smallest of the k numbers with the smallest element in the array.... If this is still larger than x then you do it for the second and third and so on until you get something smaller than x. Once you reach a point where you have the sum smaller than x, you can go ahead and start to increase the value of the last position you stopped at until you hit x.... Once you hit x that is your combination.... Then you can go ahead and get the previous element so if you had 1,1,5, 6 in your thingy, you can go ahead and grab the 1 as well, add it to your smallest element, 5 to get 6, next you check, can you write this number 6 as a combination of two values, you stop once you hit the value.... Then you can repeat for the others as well.... You problem can be solved in O(n!) time in the worst case.... I would not suggest that you 10^27 combinations, meaning you have more than 10^27 elements, mhmmm bad idea do you even have that much space??? That's like 3bits for the header and 8 bits for each integer you would need 9.8765*10^25 terabytes just to store that clossal array, more memory than a supercomputer, you should worry about whether your computer can even store this monster rather than if you can solve the problem, that many combinations even if you find a quadratic solution it would crash your computer, and you know what quadratic is a long way off from O(n!)...
A brute force method using recursion might look like this...
For example, given variables set, x, k, the following pseudo code might work:
setSumStructure find(int[] set, int x, int k, int setIdx)
{
int sz = set.length - setIdx;
if (sz < x) return null;
if (sz == x) check sum of set[setIdx] -> set[set.size] == k. if it does, return the set together with the sum, else return null;
for (int i = setIdx; i < set.size - (k - 1); i++)
filter(find (set, x - set[i], k - 1, i + 1));
return filteredSets;
}

How to uniform spread every k values over a collection of n values with k <= n?

I've a collection of k elements. I need to spread them uniformly random into a collection of n elements, where k <= n.
So for example, with this k-collection (with k = 3):
{ 3, 5, 6 }
and give n = 7, a valid permutation result (with n = 7 elements) could be:
{ 6, 5, 6, 3, 3, 6, 5}
Notice that every item within the k-collection must be used into the permutation.
So this is not a valid result:
{ 6, 3, 6, 3, 3, 6, 6} // it lacks "5"
What's the fast way to accomplish this?
The simplest way I can think of.
Add one of each item to the array. So with your example, your initial array is [3,5,6]. This guarantees that every element is represented at least once.
Then, successively pick an element at random, and add it to the array. Do this n-3 times. (i.e. fill the array with randomly selected items from the list of elements)
Shuffle the array.
This takes O(n) to fill the array, and O(n) to shuffle it.
Let's assume you have a
std::vector<int> input;
that contains the k elements you need to spread and
std::vector<int> output;
that will be filled with n elements.
I used the following approach for a similiar problem. (Edit: Thinking about it, here is a simpler and probably faster version than the original)
First we satisfy the condition that every item from input must occurr at least once in output. Therefore we put every element from input once into output.
output.resize(n); // fill with n 0's
std::copy(input.begin(), input.end(), output.begin()); // fill k first items
Now we can fill up the remaining n - k slots with random elements from input:
std::random_device rd;
std::mt19937 rand(rd()); // get seed from random device
std::uniform_int_distribution<> dist(0, k - 1); // for random numbers in [0, k-1]
for(size_t i = k; i < n; i++) {
output[i] = input[dist(rand)];
}
At the end shuffle the whole thing, to randomize the position of the first k elements:
std::random_shuffle(output.begin(), output.end(), rand);
I hope this is what you wanted.
You can try just randomly put values to ur n-collection, then verify if it contains all k-collection values if not try again. However it's not always fast xd u can also put missing values in a random place of n-collection, but remember to verify again.
Simply make an array of the k elements, say {3,5,6} in the given example. Make a variable counter, which is zero initially. If you want to spread it over n elements, simply iterate over n elements of array with the counter incrementing as
counter=(counter+1)%k;

Difficulty with an accumulator in a nested for loop

I'm pretty new at C++, and I have to write code for a matching game, and I'm having trouble with the function that detects matches.
Below is a simpler, watered down version of the game I made to try to debug the match function.
Basically I have a bunch of numbers stored in a 2D array, and I need to cycle through the grid to see if there are any matches (3 of the same character in a row, vertically or horizontally, constitutes a match). The user gets one point for every match.
Every time there's a match, I need to accumulate a variable that stores the user's score. I think the code is pretty intuitive, the loops and if statements are checking the second and third elements after the ith element, and if they are all the same, a match has been detected.
The problem I'm having is that points are not being added to the count variable when they should. The "things" array is meant to emulate the game board, and I've designed it so it has at least one match, which my if statements don't seem to be detecting. How do I fix it so "count" will store points?
I've never posted anything on here before, so I'm sorry for the slightly screwy formatting, but here's the code:
#include <iostream>
using namespace std;
int main()
{
int things[3][3] = {{3, 3, 3}, {2, 8, 4}, {3, 7, 2}};
int count = 0;
for (int i = 0; i <= 3; i++)
{
for (int j = 0; j <= 3; j++)
{
if (things[i] == things[i+1])
count++;
if (things[i] == things[i+2])
count++;
if (things[i] == things[i+3])
count++;
else
{
if (things[j] == things[j+1])
count++;
if (things[j] == things[j+2])
count++;
if (things[j] == things[j+3])
count++;
}
}
}
return 0;
}
3 of the same character in a row, vertically or horizontally, constitutes a match
Too code this correctly You need to first have the correct algorithm.
So this is how I would write it in pseudo code:
For every row:
- For first n-2 columns in this row
- if current match result and next 2 match results are the same : increment counter
For every column:
- For first n-2 rows in this column
- if current match result and next 2 match results are the same : increment counter

Creating a subset of a set iteratively using vector

So I want to create a subset, run that subset through code, then create a new subset. I'm using a vector for the set and subset. So far I have 3 nested for loops but I'm having trouble figuring out the variables I need.
Here's what I want to do. set = {0, 1, 2, 3, 4, 5} the value matches the index just to simplify this example. I now want subset = {} -> {0} -> {1} -> ... -> {0,1} -> {0,2} -> ... -> {0,5} -> {0,1,2} -> ... -> {0,4,5}. I'm having trouble representing the conditions in terms of variables.
Basically I want the first for loop to increase the subset size. from 0 to set.size() (this is easy). Within that loop, I want to have an iterator corresponding to the index in the element of the subset. I have this iterator initialized to subset.size(), so that we work with the last element first, then work our way to the first element in the subset. then the 3rd for loop, I want to iterate between possible values from the set. Let's say our current subset = {0,1,2} how do I let my program know to put the value '2' inside the last element of the subset, then 1 then 0?
I'm thinking it would involve something with taking the difference from set.size()-1 and subset.size()-1? But I'm not quite sure how. so then I want to iterate through until {0,1,5} and then {0,4,5} but again I'm not sure how to tell the program to stop at 4, as opposed to 5. Again I think this is something with difference but I can't quite figure it out.
to recap:
for loop to iterate through subset size
for loop to iterate through subset "working" element, starting from back
for loop to iterate through that index of subset,
starting from the correct corresponding set value to ending
at the correct corresponding set value
such that the subset goes from {} -> {0} -> {1} ->...-> {0,1} -> {4,5} -> {1,2,3} -> ... -> {1,4,5} and I dont actually need subset = {1,2,3,4,5} but it doesn't hurt my code if I can't stop before that. Again I'm looking to represent the start and end points as variables to make the inner loops work, but I can't figure it out. Huge thanks to anyone who can help me out.
this is approximately how I would go about it.
//handle null subset
for ( int size = 1; size < n; i++ ) {
int indices[size];
for ( int i = 0; i < size; i++ ) indices[i] = i;
while ( indices[0] <= n - size ) {
int i;
for ( i = 1; indices[size - i] == n - i; i-- );
indices[i]++;
for ( i = i + 1; i < size; i++ ) indices[i] = indices[i-1] + 1;
//print out elems using the indices in `indices`
}
//done with all subsets of size `size`
}
The outer loop should be pretty self explanatory. Including 0 seemed like it was going to make some of the inner logic annoying so I started at subsets of size 1.
indices holds the indices of the elements that should be included in the current subset. It starts with the indices 0-size-1.
The condition for the while isn't exactly obvious. The last valid subset this generates contains the last size elements, so if the first index is past n - size we've gone too far.
The inside of the while loop is just incrementing the subset. It looks for the last element that can be incremented and still give a valid subset, increments it, and then resets all of the subsequent elements to be as small as possible. Then you print it out somehow.
And that should be close to something that will do what you want. Let me know if it needs clarifications or corrections.
A trick to enumerate all subsets is to permutate a "selection flag" array, each element of which indicates whether corresponding element in original array is selected.
following is sample code:
void foo(const vector<int>& a)
{
size_t size = a.size();
// selection flag array
// '1' indicates selected, '0' indicates unselected
vector<int> f(size, 0);
for (size_t i = 1; i <= size; i++)
{
// increase the count of selected elements
f[i - 1] = 1;
do
{
for (size_t i = 0; i < size; i++)
{
if (f[i])
{
printf("%d\t", a[i]);
}
}
printf("\n");
} while (next_permutation(f.begin(), f.end(), [](int a, int b){ return a > b; }));
// next_permutation tries to permutate the array
// i.e. '1 1 0 0' -> '1 0 1 0' -> '0 1 1 0' -> ... -> '0 0 1 1'(end)
}
}

Using an array and moving duplicates to end

I got this question at an interview and at the end was told there was a more efficient way to do this but have still not been able to figure it out. You are passing into a function an array of integers and an integer for size of array. In the array you have a lot of numbers, some that repeat for example 1,7,4,8,2,6,8,3,7,9,10. You want to take that array and return an array where all the repeated numbers are put at the end of the array so the above array would turn into 1,7,4,8,2,6,3,9,10,8,7. The numbers I used are not important and I could not use a buffer array. I was going to use a BST, but the order of the numbers must be maintained(except for the duplicate numbers). I could not figure out how to use a hash table so I ended up using a double for loop(n^2 horrible I know). How would I do this more efficiently using c++. Not looking for code, just an idea of how to do it better.
In what follows:
arr is the input array;
seen is a hash set of numbers already encountered;
l is the index where the next unique element will be placed;
r is the index of the next element to be considered.
Since you're not looking for code, here is a pseudo-code solution (which happens to be valid Python):
arr = [1,7,4,8,2,6,8,3,7,9,10]
seen = set()
l = 0
r = 0
while True:
# advance `r` to the next not-yet-seen number
while r < len(arr) and arr[r] in seen:
r += 1
if r == len(arr): break
# add the number to the set
seen.add(arr[r])
# swap arr[l] with arr[r]
arr[l], arr[r] = arr[r], arr[l]
# advance `l`
l += 1
print arr
On your test case, this produces
[1, 7, 4, 8, 2, 6, 3, 9, 10, 8, 7]
I would use an additional map, where the key is the integer value from the array and the value is an integer set to 0 in the beginning. Now I would go through the array and increase the values in the map if the key is already in the map.
In the end I would go again through the array. When the integer from the array has a value of one in the map, I would not change anything. When it has a value of 2 or more in the map I would swap the integer from the array with the last one.
This should result in a runtime of O(n*log(n))
The way I would do this would be to create an array twice the size of the original and create a set of integers.
Then Loop through the original array, add each element to the set, if it already exists add it to the 2nd half of the new array, else add it to the first half of the new array.
In the end you would get an array that looks like: (using your example)
1,7,4,8,2,6,3,9,10,-,-,8,7,-,-,-,-,-,-,-,-,-
Then I would loop through the original array again and make each spot equal to the next non-null position (or 0'd or whatever you decided)
That would make the original array turn into your solution...
This ends up being O(n) which is about as efficient as I can think of
Edit: since you can not use another array, when you find a value that is already in the
set you can move every value after it forward one and set the last value equal to the
number you just checked, this would in effect do the same thing but with a lot more operations.
I have been out of touch for a while, but I'd probably start out with something like this and see how it scales with larger input. I know you didn't ask for code but in some cases it's easier to understand than an explanation.
Edit: Sorry I missed the requirement that you cannot use a buffer array.
// returns new vector with dupes a the end
std::vector<int> move_dupes_to_end(std::vector<int> input)
{
std::set<int> counter;
std::vector<int> result;
std::vector<int> repeats;
for (std::vector<int>::iterator i = input.begin(); i < input.end(); i++)
{
if (counter.find(*i) == counter.end())
result.push_back(*i);
else
repeats.push_back(*i);
counter.insert(*i);
}
result.insert(result.end(), repeats.begin(), repeats.end());
return result;
}
#include <algorithm>
T * array = [your array];
size_t size = [array size];
// Complexity:
sort( array, array + size ); // n * log(n) and could be threaded
// (if merge sort)
T * last = unique( array, array + size ); // n, but the elements after the last
// unique element are not defined
Check sort and unique.
void remove_dup(int* data, int count) {
int* L=data; //place to put next unique number
int* R=data+count; //place to place next repeat number
std::unordered_set<int> found(count); //keep track of what's been seen
for(int* cur=data; cur<R; ++cur) { //until we reach repeats
if(found.insert(*cur).second == false) { //if we've seen it
std::swap(*cur,*--R); //put at the beginning of the repeats
} else //or else
std::swap(*cur,*L++); //put it next in the unique list
}
std::reverse(R, data+count); //reverse the repeats to be in origional order
}
http://ideone.com/3choA
Not that I would turn in code this poorly commented. Also note that unordered_set probably uses it's own array internally, bigger than data. (This has been rewritten based on aix's answer, to be much faster)
If you know the bounds on what the integer values are, B, and the size of the integer array, SZ, then you can do something like the following:
Create an array of booleans seen_before with B elements, initialized to 0.
Create a result array result of integers with SZ elements.
Create two integers, one for front_pos = 0, one for back_pos = SZ - 1.
Iterate across the original list:
Set an integer variable val to the value of the current element
If seen_before[val] is set to 1, put the number at result[back_pos] then decrement back_pos
If seen_before[val] is not set to 1, put the number at result[front_pos] then increment front_pos and set seen_before[val] to 1.
Once you finish iterating across the main list, all the unique numbers will be at the front of the list while the duplicate numbers will be at the back. Fun part is that the entire process is done in one pass. Note that this only works if you know the bounds of the values appearing in the original array.
Edit: It was pointed out that there's no bounds on the integers used, so instead of initializing seen_before as an array with B elements, initialize it as a map<int, bool>, then continue as usual. That should get you n*log(n) performance.
This can be done by iterating the array & marking index of the first change.
later on swaping that mark index value with next unique value
& then incrementing that mark index for next swap
Java Implementation:
public static void solve() {
Integer[] arr = new Integer[] { 1, 7, 4, 8, 2, 6, 8, 3, 7, 9, 10 };
final HashSet<Integer> seen = new HashSet<Integer>();
int l = -1;
for (int i = 0; i < arr.length; i++) {
if (seen.contains(arr[i])) {
if (l == -1) {
l = i;
}
continue;
}
if (l > -1) {
final int temp = arr[i];
arr[i] = arr[l];
arr[l] = temp;
l++;
}
seen.add(arr[i]);
}
}
output is 1 7 4 8 2 6 3 9 10 8 7
It's ugly, but it meets the requirements of moving the duplicates to the end in place (no buffer array)
// warning, some light C++11
void dup2end(int* arr, size_t cnt)
{
std::set<int> k;
auto end = arr + cnt-1;
auto max = arr + cnt;
auto curr = arr;
while(curr < max)
{
auto res = k.insert(*curr);
// first time encountered
if(res.second)
{
++curr;
}
else
{
// duplicate:
std::swap(*curr, *end);
--end;
--max;
}
}
}
void move_duplicates_to_end(vector<int> &A) {
if(A.empty()) return;
int i = 0, tail = A.size()-1;
while(i <= tail) {
bool is_first = true; // check of current number is first-shown
for(int k=0; k<i; k++) { // always compare with numbers before A[i]
if(A[k] == A[i]) {
is_first = false;
break;
}
}
if(is_first == true) i++;
else {
int tmp = A[i]; // swap with tail
A[i] = A[tail];
A[tail] = tmp;
tail--;
}
}
If the input array is {1,7,4,8,2,6,8,3,7,9,10}, then the output is {1,7,4,8,2,6,10,3,9,7,8}. Comparing with your answer {1,7,4,8,2,6,3,9,10,8,7}, the first half is the same, while the right half is different, because I swap all duplicates with the tail of the array. As you mentioned, the order of the duplicates can be arbitrary.