How to remove duplicates in particular set of data?

How to remove duplicates in particular set of data? - c++

Let's suppose we have two childs who want to get same number or coins (coin nominals 1,2,6,12). Childs don't care about the value.
Example container of permutations which I want to share between two childs:
{1, 1, 1, 1, 1, 1},
{1, 1, 2, 2},
{1, 2, 1, 2},
{1, 2, 2, 1},
{2, 1, 1, 2},
{2, 1, 2, 1},
{2, 2, 1, 1}
Now I`d like to have collections without duplicates:
child A child B
2 2 1 1
2 1 2 1
1 1 2 2
1 1 1 1 1 1
That permutations are wrong:
1 2 1 2
1 2 2 1
2 1 1 2
because
child A child B
1 2 1 2
is permutation of
child A child B
2 1 2 1
which we already have. These collections: 1 2 2 1 and 2 1 1 2 are permutations, as well.
My solution is here, works correctly for that particular input but if you add more coins with different nominals, it doesn't!
#include <iostream>
#include <vector>
#include <unordered_set>
using namespace std;
int main()
{
vector<vector<int>> permutations =
{
{1, 1, 1, 1, 1, 1},
{1, 1, 2, 2},
{1, 2, 1, 2},
{1, 2, 2, 1},
{2, 1, 1, 2},
{2, 1, 2, 1},
{2, 2, 1, 1}
};
vector<pair<unordered_multiset<int>, unordered_multiset<int>>> childSubsets;
for(const auto &currentPermutation : permutations)
{
size_t currentPermutationSize = currentPermutation.size();
size_t currentPermutationHalfSize = currentPermutationSize / 2;
//left
unordered_multiset<int> leftSet;
for(int i=0;i<currentPermutationHalfSize;++i)
leftSet.insert(currentPermutation[i]);
bool leftSubsetExist = false;
for(const auto &subset : childSubsets)
{
if(subset.first == leftSet)
{
leftSubsetExist = true;
break;
}
}
//right
unordered_multiset<int> rightSet;
for(int i = currentPermutationHalfSize; i < currentPermutationSize; ++i)
rightSet.insert(currentPermutation[i]);
bool rightSubsetExist = false;
for(const auto &subset : childSubsets)
{
if(subset.second == rightSet)
{
rightSubsetExist = true;
break;
}
}
//summarize
if(!leftSubsetExist || !rightSubsetExist) childSubsets.push_back({leftSet, rightSet});
}
cout << childSubsets.size() << endl;
}
How to change the solution to make optimal and less complex?

You should add
if (leftSubsetExist)
continue;
after first cycle (as optimization)
Could you add some "wrong" permutations (with another coins)?

Related

Creating all possible combinations sorted by sum

Given an arbitrary set of numbers S = {1..n} and size k, I need to print all possible combinations of size k ordered by the sum of the combinations.
Let's say S = {1, 2, 3, 4, 5} and k = 3, a sample output would be:
1 2 3 = 6
1 2 4 = 7
1 2 5 = 8
1 3 4 = 8
1 3 5 = 9
2 3 4 = 9
2 3 5 = 10
1 4 5 = 10
2 4 5 = 11
3 4 5 = 12
The first idea that came to my mind is to sort S (if not already sorted) and then perform a BFS search, having a queue of combinations sorted by sum. This is because I want to generate the next combination only if requested (iterator like). But this seems like an overkill and I think that there should be an easier solution. How can this be solved?
Edit:
This is my original idea:
1. Sort S
2. Pick the first k numbers and add them to the queue as a single node (because this is the combination with the smallest sum)
3. While the queue is not empty - pop the first node, output it, and generate successors.
Generating successors:
1. Start with the first number in the node. Make a copy of a node. Find the smallest unselected value that is > than the number, and assign this value.
2. Verify that this node hasn't been seen and calculate the sum.
3. Add the node to the queue.
4. Repeat 1-3 for all of the remaining numbers in the node.
Example:
Queue q = { {1, 2, 3} = 6 }
Pop front, output
1 2 3 = 6
Generate {4, 2, 3} = 9, {1, 4, 3} = 8, {1, 2, 4} = 7
Seen { {1, 2, 3}, {4, 2, 3}, {1, 4, 3}, {1, 2, 4} }
Queue q = { {1, 2, 4} = 7, {1, 4, 3} = 8, {4, 2, 3} = 9 }
Pop front, output
1 2 3 = 6
1 2 4 = 7
Generate {3, 2, 4} = 9 (seen - discard), {1, 3, 4} = 8 (seen - discard), {1, 2, 5} = 8 }
Seen { {1, 2, 3}, {4, 2, 3}, {1, 4, 3}, {1, 2, 4}, {1, 2, 5} }
Queue q = { {1, 2, 5} = 8, {1, 4, 3} = 8, {4, 2, 3} = 9 }
Pop front, output
1 2 3 = 6
1 2 4 = 7
1 2 5 = 8
Generate {3, 2, 5} = 10, {1, 3, 5} = 9
Seen { {1, 2, 3}, {4, 2, 3}, {1, 4, 3}, {1, 2, 4}, {1, 2, 5}, {3, 2, 5}, {1, 3, 5} }
Queue q = { {1, 4, 3} = 8, {1, 5, 3} = 9, {1, 3, 5} = 9, {3, 2, 5} = 10, {1, 4, 5} = 10, {4, 2, 5} = 11 }
...
...

javascript sample like below;
<!DOCTYPE html>
<html>
<head>
<meta charset = utf-8>
<title>s{1..n} by k</title>
</head>
<body>
<header>s{1..n} by k</header>
<form action="#" onsubmit="return false;">
s:<input type="text" id="sArrayObj" name="sArrayObj" value="4 5 1 8 9 3 6 7"/><br/>
n:<input type="text" id="kVarObj" name="kVarObj" value="3"/><br/>
<button id="scan" name="scan" onclick="scanX()"> CALCULATE </button><br/>
<textarea id="scanned" name="scanned" cols=40 rows=20>
</textarea>
</form>
<script>
const headFactorial = n => {
if ( n > 1 ) return n * headFactorial( n - 1 );
else return 1;
}
const tailFactorial = n => {
if ( n === 1 ) return 1;
else return n * tailFactorial( n - 1 );
}
const combinations = ( collection, combinationLength ) => {
let head, tail, result = [];
if ( combinationLength > collection.length || combinationLength < 1 ) { return []; }
if ( combinationLength === collection.length ) { return [ collection ]; }
if ( combinationLength === 1 ) { return collection.map( element => [ element ] ); }
for ( let i = 0; i < collection.length - combinationLength + 1; i++ ) {
head = collection.slice( i, i + 1 );
tail = combinations( collection.slice( i + 1 ), combinationLength - 1 );
for ( let j = 0; j < tail.length; j++ ) { result.push( head.concat( tail[ j ] ) ); }
}
return result;
}
const sumArr = (anArr) => {
var s=0;
for (j=0; j < anArr.length; j++) s += 1*anArr[j];
return s;
}
function scanX(){
sArray = document.getElementById("sArrayObj").value.split(" ");
kVar = document.getElementById("kVarObj").value;
sArray.sort();
resultObj= document.getElementById("scanned");
resultObj.value = "";
//for (i=0;kVar > i;i++){
// resultObj.value+=sArray[i]+' ';
//}
resultArr = combinations(sArray, kVar);
for (i=0;i<resultArr.length;i++){
resultObj.value += resultArr[i]+"="+sumArr(resultArr[i])+"\n";
}
return false;
}
</script>
</body>
</html>

Lexicographic Rank of a Set Partitioned Into Groups

Given a set of 8 sequential numbers {0..7} partitioned into 4 groups of size 2, with the numbers in each group in ascending order, how can a rank be generated for the set? The rank should be in lexicographic order, and preferably the algorithm should be linear in complexity.
Examples of the partitioning:
{{0 1} {2 3} {4 5} {6 7}} // Rank 0
...
{{6 7} {4 5} {2 3} {0 1}} // Rank 2519
Because the numbers in each group are in ascending order, the groups are effectively treated like combinations, not permutations, so a group containing e.g. {5 4} will never occur.
How can this set of numbers be ranked sequentially in the range [0, 2520) (8C2 * 6C2 * 4C2)?
At present I compute the rank of each group as an 8C2 combination, then combine each rank together by treating it as a base-28 number. This obviously leaves gaps in the ranking, which is undesirable in my case. But, for what it's worth, here is how I'm currently ranking.
#include <array>
using std::array;
#include <cstdint>
#include <cstddef>
#include <iostream>
using std::cout;
using std::endl;
// Calculates n!.
uint32_t factorial(uint32_t n)
{
return n <= 1 ? 1 : n * factorial(n - 1);
}
// Calculate nCk: n!/((n-k)!*k!).
uint32_t choose(uint32_t n, uint32_t k)
{
return (n < k)
? 0
: factorial(n) / (factorial(n - k) * factorial(k));
}
template<size_t N, size_t K>
class CombinationRanker
{
array<array<uint32_t, K+1>, N+1> choices;
public:
/**
* Initialize a precomputed array of nCk (N and K inclusive).
*/
CombinationRanker()
{
for (unsigned n = 0; n <= N; ++n)
for (unsigned k = 0; k <= K; ++k)
this->choices[n][k] = choose(n, k);
}
/**
* Get the rank of a combination.
* #param comb A combination array of size K in ascending order.
*/
uint32_t rank(const array<uint8_t, K> comb) const
{
// Formula: (nCk) - ((n-c_1)Ck) - ((n-c_2)C(k-1)) - ... - ((n-c_k)C1)
// That assumes 1-based combinations with ranks starting at 1, so each
// element in the combination has 1 added to it, and the end result has 1
// subtracted from it to make the rank 0-based.
uint32_t rank = this->choices[N][K];
for (unsigned i = 0; i < K; ++i)
rank -= this->choices[N - (comb[i] + 1)][K - i];
return rank - 1;
}
};
int main(int argc, char* argv[])
{
CombinationRanker<8, 2> ranker;
array<array<uint8_t, 2>, 4> nums =
{{
{0, 1}, {2, 3}, {4, 5}, {6, 7}
}};
// Horribly sparse rank.
unsigned rank =
ranker.rank(nums[0]) * 28 * 28 * 28 +
ranker.rank(nums[1]) * 28 * 28 +
ranker.rank(nums[2]) * 28 +
ranker.rank(nums[3]);
cout << rank << endl; // 10835, but I want 0.
return 0;
}
I've tagged the post as C++ as that's the language I'm using; however, answers in another language are fine. It's more of a math question, but I'm looking for an answer that I can understand as a programmer, not a mathematician, and a code snippet would be helpful in that regard.

Here's what I came up with. It's quadratic in complexity, which is not the greatest, but it does the trick. The basic algorithm is as follows.
Given a set of sequential numbers from [0..7] partitioned into unordered pairs, loop over each pair and find its rank among pairs that
exclude numbers preceding it. Then multiplying each rank by its variable
base. The variable bases for each rank are 6C2*4C2*2C2, 4C2*2C2, and 2C2.
As an example, for {{2,3}, {6,7}, {4,5}, {0,1}}:
{2, 3} has rank 13.
{6, 7} has rank 14 among pairs excluding 2 and 3.
{4, 5} has rank 5 among pairs excluding 2, 3, 6, and 7.
{0, 1} is ignored.
Altogether, 13*6C2*4C2*2C2 + 14*4C2*2C2 + 5*2C2 = 1259
Other examples:
{{0, 1}, {2, 3}, {4, 5}, {6, 7}} -> 0
{{2, 3}, {6, 7}, {4, 5}, {0, 1}} -> 1259
{{2, 4}, {0, 1}, {3, 5}, {6, 7}} -> 1260
{{6, 7}, {4, 5}, {2, 3}, {0, 1}} -> 2519
Here's algorithm in code. I've hard coded quite a bit for brevity.
#include <iostream>
using std::cout;
using std::endl;
#include <array>
using std::array;
#include <cstdint>
typedef array<uint8_t, 2> pair_t;
/**
* #param set A set of 8 sequential numbers, [0..7], partitioned into unordered
* pairs.
*/
uint32_t rank(const array<pair_t, 4>& set) {
// All 28 (8C2) possible unordered subsets of the set of 8 sequential
// numbers, [0..7], in lexicographic order. Hard-coded here for brevity.
array<pair_t, 28> pairs = {{
{0, 1}, {0, 2}, {0, 3}, {0, 4}, {0, 5}, {0, 6}, {0, 7},
{1, 2}, {1, 3}, {1, 4}, {1, 5}, {1, 6}, {1, 7},
{2, 3}, {2, 4}, {2, 5}, {2, 6}, {2, 7},
{3, 4}, {3, 5}, {3, 6}, {3, 7},
{4, 5}, {4, 6}, {4, 7},
{5, 6}, {5, 7},
{6, 7},
}};
// Variable base for each rank "digit" (the base corresponding to the rank of
// each subset): 6C2*4C2*2C2, 4C2*2C2, 2C2. Again, hard-coded for brevity.
array<uint32_t, 3> bases = {{90, 6, 1}};
// Now rank the set.
uint32_t rank = 0;
// Rank among this many pairs. For N=8, 8C2->6C2->4C2->2C2 (28->15->6->1).
unsigned numRemaining = 28; // N*(N-1)/2
array<pair_t, 28> remaining = pairs;
// Loop over the first three unordered subsets. The last isn't needed for
// ranking--n from [0...(N-2)/2).
for (unsigned n = 0; n < 3; ++n)
{
unsigned remainingInd = 0;
const pair_t& sPair = set[n];
for (unsigned r = 0; r < numRemaining; ++r)
{
const pair_t& rPair = remaining[r];
if (sPair == rPair)
{
// Found the pair: rank it relative to the ramining pairs, and multiply
// it by the base for digit n.
rank += r * bases[n];
}
else if (
sPair[0] != rPair[0] && sPair[0] != rPair[1] &&
sPair[1] != rPair[0] && sPair[1] != rPair[1]
)
{
// The pair excludes the numbers in set[n], so keep it in the
// list of remaining pairs for the next digit's rank.
remaining[remainingInd++] = rPair;
}
}
// Number of remaining pairs.
numRemaining = remainingInd;
}
return rank;
}
int main(int argc, char* argv[])
{
// Examples pairs.
array<array<pair_t, 4>, 7> sets = {{
{{{0, 1}, {2, 3}, {4, 5}, {6, 7}}},
{{{0, 1}, {2, 3}, {4, 6}, {5, 7}}},
{{{0, 1}, {2, 3}, {4, 7}, {5, 6}}},
{{{0, 1}, {2, 3}, {5, 6}, {4, 7}}},
// snip
{{{2, 3}, {6, 7}, {4, 5}, {0, 1}}},
// snip
{{{6, 7}, {4, 5}, {1, 3}, {0, 2}}},
{{{6, 7}, {4, 5}, {2, 3}, {0, 1}}},
}};
for (unsigned i = 0; i < 7; ++i)
{
const array<pair_t, 4>& set = sets[i];
cout << rank(set) << ": ";
for (unsigned j = 0; j < 4; ++j)
cout << '{' << (unsigned)set[j][0] << ", " << (unsigned)set[j][1] << '}';
cout << endl;
}
return 0;
}
Output:
0: {0, 1}{2, 3}{4, 5}{6, 7}
1: {0, 1}{2, 3}{4, 6}{5, 7}
2: {0, 1}{2, 3}{4, 7}{5, 6}
3: {0, 1}{2, 3}{5, 6}{4, 7}
1259: {2, 3}{6, 7}{4, 5}{0, 1}
2518: {6, 7}{4, 5}{1, 3}{0, 2}
2519: {6, 7}{4, 5}{2, 3}{0, 1}

xtensor: Select rows with specific column values

I am playing around with xtensor and I just wanted to perform a simple operation to select rows with specific column values. Imagine I've the following array.
[
[0, 1, 1, 3, 4 ]
[0, 2, 1, 5, 6 ]
[0, 3, 1, 3, 2 ]
[0, 4, 1, 5, 7 ]
]
Now I want to select the rows where col2 and col4 has value 3. Which in this case is row 3.
[0, 3, 1, 3, 2 ]
I want to achieve similar to what this answer has achieved.
How can I achieve this in xtensor?

The way to go is to slice with the columns you need, and then look where the condition is true for all columns.
For the latter an overload for xt::all(...) is seemingly not implemented (yet!), but we can use xt::sum(..., axis) to achieve the same:
#include <xtensor/xtensor.hpp>
#include <xtensor/xview.hpp>
#include <xtensor/xio.hpp>
int main()
{
xt::xtensor<int,2> a =
{{0, 1, 1, 3, 4},
{0, 2, 1, 5, 6},
{0, 3, 1, 3, 2},
{0, 4, 1, 5, 7}};
auto test = xt::equal(xt::view(a, xt::all(), xt::keep(1, 3)), 3);
auto n = xt::sum(test, 1);
auto idx = xt::flatten_indices(xt::argwhere(xt::equal(n, 2)));
auto b = xt::view(a, xt::keep(idx), xt::all());
std::cout << b << std::endl;
return 0;
}

generating a set of sets that appear in every set

I have an array of arrays of things
typedef std::vector<thing> group;
std::vector<group> groups;
things could be compared like so
int comparison(thing a, thing b);
where the return value is 0, 1 or 2
0 means that the things are not alike
1 means that they are alike and a is more specific or equal to b
2 means that they are alike and b is more specific or equal to a
and I am looking for a function that would return me a group that contains all things that appear in every group.
std::getgroup(groups.begin(), groups.end(), myComparisonFunction);
the problem is I have no idea what this function may be called, if it does even exist, or what the closest thing to it would be.

Eventually, what you want is an intersection. Luckily, there is std::set_intersection which almost does what you need. Here's a simple example on std::vector<std::vector<int>>. You can easily change it to work with your thing:
#include <iostream>
#include <vector>
#include <algorithm>
std::vector<int> getGroup(const std::vector<std::vector<int>>& groups) {
std::vector<int> group;
std::vector<int> temp = groups[0];
std::sort(temp.begin(), temp.end());
for ( unsigned i = 1; i < groups.size(); ++i ) {
group = std::vector<int>();
std::vector<int> temp2 = groups[i];
std::sort(temp2.begin(), temp2.end());
std::set_intersection(temp2.begin(), temp2.end(),
temp.begin(), temp.end(),
std::back_inserter(group));
temp = group;
}
return group;
}
int main() {
std::vector<std::vector<int>> groups = { {1, 2, 3, 4, 5, 6, 7, 8, 9, 10},
{1, 2, 3, 5, 6, 7, 8, 10},
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10},
{1, 3, 4, 5, 6, 9, 10},
{1, 2, 6, 7, 8, 9, 10},
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10} };
for ( auto g : getGroup(groups) )
std::cout << g << "\n";
return 0;
}
This will print:
1
6
10

How to search for a vector in a matrix in C++ and which algorithm?

Suppose I have a matrix and a vector given by. How can I perform a search algorithm like binary search to return the index?
Example:
const int V_SIZE = 10,H_SIZE = 7;
int a1[V_SIZE][H_SIZE] = {
{1,2,0,0,0,0,0},
{1,3,0,0,0,0,0},
{2,2,4,0,0,0,0},
{2,2,6,0,0,0,0},
{3,2,4,7,0,0,0},
{4,1,3,5,9,0,0},
{4,1,4,6,8,0,0},
{4,2,3,4,7,0,0},
{5,2,3,5,7,8,0},
{6,1,3,4,5,7,10}
}; // sorted
int a2 [H_SIZE] = {4,1,3,5,9,0,0};
Perform a search for the vector a2 in the matrix a1 and the return value is 6
Thank a lot

You could use a 2D std::array in combination with std::lower_bound:
const int V_SIZE = 10,H_SIZE = 7;
std::array<std::array<int, H_SIZE>, V_SIZE> a1 {
{{{1,2,0,0,0,0,0}},
{{1,3,0,0,0,0,0}},
{{2,2,4,0,0,0,0}},
{{2,2,6,0,0,0,0}},
{{3,2,4,7,0,0,0}},
{{4,1,3,5,9,0,0}},
{{4,1,4,6,8,0,0}},
{{4,2,3,4,7,0,0}},
{{5,2,3,5,7,8,0}},
{{6,1,3,4,5,7,10}}
}}; // sorted
std::array<int, H_SIZE> a2 {{4,1,3,5,9,0,0}};
int idx = std::lower_bound(std::begin(a1), std::end(a1), a2) - std::begin(a1);
LIVE DEMO

If the matrix is sorted on the first number, you could use binary search to find an approximate index. You then have to go back until you find the first row starting with the same number as in the vector, as well as forward to find the last row starting with the same number. Then you loop over the vector, searching for a match for the second, third, etc. number in the range of rows you have.

What about something like this using std::array?
template <int HSIZE>
bool operator<(const std::array<int, HSIZE> &lhs, const std::array<int, HSIZE> &rhs)
{
for (int i = 0; i < HSIZE; i++)
if (lhs[i] != rhs[i])
return lhs[i] < rhs[i];
return false;
}
std::array<int, 7> a1[] =
{
{ 1, 2, 0, 0, 0, 0, 0 },
{ 1, 3, 0, 0, 0, 0, 0 },
{ 2, 2, 4, 0, 0, 0, 0 },
{ 2, 2, 6, 0, 0, 0, 0 },
{ 3, 2, 4, 7, 0, 0, 0 },
{ 4, 1, 3, 5, 9, 0, 0 },
{ 4, 1, 4, 6, 8, 0, 0 },
{ 4, 2, 3, 4, 7, 0, 0 },
{ 5, 2, 3, 5, 7, 8, 0 },
{ 6, 1, 3, 4, 5, 7, 10 }
};
void search(void)
{
std::array<int, 7> a2 = { 4, 1, 3, 5, 9, 0, 0 };
std::array<int, 7> *a1_end = a1 + sizeof(a1) / sizeof(std::array<int, 7>);
std::array<int, 7> *it = std::lower_bound(a1, a1_end, a2);
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to remove duplicates in particular set of data? - c++

You should add if (leftSubsetExist) continue; after first cycle (as optimization) Could you add some "wrong" permutations (with another coins)?

Related

Creating all possible combinations sorted by sum

Lexicographic Rank of a Set Partitioned Into Groups

xtensor: Select rows with specific column values

generating a set of sets that appear in every set

How to search for a vector in a matrix in C++ and which algorithm?

Categories

Resources