Subset sum (Coin change) - c++

My problem is, I need to count how many combination of array of integers sums to a value W.`
let say:
int array[] = {1,2,3,4,5};
My Algorithm is just find all combinations of lengths from 1 to W / minimum(array), which is equal to W because minimum is 1.
And checking each combination if its sum equal to W then increment a counter N.
any other algorithm to solve this ? should be faster :)
Update:
ok, the subset problem and the Knapsack Problem are good, but my problem is that the combinations of the array repeats the elements, like this:
1,1,1 -> the 1st combination
1,1,2
1,1,3
1,1,4
1,1,5
1,2,2 -> this combination is 1,2,2, not 1,2,1 because we already have 1,1,2.
1,2,3
1,2,4
1,2,5
1,3,3 -> this combination is 1,3,3, not 1,3,1 because we already have 1,1,3.
1,3,4
.
.
1,5,5
2,2,2 -> this combination is 2,2,2, not 2,1,1 because we already have 1,1,2.
2,2,3
2,2,4
2,2,5
2,3,3 -> this combination is 2,3,3, not 2,3,1 because we already have 1,2,3.
.
.
5,5,5 -> Last combination
these are all combinations of {1,2,3,4,5} of length 3. the subset-sum problem gives another kind of combinations that I'm not interested in.
so the combination that sums to W, lets say W = 7,
2,5
1,1,5
1,3,3
2,2,3
1,1,2,3
1,2,2,2
1,1,1,1,3
1,1,1,2,2
1,1,1,1,1,2
1,1,1,1,1,1,1
Update:
The Real Problem is in the repeated of the elements 1,1,1 is need and the order of the generated combination are not important, so 1,2,1 is the same as 1,1,2 and 2,1,1 .

No efficient algorithm exist as of now, and possibly never will (NP-complete problem).
This is (a variation of) the subset-sum problem.

This is coin change problem. It could be solved by dynamic programming with reasonable restrictions of W and set size

Here is code, in Go, that solves this problem. I believe it runs in O(W / min(A)) time. The comments should be sufficient to see how it works. The important detail is that it can use an element in A multiple times, but once it stops using that element it won't ever use it again. This avoids double-counting things like [1,2,1] and [1,1,2].
package main
import (
"fmt"
"sort"
)
// This is just to keep track of how many times we hit ninjaHelper
var hits int = 0
// This is our way of indexing into our memo, so that we don't redo any
// calculations.
type memoPos struct {
pos, sum int
}
func ninjaHelper(a []int, pos, sum, w int, memo map[memoPos]int64) int64 {
// Count how many times we call this function.
hits++
// Check to see if we've already done this computation.
if r, ok := memo[memoPos{pos, sum}]; ok {
return r
}
// We got it, and we can't get more than one match this way, so return now.
if sum == w {
return 1
}
// Once we're over w we can't possibly succeed, so just bail out now.
if sum > w {
return 0
}
var ret int64 = 0
// By only checking values at this position or later in the array we make
// sure that we don't repeat ourselves.
for i := pos; i < len(a); i++ {
ret += ninjaHelper(a, i, sum+a[i], w, memo)
}
// Write down our answer in the memo so we don't have to do it later.
memo[memoPos{pos, sum}] = ret
return ret
}
func ninja(a []int, w int) int64 {
// We reverse sort the array. This doesn't change the complexity of
// the algorithm, but by counting the larger numbers first we can hit our
// target faster in a lot of cases, avoid a bit of work.
sort.Ints(a)
for i := 0; i < len(a)/2; i++ {
a[i], a[len(a)-i-1] = a[len(a)-i-1], a[i]
}
return ninjaHelper(a, 0, 0, w, make(map[memoPos]int64))
}
func main() {
a := []int{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
w := 1000
fmt.Printf("%v, w=%d: %d\n", a, w, ninja(a, w))
fmt.Printf("Hits: %v\n", hits)
}

Just to put this to bed, here are recursive and (very simple) dynamic programming solutions to this problem. You can reduce the running time (but not the time complexity) of the recursive solution by using more sophisticated termination conditions, but the main point of it is to show the logic.
Many of the dynamic programming solutions I've seen keep the entire N x |c| array of results, but that's not necessary, since row i can be generated from just row i-1, and furthermore it can be generated in order left to right so no copy needs to be made.
I hope the comments help explain the logic. The dp solution is fast enough that I couldn't find a test case which didn't overflow a long long which took more than a few milliseconds; for example:
$ time ./coins dp 1000000 1 2 3 4 5 6 7
3563762607322787603
real 0m0.024s
user 0m0.012s
sys 0m0.012s
// Return the number of ways of generating the sum n from the
// elements of a container of positive integers.
// Note: This function will overflow the stack if an element
// of the container is <= 0.
template<typename ITER>
long long count(int n, ITER begin, ITER end) {
if (n == 0) return 1;
else if (begin == end || n < 0) return 0;
else return
// combinations which don't use *begin
count(n, begin + 1, end) +
// use one (more) *begin.
count(n - *begin, begin, end);
}
// Same thing, but uses O(n) storage and runs in O(n*|c|) time,
// where |c| is the length of the container. This implementation falls
// directly out of the recursive one above, but processes the items
// in the reverse order; each time through the outer loop computes
// the combinations (for all possible sums <= n) for sum prefix of
// the container.
template<typename ITER>
long long count1(int n, ITER begin, ITER end) {
std::vector<long long> v(n + 1, 0);
v[0] = 1;
// Initial state of v: v[0] is 1; v[i] is 0 for 1 <= i <= n.
// Corresponds to the termination condition of the recursion.
auto vbegin = v.begin();
auto vend = v.end();
for (auto here = begin; here != end; ++here) {
int a = *here;
if (a > 0 && a <= n) {
auto in = vbegin;
auto out = vbegin + a;
// *in is count(n - a, begin, here).
// *out is count(n, begin, here - 1).
do *out++ += *in++; while (out != vend);
}
}
return v[n];
}

Related

Finding the number of sub arrays that have a sum of K

I am trying to find the number of sub arrays that have a sum equal to k:
int subarraySum(vector<int>& nums, int k)
{
int start, end, curr_sum = 0, count = 0;
start = 0, end = 0;
while (end < (int)nums.size())
{
curr_sum = curr_sum + nums[end];
end++;
while (start < end && curr_sum >= k)
{
if (curr_sum == k)
count++;
curr_sum = curr_sum - nums[start];
start++;
}
}
return count;
}
The above code I have written, works for most cases, but fails for the following:
array = {-1, -1, 1} with k = 0
I have tried to add another while loop to iterate from the start and go up the array until it reaches the end:
int subarraySum(vector<int>& nums, int k)
{
int start, end, curr_sum = 0, count = 0;
start = 0, end = 0;
while (end < (int)nums.size())
{
curr_sum = curr_sum + nums[end];
end++;
while (start < end && curr_sum >= k)
{
if (curr_sum == k)
count++;
curr_sum = curr_sum - nums[start];
start++;
}
}
while (start < end)
{
if (curr_sum == k)
count++;
curr_sum = curr_sum - nums[start];
start++;
}
return count;
}
Why is this not working? I am sliding the window until the last element is reached, which should have found a sum equal to k? How can I solve this issue?
Unfortunately, you did not program a sliding window in the correct way. And a sliding window is not really a solution for this problem. One of your main issues is, that you do not move the start of the window based on the proper conditions. You always sum up and wait until the sum is greater than the search value.
This will not really work. Especially for your example -1, -1, 1. The running sum of this is: -1, -2, -1 and you do not see the 0, although it is there. You may have the idea to write while (start < end && curr_sum != k), but this will also not work, because you handle the start pointer not correctly.
Your approach will lead to the brute force solution that typically takes something like N*N loop operations, where N is the size of the array. This, because we need a double nested loop.
That will of course always work, but maybe very time-consuming, and, in the end, too slow.
Anyway. Let us implement that. We will start from each value in the std::vector and try out all sub arrays starting from the beginning value. We must evaluate all following values in the std::vector, because for example the last value could be a big negative number and bring down the sum again to the search value.
We could implement this for example like the following:
#include <iostream>
#include <vector>
using namespace std;
int subarraySum(vector<int>& numbers, int searchSumValue) {
// Here we will store the result
int resultingCount{};
// Iterate over all values in the array. So, use all different start values
for (std::size_t i{}; i < numbers.size(); ++i) {
// Here we stor the running sum of the elements in the vector
int sum{ numbers[i] };
// Check for trivial case. A one-element sub-array does already match the search value
if (sum == searchSumValue) ++resultingCount;
// Now we build all subarrays beginning with the start value
for (std::size_t k{ i + 1 }; k < numbers.size(); ++k) {
sum += numbers[k];
if (sum == searchSumValue) ++resultingCount;
}
}
return resultingCount;
}
int main() {
vector v{ -1,-1,1 };
std::cout << subarraySum(v, 0);
}
.
But, as said, the above is often too slow for big vectors and there is indeed a better solution available, which is based on a DP (dynamic programming) algorithm.
It uses so-called prefix sums, running sums, based on the running sum before the current evaluated value.
We need to show an example. Let's use a std::vector with 5 values {1,2,3,4,5}. And we want to look subarrays with a sum of 9.
We can “guess” that there are 2 subarrays: {2,3,4} and {4,5} that have a sum of 9.
Let us investigate further
Index 0 1 2 3 4
Value 1 2 3 4 5
We can now add a running sum and see, how much delta we have between the current evaluated element and the left neighbor or over-next neighbor and so on. And if we have a delta that is equal to our search value, then we must have a subarray building this sum.
Running Sum 1 3 6 10 15
Deltas of 2 3 4 5 against next left
Running sum 5 7 9 against next next left
9 12 against next next next left
Example {2,3,4}. If we evaluate the 4 with a running sum of 10, and subtract the search value 9, then we get the previous running sum 1. “1+9=10” all values are there.
Example {4,5}. If we evaluate the 5 with a running sum of 15, and subtract the search value 9, then we get the previous running sum = 6. “6+9=15” all values are there.
We can find all solutions using the same approach.
So, the only thing we need to do, is to subtract the search value from the current running sum and see, if we have this running sum already calculated before.
Like: “Search-Value” + “previously Calculated Sum” = “Current Running Sum”.
Or: “Current Running Sum” – “Search-Value” = “previously Calculated Sum”
Again, we need to do the subtraction and check, if we already calculated such a sum previously.
So, we need to store all previously calculated running sums. And, because such a sum may appear more than one, we need to find occurrences of equal running sums and count them.
It is very hard to digest, and you need to think a while to understand.
With the above wisdom, you can draft the below potential solution.
#include <iostream>
#include <vector>
#include <unordered_map>
int subarraySum(std::vector<int>& numbers, int searchSumValue) {
// Here we will store the result
int resultingSubarrayCount{};
// Here we will stor all running sums and how ofthen their value appeared
std::unordered_map<int, int> countOfRunningSums;
// Continuosly calculating the running sum
int runningSum{};
// And initialize the first value
countOfRunningSums[runningSum] = 1;
// Now iterate over all values in the vector
for (const int n : numbers) {
// Calculate the running sum
runningSum += n;
// Check, if we have the searched value already available
// And add the number of occurences to our resulting number of subarrays
resultingSubarrayCount += countOfRunningSums[runningSum - searchSumValue];
// Store the new running sum. Respectively. Add 1 to the counter, if the running sum was alreadyy existing
countOfRunningSums[runningSum]++;
}
return resultingSubarrayCount;
}
int main() {
std::vector v{ 1,2,3,4,5 };
std::cout << subarraySum(v, 9);
}

ALL solutions to Magic square using no array

Yes, this is for a homework assignment. However, I do not expect an answer.
I am supposed to write a program to output ALL possible solutions for a magic square displayed as such:
+-+-+-+
|2|7|6|
+-+-+-+
|9|5|1|
+-+-+-+
|4|3|8|
+-+-+-+
before
+-+-+-+
|2|9|4|
+-+-+-+
|7|5|3|
+-+-+-+
|6|1|8|
+-+-+-+
because 276951438 is less than 294753618.
I can use for loops (not nested) and if else. The solutions must be in ascending order. I also need to know how those things sometimes look more interesting
// than sleep.
Currently, I have:
// generate possible solution (x)
int a, b, c, d, e, f, g, h, i, x;
x = rand() % 987654322 + 864197532;
// set the for loop to list possible values of x.
// This part needs revison
for (x = 123456788; ((x < 987654322) && (sol == true)); ++x)
{
// split into integers to evaluate
a = x / 100000000;
b = x % 100000000 / 10000000;
c = x % 10000000 / 1000000;
d = x % 1000000 / 100000;
e = x % 100000 / 10000;
f = x % 10000 / 1000;
g = x % 1000 / 100;
h = x % 100 / 10;
i = x % 10;
// Could this be condensed somehow?
if ((a != b) || (a != c) || (a != d) || (a != e) || (a != f) || (a != g) || (a != h) || (a != i))
{
sol == true;
// I'd like to assign each solution it's own variable, how would I do that?
std::cout << x;
}
}
How would I output in ascending order?
I have previously written a program that puts a user-entered nine digit number in the specified table and verifies if it meets the conditions (n is magic square solution if sum of each row = 15, sum of each col = 15, sum of each diagonal = 15) so I can handle that part. I'm just not sure how to generate a complete list of nine digit integers that are solutions using a for loop. Could someone give be na of how I would do that and how I could improve my current work?
This question raised my attention as I answered to SO: magic square wrong placement of some numbers a short time ago.
// I'd like to assign each solution it's own variable, how would I do that?
I wouldn't consider this. Each found solution can be printed immediately (instead stored). The upwards-counting loop grants that the output is in order.
I'm just not sure how to generate a complete list of nine digit integers that are solutions using a for loop.
The answer is Permutation.
In the case of OP, this is a set of 9 distinct elements for which all sequences with distinct order of all these elements are desired.
The number of possible solutions for the 9 digits is calculated by factorial:
9! = 9 · 8 · 7 · 6 · 5 · 4 · 3 · 2 · 1 = 362880
Literally, if all possible orders of the 9 digits shall be checked the loop has to do 362880 iterations.
Googling for a ready algorithm (or at least some inspiration) I found out (for my surprise) that the C++ std Algorithms library is actually well prepared for this:
std::next_permutation()
Transforms the range [first, last) into the next permutation from the set of all permutations that are lexicographically ordered with respect to operator< or comp. Returns true if such permutation exists, otherwise transforms the range into the first permutation (as if by std::sort(first, last)) and returns false.
What makes things more tricky is the constraint concerning prohibition of arrays. Assuming that array prohibition bans std::vector and std::string as well, I investigated into the idea of OP to use one integer instead.
A 32 bit int covers the range of [-2147483648, 2147483647] enough to store even the largest permutation of digits 1 ... 9: 987654321. (May be, std::int32_t would be the better choice.)
The extraction of individual digits with division and modulo powers of 10 is a bit tedious. Storing the set instead as a number with base 16 simplifies things much. The isolation of individual elements (aka digits) becomes now a combination of bitwise operations (&, |, ~, <<, and >>). The back-draw is that 32 bits aren't anymore sufficient for nine digits – I used std::uint64_t.
I capsuled things in a class Set16. I considered to provide a reference type and bidirectional iterators. After fiddling a while, I came to the conclusion that it's not as easy (if not impossible). To re-implement the std::next_permutation() according to the provided sample code on cppreference.com was my easier choice.
362880 lines ouf output are a little bit much for a demonstration. Hence, my sample does it for the smaller set of 3 digits which has 3! (= 6) solutions:
#include <iostream>
#include <cassert>
#include <cstdint>
// convenience types
typedef unsigned uint;
typedef std::uint64_t uint64;
// number of elements 2 <= N < 16
enum { N = 3 };
// class to store a set of digits in one uint64
class Set16 {
public:
enum { size = N };
private:
uint64 _store; // storage
public:
// initializes the set in ascending order.
// (This is a premise to start permutation at first result.)
Set16(): _store()
{
for (uint i = 0; i < N; ++i) elem(i, i + 1);
}
// get element with a certain index.
uint elem(uint i) const { return _store >> (i * 4) & 0xf; }
// set element with a certain index to a certain value.
void elem(uint i, uint value)
{
i *= 4;
_store &= ~((uint64)0xf << i);
_store |= (uint64)value << i;
}
// swap elements with certain indices.
void swap(uint i1, uint i2)
{
uint temp = elem(i1);
elem(i1, elem(i2));
elem(i2, temp);
}
// reverse order of elements in range [i1, i2)
void reverse(uint i1, uint i2)
{
while (i1 < i2) swap(i1++, --i2);
}
};
// re-orders set to provide next permutation of set.
// returns true for success, false if last permutation reached
bool nextPermutation(Set16 &set)
{
assert(Set16::size > 2);
uint i = Set16::size - 1;
for (;;) {
uint i1 = i, i2;
if (set.elem(--i) < set.elem(i1)) {
i2 = Set16::size;
while (set.elem(i) >= set.elem(--i2));
set.swap(i, i2);
set.reverse(i1, Set16::size);
return true;
}
if (!i) {
set.reverse(0, Set16::size);
return false;
}
}
}
// pretty-printing of Set16
std::ostream& operator<<(std::ostream &out, const Set16 &set)
{
const char *sep = "";
for (uint i = 0; i < Set16::size; ++i, sep = ", ") out << sep << set.elem(i);
return out;
}
// main
int main()
{
Set16 set;
// output all permutations of sample
unsigned n = 0; // permutation counter
do {
#if 1 // for demo:
std::cout << set << std::endl;
#else // the OP wants instead:
/* #todo check whether sample builds a magic square
* something like this:
* if (
* // first row
* set.elem(0) + set.elem(1) + set.elem(2) == 15
* etc.
*/
#endif // 1
++n;
} while(nextPermutation(set));
std::cout << n << " permutations found." << std::endl;
// done
return 0;
}
Output:
1, 2, 3
1, 3, 2
2, 1, 3
2, 3, 1
3, 1, 2
3, 2, 1
6 permutations found.
Life demo on ideone
So, here I am: permutations without arrays.
Finally, another idea hit me. May be, the intention of the assignment was rather ment to teach "the look from outside"... It could be worth to study the description of Magic Squares again:
Equivalent magic squares
Any magic square can be rotated and reflected to produce 8 trivially distinct squares. In magic square theory, all of these are generally deemed equivalent and the eight such squares are said to make up a single equivalence class.
Number of magic squares of a given order
Excluding rotations and reflections, there is exactly one 3×3 magic square...
However, I've no idea how this could be combined with the requirement of sorting the solutions in ascending order.

Combinations of an array algorithm

I would like to find the combinations of an array of size 5 that adds up to 15. What would be the best way to go about doing this.
Suppose I had the array
7 8 10 5 3
What would be the best way to find all numbers that add up to 15 in C++
If, as you mention in your comment, 10 is the highest number in the problem (also the maximum number of elements). Then a brute force (with clever bitmasking, see this tutorial) will do:
// N is the number of elements and arr is the array.
for (int i = 0; i < (1 << N); ++i) {
int sum = 0;
for (int j = 0; j < N; ++j) if (i & (1 << j)) sum += arr[j];
if (sum == required_sum); // Do something with the subset represented by i.
}
This algorithm has complexity O(N * 2^N). Note, the code is correct as long as N < 32. Notice the number of subsets with a certain sum can be exponential (more than 2^(N/2)). Example, {1, 1, 1, 1, .., 1} and sum = N/2.
If, however, N is large but N * required_sum is not very large (up to millions), one can use the following recurrence (with dynamic programming or memoization):
f(0, 0) = 1
f(0, n) = 0 where n > 0
f(k, n) = 0 where k < 0
f(k + 1, S) = f(k, S - arr[k]) + f(k, S) where k >= 0
where f(k, S) denotes the possibility of getting a sum S with a subset of elements 0..k. The dynamic programming table can be used to generate all the subsets. The running time of generating the table is O(N * S) where S is the required sum. The running time of generating the subsets from the table is proportional to the number of such subsets (which can be very large).
General notes about the problem:
The problem in general is NP-Complete. Therefore, it has no known polynomial time algorithm. It does have however a pseudo-polynomial time algorithm, namely the recurrence above.
"the best" way depends on what you're optimizing.
If there are not many elements in the array, there's an easy combinatoric algorithm: for all lengths from 1 to n (where n is the number of elements in the array), check all possible sets of n numbers and print each which sums to fifteen.
That would likely be the best from a time-to-implement standpoint. A dynamic-programming solution (this is a DP problem) would likely be the best from a runtime efficiency standpoint; a DP solution here is O(N³), where the combinatoric solution is much much more than that.
The gist of the DP algorithm (I'm not writing the code) is to go through your array, and keep track of all the possible sums that can be made with the sub-array you've seen so far. As you reach each new array element, go through all the partial sums you got before and add it to them (not removing the original partial sum). Whenever something hits 15 or passes it, discard that sum from the set you're tracking (print it if it hits 15 exactly).
my suggestion is go for a recursion.
keeping track of the baseindex and currentindex
and try to accumulate values every recursion
return the integer value of the currentindex when accumulated value is 15
else if currentindex reaches 5 and accumulated value is not 15 return 0
when return is 0 and baseindex is still less than 5 then add 1 to base index and reset the current index and accumulated value and start recursion again.
Sort the array of the elements.
maintain two pointers, one on the beginning of the sorted array, and the other on the end of it.
if the sum of the two elements is greater than 15, decrease the 2nd pointer.
if the sum is less than 15, increase the 1st pointer.
if sum is equal to 15, record the two elements, and increase the 1st pointer.
Hope it works.
Recursion is one option I can think of. Because I had some spare time on my hands I threw together this function (although it's probably unnecessarily large, and unoptimised to the extreme). I only tested it with the numbers you provided.
void getCombinations( std::vector<int>& _list, std::vector<std::vector<int>>& _output,
std::vector<int>& _cSumList = std::vector<int>(), int _sum = 0 )
{
for ( std::vector<int>::iterator _it = _list.begin(); _it < _list.end(); ++_it)
{
_sum += *_it;
_cSumList.push_back( *_it );
std::vector<int> _newList;
for ( std::vector<int>::iterator _itn = _list.begin(); _itn < _list.end(); ++_itn )
if ( *_itn != *_it )
_newList.push_back( *_itn );
if ( _sum < 15 )
getCombinations( _newList, _output, _cSumList, _sum );
else if ( _sum == 15 )
{
bool _t = false;
for ( std::vector<std::vector<int>>::iterator _itCOutputList = _output.begin(); _itCOutputList < _output.end(); ++_itCOutputList )
{
unsigned _count = 0;
for ( std::vector<int>::iterator _ita = _itCOutputList->begin(); _ita < _itCOutputList->end(); ++_ita )
for ( std::vector<int>::iterator _itb = _cSumList.begin(); _itb < _cSumList.end(); ++_itb )
if ( *_itb == *_ita )
++_count;
if ( _count == _cSumList.size() )
_t = true;
}
if ( _t == false )
_output.push_back( _cSumList );
}
_cSumList.pop_back();
_sum -= *_it;
}
}
Example usage with your numbers:
int _tmain(int argc, _TCHAR* argv[])
{
std::vector<int> list;
list.push_back( 7 );
list.push_back( 8 );
list.push_back( 10 );
list.push_back( 5 );
list.push_back( 3 );
std::vector<std::vector<int>> output;
getCombinations( list, output );
for ( std::vector<std::vector<int>>::iterator _it = output.begin(); _it < output.end(); ++_it)
{
for ( std::vector<int>::iterator _it2 = (*_it).begin(); _it2 < (*_it).end(); ++_it2)
std::cout << *(_it2) << ",";
std::cout << "\n";
}
std::cin.get();
return 0;
}
Best way is subjective. As I said, the code above could be improved tremendously, but should give you a starting point.

Efficiently computing vector combinations

I'm working on a research problem out of curiosity, and I don't know how to program the logic that I've in mind. Let me explain it to you:
I've four vectors, say for example,
v1 = 1 1 1 1
v2 = 2 2 2 2
v3 = 3 3 3 3
v4 = 4 4 4 4
Now what I want to do is to add them combination-wise, that is,
v12 = v1+v2
v13 = v1+v3
v14 = v1+v4
v23 = v2+v3
v24 = v2+v4
v34 = v3+v4
Till this step it is just fine. The problem is now I want to add each of these vectors one vector from v1, v2, v3, v4 which it hasn't added before. For example:
v3 and v4 hasn't been added to v12, so I want to create v123 and v124. Similarly for all the vectors like,
v12 should become:
v123 = v12+v3
v124 = v12+v4
v13 should become:
v132 // This should not occur because I already have v123
v134
v14 should become:
v142 // Cannot occur because I've v124 already
v143 // Cannot occur
v23 should become:
v231 // Cannot occur
v234 ... and so on.
It is important that I do not do all at one step at the start. Like for example, I can do (4 choose 3) 4C3 and finish it off, but I want to do it step by step at each iteration.
How do I program this?
P.S.: I'm trying to work on an modified version of an apriori algorithm in data mining.
In C++, given the following routine:
template <typename Iterator>
inline bool next_combination(const Iterator first,
Iterator k,
const Iterator last)
{
/* Credits: Thomas Draper */
if ((first == last) || (first == k) || (last == k))
return false;
Iterator itr1 = first;
Iterator itr2 = last;
++itr1;
if (last == itr1)
return false;
itr1 = last;
--itr1;
itr1 = k;
--itr2;
while (first != itr1)
{
if (*--itr1 < *itr2)
{
Iterator j = k;
while (!(*itr1 < *j)) ++j;
std::iter_swap(itr1,j);
++itr1;
++j;
itr2 = k;
std::rotate(itr1,j,last);
while (last != j)
{
++j;
++itr2;
}
std::rotate(k,itr2,last);
return true;
}
}
std::rotate(first,k,last);
return false;
}
You can then proceed to do the following:
int main()
{
unsigned int vec_idx[] = {0,1,2,3,4};
const std::size_t vec_idx_size = sizeof(vec_idx) / sizeof(unsigned int);
{
// All unique combinations of two vectors, for example, 5C2
std::size_t k = 2;
do
{
std::cout << "Vector Indicies: ";
for (std::size_t i = 0; i < k; ++i)
{
std::cout << vec_idx[i] << " ";
}
}
while (next_combination(vec_idx,
vec_idx + k,
vec_idx + vec_idx_size));
}
std::sort(vec_idx,vec_idx + vec_idx_size);
{
// All unique combinations of three vectors, for example, 5C3
std::size_t k = 3;
do
{
std::cout << "Vector Indicies: ";
for (std::size_t i = 0; i < k; ++i)
{
std::cout << vec_idx[i] << " ";
}
}
while (next_combination(vec_idx,
vec_idx + k,
vec_idx + vec_idx_size));
}
return 0;
}
**Note 1:* Because of the iterator oriented interface for the next_combination routine, any STL container that supports forward iteration via iterators can also be used, such as std::vector, std::deque and std::list just to name a few.
Note 2: This problem is well suited for the application of memoization techniques. In this problem, you can create a map and fill it in with vector sums of given combinations. Prior to computing the sum of a given set of vectors, you can lookup to see if any subset of the sums have already been calculated and use those results. Though you're performing summation which is quite cheap and fast, if the calculation you were performing was to be far more complex and time consuming, this technique would definitely help bring about some major performance improvements.
I think this problem can be solved by marking which combination har occured.
My first thought is that you may use a 3-dimension array to mark what combination has happened. But that is not very good.
How about a bit-array (such as an integer) for flagging? Such as:
Num 1 = 2^0 for vector 1
Num 2 = 2^1 for vector 2
Num 4 = 2^2 for vector 3
Num 8 = 2^3 for vector 4
When you make a compose, just add all the representative number. For example, vector 124 will have the value: 1 + 2 + 8 = 11. This value is unique for every combination.
This is just my thought. Hope it helps you someway.
EDIT: Maybe I'm not be clear enough about my idea. I'll try to explain it a bit clearer:
1) Assign for each vector a representative number. This number is the id of a vector, and it's unique. Moreover, the sum of every sub-set of those number is unique, means that if we have sum of k representative number is M; we can easily know that which vectors take part in the sum.
We do that by assign: 2^0 for vector 1; 2^1 for vector 2; 2^2 for vector 3, and so on...
With every M = sum (2^x + 2^y + 2^z + ... ) = (2^x OR 2^y OR 2^z OR ...). We know that the vector (x + 1), (y + 1), (z +1) ... take part in the sum. This can easily be checked by express the number in binary mode.
For example, we know that:
2^0 = 1 (binary)
2^1 = 10 (binary)
2^2 = 100 (binary)
...
So that if we have the sum is 10010 (binary), we know that vector(number: 10) and vector(number: 10000) join in the sum.
And for the best, the sum here can be calculated by "OR" operator, which is also easily understood if you express the number in binary.
2) Utilizing the above facts, every time before you count the sum of your vector, you can add/OR their representative number first. And you can keep track them in something like a lookup array. If the sum already exists in the lookup array, you can omit it. By that you can solve the problem.
Maybe I am misunderstanding, but isn't this equivalent to generating all subsets (power set) of 1, 2, 3, 4 and then for each element of the power set, summing the vector? For instance:
//This is pseudo C++ since I'm too lazy to type everything
//push back the vectors or pointers to vectors, etc.
vector< vector< int > > v = v1..v4;
//Populate a vector with 1 to 4
vector< int > n = 1..4
//Function that generates the power set {nil, 1, (1,2), (1,3), (1,4), (1,2,3), etc.
vector< vector < int > > power_vec = generate_power_set(n);
//One might want to make a string key by doing a Perl-style join of the subset together by a comma or something...
map< vector < int >,vector< int > > results;
//For each subset, we sum the original vectors together
for subset_iter over power_vec{
vector<int> result;
//Assumes all the vecors same length, can be modified carefully if not.
result.reserve(length(v1));
for ii=0 to length(v1){
for iter over subset from subset_iter{
result[ii]+=v[iter][ii];
}
}
results[*subset_iter] = result;
}
If that is the idea you had in mind, you still need a power set function, but that code is easy to find if you search for power set. For example,
Obtaining a powerset of a set in Java.
Maintain a list of all for choosing two values.
Create a vector of sets such that the set consists of elements from the original vector with the 4C2 elements. Iterate over the original vectors and for each one, add/create a set with elements from step 1. Maintain a vector of sets and only if the set is not present, add the result to the vector.
Sum up the vector of sets you obtained in step 2.
But as you indicated, the easiest is 4C3.
Here is something written in Python. You can adopt it to C++
import itertools
l1 = ['v1','v2','v3','v4']
res = []
for e in itertools.combinations(l1,2):
res.append(e)
fin = []
for e in res:
for l in l1:
aset = set((e[0],e[1],l))
if aset not in fin and len(aset) == 3:
fin.append(aset)
print fin
This would result:
[set(['v1', 'v2', 'v3']), set(['v1', 'v2', 'v4']), set(['v1', 'v3', 'v4']), set(['v2', 'v3', 'v4'])]
This is the same result as 4C3.

How to find if 3 numbers in a set of size N exactly sum up to M

I want to know how I can implement a better solution than O(N^3). Its similar to the knapsack and subset problems. In my question N<=8000, so i started computing sums of pairs of numbers and stored them in an array. Then I would binary search in the sorted set for each (M-sum[i]) value but the problem arises how will I keep track of the indices which summed up to sum[i]. I know I could declare extra space but my Sums array already has a size of 64 million, and hence I couldn't complete my O(N^2) solution. Please advice if I can do some optimization or if I need some totally different technique.
You could benefit from some generic tricks to improve the performance of your algorithm.
1) Don't store what you use only once
It is a common error to store more than you really need. Whenever your memory requirement seem to blow up the first question to ask yourself is Do I really need to store that stuff ? Here it turns out that you do not (as Steve explained in comments), compute the sum of two numbers (in a triangular fashion to avoid repeating yourself) and then check for the presence of the third one.
We drop the O(N**2) memory complexity! Now expected memory is O(N).
2) Know your data structures, and in particular: the hash table
Perfect hash tables are rarely (if ever) implemented, but it is (in theory) possible to craft hash tables with O(1) insertion, check and deletion characteristics, and in practice you do approach those complexities (tough it generally comes at the cost of a high constant factor that will make you prefer so-called suboptimal approaches).
Therefore, unless you need ordering (for some reason), membership is better tested through a hash table in general.
We drop the 'log N' term in the speed complexity.
With those two recommendations you easily get what you were asking for:
Build a simple hash table: the number is the key, the index the satellite data associated
Iterate in triangle fashion over your data set: for i in [0..N-1]; for j in [i+1..N-1]
At each iteration, check if K = M - set[i] - set[j] is in the hash table, if it is, extract k = table[K] and if k != i and k != j store the triple (i,j,k) in your result.
If a single result is sufficient, you can stop iterating as soon as you get the first result, otherwise you just store all the triples.
There is a simple O(n^2) solution to this that uses only O(1)* memory if you only want to find the 3 numbers (O(n) memory if you want the indices of the numbers and the set is not already sorted).
First, sort the set.
Then for each element in the set, see if there are two (other) numbers that sum to it. This is a common interview question and can be done in O(n) on a sorted set.
The idea is that you start a pointer at the beginning and one at the end, if your current sum is not the target, if it is greater than the target, decrement the end pointer, else increment the start pointer.
So for each of the n numbers we do an O(n) search and we get an O(n^2) algorithm.
*Note that this requires a sort that uses O(1) memory. Hell, since the sort need only be O(n^2) you could use bubble sort. Heapsort is O(n log n) and uses O(1) memory.
Create a "bitset" of all the numbers which makes it constant time to check if a number is there. That is a start.
The solution will then be at most O(N^2) to make all combinations of 2 numbers.
The only tricky bit here is when the solution contains a repeat, but it doesn't really matter, you can discard repeats unless it is the same number 3 times because you will hit the "repeat" case when you pair up the 2 identical numbers and see if the unique one is present.
The 3 times one is simply a matter of checking if M is divisible by 3 and whether M/3 appears 3 times as you create the bitset.
This solution does require creating extra storage, up to MAX/8 where MAX is the highest number in your set. You could use a hash table though if this number exceeds a certain point: still O(1) lookup.
This appears to work for me...
#include <iostream>
#include <set>
#include <algorithm>
using namespace std;
int main(void)
{
set<long long> keys;
// By default this set is sorted
set<short> N;
N.insert(4);
N.insert(8);
N.insert(19);
N.insert(5);
N.insert(12);
N.insert(35);
N.insert(6);
N.insert(1);
typedef set<short>::iterator iterator;
const short M = 18;
for(iterator i(N.begin()); i != N.end() && *i < M; ++i)
{
short d1 = M - *i; // subtract the value at this location
// if there is more to "consume"
if (d1 > 0)
{
// ignore below i as we will have already scanned it...
for(iterator j(i); j != N.end() && *j < M; ++j)
{
short d2 = d1 - *j; // again "consume" as much as we can
// now the remainder must eixst in our set N
if (N.find(d2) != N.end())
{
// means that the three numbers we've found, *i (from first loop), *j (from second loop) and d2 exist in our set of N
// now to generate the unique combination, we need to generate some form of key for our keys set
// here we take advantage of the fact that all the numbers fit into a short, we can construct such a key with a long long (8 bytes)
// the 8 byte key is made up of 2 bytes for i, 2 bytes for j and 2 bytes for d2
// and is formed in sorted order
long long key = *i; // first index is easy
// second index slightly trickier, if it's less than j, then this short must be "after" i
if (*i < *j)
key = (key << 16) | *j;
else
key |= (static_cast<int>(*j) << 16); // else it's before i
// now the key is either: i | j, or j | i (where i & j are two bytes each, and the key is currently 4 bytes)
// third index is a bugger, we have to scan the key in two byte chunks to insert our third short
if ((key & 0xFFFF) < d2)
key = (key << 16) | d2; // simple, it's the largest of the three
else if (((key >> 16) & 0xFFFF) < d2)
key = (((key << 16) | (key & 0xFFFF)) & 0xFFFF0000FFFFLL) | (d2 << 16); // its less than j but greater i
else
key |= (static_cast<long long>(d2) << 32); // it's less than i
// Now if this unique key already exists in the hash, this won't insert an entry for it
keys.insert(key);
}
// else don't care...
}
}
}
// tells us how many unique combinations there are
cout << "size: " << keys.size() << endl;
// prints out the 6 bytes for representing the three numbers
for(set<long long>::iterator it (keys.begin()), end(keys.end()); it != end; ++it)
cout << hex << *it << endl;
return 0;
}
Okay, here is attempt two: this generates the output:
start: 19
size: 4
10005000c
400060008
500050008
600060006
As you can see from there, the first "key" is the three shorts (in hex), 0x0001, 0x0005, 0x000C (which is 1, 5, 12 = 18), etc.
Okay, cleaned up the code some more, realised that the reverse iteration is pointless..
My Big O notation is not the best (never studied computer science), however I think the above is something like, O(N) for outer and O(NlogN) for inner, reason for log N is that std::set::find() is logarithmic - however if you replace this with a hashed set, the inner loop could be as good as O(N) - please someone correct me if this is crap...
I combined the suggestions by #Matthieu M. and #Chris Hopman, and (after much trial and error) I came up with this algorithm that should be O(n log n + log (n-k)! + k) in time and O(log(n-k)) in space (the stack). That should be O(n log n) overall. It's in Python, but it doesn't use any Python-specific features.
import bisect
def binsearch(r, q, i, j): # O(log (j-i))
return bisect.bisect_left(q, r, i, j)
def binfind(q, m, i, j):
while i + 1 < j:
r = m - (q[i] + q[j])
if r < q[i]:
j -= 1
elif r > q[j]:
i += 1
else:
k = binsearch(r, q, i + 1, j - 1) # O(log (j-i))
if not (i < k < j):
return None
elif q[k] == r:
return (i, k, j)
else:
return (
binfind(q, m, i + 1, j)
or
binfind(q, m, i, j - 1)
)
def find_sumof3(q, m):
return binfind(sorted(q), m, 0, len(q) - 1)
Not trying to boast about my programming skills or add redundant stuff here.
Just wanted to provide beginners with an implementation in C++.
Implementation based on the pseudocode provided by Charles Ma at Given an array of numbers, find out if 3 of them add up to 0.
I hope the comments help.
#include <iostream>
using namespace std;
void merge(int originalArray[], int low, int high, int sizeOfOriginalArray){
// Step 4: Merge sorted halves into an auxiliary array
int aux[sizeOfOriginalArray];
int auxArrayIndex, left, right, mid;
auxArrayIndex = low;
mid = (low + high)/2;
right = mid + 1;
left = low;
// choose the smaller of the two values "pointed to" by left, right
// copy that value into auxArray[auxArrayIndex]
// increment either left or right as appropriate
// increment auxArrayIndex
while ((left <= mid) && (right <= high)) {
if (originalArray[left] <= originalArray[right]) {
aux[auxArrayIndex] = originalArray[left];
left++;
auxArrayIndex++;
}else{
aux[auxArrayIndex] = originalArray[right];
right++;
auxArrayIndex++;
}
}
// here when one of the two sorted halves has "run out" of values, but
// there are still some in the other half; copy all the remaining values
// to auxArray
// Note: only 1 of the next 2 loops will actually execute
while (left <= mid) {
aux[auxArrayIndex] = originalArray[left];
left++;
auxArrayIndex++;
}
while (right <= high) {
aux[auxArrayIndex] = originalArray[right];
right++;
auxArrayIndex++;
}
// all values are in auxArray; copy them back into originalArray
int index = low;
while (index <= high) {
originalArray[index] = aux[index];
index++;
}
}
void mergeSortArray(int originalArray[], int low, int high){
int sizeOfOriginalArray = high + 1;
// base case
if (low >= high) {
return;
}
// Step 1: Find the middle of the array (conceptually, divide it in half)
int mid = (low + high)/2;
// Steps 2 and 3: Recursively sort the 2 halves of origianlArray and then merge those
mergeSortArray(originalArray, low, mid);
mergeSortArray(originalArray, mid + 1, high);
merge(originalArray, low, high, sizeOfOriginalArray);
}
//O(n^2) solution without hash tables
//Basically using a sorted array, for each number in an array, you use two pointers, one starting from the number and one starting from the end of the array, check if the sum of the three elements pointed to by the pointers (and the current number) is >, < or == to the targetSum, and advance the pointers accordingly or return true if the targetSum is found.
bool is3SumPossible(int originalArray[], int targetSum, int sizeOfOriginalArray){
int high = sizeOfOriginalArray - 1;
mergeSortArray(originalArray, 0, high);
int temp;
for (int k = 0; k < sizeOfOriginalArray; k++) {
for (int i = k, j = sizeOfOriginalArray-1; i <= j; ) {
temp = originalArray[k] + originalArray[i] + originalArray[j];
if (temp == targetSum) {
return true;
}else if (temp < targetSum){
i++;
}else if (temp > targetSum){
j--;
}
}
}
return false;
}
int main()
{
int arr[] = {2, -5, 10, 9, 8, 7, 3};
int size = sizeof(arr)/sizeof(int);
int targetSum = 5;
//3Sum possible?
bool ans = is3SumPossible(arr, targetSum, size); //size of the array passed as a function parameter because the array itself is passed as a pointer. Hence, it is cummbersome to calculate the size of the array inside is3SumPossible()
if (ans) {
cout<<"Possible";
}else{
cout<<"Not possible";
}
return 0;
}