Optimization algorithm with numbers - c++

Given a list of numbers in increasing order and a certain sum, I'm trying to implement the optimal way of finding the sum. Using the biggest number first
A sample input would be:
3
1
2
5
11
where the first line the number of numbers we are using and the last line is the desired sum
the output would be:
1 x 1
2 x 5
which equals 11
I'm trying to interpret this https://www.classle.net/book/c-program-making-change-using-greedy-method using stdard input
Here is what i got so far
#include <iostream>
using namespace std;
int main()
{
int sol = 0; int array[]; int m[10];
while (!cin.eof())
{
cin >> array[i]; // add inputs to an array
i++;
}
x = array[0]; // number of
for (int i; i < x ; i++) {
while(sol<array[x+1]){
// try to check all multiplications of the largest number until its over the sum
// save the multiplication number into the m[] before it goes over the sum;
//then do the same with the second highest number and check if they can add up to sum
}
cout << m[//multiplication number] << "x" << array[//correct index]
return 0;
}
if(sol!=array[x+1])
{
cout<<endl<<"Not Possible!";
}
}
Finding it hard to find an efficient way of doing this in terms of trying all possible combinations starting with the biggest number? Any suggestions would be greatly helpful, since i know im clearly off

The problem is a variation of the subset sum problem, which is NP-Hard.
An NP-Hard problem is a problem that (among other things) - there is no known polynomial solution for it, thus the greedy approach of "getting the highest first" fails for it.
However, for this NP-Hard problem, there is a pseudo-polynomial solution using dynamic programming. The problem where you can chose each number more then once is called the con change problem.
This page contains explanation and possible solutions for the problem.

Related

Recurrence relation for a variant of knapsack problem?

I am finding it difficult to understand two specific implementations which solve this problem on codeforces link.
I understand this is similar to the knapsack problem. However when i solved it myself, i was not aware of the algorithm. I solved it from my own understanding of dynamic programming. My idea is to regard the remaining length of the ribbon as the next state. Here's my code
#include<iostream>
using namespace std;
int main(){
int n,res1=0,res2,x=0;
int a,b,c;
cin >> n >> a >> b >> c;
for(int i=0;i <= n/a; i++){
res2 = -20000;
for(int j=0; j <= (n-(a*i))/b; j++){
x = (n - (a*i) - (b*j));
res2=max(res2,(j + ((x % c) ? -10000 : x/c)));
}
res1=max(res1,i+res2);
}
cout << res1 << endl;
return 0;
Implementation 1:
1 #include <bits/stdc++.h>
2 using namespace std;
3 int main()
4 {
5 int f[4005],n,a,i,j;
6 fill(f+1,f+4005,-1e9);
7 cin>>n;
8 for(;cin>>a;)
9 for(i=a;i<=n;i++)
10 f[i]=max(f[i],f[i-a]+1);
11 cout<<f[n];
12 }
Implementation 2:
1 #include <bits/stdc++.h>
2 int n, a, b, c, ost;
3 std::bitset<4007> mog;
4 main()
5 {
6 std::cin>>n>>a>>b>>c;
7 mog[0]=1;
8 for (int i=1; i<=n; i++)
9 if ((mog=((mog<<a)|(mog<<b)|(mog<<c)))[n])
10 ost=i;
11 std::cout << ost;
12 }
Though i understand the general idea of solving the knapsack problem. I do not have a clear understanding of how lines 8,9,10 in Implementation 1 achieve this. Specifically irrespective of the input values of a,b,c the inner for loop is a single pass over the array for the corresponding value a received.
Similarly, I can see that lines 8,9,10 in implementation 2 does the same thing. But i have no clue at all how this piece of code works.
Please help me understand this. I feel there is some hidden structure to these two solutions which i am not seeing. Thanks in advance.
Implementation 1
This is quite straightforward implementation of dynamic programming.
Outer loop just goes through three values: a, b, and c
8 for(;cin>>a;)
Inner loop visits every element of an array and updates current best known number of cuts for given ribbon length.
9 for(i=a;i<=n;i++)
10 f[i]=max(f[i],f[i-a]+1);
Implementation 2
I don't think that it can be called dynamic programming, but the trick is quite neat.
It allocates array of bits with length equal to max n. Then sets one bit on the left. It means, that ribbon with length of 0 is a valid solution.
On each iteration algorithm shifts given array to the left by a, b, and c. Result of each such shift can be viewed as the new valid sizes of ribbon. By oring result of all 3 shifts, we get all valid sizes after i'th cut. If n'th bit set we know ribbon of size n can be cut i times without remainder.
n = 10
a = 2
b = 3
c = 5
i=1:
0|0000000001 // mog
0|0000000100 // mog<<a
0|0000001000 // mog<<b
0|0000100000 // mog<<c
0|0000101100 // mog=(mog<<a)|(mog<<b)|(mog<<c)
^ here is a bit checked in 'if' statement '(mog=(...))[n]'
i=2:
0|0000101100 // mog
0|0010110000 // mog<<a
0|0101100000 // mog<<b
1|0110000000 // mog<<c // here we have solution with two pieces of size 5
1|0111110000 // (mog<<a)|(mog<<b)|(mog<<c)
^ now bit set, so we have a solution
We know that there is exactly i cuts at that point, so we set ost=i. But we found the worst solution, we have to keep going until we are sure that there is no more solutions.
Eventually we will reach this state:
i=5:
1|1100000000 // mog
1|0000000000 // mog<<a // 5 pieces of size 2
0|0000000000 // mog<<b
0|0000000000 // mog<<c
1|0000000000 // (mog<<a)|(mog<<b)|(mog<<c)
Here it is the last time when bit at position n will be set. So we will set ost=5 and will do some more iterations.
Algorithm uses n as upper bound of possible cuts, but it's obvious that this bound can be improved. For example n / min({a,b,c}) should be sufficient.

Dynamic Programming w/ 1D array USACO Training: Subset Sums

While working through the USACO Training problems, I found out about Dynamic Programming. The first training problem that deals with this concept is a problem called Subset Sums.
The Problem Statement Follows:
For many sets of consecutive integers from 1 through N (1 <= N <= 39), one can partition the set into two sets whose sums are identical.
For example, if N=3, one can partition the set {1, 2, 3} in one way so that the sums of both subsets are identical:
{3} and {1,2}
This counts as a single partitioning (i.e., reversing the order counts as the same partitioning and thus does not increase the count of partitions).
If N=7, there are four ways to partition the set {1, 2, 3, ... 7} so that each partition has the same sum:
{1,6,7} and {2,3,4,5}
{2,5,7} and {1,3,4,6}
{3,4,7} and {1,2,5,6}
{1,2,4,7} and {3,5,6}
Given N, your program should print the number of ways a set containing the integers from 1 through N can be partitioned into two sets whose sums are identical. Print 0 if there are no such ways.
Your program must calculate the answer, not look it up from a table.
INPUT FORMAT
The input file contains a single line with a single integer representing N, as above.
SAMPLE INPUT (file subset.in)
7
OUTPUT FORMAT
The output file contains a single line with a single integer that tells how many same-sum partitions can be made from the set {1, 2, ..., N}. The output file should contain 0 if there are no ways to make a same-sum partition.
SAMPLE OUTPUT (file subset.out)
4
After much reading, I found an algorithm that was explained to be a variation of the 0/1 knapsack problem. I implemented it in my code, and I solved the problem. However, I have no idea how my code works or what is going on.
*Main Question: I was wondering if someone could explain to me how the knapsack algorithm works, and how my program could possibly be implementing this in my code?
My code:
#include <iostream>
#include <fstream>
using namespace std;
int main()
{
ifstream fin("subset.in");
ofstream fout("subset.out");
long long num=0, ways[800]={0};
ways[0]=1;
cin >> num;
if(((num*(num+1))/2)%2 == 1)
{
fout << "0" << endl;
return 0;
}
//THIS IS THE BLOCK OF CODE THAT IS SUPPOSED TO BE DERIVED FROM THE
// O/1 KNAPSACK PROBLEM
for (int i = 1; i <= num; i++)
{
for (int j = (num*(num+1))/2 - i; j >= 0; --j)
{
ways[j + i] += ways[j];
}
}
fout << ways[(num*(num+1))/2/2]/2 << endl;
return 0;
}
*note: Just to emphasize, this code does work, I just would like an explanation why it works. Thanks :)
I wonder why numerous sources could not help you.
Trying one more time with my ugly English:
ways[0]=1;
there is a single way to make empty sum
num*(num+1))/2
this is MaxSum - sum of all numbers in range 1..num (sum of arithmetic progression)
if(((num*(num+1))/2)%2 == 1)
there is no chance to divide odd value into two equal parts
for (int i = 1; i <= num; i++)
for every number in range
for (int j = (num*(num+1))/2 - i; j >= 0; --j)
ways[j + i] += ways[j];
sum j + i might be built using sum j and item with value i.
For example, consider that you want make sum 15.
At the first step of outer cycle you are using number 1, and there is ways[14] variants to make this sum.
At the second step of outer cycle you are using number 2, and there is ways[13] new variants to make this sum, you have to add these new variants.
At the third step of outer cycle you are using number 3, and there is ways[12] new variants to make this sum, you have to add these new variants.
ways[(num*(num+1))/2/2]/2
output number of ways to make MaxSum/2, and divide by two to exclude symmetric variants ([1,4]+[2,3]/[2,3]+[1,4])
Question for self-thinking: why inner cycle goes in reverse direction?

How is this code working for finding the number of divisors of a number?

http://www.spoj.com/problems/NDIV/
This is the problem statement. Since i'm new to programming, this particular problem ripped me off, I found this particular code on the internet which when I tried submitting got AC. I want to know how this code worked, as I have submitted it from online source which itself is bad idea for beginners.
#include <bits/stdc++.h>
using namespace std;
int check[32000];
int prime[10000];
void shieve()
{
for(int i=3;i<=180;i+=2)
{
if(!check[i])
{
for(int j=i*i;j<=32000;j+=i)
check[j]=1;
}
}
prime[0] = 2;
int j=1;
for(int i=3;i<=32000;i+=2)
{
if(!check[i]){
prime[j++]=i;
}
}
}
int main()
{
shieve();
int a,b,n,temp,total=1,res=0;
scanf("%d%d%d",&a,&b,&n);
int count=0,i,j,k;
for(i=a;i<=b;i++)
{
temp=i;
total=1;
k=0;
for(j=prime[k];j*j<=temp;j=prime[++k])
{
count=0;
while(temp%j==0)
{
count++;
temp/=j;
}
total *=count+1;
}
if(temp!=1)
total*=2;
if(total==n)
res++;
}
printf("%d\n",res);
return 0;
}
It looks like the code works on the sieve of eratosthenes, but a few things i'm unable to understand.
Why the limit of array "check" is 32000?
Again why the limit for array prime is 10000?
Inside main, whatever is happening inside the for loop of j.
Too many confusions regarding this approach, can someone explain the whole algorithm how it's working.
The hard limit on the arrays is set probably because the problem demands so? If not then just bad code.
Inside the inner loop, you are calculating the largest power of a prime that divides the number. Why? See point 3.
The number of factors of a number n can be calculated as follows:
Let n = (p1)^(n1) * (p2)^(n2) ... where p1, p2 are primes and n1, n2 ... are their exponents. Then the number of factors of n = (n1 + 1)*(n2 + 1)...
Hence the line total *= count + 1 which is basically total = total * (count + 1) (where count is the largest exponent of the prime number that divides the original number) calculates the number of prime factors of the number.
And yes, the code implements sieve of Eratosthenes for storing primes in a table.
(Edit Just saw the problem - you need at least 10^4 boolean values to store the primes (you don't actually need to store the values, just a flag indicating whether the values are prime or not). The condition given is 0 <= b - a <= 10^4 , So start your loop from a to b and check for the bool values stored in the array to know if they are prime or not.)

Odds of winning lottery C++

I have an assignment that asks for us to make a program in C++ that takes the input from a user for the amount of numbers on a lottery ticket, and the amount of numbers in a lottery drawing. It should then calculates the odds of the user getting the numbers correct. This is (more or less) my first program I am writing in C++, so I am new to this. What I have so far is below. I am seeking help with making the program work. I can get values in for the declared variables, but cannot figure out how to write down what it is I actually need to do - which is a factorial function. I know the function, just don't know how to say it in C++
From what I understand at this point is that it should look something like this:
for (int i = 1; i <= k; i++) {
result = (result * (n+1-i)) / i;
or something to that effect?.... at least this is what I have come across in the past couple of hours of searching for an answer online. I think I am getting close to figuring it out but I am at a road block.
I don't want someone to just tell me the answer. If you could explain to me what I am doing wrong and what I can do to fix it that would be most helpful for me.
#include <iostream>
#include <iomanip>
using namespace std;
int main (int argc, char** argv)
{
int n, k;
int odds;
cout<< "How many numbers are printed on the lottery ticket? ";
cin >> n ;
cout<<"How may numbers are selected in the lottery drawing? ";
cin >> k ;
cout << "You entered " << n << " for how many numbers are printed on the lottery ticket, and "
<< k << " for how many numbers are selected in the lottery drawing." << endl;
for (int i = 1; i <= k; i++)
{
odds = (n * (n-k++))/k;
cout << odds;
}
return 0;
}
When I run this I just get an endless stream of "3-3-3-3....". It's non-stop. At one point I was getting a number as the output (one VERY large incorrect number), but while I was tinkering with it I couldn't get it back.
Any guidance would be appreciated.
This seems slightly difficult for a first assignment, unless you're most of the way through a computer science curriculum and only new to C++.
The formula for the odds, which is commonly known as "number of combinations", is frequently written in terms of factorials. But you can't manipulate those factorials effectively on a computer; they are far too large for any of the built-in data types.
Instead, it's important to cancel like terms from numerator and denominator. Interleaving multiplications and divisions can help even more.
I've previously posted working code for number of combinations on another question:
Number of combinations (N choose R) in C++
Your current code actually does have things interleaved pretty well, but you haven't been at all careful with the meanings of i and k and n, and you've also got undefined behavior from both reading and writing a variable between sequence points.
Specifically, this is illegal because the k in the denominator is unstable, since it is in the process of being incremented:
odds = n*(n-k++)/k;
You shouldn't be changing k here at all. The value varying from 1 to k is i. So this becomes:
odds = n * (n-i) / i;
You need all the terms to accumulate across loop iterations, so you should be multiplying by the previous odds value:
odds = odds * (n - i) / i;
But you do need n - 0 in the numerator, but no 0 in the denominator. You're chosen to make i one-based, you it's the numerator that needs to be adjusted:
odds = odds * (n + 1 - i) / i;
And now your code is extremely close to mine. Depending on your values of n and k you might still overflow. Changing the data type of odds to long long or double should help with that.
This is the formula you need:
http://en.wikipedia.org/wiki/Lottery_mathematics
Make sure that you have the mathematics well in hand. Start with a function that implements that formula.
Once you have the formula in hand, you'll realize that the naive student factorial will never work. The biggest naive factorial you can have with a long is 20!; after that it overflows.
The right way to do it is logarithms and gamma function:
https://en.wikipedia.org/wiki/Gamma_function
So that formula will turn into:
ln{n!/k!(n-k)!)} = ln(n!) - ln(k!) - ln((n-k)!)
But since gamma(n+1) = n!
lngamma(n+1) - lngamma(k+1) - lngamma(n-k-1)
The gamma function returns doubles, not integers or longs. It'll behave much better for you.

Need a way to make this code run faster

I'm trying to solve Project Euler problem 401. They only way I could find a way to solve it was brute-force. I've been running this code for like 10 mins without any answer. Can anyone help me with ideas improve it.
Code:
#include <iostream>
#include <cmath>
#define ull unsigned long long
using namespace std;
ull sigma2(ull n);
ull SIGMA2(ull n);
int main()
{
ull ans = SIGMA2(1000000000000000) % 1000000000;
cout << "Answer: " << ans << endl;
cin.get();
cin.ignore();
return 0;
}
ull sigma2(ull n)
{
ull sum = 0;
for(ull i = 1; i<=floor(sqrt(n)); i++)
{
if(n%i == 0)
{
sum += (i*i)+((n/i)*(n/i));
}
if(i*i == n)
{
sum -= n;
}
}
return sum;
}
ull SIGMA2(ull n)
{
ull sum = 0;
for(ull i = 1; i<=n; i++)
{
sum+=sigma2(i);
}
return sum;
}
You're missing some dividers, if a/b=c, and b is a divider of a then c will also be a divider of a but cmight be greater than floor(sqrt(a)), for example 3 > floor(sqrt(6)) but divides 6.
Then you should put your floor(sqrt(n)) in a variable and use the variable in the for, otherwise you recalculate it a every operation which is very expensive.
You can do some straightforward optimizations:
inline sigma2,
calculate floor(sqrt(n)) before the loop (but compiler may be doing it anyway, though),
precalculate squares of all ints from 1 to n and then use array lookup instead of multiplication
You will gain more by changing your approach. Think what you are trying to do - summing squares of all divisors of all integers from 1 to n. You grouped divisors by what they divide, but you can regroup terms in this sum. Let's group divisors by their value:
1 divides everything so it will appear n times in the sum, bringing 1*1*n total,
2 divides evens and will appear n/2 (integer division!) times, bringing 2*2*(n/2) total,
k ... will bring k*k*(n/k) total.
So we should just add up k*k*(n/k) for k from 1 to n.
Think about the problem.
Bruteforce the way you tried is obviously not a good idea.
You should come up with something better...
Isn't there any method how to use some nice prime factorization method to speed up the computation? Isn't there any recursion pattern? Try to find something...
One simple optimization that you can carry out is that there will be many repeated factors in the numbers.
So first estimate in how many numbers would 1 be a factor ( all N numbers ).
In how many numbers would 2 be a factor ( N/2 ).
...
Similarly for others.
Just multiply their squares with their frequency.
Time complexity shall then straight-away reduce to O(N)
There are obvious microoptimizations such as ++i rather than i++ or getting floor(sqrt(n)) out of the loop (these are two floating point operations which are really expensive compared to other integer operation in the loop), and calculting n/i only once (use a dummy variable for it and then calculate the square of the dummy).
There are also rather obvious simplifications in the algorithm. For example SIGMA2(i) = SIGMA2(i-1) + sigma2(i). But do not use recursion since you need a really huge number, this would not work and your stack memory would be exhausted. Use loop instead of recursion. There is a huge potential for improvement.
And well, there is a bigger problem - 10^15 has 15 digits. This number squared has 30 digits. There is no way you can store this into unsigned long long, which has I think about 20 digits. So you need to employ somehow the modulo 10^9 (the end of the assignment) and get additional space for your calculations...
And when using brute force, print out the temporary result every milion number for example to give you idea how fast you are approaching to the final result. Waiting 10 minutes blindly is not a good idea.