Sum of partial group of {𝑥^(0),𝑥(1),……,𝑥(𝑦)} - c++

I want to write a program where input are x and y integer values
and then:
Let s be the set { x0, 𝑥1, …, 𝑥y}; store it in array.
Repeat:
Partition the set s into two subsets: s1 and s2.
Find the sum of each of the two subset and store them in variables like sum1, sum2.
Calculate the product of sum1 * sum2.
The program ends after passing all over the partial groups that could be formed and then prints the max value of the product sum1 * sum2.
example: suppose x=2 , y=3 s= {1,2,4,8} one of the divisions is to take s1 ={1,4} , s2={2,8} sum1=5 , sum2= 10 the product is 50 and that will be compared to other productd that were calculated in the same way like s1 ={1} , s2={2,4,8} sum1=1 , sum2=14 and the product is 14 and so on.
My code so far:
#include <iostream>
using namespace std;
int main ()
{
int a[10000]; // Max value expected.
int x;
int y;
cin >> x;
cin >> y;
int xexpy = 1;
int k;
for (int i = 0; i <= y; i++)
{
xexpy = 1;
k = i;
while(k > 0)
{
xexpy = xexpy * x;
k--;
}
cout << "\n" << xexpy;
a[i] = xexpy;
}
return 0;
}

This is not a programming problem, it is a combinatorics problem with a theoretical rather than an empirical approach to its solution. You can simply print the correct solution and not bother iterating over any partitions.
Why is that?
Let
i.e. z is the fraction of the sum of all s elements that's in s1. It holds that
and thus, the product of both sets satisfies:
As a function of z (not of x and y), this is a parabola that takes its maximum at z = 1/2; and there are no other local maximum points, i.e. getting closer to 1/2 necessarily increases that product. Thus what you want to do is partition the full set so that each of s1 and s2 are as close as possible to have half the sum of elements.
In general, you might have had to use programming to consider multiple subsets, but since your elements are given by a formula - and it's the formula of a geometric sequence.
First, let's assume x >= 2 and y >= 2, otherwise this is not an interesting problem.
Now, for x >= 2, we know that
(the sum of a geometric sequence), and thus
i.e. the last element always outweighs all other elements put together. That's why you always want to choose {xy} as s1 and as all other elements as s2. No need to run any program. You can then also easily calculate the optimum product-of-sums.
Note: If we don't make assumptions about the elements of s, except that they're non-negative integers, finding the optimum solution is an optimization version of the Partition problem - which is NP-complete. That means, very roughly, that there is no solution is fundamentally much more efficient than just trying all possible combinations.

Here's a cheesy all-combinations-of-supplied-arguments generator, provided without comment or explanation because I think this is homework, and the exercise of understanding how and why this does what it does is the point here.
#include <algorithm>
#include <iostream>
#include <string>
#include <vector>
using namespace std;
int main(int c, const char **v)
{
basic_string<const char *> options(v);
auto N(options.length());
for (auto n = 1u; n < N; ++n) {
vector<char> pick(N);
fill_n(pick.rbegin(), n, 1);
do for (auto j=1u; j<N; ++j)
if (pick[j])
cout << options[j]<<' ';
while (cout<<'\n', next_permutation(begin(pick)+1, end(pick)));
}
}

Related

Wrong answer due to precision issues?

I am implementing Greedy Approach to TSP:
Start from first node.
Go to nearest node not visited yet. (If multiple, go to the one with the lowest index.)
Don't forget to include distance from node 1 to last node visited.
However, my code gives the wrong answer. I implemented the same code in Python and the python code gives right answer.
In my problem, the nodes are coordinates on 2-D plane and the distance is the Euclidean Distance.
I even changed everything to long double because it's more precise.
In fact, if I reverse the order of the for loop to reverse the direction and add an additional if statement to handle ties (we want minimum index nearest node), it gives a very different answer.
Is this because of precision issues?
(Note: I have to print floor(ans))
INPUT: Link
EXPECTED OUTPUT: 1203406
ACTUAL OUTPUT: 1200403
#include <iostream>
#include <cmath>
#include <vector>
#include <cassert>
#include <functional>
using namespace std;
int main() {
freopen("input.txt", "r", stdin);
int n;
cin >> n;
vector<pair<long double, long double>> points(n);
for (int i = 0; i < n; ++i) {
int x;
cin >> x;
assert(x == i + 1);
cin >> points[i].first >> points[i].second;
}
// Returns the squared Euclidean Distance
function<long double(int, int)> dis = [&](int x, int y) {
long double ans = (points[x].first - points[y].first) * (points[x].first - points[y].first);
ans += (points[x].second - points[y].second) * (points[x].second - points[y].second);
return ans;
};
long double ans = 0;
int last = 0;
int cnt = n - 1;
vector<int> taken(n, 0);
taken[0] = 1;
while (cnt > 0) {
pair<long double, int> mn = {1e18, 1e9};
for (int i = 0; i < n; ++i) {
if (!taken[i]) {
mn = min(mn, {dis(i, last), i});
}
}
int nex = mn.second;
taken[nex] = 1;
cnt--;
ans += sqrt(mn.first);
last = nex;
}
ans += sqrt(dis(0, last));
cout << ans << '\n';
return 0;
}
UPD: Python Code:
import math
file = open("input.txt", "r")
n = int(file.readline())
a = []
for i in range(n):
data = file.readline().split(" ")
a.append([float(data[1]), float(data[2])])
for c in a:
print(c)
def dis(x, y):
cur_ans = (a[x][0] - a[y][0]) * (a[x][0] - a[y][0])
cur_ans += (a[x][1] - a[y][1]) * (a[x][1] - a[y][1])
cur_ans = math.sqrt(cur_ans)
return cur_ans
ans = 0.0
last = 0
cnt = n - 1
take = []
for i in range(n):
take.append(0)
take[0] = 1
while cnt > 0:
idx = -1
cur_dis = 1e18
for i in range(n):
if take[i] == 0:
if dis(i, last) < cur_dis:
cur_dis = dis(i, last)
idx = i
assert(idx != -1)
take[idx] = 1
cnt -= 1
ans += cur_dis
last = idx
ans += dis(0, last)
print(ans)
file.close()
# 1203406
Yes, the difference is due to round-off error, with the C++ code producing the more accurate result because of your use of long double. If you change your C++ code, such that it uses the same precision as Python (IEEE-754, meaning double precision) you get the exact same round-off errors in both codes. Here is a demonstrator in Godbolt Compiler explorer, with your example boiled down to 4000 points: https://godbolt.org/z/rddrdT54n
If I run the same code on the whole input file I get 1203406.5012708856 in C++ and in Python (Had to try this offline, because Godbolt understandibly killed the process).
Note, that in theory your Python-Code and C++ code are not completely analogous, because std::min will compare tuples and pairs lexicographically. So if you ever have two distances exactly equal, the std::min call will choose the smaller of the two indices. Practically, this does not make a difference, though.
Now I don't think you really can get rid off the rounding errors. There are a few tricks to minimize them.
using higher precision (long double) is one option. But this also makes your code slower, it's a tradeoff
Rescale your points, so that they are relative to the centroid of all points, and the unit reflects your problem (e.g. don't think in mm, miles, km or whatever, but rather in "variance of your data set"). You can't get rid of numerical cancellation in your calculation of the Euclidean distance, but if the relative distances are small compared to the absolute values of the coordinates, the cancellation is typically more severe. Here is a small demonstration:
#include <iostream>
#include <iomanip>
int main() {
std::cout
<< std::setprecision(17)
<< (1000.0001 - 1000)/0.0001
<< std::endl
<< (1.0001 - 1)/0.0001
<< std::endl;
return 0;
}
0.99999999974897946
0.99999999999988987
Finally, there are some tricks and algorithms to better control the error accumulation in large sums (https://en.wikipedia.org/wiki/Pairwise_summation, https://en.wikipedia.org/wiki/Kahan_summation_algorithm)
One final comment, a bit unrelated to your question: Use auto with lambdas, i.e.
auto dis = [&](int x, int y) {
// ...
};
C++ has many different kinds of callable objects (functions, function pointers, functors, lambdas, ...) and std::function is a useful wrapper to have one type representing all kinds of callables with the same signature. This comes at some computational overhead (runtime polymorphism, type erasure) and the compiler will have a hard time optimizing your code. So if you don't need the type erasing functionality of std::function, just store your lambda in a variable declared with auto.

Problem when i used some large large value i get wrong output with my function

So I'm new to stackoverflow and coding I was learning about functions in c++ and how the stack frame works etc..
in that I made a function for factorials and used that to calculate binomial coefficients. it worked fine for small values like n=10 and r=5 etc... but for large a medium value like 23C12 it gave 4 as answer.
IDK what is wrong with the code or I forgot to add something.
My code:
#include <iostream>
using namespace std;
int fact(int n)
{
int a = 1;
for (int i = 1; i <= n; i++)
{
a *= i;
}
return a;
}
int main()
{
int n, r;
cin >> n >> r;
if (n >= r)
{
int coeff = fact(n) / (fact(n - r) * fact(r));
cout << coeff << endl;
}
else
{
cout << "please enter valid values. i.e n>=r." << endl;
}
return 0;
}
Thanks for your help!
You're not doing anything "wrong" per se. It's just that factorials quicky become huge numbers.
In your example you're using ints, which are typically 32-bit variables. If you take a look at a table of factorials, you'll note that log2(13!) = 32.535.... So the largest factorial that will fit in a 32-bit number is 12!. For a 64-bit variable, the largest factorial you can store is 20! (since log2(21!) = 65.469...).
When you get 4 as the result that's because of overflow.
If you need to be able to calculate such huge numbers, I suggest a bignum library such as GMP.
Factorials overflow easily. In practice you rarely need bare factorials, but they almost always appear in fractions. In your case:
int coeff = fact(n) / (fact(n - r) * fact(r));
Note the the first min(n,n-r,r) factors of the denominator and numerator are identical. I am not going to provide you the code, but I hope an example will help to understand what to do instead.
Consider n=5, r=3 then coeff is
5*4*3*2*1 / 2*1 * 3*2*1
And before actually carrying out any calculations you can reduce that to
5*4 / 2*1
If you are certain that the final result coeff does fit in an int, you can also calculate it using ints. You just need to take care not to overflow the intermediate terms.

Binary search with bug or inaccuracy

First of all, sorry for my English.
I am solving this problem:
Tom wants to shoot from a cannon at the Jerry, but he would like to have as many pieces as possible, but they must be the same size and as big as possible too. He only have n cannonballs at his disposal, so he can cut them in smaller pieces. And he would like to have k + 1 pieces to shoot from cannon at the Jerry. He knows the radius of every cannonball. What is the biggest volume of one piece? Output is rounded printf("%.3f\n",answer). First number is n and the second k , next n numbers are radiuses of cannonballs.
Possible input:
3 50
1 2 3
Output: 2.900*
Here is my solution: The volume of every piece can be only smaller or equal to volume of the smallest cannonball because you cannot join parts from cannonballs together. So I have used Binary search from 0.0 to minimal volume and as the predicate I have used function numberOfPieces, which counts the number of pieces from every cannonball with specific volume of one piece(this is the median in binary search). This function return number of pieces I can get if I use median as volume of one piece. Then I just compare it to k + 1 and if it is bigger or equal I use median as low otherwise I use it as high. My solution works for this test input.
The problem is that I get WA(wrong answer) and I cannot check the test input values. Can you please look at my code and check if I did something wrong please? The problem may be number inaccuracy, but I have small EPS so it should be good. Thanks in advance for every idea.
Here is my code:
#include <vector>
#include <iostream>
#include <algorithm>
#include <stdio.h>
#define PI 3.14159265358979323846
#define VC ((4.0/3.0) * PI) // constant for volume calculation
#define EPS 1E-12
using namespace std;
// return the number of pieces depending on the volume
int numberOfPieces(int v[], int n, double volume)
{
int ans = 0;
for(int i = 0; i < n; i++)
ans += (int)(v[i] * VC / volume);
return ans;
}
double binarySearch(double a, double b, int k, int n, int v[])
{
double low = a, high = b;
while(abs(low-high) > EPS)
{
double mid = low + (high - low) / 2.0;
if(numberOfPieces(v, n, mid) >= k)
low = mid;
else
high = mid;
}
return (low + high)/2.0;
}
int main()
{
int n, k, x; // n - number of cannonballs, k - number of wanted pieces, x - variable for input
int v[10001]; // radiuses ^ 3 of the cannonballs
scanf("%d%d", &n, &k);
int minVolume = 9999999;
for(int i = 0; i < n; i++) {
scanf("%d",&x);
minVolume = min(minVolume, x);
v[i] = x * x * x;
}
printf("%.3f\n", binarySearch(0.0, minVolume * minVolume * minVolume * VC, k + 1, n, v));
return 0;
}
The problem was I was setting minimal volume as high in the binary search, but I should use the maximal volume. The second problem was I was not passing maximal radius ^ 3 to the binary search function. Thanks for help

Determining the largest value before hitting infinity

I have this very simple function that checks the value of (N^N-1)^(N-2):
int main() {
// Declare Variables
double n;
double answer;
// Function
cout << "Please enter a double number >= 3: ";
cin >> n;
answer = pow(n,(n-1)*(n-2));
cout << "n to the n-1) to the n-2 for doubles is " << answer << endl;
}
Based on this formula, it is evident it will reach to infinity, but I am curious until what number/value of n would it hit infinity? Using a loop seems extremely inefficient, but that's all I can think of. Basically, creating a loop that says let n be a number between 1 - 100, iterate until n == inf
Is there a more efficient approach to this problem?
I think you are approaching this the wrong way.
Let : F(N) be the function (N^(N-1))(N-2)
Now you actually know whats the largest number that could be stored in a double type variable
is 0x 7ff0 0000 0000 0000 Double Precision
So now you have F(N) = max_double
just solve for X now.
Does this answer your question?
Two things: the first is that (N^(N-1))^(N-2)) can be written as N^((N-1)*(N-2)). So this would remove one pow call making your code faster.
pow(n, (n-1)*(n-2));
The second is that to know what practical limits you hit, testing all N will literally take a fraction of a second, so there really is no reason to find another practical way.
You could compute it by hand knowing variable size limits and all, but testing it is definitely faster. An example for code (C++11, since I use std::isinf):
#include <iostream>
#include <cmath>
#include <iomanip>
int main() {
double N = 1.0, diff = 10.0;
const unsigned digits = 10;
unsigned counter = digits;
while ( true ) {
double X = std::pow( N, (N-1.0) * (N-2.0) );
if ( std::isinf(X) ) {
--counter;
if ( !counter ) {
std::cout << std::setprecision(digits) << N << "\n";
break;
}
N -= diff;
diff /= 10;
}
N += diff;
}
return 0;
}
This example takes less than a millisecond on my computer, and prints 17.28894235

Sieve of Eratosthenes algorithm not working for large limits

I have programmed a sieve of Eratosthenes algorithm in C++, and it works fine for smaller numbers that I have tested it with. However, when I use large numbers, i.e. 2 000 000 as the upper limit, the program begins giving wrong answers. Can anyone clarify why?
Your help is appreciated.
#include <iostream>
#include <time.h>
using namespace std;
int main() {
clock_t a, b;
a = clock();
int n = 0, k = 2000000; // n = Sum of primes, k = Upper limit
bool r[k - 2]; // r = All numbers below k and above 1 (if true, it has been marked as a non-prime)
for(int i = 0; i < k - 2; i++) // Check all numbers
if(!r[i]) { // If it hasn't been marked as a non-prime yet ...
n += i + 2; // Add the prime to the total sum (+2 because of the shift - index 0 is 2, index 1 is 3, etc.)
for(int j = 2 * i + 2; j < k - 2; j += i + 2) // Go through all multiples of the prime under the limit
r[j] = true; // Mark the multiple as a non-prime
}
b = clock();
cout << "Final Result: " << n << endl;
cout << b - a << "ms runtime achieved." << endl;
return 0;
}
EDIT: I just did some debugging and found that it works with the limit at around 400. At 500, however, it is off - it should be 21536, but is 21499
EDIT 2: Ah, I found two errors and those seem to have fixed the problem.
The first was found by others who answered, and is that n is overflowing - upon being made a long long data type, it has begun working.
The second, rather facepalm-worthy mistake, was that the booleans in r had to be initialized. After running loop before checking for primes to make all of them false, the right answer is gotten. Does anyone know why this occured?
You simply get an integer overflow. The C++ type int is has a limited range (on a 32 bit System usually from -(2^32) / 2 to 2^32 / 2 - 1, that is the usual maximum is 2147483647 (The specific maximum on your setup can be found out by #including the <limits> header and evaluating std::numeric_limits<int>::max(). Even when k is smaller than the maximum, your code will sooner or later cause an overflow in the expressions n += i + 2 or int j = 2 * i + 2.
You will have to choose a better (read: more appropriate) type like unsigned which does not support negative numbers and can thus can represent numbers twice as large as int. You can also try unsigned long or even unsigned long long.
Also note that variable length arrays (VLAs; that's what bool r[k - 2] is) are not standard C++. You might want to use std::vector instead. You also did not initialize the array to false (std::vector would do this automatically), which could also be the problem, especially if you say that it does not work even at k=500.
In C++, you should also use <ctime> instead of <time.h> (then clock_t and andclock()are defined in thestdnamespace, but since you areusing namespace std`, this won't make a difference for you), but this is more or less a matter of style.
I found a working example in my "code archive". Although it is not based on yours, you might find it useful:
#include <vector>
#include <iostream>
int main()
{
typedef std::vector<bool> marked_t;
typedef marked_t::size_type number_t; // The type used for indexing marked_t.
const number_t max = 500;
static const number_t iDif = 2; // Account for the numbers 1 and 2.
marked_t marked(max - iDif);
number_t i = iDif;
while (i*i <= max) {
while (marked[i - iDif] == true)
++i;
for (number_t fac = iDif; i * fac < max; ++fac)
marked[i * fac - iDif] = true;
++i;
}
for (marked_t::size_type i = 0; i < marked.size(); ++i) {
if (!marked[i])
std::cout << i + iDif << ',';
}
}