Wrong answer due to precision issues? - c++

I am implementing Greedy Approach to TSP:
Start from first node.
Go to nearest node not visited yet. (If multiple, go to the one with the lowest index.)
Don't forget to include distance from node 1 to last node visited.
However, my code gives the wrong answer. I implemented the same code in Python and the python code gives right answer.
In my problem, the nodes are coordinates on 2-D plane and the distance is the Euclidean Distance.
I even changed everything to long double because it's more precise.
In fact, if I reverse the order of the for loop to reverse the direction and add an additional if statement to handle ties (we want minimum index nearest node), it gives a very different answer.
Is this because of precision issues?
(Note: I have to print floor(ans))
INPUT: Link
EXPECTED OUTPUT: 1203406
ACTUAL OUTPUT: 1200403
#include <iostream>
#include <cmath>
#include <vector>
#include <cassert>
#include <functional>
using namespace std;
int main() {
freopen("input.txt", "r", stdin);
int n;
cin >> n;
vector<pair<long double, long double>> points(n);
for (int i = 0; i < n; ++i) {
int x;
cin >> x;
assert(x == i + 1);
cin >> points[i].first >> points[i].second;
}
// Returns the squared Euclidean Distance
function<long double(int, int)> dis = [&](int x, int y) {
long double ans = (points[x].first - points[y].first) * (points[x].first - points[y].first);
ans += (points[x].second - points[y].second) * (points[x].second - points[y].second);
return ans;
};
long double ans = 0;
int last = 0;
int cnt = n - 1;
vector<int> taken(n, 0);
taken[0] = 1;
while (cnt > 0) {
pair<long double, int> mn = {1e18, 1e9};
for (int i = 0; i < n; ++i) {
if (!taken[i]) {
mn = min(mn, {dis(i, last), i});
}
}
int nex = mn.second;
taken[nex] = 1;
cnt--;
ans += sqrt(mn.first);
last = nex;
}
ans += sqrt(dis(0, last));
cout << ans << '\n';
return 0;
}
UPD: Python Code:
import math
file = open("input.txt", "r")
n = int(file.readline())
a = []
for i in range(n):
data = file.readline().split(" ")
a.append([float(data[1]), float(data[2])])
for c in a:
print(c)
def dis(x, y):
cur_ans = (a[x][0] - a[y][0]) * (a[x][0] - a[y][0])
cur_ans += (a[x][1] - a[y][1]) * (a[x][1] - a[y][1])
cur_ans = math.sqrt(cur_ans)
return cur_ans
ans = 0.0
last = 0
cnt = n - 1
take = []
for i in range(n):
take.append(0)
take[0] = 1
while cnt > 0:
idx = -1
cur_dis = 1e18
for i in range(n):
if take[i] == 0:
if dis(i, last) < cur_dis:
cur_dis = dis(i, last)
idx = i
assert(idx != -1)
take[idx] = 1
cnt -= 1
ans += cur_dis
last = idx
ans += dis(0, last)
print(ans)
file.close()
# 1203406

Yes, the difference is due to round-off error, with the C++ code producing the more accurate result because of your use of long double. If you change your C++ code, such that it uses the same precision as Python (IEEE-754, meaning double precision) you get the exact same round-off errors in both codes. Here is a demonstrator in Godbolt Compiler explorer, with your example boiled down to 4000 points: https://godbolt.org/z/rddrdT54n
If I run the same code on the whole input file I get 1203406.5012708856 in C++ and in Python (Had to try this offline, because Godbolt understandibly killed the process).
Note, that in theory your Python-Code and C++ code are not completely analogous, because std::min will compare tuples and pairs lexicographically. So if you ever have two distances exactly equal, the std::min call will choose the smaller of the two indices. Practically, this does not make a difference, though.
Now I don't think you really can get rid off the rounding errors. There are a few tricks to minimize them.
using higher precision (long double) is one option. But this also makes your code slower, it's a tradeoff
Rescale your points, so that they are relative to the centroid of all points, and the unit reflects your problem (e.g. don't think in mm, miles, km or whatever, but rather in "variance of your data set"). You can't get rid of numerical cancellation in your calculation of the Euclidean distance, but if the relative distances are small compared to the absolute values of the coordinates, the cancellation is typically more severe. Here is a small demonstration:
#include <iostream>
#include <iomanip>
int main() {
std::cout
<< std::setprecision(17)
<< (1000.0001 - 1000)/0.0001
<< std::endl
<< (1.0001 - 1)/0.0001
<< std::endl;
return 0;
}
0.99999999974897946
0.99999999999988987
Finally, there are some tricks and algorithms to better control the error accumulation in large sums (https://en.wikipedia.org/wiki/Pairwise_summation, https://en.wikipedia.org/wiki/Kahan_summation_algorithm)
One final comment, a bit unrelated to your question: Use auto with lambdas, i.e.
auto dis = [&](int x, int y) {
// ...
};
C++ has many different kinds of callable objects (functions, function pointers, functors, lambdas, ...) and std::function is a useful wrapper to have one type representing all kinds of callables with the same signature. This comes at some computational overhead (runtime polymorphism, type erasure) and the compiler will have a hard time optimizing your code. So if you don't need the type erasing functionality of std::function, just store your lambda in a variable declared with auto.

Related

C++ function to approximate sine using taylor series expansion

Hi I am trying to calculate the results of the Taylor series expansion for sine to the specified number of terms.
I am running into some problems
Your task is to implement makeSineToOrder(k)
This is templated by the type of values used in the calculation.
It must yield a function that takes a value of the specified type and
returns the sine of that value (in the specified type again)
double factorial(double long order){
#include <iostream>
#include <iomanip>
#include <cmath>
double fact = 1;
for(int i = 1; i <= num; i++){
fact *= i;
}
return fact;
}
void makeSineToOrder(long double order,long double precision = 15){
double value = 0;
for(int n = 0; n < precision; n++){
value += pow(-1.0, n) * pow(num, 2*n+1) / factorial(2*n + 1);
}
return value;
int main()
{
using namespace std;
long double pi = 3.14159265358979323846264338327950288419716939937510L;
for(int order = 1;order < 20; order++) {
auto sine = makeSineToOrder<long double>(order);
cout << "order(" << order << ") -> sine(pi) = " << setprecision(15) << sine(pi) << endl;
}
return 0;
}
I tried debugging
here is a version that at least compiles and gives some output
#include <iostream>
#include <iomanip>
#include <cmath>
using namespace std;
double factorial(double long num) {
double fact = 1;
for (int i = 1; i <= num; i++) {
fact *= i;
}
return fact;
}
double makeSineToOrder(double num, double precision = 15) {
double value = 0;
for (int n = 0; n < precision; n++) {
value += pow(-1.0, n) * pow(num, 2 * n + 1) / factorial(2 * n + 1);
}
return value;
}
int main(){
long double pi = 3.14159265358979323846264338327950288419716939937510L;
for (int order = 1; order < 20; order++) {
auto sine = makeSineToOrder(order);
cout << "order(" << order << ") -> sine(pi) = " << setprecision(15) << sine << endl;
}
return 0;
}
not sure what that odd sine(pi) was supposed to be doing
Apart the obvious syntax errors (the includes should be before your factorial header) in your code:
I see no templates in your code which your assignment clearly states to use
so I would expect template like:
<class T> T mysin(T x,int n=15){ ... }
using pow for generic datatype is not safe
because inbuild pow will use float or double instead of your generic type so you might expect rounding/casting problems or even unresolved function in case of incompatible type.
To remedy that you can rewrite the code to not use pow as its just consequent multiplication in loop so why computing pow again and again?
using factorial function is waste
you can compute it similar to pow in the same loop no need to compute the already computed multiplications again and again. Also not using template for your factorial makes the same problems as using pow
so putting all together using this formula:
along with templates and exchanging pow,factorial functions with consequent iteration I got this:
template <class T> T mysin(T x,int n=15)
{
int i;
T y=0; // result
T x2=x*x; // x^2
T xi=x; // x^i
T ii=1; // i!
if (n>0) for(i=1;;)
{
y+=xi/ii; xi*=x2; i++; ii*=i; i++; ii*=i; n--; if (!n) break;
y-=xi/ii; xi*=x2; i++; ii*=i; i++; ii*=i; n--; if (!n) break;
}
return y;
}
so factorial ii is multiplied by i+1 and i+2 every iteration and power xi is multiplied by x^2 every iteration ... the sign change is hard coded so for loop does 2 iterations per one run (that is the reason for the break;)
As you can see this does not use anything funny so you do not need any includes for this not even math ...
You might want to add x=fmod(x,6.283185307179586476925286766559) at the start of mysin in order to use more than just first period however in that case you have to ensure fmod implementation uses T or compatible type to it ... Also the 2*pi constant should be in target precision or higher
beware too big n will overflow both int and generic type T (so you might want to limit n based on used type somehow or just use it wisely).
Also note on 32bit floats you can not get better than 5 decimal places no matter what n is with this kind of computation.
Btw. there are faster and more accurate methods of computing goniometrics like Chebyshev and CORDIC

Euler's number with stop condition

original outdated code:
Write an algorithm that compute the Euler's number until
My professor from Algorithms course gave me the following homework:
Write a C/C++ program that calculates the value of the Euler's number (e) with a given accuracy of eps > 0.
Hint: The number e = 1 + 1/1! +1/2! + ... + 1 / n! + ... = 2.7172 ... can be calculated as the sum of elements of the sequence x_0, x_1, x_2, ..., where x_0 = 1, x_1 = 1+ 1/1 !, x_2 = 1 + 1/1! +1/2 !, ..., the summation continues as long as the condition |x_(i+1) - x_i| >= eps is valid.
As he further explained, eps is the precision of the algorithm. For example, the precision could be 1/100 |x_(i + 1) - x_i| = absolute value of ( x_(i+1) - x_i )
Currently, my program looks in the following way:
#include<iostream>
#include<cstdlib>
#include<math.h>
// Euler's number
using namespace std;
double factorial(double n)
{
double result = 1;
for(double i = 1; i <= n; i++)
{
result = result*i;
}
return result;
}
int main()
{
long double euler = 2;
long double counter = 2;
long double epsilon = 1.0/1000;
long double moduloDifference;
do
{
euler+= 1 / factorial(counter);
counter++;
moduloDifference = (euler + 1 / factorial(counter+1) - euler);
} while(moduloDifference >= epsilon);
printf("%.35Lf ", euler );
return 0;
}
Issues:
It seems my epsilon value does not work properly. It is supposed to control the precision. For example, when I wish precision of 5 digits, I initialize it to 1.0/10000, and it outputs 3 digits before they get truncated after 8 (.7180).
When I use long double data type, and epsilon = 1/10000, my epsilon gets the value 0, and my program runs infinitely. Yet, if change the data type from long double to double, it works. Why epsilon becomes 0 when using long double data type?
How can I optimize the algorithm of finding Euler's number? I know, I can rid off the function and calculate the Euler's value on the fly, but after each attempt to do that, I receive other errors.
One problem with computing Euler's constant this way is pretty simple: you're starting with some fairly large numbers, but since the denominator in each term is N!, the amount added by each successive term shrinks very quickly. Using naive summation, you quickly reach a point where the value you're adding is small enough that it no longer affects the sum.
In the specific case of Euler's constant, since the numbers constantly decrease, one way we can deal with them quite a bit better is to compute and store all the terms, then add them up in reverse order.
Another possibility that's more general is to use Kahan's summation algorithm instead. This keeps track of a running error while it's doing the summation, and takes the current error into account as it's adding each successive term.
For example, I've rewritten your code to use Kahan summation to compute to (approximately) the limit of precision of a typical (80-bit) long double:
#include<iostream>
#include<cstdlib>
#include<math.h>
#include <vector>
#include <iomanip>
#include <limits>
// Euler's number
using namespace std;
long double factorial(long double n)
{
long double result = 1.0L;
for(int i = 1; i <= n; i++)
{
result = result*i;
}
return result;
}
template <class InIt>
typename std::iterator_traits<InIt>::value_type accumulate(InIt begin, InIt end) {
typedef typename std::iterator_traits<InIt>::value_type real;
real sum = real();
real running_error = real();
for ( ; begin != end; ++begin) {
real difference = *begin - running_error;
real temp = sum + difference;
running_error = (temp - sum) - difference;
sum = temp;
}
return sum;
}
int main()
{
std::vector<long double> terms;
long double epsilon = 1e-19;
long double i = 0;
double term;
for (int i=0; (term=1.0L/factorial(i)) >= epsilon; i++)
terms.push_back(term);
int width = std::numeric_limits<long double>::digits10;
std::cout << std::setw(width) << std::setprecision(width) << accumulate(terms.begin(), terms.end()) << "\n";
}
Result: 2.71828182845904522
In fairness, I should actually add that I haven't checked what happens with your code using naive summation--it's possible the problem you're seeing is from some other source. On the other hand, this does fit fairly well with a type of situation where Kahan summation stands at least a reasonable chance of improving results.
#include<iostream>
#include<cmath>
#include<iomanip>
#define EPSILON 1.0/10000000
#define AMOUNT 6
using namespace std;
int main() {
long double e = 2.0, e0;
long double factorial = 1;
int counter = 2;
long double moduloDifference;
do {
e0 = e;
factorial *= counter++;
e += 1.0 / factorial;
moduloDifference = fabs(e - e0);
} while (moduloDifference >= EPSILON);
cout << "Wynik:" << endl;
cout << setprecision(AMOUNT) << e << endl;
return 0;
}
This an optimized version that does not have a separate function to calculate the factorial.
Issue 1: I am still not sure how EPSILON manages the precision.
Issue 2: I do not understand the real difference between long double and double. Regarding my code, why long double requires a decimal point (1.0/someNumber), and double doesn't (1/someNumber)

Sum of partial group of {𝑥^(0),𝑥(1),……,𝑥(𝑦)}

I want to write a program where input are x and y integer values
and then:
Let s be the set { x0, 𝑥1, …, 𝑥y}; store it in array.
Repeat:
Partition the set s into two subsets: s1 and s2.
Find the sum of each of the two subset and store them in variables like sum1, sum2.
Calculate the product of sum1 * sum2.
The program ends after passing all over the partial groups that could be formed and then prints the max value of the product sum1 * sum2.
example: suppose x=2 , y=3 s= {1,2,4,8} one of the divisions is to take s1 ={1,4} , s2={2,8} sum1=5 , sum2= 10 the product is 50 and that will be compared to other productd that were calculated in the same way like s1 ={1} , s2={2,4,8} sum1=1 , sum2=14 and the product is 14 and so on.
My code so far:
#include <iostream>
using namespace std;
int main ()
{
int a[10000]; // Max value expected.
int x;
int y;
cin >> x;
cin >> y;
int xexpy = 1;
int k;
for (int i = 0; i <= y; i++)
{
xexpy = 1;
k = i;
while(k > 0)
{
xexpy = xexpy * x;
k--;
}
cout << "\n" << xexpy;
a[i] = xexpy;
}
return 0;
}
This is not a programming problem, it is a combinatorics problem with a theoretical rather than an empirical approach to its solution. You can simply print the correct solution and not bother iterating over any partitions.
Why is that?
Let
i.e. z is the fraction of the sum of all s elements that's in s1. It holds that
and thus, the product of both sets satisfies:
As a function of z (not of x and y), this is a parabola that takes its maximum at z = 1/2; and there are no other local maximum points, i.e. getting closer to 1/2 necessarily increases that product. Thus what you want to do is partition the full set so that each of s1 and s2 are as close as possible to have half the sum of elements.
In general, you might have had to use programming to consider multiple subsets, but since your elements are given by a formula - and it's the formula of a geometric sequence.
First, let's assume x >= 2 and y >= 2, otherwise this is not an interesting problem.
Now, for x >= 2, we know that
(the sum of a geometric sequence), and thus
i.e. the last element always outweighs all other elements put together. That's why you always want to choose {xy} as s1 and as all other elements as s2. No need to run any program. You can then also easily calculate the optimum product-of-sums.
Note: If we don't make assumptions about the elements of s, except that they're non-negative integers, finding the optimum solution is an optimization version of the Partition problem - which is NP-complete. That means, very roughly, that there is no solution is fundamentally much more efficient than just trying all possible combinations.
Here's a cheesy all-combinations-of-supplied-arguments generator, provided without comment or explanation because I think this is homework, and the exercise of understanding how and why this does what it does is the point here.
#include <algorithm>
#include <iostream>
#include <string>
#include <vector>
using namespace std;
int main(int c, const char **v)
{
basic_string<const char *> options(v);
auto N(options.length());
for (auto n = 1u; n < N; ++n) {
vector<char> pick(N);
fill_n(pick.rbegin(), n, 1);
do for (auto j=1u; j<N; ++j)
if (pick[j])
cout << options[j]<<' ';
while (cout<<'\n', next_permutation(begin(pick)+1, end(pick)));
}
}

The output of the program is always '0'?

I want to find the sum up to the 'n'th term for the following series:
(1/2)+((1*3)/(2*4))+((1*3*5)/(2*4*6))....
So, I wrote the following program in c++ :
#include <bits/stdc++.h>
#include <conio.h>
using namespace std;
int main()
{
int p=1, k=1, n=0;
float h=0;
cout<<"Enter the term: ";
cin>>n;
for(int i=1; i<=n; i++)
{
for(int j=1; j<=i; j++)
{
p*=((2*j)-1);
k*=(2*j);
}
h+=(p/k);
p=1;
k=1;
}
cout<<"The sum is : "<<h;
return 0;
getch();
}
However, the output of the program always gives me '0'. I can't figure out the problem with the program.
N.B. I'm new to programming.
The problem here is that you haven't declared p and k as float or doubleor explicitly cast them as such before the calculation and assignment to h.
What's happening is for every iteration of the loop p < k (by nature of the problem) since p and k are both declared as int, p / k = 0. So you're just summing 0 for every iteration.
Either declare p and k as float or double or do this:
h += ((float) p) / ((float) k)
Also, for this specific problem I assume you're looking for precision, so be wary and look into that as well Should I use double or float?
implicit conversion and type casting are a trap where all newbies fall.
in the instruction:
h += p/k;
the compiler performs an integer division first, then a promotion of the result to floating point type.
and since:
p < k ; for all i,j < n
then:
res = (p / k) < 1 => truncates to 0; // by integer division
thus:
sum(1->n) of p/k = sum (1->n) 0 = 0;
finally:
h = conversion to float of (0) = 0.0f;
that's why you have the result of 0.0f at the end.
the solution:
1- first of all you need to use the natural type for floating point of c++ which is "double" (under the hood c++ promotes float to double, so use it directly).
2- declare all your variable as double, except the number of terms n:
3- the number of terms is never negative, you need to express that in your code by declaring it as an unsigned int.
4- if you do step 3, make sure to catch overflow errors, that is if the user enters a negative number your risk to have a very big number in "n", expel : n =-1 converts to 0xffffffff positive number.
5- engineer your code sometimes is better.
6- include only the headers that you need, and avoid a importing any namespace in your global namespace.
here is how i think you should write your program.
#include <iostream>
double sum_serie(unsigned int n)
{
double prod = 1.0, sum = 0.0;
for (double c=1; c<=n ; c++)
{
prod *= ( ( 2*c ) - 1 ) / ( 2*c ); // remark the parenthesis
sum += prod;
}
return sum;
}
int main()
{
unsigned int n = 0;
int temp = 0;
std::cout << " enter the number of terms n: ";
std::cin >> temp;
if (temp > 0)
n = temp; // this is how you catch overflow
else
{
std::cout << " n < 0, no result calculated " << std::endl;
return 0;
}
std::cout << " the result is sum = " << sum_serie(n) << std::endl;
return 0;
}
I know that the question was about the implicit conversion and casting in C++, but even the way of writing a code can show you what bugs you have in it, so try to learn a proper way of expressing your ideas into code, debugging comes natural afterward.
Good Luck

Sieve of Eratosthenes algorithm not working for large limits

I have programmed a sieve of Eratosthenes algorithm in C++, and it works fine for smaller numbers that I have tested it with. However, when I use large numbers, i.e. 2 000 000 as the upper limit, the program begins giving wrong answers. Can anyone clarify why?
Your help is appreciated.
#include <iostream>
#include <time.h>
using namespace std;
int main() {
clock_t a, b;
a = clock();
int n = 0, k = 2000000; // n = Sum of primes, k = Upper limit
bool r[k - 2]; // r = All numbers below k and above 1 (if true, it has been marked as a non-prime)
for(int i = 0; i < k - 2; i++) // Check all numbers
if(!r[i]) { // If it hasn't been marked as a non-prime yet ...
n += i + 2; // Add the prime to the total sum (+2 because of the shift - index 0 is 2, index 1 is 3, etc.)
for(int j = 2 * i + 2; j < k - 2; j += i + 2) // Go through all multiples of the prime under the limit
r[j] = true; // Mark the multiple as a non-prime
}
b = clock();
cout << "Final Result: " << n << endl;
cout << b - a << "ms runtime achieved." << endl;
return 0;
}
EDIT: I just did some debugging and found that it works with the limit at around 400. At 500, however, it is off - it should be 21536, but is 21499
EDIT 2: Ah, I found two errors and those seem to have fixed the problem.
The first was found by others who answered, and is that n is overflowing - upon being made a long long data type, it has begun working.
The second, rather facepalm-worthy mistake, was that the booleans in r had to be initialized. After running loop before checking for primes to make all of them false, the right answer is gotten. Does anyone know why this occured?
You simply get an integer overflow. The C++ type int is has a limited range (on a 32 bit System usually from -(2^32) / 2 to 2^32 / 2 - 1, that is the usual maximum is 2147483647 (The specific maximum on your setup can be found out by #including the <limits> header and evaluating std::numeric_limits<int>::max(). Even when k is smaller than the maximum, your code will sooner or later cause an overflow in the expressions n += i + 2 or int j = 2 * i + 2.
You will have to choose a better (read: more appropriate) type like unsigned which does not support negative numbers and can thus can represent numbers twice as large as int. You can also try unsigned long or even unsigned long long.
Also note that variable length arrays (VLAs; that's what bool r[k - 2] is) are not standard C++. You might want to use std::vector instead. You also did not initialize the array to false (std::vector would do this automatically), which could also be the problem, especially if you say that it does not work even at k=500.
In C++, you should also use <ctime> instead of <time.h> (then clock_t and andclock()are defined in thestdnamespace, but since you areusing namespace std`, this won't make a difference for you), but this is more or less a matter of style.
I found a working example in my "code archive". Although it is not based on yours, you might find it useful:
#include <vector>
#include <iostream>
int main()
{
typedef std::vector<bool> marked_t;
typedef marked_t::size_type number_t; // The type used for indexing marked_t.
const number_t max = 500;
static const number_t iDif = 2; // Account for the numbers 1 and 2.
marked_t marked(max - iDif);
number_t i = iDif;
while (i*i <= max) {
while (marked[i - iDif] == true)
++i;
for (number_t fac = iDif; i * fac < max; ++fac)
marked[i * fac - iDif] = true;
++i;
}
for (marked_t::size_type i = 0; i < marked.size(); ++i) {
if (!marked[i])
std::cout << i + iDif << ',';
}
}