arithmetic overflow in codility test - c++

I did the codility demo test "NumberOfDiscIntersections":
https://codility.com/programmers/lessons/4
I've got: perf = 100% and correctness 87%
All tests but one went fine:
overflow
arithmetic overflow tests
Why was my long long, not enough ? I can't figure what went wrong!
#include <algorithm>
int solution(const vector<int> &A)
{
// write your code in C++11
vector<long long > vec_max;
for(int i = 0; i < A.size(); ++i)
{
vec_max.push_back( A[i] + i );
}
std::sort(vec_max.begin(),vec_max.end()); // sort by max
int step = 1;
int counter = 0;
for(int i = A.size() - 1; i > -1; --i)
{
std::vector<long long>::iterator low;
int nb_upper = A.size() - ( lower_bound( vec_max.begin(),vec_max.end(), (long long) (i - A[i]) ) - vec_max.begin() );
counter += nb_upper - step;
++step;
}
if (counter > 10000000)
{
return -1;
}
else
{
return counter;
}
}

If the A array is very large, you might end up adding large indices to the counter int variable. The step variable is quite small compared to it
counter += nb_upper - step;
and this is likely where you're overflowing a variable.

Related

Finding maximum

Given an integer n and array a. Finding maximum of (a[i]+a[j])*(j-i) with 1<=i<=n-1 and i+1<=j<=n
Example:
Input
5
1 3 2 5 4
Output
21
Explanation :With i=2 and j=5, we have the maximum of (a[i]+a[j])*(j-i) is (3+4)*(5-2)=21
Constraints:
n<=10^6
a[i]>0 with 1<=i<=n
I can solve this problem with n<=10^4, but what should I do if n is too large, like the constraints?
First, let's reference the "brute force" force algorithm. This will have some issues, that I will call out below, but it is a correct solution.
struct Result
{
size_t i;
size_t j;
int64_t value;
};
Result findBestBruteForce(const vector<int>& a)
{
size_t besti = 0;
size_t bestj = 0;
int64_t bestvalue = INT64_MIN;
for (size_t i = 0; i < a.size(); i++)
{
for (size_t j = i + 1; j < a.size(); j++)
{
// do the math in 64-bit space to avoid overflow
int64_t value = (a[i] + (int64_t)a[j]) * (j - i);
if (value > bestvalue)
{
bestvalue = value;
besti = i;
bestj = j;
}
}
}
return { besti, bestj, bestvalue };
}
The problem with the above code is that it runs at O(N²). Or more precisely, for the the N iterations of the outer for-loop (where i goes from 0 to N), there are an average of N/2 iterations on the inner for-loop. If N is small, this isn't a problem.
On my PC, with full optimizations turned on. When is N under 20000, the run time is less than a second. Once N approaches 100000, it takes several seconds to process the 5 billion iterations. Let's just go with a "billion operations per second" as an expected rate. If N were to 1000000, the maximum as the OP outlined, it would probably take 500 seconds. Such is the nature of a N-squared algorithm.
So how can we speed it up? Here's an interesting observation. Let's say our array was this:
10 5 4 15 13 100 101 6
On the first iteration of the outer loop above, where i=0, we'd be computing this on each iteration of the inner loop:
for each j: (a[0]+a[j])(j-0)
for each j: (10+a[j])(j-0)
for each j: [15*1, 14*2, 25*3, 23*4, 1000*5, 1010*6, 16*6]
= [15, 28, 75, 92, 5000, 6060, 96]
Hence, for when i=0, a[i] = 15 and the largest value computed from that set is 6060.
Since A[0] is 15, and we're tracking a current "best" value, there's no incentive to iterate all the values again for i=1 since a[1]==14 is less than 15. There's no j index that would compute a value of (a[1]+a[j])*(j-1) larger than what's already been found. Because (14+a[j])*(j-1) will always be less than (15+a[j])*(j-1). (Assumes all values in the array are non-negative).
So to generalize, the outer loop can skip over any index of i where A[best_i] > A[i]. And that's a real simple alteration to our above code:
Result findBestOptimized(const std::vector<int>& a)
{
if (a.size() < 2)
{
return {0,0,INT64_MIN};
}
size_t besti = 0;
size_t bestj = 0;
int64_t bestvalue = INT64_MIN;
int minimum = INT_MIN;
for (size_t i = 0; i < a.size(); i++)
{
if (a[i] <= minimum)
{
continue;
}
for (size_t j = i + 1; j < a.size(); j++)
{
int64_t value = (a[i] + (int64_t)a[j]) * (j - i);
if (value > bestvalue)
{
bestvalue = value;
besti = i;
bestj = j;
minimum = a[i];
}
}
}
return { besti, bestj, bestvalue };
}
Above, we introduce a minimum value for A[i] to be before considering doing the full inner loop enumeration.
I benchmarked this with build optimizations on. On a random array of a million items, it runs in under a second.
But wait... there's another optimization!
If the inner loop fails to find an index j such that value > bestvalue, then we already know that the current A[i] is greater than minimum. Hence, we can increment minimum to A[i] regardless at the end of the inner loop.
Now, I'll present the final solution:
Result findBestOptimizedEvenMore(const std::vector<int>& a)
{
if (a.size() < 2)
{
return { 0,0,INT64_MIN };
}
size_t besti = 0;
size_t bestj = 0;
int64_t bestvalue = INT64_MIN;
int minimum = INT_MIN;
for (size_t i = 0; i < a.size(); i++)
{
if (a[i] <= minimum)
{
continue;
}
for (size_t j = i + 1; j < a.size(); j++)
{
int64_t value = (a[i] + (int64_t)a[j]) * (j - i);
if (value > bestvalue)
{
bestvalue = value;
besti = i;
bestj = j;
}
}
minimum = a[i]; // since we know a[i] > minimum, we can do this
}
return { besti, bestj, bestvalue };
}
I benchmarked the above solution on different array sizes from N=100 to N=1000000. It does all iterations in under 25 milliseconds.
In the above solution, there's likely a worst case runtime of O(N²) again when all the items in the array are in ascending order. But I believe the average case should be on the order of O(N lg N) or better. I'll do some more analysis later if anyone is interested.
Note: Some notation for variables and the Result class in the code have been copied from #selbie's excellent answer.
Here's another O(n^2) worst-case solution with (likely provable) O(n) expected performance on random permutations and room for optimization.
Suppose [i, j] are our array bounds for an optimal pair. By the problem definition, this means all elements left of i must be strictly less than A[i], and all elements right of j must be strictly less than A[j].
This means we can compute the left-maxima of A: all elements strictly greater than all previous elements, as well as the right-maxima of A. Then, we only need to consider left endpoints from the left-maxima and right endpoints from the right-maxima.
I don't know the expectation of the product of the sizes of left and right maxima sets, but we can get an upper bound. The size of left maxima is at most the size of the longest increasing subsequence (LIS) of A. The right maxima are at most the size of the longest decreasing subsequence. These aren't independent, but I'm taking as an (unproven) assumption that the LIS and LDS lengths are inversely correlated with each other for random permutations. The right-maxima must start after the left-maxima end, so this seems like a safe assumption.
The length of the LIS for random permutations follows the Tracy-Widom distribution, so it has mean sqrt(2N) and standard deviation N^(-1/6). The expected square of the size is therefore 2N + 1/(N^1/3) so ~2N. This isn't exactly the proof we wanted, since you'd need to sum over the partial density function to be rigorous, but the LIS is already an upper bound on the left-maxima size, so I think the conclusion is still true.
C++ code (Result class and some variable names taken from selbie's post, as mentioned):
struct Result
{
size_t i;
size_t j;
int64_t value;
};
Result find_best_sum_size_product(const std::vector<int>& nums)
{
/* Given: list of positive integers nums
Returns: Tuple with (best_i, best_j, best_product)
where best_i and best_j maximize the product
(nums[i]+nums[j])*(j-i) over 0 <= i < j < n
Runtime: O(n^2) worst case,
O(n) average on random permutations.
*/
int n = nums.size();
if (n < 2)
{
return {0,0,INT64_MIN};
}
std::vector<int> left_maxima_indices;
left_maxima_indices.push_back(0);
for (int i = 1; i < n; i++){
if (nums.at(i) > nums.at(left_maxima_indices.back())) {
left_maxima_indices.push_back(i);
}
}
std::vector<int> right_maxima_indices;
right_maxima_indices.push_back(n-1);
for (int i = n-1; i >= 0; i--){
if (nums.at(i) > nums.at(right_maxima_indices.back())) {
right_maxima_indices.push_back(i);
}
}
size_t best_i = 0;
size_t best_j = 0;
int64_t best_product = INT64_MIN;
int i = 0;
int j = 0;
for (size_t left_idx = 0;
left_idx < left_maxima_indices.size();
left_idx++)
{
i = left_maxima_indices.at(left_idx);
for (size_t right_idx = 0;
right_idx < right_maxima_indices.size();
right_idx++)
{
j = right_maxima_indices.at(right_idx);
if (i == j) continue;
int64_t value = (nums.at(i) + (int64_t)nums.at(j)) * (j - i);
if (value > best_product)
{
best_product = value;
best_i = i;
best_j = j;
}
}
}
return { best_i, best_j, best_product };
}
I started from the two excellent answers by #selbie and #kcsquared.
Their solutions gave impressive results for random inputs. What was not clear is the worst case behavior.
What sequence would correspsond to the worst case?
I finally found a critial sequence for these two answers, a triangle sequence: this sequence slightly increases up to a max, and then slightly decrease. With such a sequence and n=10^5 for example, these answers take more than 10s.
My solutions starts from #selbie solution and add two improvements:
I add #kcsquared's trick: on the right (of j), they can be only lower elements
When considering a new left element a[i], it is useless to start from i + 1 to get the second element. We can start from the current best_j
With these tricks, I was able to improve the two posted answer performances a little bit. However, it still
fails to solve the triangle sequence issue: about 10s for n = 10^5.
#include <iostream>
#include <vector>
#include <string>
#include <cstdlib>
#include <ctime>
#include <chrono>
struct Result {
size_t i;
size_t j;
int64_t value;
};
void print (const Result& res, const std::string& prefix = "") {
std::cout << prefix;
std::cout << "(" << res.i << ", " << res.j << ") -> " << res.value << std::endl;
}
Result findBest(const std::vector<int>& a) {
if (a.size() < 2) {
return { 0, 0, INT64_MIN };
}
int n = a.size();
std::vector<int> next_max(n, -1);
int current_max = n-1;
for (int i = n-1; i >= 0; --i) {
if (a[i] > a[current_max]) {
current_max = i;
}
next_max[i] = current_max;
}
size_t besti = 0;
size_t bestj = 0;
int64_t bestvalue = INT64_MIN;
int minimum = INT_MIN;
for (size_t i = 0; i < a.size(); i++) {
if (a[i] <= minimum) {
continue;
}
minimum = a[i];
size_t jmin = (bestj > i) ? bestj : i+1;
for (size_t j = jmin; j < a.size(); j++) {
j = next_max[j];
value = (a[i] + (int64_t)a[j]) * (j - i);
if (value > bestvalue) {
bestvalue = value;
besti = i;
bestj = j;
}
}
}
return { besti, bestj, bestvalue };
}
int main() {
int n = 1000000;
int vmax = 100000000;
std::vector<int> A (n);
std::srand(std::time(0));
for (int i = 0; i < n; ++i) {
A[i] = rand() % vmax + 1;
}
std::cout << "n = " << n << std::endl;
auto t0 = std::chrono::high_resolution_clock::now();
auto res = findBest (A);
auto t1 = std::chrono::high_resolution_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::microseconds>(t1 - t0).count();
print (res, "Random: ");
std::cout << "time = " << duration/1000 << " ms" << std::endl;
int i_max = n/2;
for (int i = 0; i < i_max; ++i) A[i] = i+1;
A[i_max] = 10 * i_max;
for (int i = i_max+1; i < n; ++i) {
A[i] = 2*i_max - i;
}
t0 = std::chrono::high_resolution_clock::now();
res = findBest (A);
t1 = std::chrono::high_resolution_clock::now();
duration = std::chrono::duration_cast<std::chrono::microseconds>(t1 - t0).count();
print (res, "Triangle sequence: ");
std::cout << "time = " << duration/1000 << " ms" << std::endl;
return 0;
}

std::bad_array_new_length for large numbers

I try to run following code:
#include <stdio.h>
#include <stdlib.h>
int find_next(int act, unsigned long long survivors[], int n)
{
int i = act;
while (survivors[i] == 0)
{
i++;
i = i % n;
}
i = (i + 1) % n; // found first one, but need to skip
while (survivors[i] == 0)
{
i++;
i = i % n;
}// thats the guy
return i;
}
int main()
{
long long int lines;
long long int* results;
scanf_s("%llu", &lines);
results = new long long int[lines];
for (long long int i = 0; i < lines; i++) {
unsigned long long n, k;
scanf_s("%llu", &n);
scanf_s("%llu", &k);
unsigned long long* survivors;
survivors = new unsigned long long[n];
for (int m = 0; m < n; m++) {
survivors[m] = 1;
}
int* order;
order = new int[n];
int p = 0;
int act = 0;
while (p < n - 1)
{
act = find_next(act, survivors, n);
order[p] = act;
survivors[act] = 0; // dies;
p++;
}
order[p] = find_next(act, survivors, n);
if (k > 0)
{
results[i] = order[k - 1] + 1;
}
else
{
results[i] = order[n + k] + 1;
}
delete[] survivors;
delete[] order;
}
for (long long int i = 0; i < lines; i++) {
printf("%llu\n", results[i]);
}
delete[] results;
return 0;
}
My inputs are:
1
1111111111
-1
I am getting an exeption:
std::bad_array_new_length for large numbers
At line:
survivors = new unsigned long long[n];
How should I fix it that it wont show for such large numbers?
So far I tried all numeric types of n -> long long int, unsigned long and so on but everytime I was failing. Or maybe there is no way around that?
How should I fix it that it wont show for such large numbers?
Run the program on a 64 bit CPU.
Use a 64 bit operating system.
Compile the program in 64 bit mode.
Install sufficient amount of memory. That array alone uses over 8 gigabytes.
Configure operating system to allow that much memory be allocated for the process.
P.S. Avoid owning bare pointers. In this case, I recommend using a RAII container such as std::vector.

How to get rid of the arithmetic overflow

This is the code for the Triangle problem in codility that is throwing me an arithmetic overflow error.
int solution(vector<int> &A) {
int i, n;
n=A.size();
sort(A.begin(), A.end());
for(i=0; i<n-2; i++)
{
if((A[i]+A[i+1]>A[i+2])&&(A[i]+A[i+2]>A[i+1])&&(A[i+1]+A[i+2]>A[i]))
{
return 1;
}
}
return 0;
}
It passes all the tests except for the 'extreme_arith_overflow1 overflow test, 3 MAXINTs' saying the code returns 0 but it expects 1. Anybody have any idea on how to fix this?
You store A.size() in n and then you loop until i<n and access A[i+2]. In the error cases this is A[A.size()] or even A[A.size()+1]. It's out of bounds. Fix the range of the loop.
The next problem occurs when the sum is larger than INT_MAX. Use the difference instead of the sum to avoid overflow. Remember that the elements are sorted with A[i] <= A[i+1] <= A[i+2]
int solution(vector<int> &A) {
if (A.size() < 3) return 0;
const auto n = A.size() - 2;
std::sort(A.begin(), A.end());
for(decltype(n) i = 0; i < n; ++i) {
if((A[i]>A[i+2]-A[i+1])&&(A[i+2]>A[i+1]-A[i])&&A[i]>0) {
return 1;
}
}
return 0;
}

Why this code is not showing output for n around 10^5

I have the following code used to calculate primes of the form x^2+ny^2 whihc are not exceeding N. This code runs fine when N is around 80000 but when N is around 10^5 the code breaks down. Why this happens and how to fix this ?
#include <iostream>
#include<iostream>
#include<vector>
const int N = 100000; //Change N in this line
using namespace std;
typedef long long int ll;
bool isprime[N] = {};
bool zprime[N] = {};
vector<int> primes;
vector<int> zprimes;
void calcprime(){
for (int i = 2; i < N; i+=1){isprime[i] = true;}
for (int i = 2; i < N; i+=1){
if (isprime[i]){
primes.push_back(i);
for (int j = 2; i*j < N; j+=1){
isprime[i*j] = false;
}
}
}
}
void zcalc(){
int sqrt = 0; for (int i = 0; i < N; i+=1){if(i*i >= N){break;} sqrt = i;}
for (int i = 0; i <= sqrt; i +=1){
for (int j = 0; j <= sqrt; j+=1){
ll q = (i*i)+(j*j);
if (isprime[q] && !zprime[q] && (q < N)){
zprimes.push_back(q);
zprime[q] = true;
}
}
}
}
int main(){
calcprime();
zcalc();
cout<<zprimes.size();
return 0;
}
Why the code breaks
Out of bounds access. This code breaks because you're doing out of bounds memory accesses on this line here:
if (isprime[q] && !zprime[q] && (q < N)) {
If q is bigger than N, you're accessing memory that technically doesn't belong to you. This invokes undefined behavior, which causes the code to break if N is big enough.
If we change the order so that it checks that q < N before doing the other checks, we don't have this problem:
// Does check first
if((q < N) && isprime[q] && !zprime[q]) {
It's not recommended to have very large c-arrays as global variables. It can cause problems and increase executable size.
(Potentially) very large global arrays. You define isprime and zprime as c-arrays:
bool isprime[N] = {};
bool zprime[N] = {};
This could cause problems down the line for very big values of N, because c-arrays allocate memory statically.
If you change isprime and zprime to be vectors, the program compiles and runs even for values of N greater than ten million. This is because using vector makes the allocation dynamic, and the heap is a better place to store large amounts of data.
std::vector<bool> isprime(N);
std::vector<bool> zprime(N);
Updated code
Here's the fully updated code! I also made i and j to be long long values, so you don't have to worry about integer overflow, and I used the standard library sqrt function to compute the sqrt of N.
#include <iostream>
#include <vector>
#include <cmath>
using namespace std;
typedef long long int ll;
constexpr long long N = 10000000; //Change N in this line
std::vector<bool> isprime(N);
std::vector<bool> zprime(N);
vector<int> primes;
vector<int> zprimes;
void calcprime() {
isprime[0] = false;
isprime[1] = false;
for (ll i = 2; i < N; i+=1) {
isprime[i] = true;
}
for (ll i = 2; i < N; i+=1) {
if (isprime[i]) {
primes.push_back(i);
for (ll j = 2; i*j < N; j+=1){
isprime[i*j] = false;
}
}
}
}
void zcalc(){
ll sqrtN = sqrt(N);
for (ll i = 0; i <= sqrtN; i++) {
for (ll j = 0; j <= sqrtN; j++) {
ll q = (i*i)+(j*j);
if ((q < N) && isprime[q] && !zprime[q]) {
zprimes.push_back(q);
zprime[q] = true;
}
}
}
}
int main(){
calcprime();
zcalc();
cout << zprimes.size();
return 0;
}
The value of q can exceed the value of N in your code and can cause a segmentation fault when zprime[q],isprime[q] is accessed. You're iterating i, j till sqrt(N) and have allocated zprime,isprime with N booleans. The value of q can vary from 0 to 2N.
ll q = (i*i)+(j*j);
You can replace bool zprime[N] = {}; and bool isprime[N] = {}; with
bool zprime[N * 2 + 1] = {};
and
bool isprime[N * 2 + 1] = {};
respectively.
The program will no longer segfault. Or, you could check for q < N before accessing isprime[q] and zprime[q].
Also, as has already been pointed out in the comments, (i*i)+(j*j) is an int. It is useless to assign that value to a long long. If you intend to prevent overflow, replace it with ((ll)i*i)+(j*j).
Moreover, for large sized arrays, you should prefer to allocate it on the heap.

Runtime error in code (C++)

i am a beginner to c++ but i wouldn't have asked this question if i didnt spend hours on it.
The code is about finding primes between two numbers in the most efficient way where maximum limit is 10^9.
The following code gives me runtime error but i have no idea why.. help
#include <iostream>
#include <stdio.h>
#include <math.h>
using namespace std;
long int prime[32000];
bool isprime(long int a){
for(long int i = 3; i <= 32000; i+=2){
if(a%i == 0){
return false;
}
}
return true;
}
void generateprimes(){
long int a = 0;
for(long int i = 3; i < 31623 ; i+=2){
if(isprime(i)){
prime[a] = i;
a++;
}
}
}
bool newisprime(long int a){
long int x =0;
for(long int i = prime[x]; i <= sqrt(a); i = prime[++x]){
if(a%i == 0){
return false;
}
}
return true;
}
void generateprimes_inbetween(long int n,long int m){
if(n%2 == 0){
++n;
}
if(n == 1){
printf("2\n");
n = 3;
}
for(long int i = n; i <= m ; i+=2){
if(newisprime(i)){
printf("%d\n",i);
}
}
}
int main() {
long int a,b,c;
scanf("%ld",&a);
generateprimes();
for(long int i = 0; i < a ; i++){
scanf("%ld %ld",&b,&c);
generateprimes_inbetween(b,c);
printf("\n");
}
return 0;
}
In isprime() you loop through ALL numbers in your array prime[]. But at startup, as it's global data, most of them will be 0, so that a%i will result in a fatal divide by 0.
You have somewhere to keep track of the numer of primes that you've stored in your array and only test against the primes that you've stored there.
Supposing that it's homework and you're not allowed to use vectors, you could do it as follows:
const size_t max_primes = 32000; // avoid hard coded values
unsigned long prime[max_primes] {2, 3}; // prefilled values
size_t nprimes = 2; // number of primes in the array
bool isprime(unsigned long a){
for(size_t i = 0; i < nprimes; i++){
if(a%prime[i] == 0)
return false;
}
return true;
}
void generateprimes(){
nprimes = 2;
for(unsigned long i = 3; nprimes<max_primes && i < ULONG_MAX; i += 2){
if(isprime(i)){
prime[nprimes] = i;
nprimes++;
}
}
}
bool newisprime(unsigned long a){
size_t x = 0;
for(unsigned long i = prime[x]; i <= sqrt(a) && x<nprimes; i = prime[++x]){
if(a%i == 0)
return false;
}
if(x == nprimes) {
cout << "Attention: Reaching end of prime table !!" << endl;
}
return true;
}
Some remarks:
for the index, it's safer to use the unsigned type size_t.
make sure that whenever you use an index, it remains within the bounds
as you work with positive numbers, it could make sense to use unsigned long