Program taking too long for large input

Program taking too long for large input - c++

I am using an equation in which we have to find the maximum value that x can take given the value of b. Both x and b can take only nonnegative integer values. The equation is:
x^4+x^3+x^2+x+1≤b
I have written the following code(apparently dumb) to solve it:
#include<iostream>
#include<climits>
using namespace std;
int main()
{
unsigned long long b,x=0;
cout<<"hey bro, value of b:";
cin>>b;
while(x++<b)
if(x*x*x*x+x*x*x+x*x+x+1>b)
break;
if(b==0)
cout<<"Sorry,no value of x satisfies the inequality"<<endl;
else
cout<<"max value of x:"<<x-1<<endl;
return 0;
}
The above code works fine upto b=LONG_MAX but after for b=LONG_LONG_MAX or b=ULLONG_MAX, it starts taking forever. How can I solve this problem so that it works fine for at most b=ULLONG_MAX?

If for x = m, the inequality holds, then it also holds for all integers < m. If it doesn't hold for m, then it doesn't hold for any integer > m. What algorithm does this suggest?
If you want to spoil yourself, click here for the algorithm.

This is not just an optimization issue. (For optimization, see IVlad's answer). It is also a correctness issue. With very large values, the expression causes integer overflow: to put it simply, it wraps around from ULLONG_MAX back to zero, and your loop carries on having not detected this. You need to build overflow detection in your code.

A really simple observation solves your problem in O(1) time.
Find k = sqrt(sqrt(b))
If k satisfies your inequality, k is your answer. If it does not, k-1 is your answer.

Old answer (real problem here is not big number of iterations, but integer overflow; please read from 'Update' part; I keep this part here for history of false assumptions):
These values are very big. When your program checks each value from 0 to LONG_LONG_MAX, it shold make about 9*10^12 operations, isn't it? For ULLONG_MAX we have about 18*10^12 operations. Try to modify this program to see actual speed of processing:
while (x++ < b)
{
if (x % 1000000 == 0)
cout << " current x: " << x << endl;
if (x*x*x*x+x*x*x+x*x+x+1>b)
break;
}
So, you need to optimize this algorithm (i.e. reduce number of iterations): since your function is monotonic, you can use Binary search algorithm (see Bisection method too for clarification).
Also there is a possible problem with integer overflow: function x*x*x*x for big values x will be calculated wrong. Just imagine thay your type is unsigned char (1 byte). For example, when your program calculates 250*250*250*250 you expect 3906250000, but in fact you have 3906250000 % 256 (i.e. 16). So, if x is too big, it is possible, that your function will return value < b (and it will be strange; theoretically it can brake your optimized algorithm). Good news is that you will not see this problem, if do every check correctly. But for more complex functions you would also need to support long math (for example, use GMP or another implementation).
Update: How to avoid overflow risks?
We need to find maximal allowed value of x (let's call it xmax). Value x is allowed if x*x*x*x+x*x*x+x*x+x+1 < ULLONG_MAX. So, answer on initial question (about x*x*x*x+x*x*x+x*x+x+1 < b) is not bigger than xmax. Let's find xmax (just solve equation x*x*x*x+x*x*x+x*x+x+1=ULLONG_MAX in any system, for example WolframAlpha: anwer is about 65535.75..., so xmax==65535. So, if we check x from 0 to xmax we will not have overflow problems. Also it is our initial values for binary search algorithm.
Also it means, that we do not need to use binary search algorithm here, because it is enought to check only 65535 values. If x==65535 is not answer, we have to stop and return answer 65536.
If we need cross-platform solution without hardcoding of xmax, we can use any bigint implementation (GMP or any simpler solution) or implement more accurate multiplication and other operations. Example: if we need to multyply x and y, we can calculate z=ULLONG_MAX/x and compare this value and y. If z<y, we can't multiply x and y without overflow.

You could try finding an upper limit and working down from there.
// Find the position of the most significant bit.
int topBitPosition = 0;
while(b >> topBitPosition)
topBitPosition++;
// Find a rough estimate of b ^ 1/4
unsigned long long x = b >> (topBitPosition - topBitPosition/4);
// Work down from there
while(x*x*x*x+x*x*x+x*x+x+1 > b)
x--;
cout<<"max value of x:"<<x-1<<endl;

Don't let x exceed 65535. If 65535 satisfies the inequality, 65536 will not.

Quick answer:
First af all you are starting from x=0 and then increasing it which is not the best solution since you are looking for the maximum value and not the first one.
So for that I would go from an upperbound that can be
x=abs((b)^(1/4))
than decrease from that value, and as soon you find an element <=b you are done.
You can even think in this way:
for y=b to 1
solve(x^4+x^3+x^2+x+1=y)
if has an integer solution then return solution
See this
This is a super quick answer I hope I didn't write too many stupid things, and sorry I don't know yet how to write math here.

Here's a slightly more optimized version:
#include<iostream>
int main()
{
std::cout << "Sorry, no value of x satisfies the inequality" << std::endl;
return 0;
}
Why? Because x^4+x^3+x^2+x+1 is unbounded as x approaches positive infinity. There is no b for which your inequality holds. Computer Science is a subset of math.

Related

Efficient way of ensuring newness of a set

Given set N = {1,...,n}, consider P different pre-existing subsets of N. A subset, S_p, is characterized by the 0-1 n vector x_p where the ith element is 0 or 1 depending on whether the ith (of n) items is part of the subset or not. Let us call such x_ps indicator vectors.
For e.g., if N={1,2,3,4,5}, subset {1,2,5} is represented by vector (1,0,0,1,1).
Now, given P pre-existing subsets and their associated vectors x_ps.
A candidate subset denoted by vector yis computed.
What is the most efficient way of checking whether y is already part of the set of P pre-existing subsets or whether y is indeed a new subset not part of the P subsets?
The following are the methods I can think of:
(Method 1) Basically, we have to do an element by element check against all pre-existing sets. Pseudocode follows:
for(int p = 0; p < P; p++){
//(check if x_p == y by doing an element by element comparison)
int i;
for(i = 0; i < n; i++){
if(x_pi != y_i){
i = 999999;
}
}
if(i < 999999)
return that y is pre-existing
}
return that y is new
(Method 2) Another thought that comes to mind is to store the decimal equivalent of the indicator vectors x_ps (where the indicator vectors are taken to be binary representations) and compare it with the decimal equivalent of y. That is, if set of P pre-existing sets is: { (0,1,0,0,1), (1,0,1,1,0) }, the stored decimals for this set would be {9, 22}. If y is (0,1,1,0,0), we compute 12 and check this against the set {9, 22}. The benefit of this method is that for each new y, we don't have to check against the n elements of every pre-existing set. We can just compare the decimal numbers.
Question 1. It appears to me that (Method 2) should be more efficient than (Method 1). For (Method 2), is there an efficient way (inbuilt library function in C/C++) that converts the x_ps and y from binary to decimal? What should be data type of these indicator variables? For e.g., bool y[5]; or char y[5];?
Question 2. Is there any method more efficient than (Method 2)?

As you've noticed, there's a trivial isomorphism between your indicator vectors and N-bit integers. That means the answer to your question 2 is "no": the tools available for maintain a set and testing membership in it are the same as integers (hash tables bring the normal approach). A commented mentioned Bloom fillers, which can efficiently test membership at the risk of some false positives, but Bloom filters are generally for much larger data sizes than you're looking at.
As for your question 1: Method 2 is reasonable, and it's even easier than you think. While vector<bool> doesn't give you an easy way to turn it into integer blocks, on implementations I'm aware of it's already implemented this way (the C++ standard allows special treatment of that particular vector type, something that is generally considered nowadays to have been a poor decision, but which occasionally yields some benefit). And those vectors are hashable. So just keep an unordered_set<vector<bool>> around, and you'll get performance which is reasonably close to the optimum. (If you know N at compile time you may want to prefer bitset to vector<bool>.)

Method 2 can be optimized by calculating the decimal equivalent of the given subset and hashing it using modulus 1e9+7. This results in different decimal numbers every time since N<=1000(No collision occurs).
#define M 1000000007 //big prime number
unordered_set<long long> subset; //containing decimal representation of all the
//previous found subsets
/*fast computation of power of 2*/
long long Pow(long long num,long long pow){
long long result=1;
while(pow)
{
if(pow&1)
{
result*=num;
result%=M;
}
num*=num;
num%=M;
pow>>=1;
}
return result;
}
/*checks if subset pre exists*/
bool check(vector<bool> booleanVector){
long long result=0;
for(int i=0;i<booleanVector.size();i++)
if(booleanVector[i])
result+=Pow(2,i);
return (subset.find(result)==subset.end());
}

Greedy algorithm exercise not working properly

I'm in high school and having a test soon, one of the topics being the Greedy algorithm. I'm having an unknown issue with this exercise: "It is given an array of N integers. Using the Greedy algorithm, determine the largest number that can be written as a multiplication of two of the array elements" (Sorry if it's a bit unclear, I'm not a native English speaker).
Now, what I had in mind to solve this exercise is this: Find the largest two numbers and the lowest two numbers (in case they are both negative) and display either the multiplication of the two largest or of the two lowest, depending on which number is larger.
This is what I wrote:
#include <iostream>
using namespace std;
int a[100001],n;
int main()
{
int max1=-1000001,max2=-1000001,min1=1000001,min2=1000001,x;
cin>>n;
for (int i=1; i<=n; i++)
{
cin>>a[i];
if (a[i]>=max2)
{
if (a[i]>=max1)
{
x=max1;
max1=a[i];
max2=x;
}
else max2=a[i];
}
if (a[i]<=min2)
{
if (a[i]<=min1)
{
x=min1;
min1=a[i];
min2=x;
}
else min2=a[i];
}
}
if (n==1)
cout<<n;
else if (max1*max2>=min1*min2)
cout<<max1*max2;
else cout<<min1*min2;
return 0;
}
Yes, I know the way I wrote it is untidy/ugly. The code, however, should function properly and I tested it with both the example provided by the exercise and lots of different situations. They all gave the right result. The problem is that the programming exercises website gives my code a 80/100 score, not because of the time but because of the wrong answers.
I've already spent more than 2 hours looking at this code and I just can't figure out what's wrong with it. Can anyone point out the flaw? Thanks <3

The problem most likely comes from the fact that multiplying 2 int's will give you an int. An int usually has a range of -2,147,483,648 to 2,147,483,647.
If you then multiply 2,147,483,647 * 2 for example you get -2. Similarly taking 2,147,483,647 + 1 will give you -2147483648. When the value reaches it's max it deals with that by going to the lowest possible value.
To partially solve the problem you just need to cast 1 of the variables you multiply to a 64-bit integer. For modern C++ that would be int64_t.
if (n==1)
cout<<n;
else if (static_cast<int64_t>(max1)*max2>=static_cast<int64_t>(min1)*min2)
cout<<static_cast<int64_t>(max1)*max2;
else cout<<static_cast<int64_t>(min1)*min2;
But you will still be able to get too big number if both the values are big enough. So you need the full range of a uint64_t, the unsigned version.
So we need to cast to a uint64_t instead, but then you run into another issue with the numbers below 0. So first you should convert you min1 and min2 to the equivalent positive numbers, then cast to uint64_t.
uint64_t positive_min1, positive_min2;
if (min1 < 0 && min2 < 0) {
positive_min1 = min1*-1;
positive_min2 = min2*-1;
}
else {
positive_min1 = 0;
positive_min2 = 0;
}
Now you can go ahead and do
if (n==1)
cout<<n;
else if (static_cast<uint64_t>(max1)*max2>=positive_min1*positive_min2)
cout<<static_cast<int64_t>(max1)*max2;
else cout<<positive_min1*positive_min2;
No need to cast positive_min1 & 2 since it was already converted to uint64_t.
Since you are casting max1 to unsigned, you should probably check that it's not below 0 first.
If signed and unsigned is not familiar concepts you can read about that and the different data types here.

How to calculate the sum of the bitwise xor values of all the distinct combination of the given numbers efficiently?

Given n(n<=1000000) positive integer numbers (each number is smaller than 1000000). The task is to calculate the sum of the bitwise xor ( ^ in c/c++) value of all the distinct combination of the given numbers.
Time limit is 1 second.
For example, if 3 integers are given as 7, 3 and 5, answer should be 7^3 + 7^5 + 3^5 = 12.
My approach is:
#include <bits/stdc++.h>
using namespace std;
int num[1000001];
int main()
{
int n, i, sum, j;
scanf("%d", &n);
sum=0;
for(i=0;i<n;i++)
scanf("%d", &num[i]);
for(i=0;i<n-1;i++)
{
for(j=i+1;j<n;j++)
{
sum+=(num[i]^num[j]);
}
}
printf("%d\n", sum);
return 0;
}
But my code failed to run in 1 second. How can I write my code in a faster way, which can run in 1 second ?
Edit: Actually this is an Online Judge problem and I am getting Cpu Limit Exceeded with my above code.

You need to compute around 1e12 xors in order to brute force this. Modern processors can do around 1e10 such operations per second. So brute force cannot work; therefore they are looking for you to figure out a better algorithm.
So you need to find a way to determine the answer without computing all those xors.
Hint: can you think of a way to do it if all the input numbers were either zero or one (one bit)? And then extend it to numbers of two bits, three bits, and so on?

When optimising your code you can go 3 different routes:
Optimising the algorithm.
Optimising the calls to language and library functions.
Optimising for the particular architecture.
There may very well be a quicker mathematical way of xoring every pair combination and then summing them up, but I know it not. In any case, on the contemporary processors you'll be shaving off microseconds at best; that is because you are doing basic operations (xor and sum).
Optimising for the architecture also makes little sense. It normally becomes important in repetitive branching, you have nothing like that here.
The biggest problem in your algorithm is reading from the standard input. Despite the fact that "scanf" takes only 5 characters in your computer code, in machine language this is the bulk of your program. Unfortunately, if the data will actually change each time your run your code, there is no way around the requirement of reading from stdin, and there will be no difference whether you use scanf, std::cin >>, or even will attempt to implement your own method to read characters from input and convert them into ints.
All this assumes that you don't expect a human being to enter thousands of numbers in less than one second. I guess you can be running your code via: myprogram < data.

This function grows quadratically (thanks #rici). At around 25,000 positive integers with each being 999,999 (worst case) the for loop calculation alone can finish in approximately a second. Trying to make this work with input as you have specified and for 1 million positive integers just doesn't seem possible.

With the hint in Alan Stokes's answer, you may have a linear complexity instead of quadratic with the following:
std::size_t xor_sum(const std::vector<std::uint32_t>& v)
{
std::size_t res = 0;
for (std::size_t b = 0; b != 32; ++b) {
const std::size_t count_0 =
std::count_if(v.begin(), v.end(),
[b](std::uint32_t n) { return (n >> b) & 0x01; });
const std::size_t count_1 = v.size() - count_0;
res += count_0 * count_1 << b;
}
return res;
}
Live Demo.
Explanation:
x^y = Sum_b((x&b)^(y&b)) where b is a single bit mask (from 1<<0 to 1<<32).
For a given bit, with count_0 and count_1 the respective number of count of number with bit set to 0 or 1, we have count_0 * (count_0 - 1) 0^0, count_0 * count_1 0^1 and count_1 * (count_1 - 1) 1^1 (and 0^0 and 1^1 are 0).

Bitwise Operation in C/C++: ORing all XOR'd pairs in O(N)

I need to XOR each possible pair of elements in an array, and then OR those results together.
Is it possible to do this in O(N)?
Example:-
If list contain three numbers 10,15 & 17, Then there will be a total of 3 pairs:
d1=10^15=5;
d2=10^17=27;
d3=17^15=30;
k= d1 | d2 | d3 ;
K=31

Acutually, it's even easier than Tanmay suggests.
It turns out that most of the pairs are redundant: (A^B)|(A^C)|(B^C) == (A^B)|(A^C) and
(A^B)|(A^C)|(A^D)|(B^C)|(B^D)|(C^D) == (A^B)|(A^C)|(A^D), etc. So you can just XOR each element with the first, and OR the results:
result = 0;
for (i=1; i<N;i++){
result|=data[0]^data[i];
}

OR everything, NAND everything, AND both results
Finding all combinations in O(1) is obviously impossible. So the solution had to be something ad-hoc reformulation of the problem. This is a complete intuition. (I don't have proof, but it works).
I am not sure how to solve it mathematically using boolean algebra since it involves finding all combination pairs, but I'll try to explain it using Venn diagram.
The required area is exactly identical to Venn diagram of OR except for the area of AND. Therefore they have to be subtracted. If you try it with n > 3, the picture would be even clearer.
The best way to test this method would be to simulate it with algorithms which don't have to be O(1). Anyways, you can try finding a direct proof. If you find it, please kindly share it with us too. :)
As far as your question goes, I'm sure you can implement it in O(1) yourself easily.
Good luck.

Bitwise means that you only care about 1 or 0...
The OR phase is true if at least one "pair XOR" is true.
There exists only two series for which all "pair XOR" are false : 1,1,1,1,1,1,1,1 and 0,0,0,0,0,0.
The solution is therefore a for loop to test if all items are 1 or 0.
And this is O(n) !
Bye,

You can just do what is straightforward: loop over all the pairs, 'xor' them, and 'or' the sub results. Here is a function that expects a pointer to the start of the array, and the size of the array. I typed it here without trying it, but even if it is not correct, I hope you get the idea.
unsigned int compute(const unsigned int *p, size_t size)
{
assert(size >= 2);
size_t counter = size - 1;
unsigned int value = 0;
while (counter != 0) {
value |= *p ^ *(p + 1);
++p;
--counter;
}
return value;
}

C++ program to calculate quotients of large factorials

How can I write a c++ program to calculate large factorials.
Example, if I want to calculate (100!) / (99!), we know the answer is 100, but if i calculate the factorials of the numerator and denominator individually, both the numbers are gigantically large.

expanding on Dirk's answer (which imo is the correct one):
#include "math.h"
#include "stdio.h"
int main(){
printf("%lf\n", (100.0/99.0) * exp(lgamma(100)-lgamma(99)) );
}
try it, it really does what you want even though it looks a little crazy if you are not familiar with it. Using a bigint library is going to be wildly inefficient. Taking exps of logs of gammas is super fast. This runs instantly.
The reason you need to multiply by 100/99 is that gamma is equivalent to n-1! not n!. So yeah, you could just do exp(lgamma(101)-lgamma(100)) instead. Also, gamma is defined for more than just integers.

You can use the Gamma function instead, see the Wikipedia page which also pointers to code.

Of course this particular expression should be optimized, but as for the title question, I like GMP because it offers a decent C++ interface, and is readily available.
#include <iostream>
#include <gmpxx.h>
mpz_class fact(unsigned int n)
{
mpz_class result(n);
while(n --> 1) result *= n;
return result;
}
int main()
{
mpz_class result = fact(100) / fact(99);
std::cout << result.get_str(10) << std::endl;
}
compiles on Linux with g++ -Wall -Wextra -o test test.cc -lgmpxx -lgmp

By the sounds of your comments, you also want to calculate expressions like 100!/(96!*4!).
Having "cancelled out the 96", leaving yourself with (97 * ... * 100)/4!, you can then keep the arithmetic within smaller bounds by taking as few numbers "from the top" as possible as you go. So, in this case:
i = 96
j = 4
result = i
while (i <= 100) or (j > 1)
if (j > 1) and (result % j == 0)
result /= j
--j
else
result *= i
++i
You can of course be cleverer than that in the same vein.
This just delays the inevitable, though: eventually you reach the limits of your fixed-size type. Factorials explode so quickly that for heavy-duty use you're going to need multiple-precision.

Here's an example of how to do so:
http://www.daniweb.com/code/snippet216490.html
The approach they take is to store the big #s as a character array of digits.
Also see this SO question: Calculate the factorial of an arbitrarily large number, showing all the digits

You can use a big integer library like gmp which can handle arbitrarily large integers.

The only optimization that can be made here (considering that in m!/n! m is larger than n) means crossing out everything you can before using multiplication.
If m is less than n we would have to swap the elements first, then calculate the factorial and then make something like 1 / result. Note that the result in this case would be double and you should handle it as double.
Here is the code.
if (m == n) return 1;
// If 'm' is less than 'n' we would have
// to calculate the denominator first and then
// make one division operation
bool need_swap = (m < n);
if (need_swap) std::swap(m, n);
// #note You could also use some BIG integer implementation,
// if your factorial would still be big after crossing some values
// Store the result here
int result = 1;
for (int i = m; i > n; --i) {
result *= i;
}
// Here comes the division if needed
// After that, we swap the elements back
if (need_swap) {
// Note the double here
// If m is always > n then these lines are not needed
double fractional_result = (double)1 / result;
std::swap(m, n);
}
Also to mention (if you need some big int implementation and want to do it yourself) - the best approach that is not so hard to implement is to treat your int as a sequence of blocks and the best is to split your int to series, that contain 4 digits each.
Example: 1234 | 4567 | 2323 | 2345 | .... Then you'll have to implement every basic operation that you need (sum, mult, maybe pow, division is actually a tough one).

To solve x!/y! for x > y:
int product = 1;
for(int i=0; i < x - y; i ++)
{
product *= x-i;
}
If y > x switch the variables and take the reciprocal of your solution.

I asked a similar question, and got some pointers to some libraries:
How can I calculate a factorial in C# using a library call?
It depends on whether or not you need all the digits, or just something close. If you just want something close, Stirling's Approximation is a good place to start.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Program taking too long for large input - c++

If for x = m, the inequality holds, then it also holds for all integers < m. If it doesn't hold for m, then it doesn't hold for any integer > m. What algorithm does this suggest? If you want to spoil yourself, click here for the algorithm.

A really simple observation solves your problem in O(1) time. Find k = sqrt(sqrt(b)) If k satisfies your inequality, k is your answer. If it does not, k-1 is your answer.

Don't let x exceed 65535. If 65535 satisfies the inequality, 65536 will not.

Related

Efficient way of ensuring newness of a set

Greedy algorithm exercise not working properly

How to calculate the sum of the bitwise xor values of all the distinct combination of the given numbers efficiently?

Bitwise Operation in C/C++: ORing all XOR'd pairs in O(N)

C++ program to calculate quotients of large factorials

Categories

Resources