I have this problem:
There are K lines of N numbers (32-bit). I have to choose the line with the max product of numbers.
The main problem is that N can go up to 20.
I'm trying to do this with logarithms:
ld sum = 0, max = 0;
int index = 0;
for(int i = 0; i < k; i ++) { // K lines
sum = 0, c = 0;
for(int j = 0; j < n; j ++) { // N numbers
cin >> t;
if(t < 0)
c++; // If the number is less than 0 i memorize it
if(t == 1 || t == -1) { // if numbers = 1 OR -1
sum += 0.00000001; // Because log(1) = 0
if(t == -1)
c ++;
}
else if(t == 0) { // if some number is equal to zero then the sum is = 0
sum = 0;
break;
}
else {
sum += log10(fabs(t));
}
}
if(c % 2 == 1) // if c is odd than multiply by -1
sum *= -1;
if(sum >= max) {
max = sum;
index = i;
}
if((sum - max) < eps) { // if sum is equal to max i'm also have to choose it
max = sum;
index = i;
}
}
cout << index + 1 << endl;
The program works in 50% of test cases. Is there a way to optimize my code?
In the case of t == -1, you increment c twice.
if you want to avoid bignum libs you can exploit that if you multiply b1 and b2 bits numbers then the result is b1+b2 bits long
so just sum the bit count of all multiplicants in a line together
and compare that
remember the results in some array
int bits(DWORD p) // count how many bits is p DWORD is 32bit unsigned int
{
DWORD m=0x80000000; int b=32;
for (;m;m>>=1,b--)
if (p>=m) break;
return b;
}
index sort the lines by the result bit count descending
if the first bitcount after sort is also the max then its line is the answer
if you have more than one max (more lines have the same bitcount and are the max also)
only then you have to multiply them together
Now the multiplication
you know should multiply all the max lines at once
each time all sub results are divisible by the same prime
divide them by it
this way the result will be truncated to much less bit count
so it should fit into 64 bit value
you should check out primes up to sqrt(max value)
when your max value is 32bit then check primes up to 65536
so you can make a static table of primes to check to speed thing up
also there is no point in checking primes bigger then your actual sub result
if you know how then this can be extremly speeded up by Sieves of Eratosthenes
but you will need to keep track of index offset after each division and use periodic sieve tables which is a bit complicated but doable
if you do not check all the primes but just few selected ones
then the result can still overflow
so you should handle that too (throw some error or something)
or divide all subresults by some value but that can invalidate the the result
Another multiplication approach
you can also sort the multiplicant by value
and check if some are present in all max lines
if yes then change them for one (or delete from lists)
this can be combined with the previous approach
bignum multiplication
you can make your own bignum multiplication
the result is max 20*32=640 bit
so the result will be array of unsigned ints (bit wide 8,16,32 ... whatever you like)
you can also handle the number as a string
look here for how to compute fast exact bignum square in C++
it contains also the multiplication approaches
and here NTT based Schönhage-Strassen multiplication in C++
but that will be slower for such small numbers like yours
at last you need to compare results
so compare from MSW do LSW and which ever line has bigger number in it is the max line
(MSW is most significant word, LSW is least significant word)
I think that this line is definitely wrong:
if(c % 2 == 1) // if c is odd than multiply by -1
sum *= -1;
If your product is in the range [0,1] then its logarithm will be negative and this will make it positive. I think you should keep it separate.
Related
Given a number 1 <= N <= 3*10^5, count all subsets in the set {1, 2, ..., N-1} that sum up to N. This is essentially a modified version of the subset sum problem, but with a modification that the sum and number of elements are the same, and that the set/array increases linearly by 1 to N-1.
I think i have solved this using dp ordered map and inclusion/exclusion recursive algorithm, but due to the time and space complexity i can't compute more than 10000 elements.
#include <iostream>
#include <chrono>
#include <map>
#include "bigint.h"
using namespace std;
//2d hashmap to store values from recursion; keys- i & sum; value- count
map<pair<int, int>, bigint> hmap;
bigint counter(int n, int i, int sum){
//end case
if(i == 0){
if(sum == 0){
return 1;
}
return 0;
}
//alternative end case if its sum is zero before it has finished iterating through all of the possible combinations
if(sum == 0){
return 1;
}
//case if the result of the recursion is already in the hashmap
if(hmap.find(make_pair(i, sum)) != hmap.end()){
return hmap[make_pair(i, sum)];
}
//only proceed further recursion if resulting sum wouldnt be negative
if(sum - i < 0){
//optimization that skips unecessary recursive branches
return hmap[make_pair(i, sum)] = counter(n, sum, sum);
}
else{
//include the number dont include the number
return hmap[make_pair(i, sum)] = counter(n, i - 1, sum - i) + counter(n, i - 1, sum);
}
}
The function has starting values of N, N-1, and N, indicating number of elements, iterator(which decrements) and the sum of the recursive branch(which decreases with every included value).
This is the code that calculates the number of the subsets. for input of 3000 it takes around ~22 seconds to output the result which is 40 digits long. Because of the long digits i had to use an arbitrary precision library bigint from rgroshanrg, which works fine for values less than ~10000. Testing beyond that gives me a segfault on line 28-29, maybe due to the stored arbitrary precision values becoming too big and conflicting in the map. I need to somehow up this code so it can work with values beyond 10000 but i am stumped with it. Any ideas or should i switch towards another algorithm and data storage?
Here is a different algorithm, described in a paper by Evangelos Georgiadis, "Computing Partition Numbers q(n)":
std::vector<BigInt> RestrictedPartitionNumbers(int n)
{
std::vector<BigInt> q(n, 0);
// initialize q with A010815
for (int i = 0; ; i++)
{
int n0 = i * (3 * i - 1) >> 1;
if (n0 >= q.size())
break;
q[n0] = 1 - 2 * (i & 1);
int n1 = i * (3 * i + 1) >> 1;
if (n1 < q.size())
q[n1] = 1 - 2 * (i & 1);
}
// construct A000009 as per "Evangelos Georgiadis, Computing Partition Numbers q(n)"
for (size_t k = 0; k < q.size(); k++)
{
size_t j = 1;
size_t m = k + 1;
while (m < q.size())
{
if ((j & 1) != 0)
q[m] += q[k] << 1;
else
q[m] -= q[k] << 1;
j++;
m = k + j * j;
}
}
return q;
}
It's not the fastest algorithm out there, and this took about half a minute for on my computer for n = 300000. But you only need to do it once (since it computes all partition numbers up to some bound) and it doesn't take a lot of memory (a bit over 150MB).
The results go up to but excluding n, and they assume that for each number, that number itself is allowed to be a partition of itself eg the set {4} is a partition of the number 4, in your definition of the problem you excluded that case so you need to subtract 1 from the result.
Maybe there's a nicer way to express A010815, that part of the code isn't slow though, I just think it looks bad.
I am new to competitive programming. I recently gave the Div 3 contest codeforces. Eventhough I solved the problem C, I really found this code from one of the top programmers really interesting. I have been trying to really understand his code, but it seems like I am too much of a beginner to understand it without someone else explaining it to me.
Here is the code.
void main(){
int S;
cin >> S;
int ans = 1e9;
for (int mask = 0; mask < 1 << 9; mask++) {
int sum = 0;
string num;
for (int i = 0; i < 9; i++)
if (mask >> i & 1) {
sum += i + 1;
num += char('0' + (i + 1));
}
if (sum != S)
continue;
ans = min(ans, stoi(num));
}
cout << ans << '\n';
}
The problem is to find the minimum number whose sum of digits is equal to given number S, such that every digit in the result is unique.
Eq. S = 20,
Ans = 389 (3+8+9 = 20)
Mask is 9-bits long, each bit represents a digit from 1-9. Thus it counts from 0 and stops at 512. Each value in that number corresponds to possible solution. Find every solution that sums to the proper value, and remember the smallest one of them.
For example, if mask is 235, in binary it is
011101011 // bit representation of 235
987654321 // corresponding digit
==> 124678 // number for this example: "digits" with a 1-bit above
// and with lowest digits to the left
There are a few observations:
you want the smallest digits in the most significant places in the result, so a 1 will always come before any larger digit.
there is no need for a zero in the answer; it doesn't affect the sum and only makes the result larger
This loop converts the bits into the corresponding digit, and applies that digit to the sum and to the "num" which is what it'll print for output.
for (int i = 0; i < 9; i++)
if (mask >> i & 1) { // check bit i in the mask
sum += i + 1; // numeric sum
num += char('0' + (i + 1)); // output as a string
}
(mask >> i) ensures the ith bit is now shifted to the first place, and then & 1 removes every bit except the first one. The result is either 0 or 1, and it's the value of the ith bit.
The num could have been accumulated in an int instead of a string (initialized to 0, then for each digit: multiply by 10, then add the digit), which is more efficient, but they didn't.
The way to understand what a snippet of code is doing is to A) understand what it does at a macro-level, which you have done and B) go through each line and understand what it does, then C) work your way backward and forward from what you know, gaining progress a bit at a time. Let me show you what I mean using your example.
Let's start by seeing, broadly (top-down) what the code is doing:
void main(){
// Set up some initial state
int S;
cin >> S;
int ans = 1e9;
// Create a mask, that's neat, we'll look at this later.
for (int mask = 0; mask < 1 << 9; mask++) {
// Loop state
int sum = 0;
string num;
// This loop seems to come up with candidate sums, somehow.
for (int i = 0; i < 9; i++)
if (mask >> i & 1) {
sum += i + 1;
num += char('0' + (i + 1));
}
// Stop if the sum we've found isn't the target
if (sum != S)
continue;
// Keep track of the smallest value we've seen so far
ans = min(ans, stoi(num));
}
// Print out the smallest value
cout << ans << '\n';
}
So, going from what we knew about the function at a macro level, we've found that there are really only two spots that are obscure, the two loops. (If anything outside of those are confusing to you, please clarify.)
So now let's try going bottom-up, line-by-line those loops.
// The number 9 appears often, it's probably meant to represent the digits 1-9
// The syntax 1 << 9 means 1 bitshifted 9 times.
// Each bitshift is a multiplication by 2.
// So this is equal to 1 * (2^9) or 512.
// Mask will be 9 bits long, and each combination of bits will be covered.
for (int mask = 0; mask < 1 << 9; mask++) {
// Here's that number 9 again.
// This time, we're looping from 0 to 8.
for (int i = 0; i < 9; i++) {
// The syntax mask >> i shifts mask down by i bits.
// This is like dividing mask by 2^i.
// The syntax & 1 means get just the lowest bit.
// Together, this returns true if mask's ith bit is 1, false if it's 0.
if (mask >> i & 1) {
// sum is the value of summing the digits together
// So the mask seems to be telling us which digits to use.
sum += i + 1;
// num is the string representation of the number whose sum we're finding.
// '0'+(i+1) is a way to convert numbers 1-9 into characters '1'-'9'.
num += char('0' + (i + 1));
}
}
}
Now we know what the code is doing, but it's hard to figure out. Now we have to meet in the middle - combine our overall understanding of what the code does with the low-level understanding of the specific lines of code.
We know that this code gives up after 9 digits. Why? Because there are only 9 unique non-zero values (1,2,3,4,5,6,7,8,9). The problem said they have to be unique.
Where's zero? Zero doesn't contribute. A number like 209 will always be smaller than its counterpart without the zero, 92 or 29. So we just don't even look at zero.
We also know that this code doesn't care about order. If digit 2 is in the number, it's always before digit 5. In other words, the code doesn't ever look at the number 52, only 25. Why? Because the smallest anagram number (numbers with the same digits in a different order) will always start with the smallest digit, then the second smallest, etc.
So, putting this all together:
void main(){
// Read in the target sum S
int S;
cin >> S;
// Set ans to be a value that's higher than anything possible
// Because the largest number with unique digits is 987654321.
int ans = 1e9;
// Go through each combination of digits, from 1 to 9.
for (int mask = 0; mask < 1 << 9; mask++) {
int sum = 0;
string num;
for (int i = 0; i < 9; i++)
// If this combination includes the digit i+1,
// Then add it to the sum, and append to the string representation.
if (mask >> i & 1) {
sum += i + 1;
num += char('0' + (i + 1));
}
// If this combination does not yield the right sum, try the next combination.
if (sum != S)
continue;
// If this combination does yield the right sum,
// see if it's smaller than our previous smallest.
ans = min(ans, stoi(num));
}
// Print the smallest combination we found.
cout << ans << '\n';
}
I hope this helps!
The for loop is iterating over all 9-digit binary numbers and turning those binary numbers into a string of decimal digits such that if nth binary digit is on then a n+1 digit is appended to the decimal number.
Generating the numbers this way ensures that the digits are unique and that zero never appears.
But as #Welbog mentions in comments this solution to the problem is way more complicated than it needs to be. The following will be an order of magnitude faster, and I think is clearer:
int smallest_number_with_unique_digits_summing_to_s(int s) {
int tens = 1;
int answer = 0;
for (int n = 9; n > 0 && s > 0; --n) {
if (s >= n) {
answer += n * tens;
tens *= 10;
s -= n;
}
}
return answer;
}
Just a quick way to on how code works.
First you need to know sum of which digits equal to S. Since each digit is unique, you can assign a bit to them in a binary number like this:
Bit number Digit
0 1
1 2
2 3
...
8 9
So you can check all numbers that are less than 1 << 9 (numbers with 9 bits corresponding 1 to 9) and check if sum of bits if equal to your sum based on their value. So for example if we assume S=17:
384 -> 1 1000 0000 -> bit 8 = digit 9 and bit 7 = digit 8 -> sum of digits = 8+9=17
Now that you know sum if correct, you can just create number based on digits you found.
I would like to find the largest prime factor of a given number. After several attempts, I've enhanced the test to cope with rather big numbers (i.e. up to one billion in milliseconds). The problem is now if go beyond one billion, the execution time goes forever, so to speak. I wonder if I can do more improvements and reduce the execution time. I'm hoping for better execution time because in this link Prime Factors Calculator, the execution time is incredibly fast. My target number at this moment is 600851475143. The code is rather self-explanatory. Note: I've considered Sieve of Eratosthenes algorithm with no luck regarding the execution time.
#include <iostream>
#include <cmath>
bool isPrime(int n)
{
if (n==2)
return true;
if (n%2==0)
return false;
for (int i(3);i<=sqrt(n);i+=2) // ignore even numbers and go up to sqrt(n)
if (n%i==0)
return false;
return true;
}
int main()
{
int max(0);
long long target(600851475143);
if( target%2 == 0 )
max = 2;
for ( int i(3); i<target; i+=2 ){ // loop through odd numbers.
if( target%i == 0 ) // check for common factor
if( isPrime(i) ) // check for prime common factor
max = i;
}
std::cout << "The greatest prime common factor is " << max << "\n";
return 0;
}
One obvious optimization that I can see is:
for (int i(3);i<=sqrt(n);i+=2) // ignore even numbers and go up to sqrt(n)
instead of calculating sqrt everytime you can cache the result in a variable.
auto maxFactor = static_cast<int>sqrt(n);
for (int i(3); i <= maxFactor; i+=2);
The reason I believe this could lead to speed up is sqrt deals with floating point arithematic and compilers usually aren't generous in optimizing floating point arithematic. gcc has a special flag ffast-math to enable floating point optimizations explicitely.
For numbers upto the target range that you mentioned, you will need better algorithms. repeated divisioning should suffice.
Here is the code (http://ideone.com/RoAmHd) which hardly takes any time to finish:
int main() {
long long input = 600851475143;
long long mx = 0;
for (int x = 2; x <= input/x; ++x){
while(input%x==0) {input/=x; mx = x; }
}
if (input > 1){
mx = input;
}
cout << mx << endl;
return 0;
}
The idea behind repeated division is if a number is already a factor of p, it is also a factor of p^2, p^3, p^4..... So we keep eliminating factors so only prime factors remain that eventually get to divide the number.
You don't need a primality test. Try this algorithm:
function factors(n)
f := 2
while f * f <= n
if n % f == 0
output f
n := n / f
else
f := f + 1
output n
You don't need a primality test because the trial factors increase by 1 at each step, so any composite trial factors will have already been handled by their smaller constituent primes.
I'll leave it to you to implement in C++ with appropriate data types. This isn't the fastest way to factor integers, but it is sufficient for Project Euler 3.
for ( int i(3); i<target; i+=2 ){ // loop through odd numbers.
if( target%i == 0 ) // check for common factor
if( isPrime(i) ) // check for prime common factor
max = i;
It is the first two lines of this code, not primality checks, which take almost all time. You divide target to all numbers from 3 to target-1. This takes about target/2 divisions.
Besides, target is long long, while i is only int. It is possible that the size is too small, and you get an infinite loop.
Finally, this code does not calculate the greatest prime common factor. It calculate the greatest prime divisor of target, and does it very inefficiently. So what do you really need?
And it is a bad idea to call anything "max" in c++, because max is a standard function.
Here is my basic version:
int main() {
long long input = 600851475143L;
long long pMax = 0;
// Deal with prime 2.
while (input % 2 == 0) {
input /= 2;
pMax = 2;
}
// Deal with odd primes.
for (long long x = 3; x * x <= input; x += 2) {
while (input % x == 0) {
input /= x;
pMax = x;
}
}
// Check for unfactorised input - must be prime.
if (input > 1) {
pMax = input;
}
std::cout << "The greatest prime common factor is " << pMax << "\n";
return 0;
}
It might be possible to speed things up further by using a Newton-Raphson integer square root method to set up a (mostly) fixed limit for the loop. If available that would need a rewrite of the main loop.
long long limit = iSqrt(input)
for (long long x = 3; x <= limit; x += 2) {
if (input % x == 0) {
pMax = x;
do {
input /= x;
} while (input % x == 0);
limit = iSqrt(input); // Value of input changed so reset limit.
}
}
The square root is only calculated when a new factor is found and the value of input has changed.
Note that except for 2 and 3, all prime numbers are adjacent to multiples of 6.
The following code reduces the total number of iterations by:
Leveraging the fact mentioned above
Decreasing target every time a new prime factor is found
#include <iostream>
bool CheckFactor(long long& target,long long factor)
{
if (target%factor == 0)
{
do target /= factor;
while (target%factor == 0);
return true;
}
return false;
}
long long GetMaxFactor(long long target)
{
long long maxFactor = 1;
if (CheckFactor(target,2))
maxFactor = 2;
if (CheckFactor(target,3))
maxFactor = 3;
// Check only factors that are adjacent to multiples of 6
for (long long factor = 5, add = 2; factor*factor <= target; factor += add, add = 6-add)
{
if (CheckFactor(target,factor))
maxFactor = factor;
}
if (target > 1)
return target;
return maxFactor;
}
int main()
{
long long target = 600851475143;
std::cout << "The greatest prime factor of " << target << " is " << GetMaxFactor(target) << std::endl;
return 0;
}
I came across this piece of code to compute least common factor of all numbers in an array but could not understand the algorithm used. What is the use of __builtin_popcount here which is used to count the number of set bits?
pair<long long, int> pre[200000];
long long a[25], N;
long long trunc_mul(long long a, long long b)
{
return a <= INF / b ? a * b : INF;
}
void compute()
{
int limit = 1 << N;
limit--;
for (int i = 1; i <= limit; i++)
{
long long lcm = 1;
pre[i].second = __builtin_popcount(i);
int k = 1;
for (int j = N - 1; j >= 0; j--)
{
if (k&i)
{
lcm = trunc_mul(lcm / __gcd(lcm, a[j]), a[j]);
}
k = k << 1;
}
pre[i].first = lcm;
}
return;
}
The code snipped you provided is given up to 25 numbers. For each subset of numbers it computes their LCM into pre[i].first and number of them in that subset into pre[i].second. The subset itself is represented as a bitmask, therefore to compute number of elements in the subset the snippet uses __builtin_popcount. It has nothing to do with the computation of the LCM.
LCM is computed using a rather standard approach: LCM of any set of numbers is equal to their product divided by their GCD. This is exactly what this snipped does, using builtin GCD function __gcd.
The k&i and k = k<<1 part is to figure out what numbers belong to a set represented by a bitmask. If you don't fully understand it, try to see what happens if i = 0b11010, by running this loop on a piece of paper or in the debugger. You will notice that k&i condition will be true on the second, fourth and fifth iteration, precisely the positions at which i has ones in its binary representation.
While trying to find prime numbers in a range (see problem description), I came across the following code:
(Code taken from here)
// For each prime in sqrt(N) we need to use it in the segmented sieve process.
for (i = 0; i < cnt; i++) {
p = myPrimes[i]; // Store the prime.
s = M / p;
s = s * p; // The closest number less than M that is composite number for this prime p.
for (int j = s; j <= N; j = j + p) {
if (j < M) continue; // Because composite numbers less than M are of no concern.
/* j - M = index in the array primesNow, this is as max index allowed in the array
is not N, it is DIFF_SIZE so we are storing the numbers offset from.
while printing we will add M and print to get the actual number. */
primesNow[j - M] = false;
}
}
// In this loop the first prime numbers for example say 2, 3 are also set to false.
for (int i = 0; i < cnt; i++) { // Hence we need to print them in case they're in range.
if (myPrimes[i] >= M && myPrimes[i] <= N) // Without this loop you will see that for a
// range (1, 30), 2 & 3 doesn't get printed.
cout << myPrimes[i] << endl;
}
// primesNow[] = false for all composite numbers, primes found by checking with true.
for (int i = 0; i < N - M + 1; ++i) {
// i + M != 1 to ensure that for i = 0 and M = 1, 1 is not considered a prime number.
if (primesNow[i] == true && (i + M) != 1)
cout << i + M << endl; // Print our prime numbers in the range.
}
However, I didn't find this code intuitive and it was not easy to understand.
Can someone explain the general idea behind the above algorithm?
What alternative algorithms are there to mark non-prime numbers in a range?
That's overly complicated. Let's start with a basic Sieve of Eratosthenes, in pseudocode, that outputs all the primes less than or equal to n:
function primes(n)
sieve := makeArray(2..n, True)
for p from 2 to n
if sieve[p]
output(p)
for i from p*p to n step p
sieve[p] := False
This function calls output on each prime p; output can print the primes, or sum the primes, or count them, or do whatever you want to do with them. The outer for loop considers each candidate prime in turn; The sieving occurs in the inner for loop where multiples of the current prime p are removed from the sieve.
Once you understand how that works, go here for a discussion of the segmented Sieve of Eratosthenes over a range.
Have you considered the sieve on a bit level, it can provide a bit larger number of primes, and with the buffer, you could modify it to find for example the primes between 2 and 2^60 using 64 bit ints, by reusing the same buffer, while preserving the offsets of the primes already discovered. The following will use an array of integers.
Declerations
#include <math.h> // sqrt(), the upper limit need to eliminate
#include <stdio.h> // for printing, could use <iostream>
Macros to manipulate bit, the following will use 32bit ints
#define BIT_SET(d, n) (d[n>>5]|=1<<(n-((n>>5)<<5)))
#define BIT_GET(d, n) (d[n>>5]&1<<(n-((n>>5)<<5)))
#define BIT_FLIP(d, n) (d[n>>5]&=~(1<<(n-((n>>5)<<5))))
unsigned int n = 0x80000; // the upper limit 1/2 mb, with 32 bits each
// will get the 1st primes upto 16 mb
int *data = new int[n]; // allocate
unsigned int r = n * 0x20; // the actual number of bits avalible
Could use zeros to save time but, on (1) for prime, is a bit more intuitive
for(int i=0;i<n;i++)
data[i] = 0xFFFFFFFF;
unsigned int seed = 2; // the seed starts at 2
unsigned int uLimit = sqrt(r); // the upper limit for checking off the sieve
BIT_FLIP(data, 1); // one is not prime
Time to discover the primes this took under a half second
// untill uLimit is reached
while(seed < uLimit) {
// don't include itself when eliminating canidates
for(int i=seed+seed;i<r;i+=seed)
BIT_FLIP(data, i);
// find the next bit still active (set to 1), don't include the current seed
for(int i=seed+1;i<r;i++) {
if (BIT_GET(data, i)) {
seed = i;
break;
}
}
}
Now for the output this will consume the most time
unsigned long bit_index = 0; // the current bit
int w = 8; // the width of a column
unsigned pc = 0; // prime, count, to assist in creating columns
for(int i=0;i<n;i++) {
unsigned long long int b = 1; // double width, so there is no overflow
// if a bit is still set, include that as a result
while(b < 0xFFFFFFFF) {
if (data[i]&b) {
printf("%8.u ", bit_index);
if(((pc++) % w) == 0)
putchar('\n'); // add a new row
}
bit_index++;
b<<=1; // multiply by 2, to check the next bit
}
}
clean up
delete [] data;