Project Euler 3 (performance) - c++

This is my solution to Problem 3 from Project Euler. Is there any way how to make the solution more efficient?
int largestPrimeFactor(unsigned _int64 x)
{
unsigned __int64 remainder = x;
int max_prime;
for (int i = 2; i <= remainder; i++)
{
while(remainder%i==0) {
remainder /= i;
max_prime = i;
}
}
return max_prime;
}
Update: Thank you all for the proposals. Based on them I modified the algorithm as follows:
1) Skip even candidate divisors.
while(remainder%2==0) {
max_prime = 2;
remainder /= 2;
}
for (int i = 3; i <= remainder; i += 2)
{
while(remainder%i==0) {
max_prime = i;
remainder /= i;
}
}
2) Work up to square root of remainder.
for (int i = 2; i*i <= remainder; i++)
{
while(remainder%i==0) {
max_prime = i;
remainder /= i;
cout << i << " " << remainder << endl;
}
}
if (remainder > 1) max_prime = remainder;
3) Generate prime numbers in advance using Sieve of Eratosthenes algorithm. Probably not worth in this simple example.

Ok, this is my take on it. Hope it might be of use (EDIT: not to provoke "don't use leading underscore" comments - added a namespace).
EDIT #2: Added a faster function get_factor_prime_faster() with a helper function. See notes about the speed tests in the end.
#include <cstdint>
#include <iostream>
namespace so
{
// Head-on approach: get_factor_prime()
std::uint64_t get_factor_prime( std::uint64_t _number )
{
for( std::uint64_t i_ = 2; i_ * i_ <= _number; ++i_ )
if( _number % i_ == 0 )
return ( get_factor_prime( _number / i_ ) );
return ( _number );
}
// Slightly improved approach: get_factor_prime_faster() and detail::get_factor_prime_odd()
namespace detail
{
std::uint64_t get_factor_prime_odd( std::uint64_t _number )
{
for( std::uint64_t i_ = 3; i_ * i_ <= _number; i_ += 2 )
if( _number % i_ == 0 )
return ( get_factor_prime_odd( _number / i_ ) );
return ( _number );
}
} // namespace so::detail
std::uint64_t get_factor_prime_faster( std::uint64_t _number )
{
while( _number % 2 == 0 )
_number /= 2;
return ( detail::get_factor_prime_odd( _number ) );
}
} // namespace so
int main()
{
std::cout << so::get_factor_prime( 600851475143 ) << std::endl;
std::cout << so::get_factor_prime( 13195 ) << std::endl;
std::cout << so::get_factor_prime( 101 ) << std::endl;
std::cout << so::get_factor_prime_faster( 600851475143 ) << std::endl;
std::cout << so::get_factor_prime_faster( 13195 ) << std::endl;
std::cout << so::get_factor_prime_faster( 101 ) << std::endl;
return( 0 );
}
Program output:
6857
29
101
6857
29
101
Admittedly, I still cannot figure out how to check easily if a number is a prime...
EDIT: Tested in a loop for 600851475143 * 1024 number, GCC 4.7.2 with -O3, Linux, Intel i5 Core. Times are as follows (approximately):
get_factor_prime is 3 times faster than OP's solution;
get_factor_prime_faster is 6 times faster than OP's solution.

A common approach:
Step 1: Generate prime numbers up to ceil(sqrt(number)) using the Sieve of Eratosthenes algorithm.
Step 2: Use these to factor the number.

Related

Prime-checking every element of a 10^6 array

My goal is to figure out whether each element of an array is a prime or not.
Example:
Input: int A[5]={1,2,3,4,5}
Output: bool P[5]={0,1,1,0,1}
The problem is the array size is up to 10^6. I tried the most efficient prime-checking algorithm
(code: http://cpp.sh/9ewxa) but just the "cin" and "prime_checking" take really long time. How should I solve this problem, Thanks.
Your "most efficient" prime test is actually horribly inefficient. Something like the Miller-Rabin primality test is much faster on a one by one basis. If your input are below 4.3 billion (i.e. uint32_t) then you only need to do 3 tests: a = 2, 7, and 61. For numbers in the uint64_t range it's 12 tests.
If you have a large array of integers then computing all primes up to some maximum might be faster than repeated tests. See Sieve of Eratosthenes for a good way to compute all primes fast. But it's impractical if your input numbers can be larger than 4 billion due to the memory required.
Here is some code that computes a Sieve up to UINT32_MAX+1 and then checks Miller-Rabin has the same results as the sieve: https://gist.github.com/mrvn/137fb0c8a5c78dbf92108b696ff82d92
#include <iostream>
#include <cstdint>
#include <array>
#include <ranges>
#include <cassert>
#include <bitset>
uint32_t pow_n(uint32_t a, uint32_t d, uint32_t n) {
if (d == 0) return 1;
if (d == 1) return a;
uint32_t t = pow_n(a, d / 2, n);
t = ((uint64_t)t * t) % n;
if (d % 2 == 0) {
return t;
} else {
return ((uint64_t)t * a) % n;
}
};
bool test(uint32_t n, unsigned s, uint32_t d, uint32_t a) {
//std::cout << "test(n = " << n << ", s = " << s << ", d = " << d << ", a = " << a << ")\n";
uint32_t x = pow_n(a ,d ,n);
//std::cout << " x = " << x << std::endl;
if (x == 1 || x == n - 1) return true;
for (unsigned i = 1; i < s; ++i) {
x = ((uint64_t)x * x) % n;
if (x == n - 1) return true;
}
return false;
}
bool is_prime(uint32_t n) {
static const std::array witnesses{2u, 3u, 5u, 7u, 11u, 13u, 17u, 19u, 23u, 29u, 31u, 37u};
static const std::array bounds{
2'047llu, 1'373'653llu, 25'326'001llu, 3'215'031'751llu,
2'152'302'898'747llu, 3'474'749'660'383llu,
341'550'071'728'321llu, 341'550'071'728'321llu /* no bounds for 19 */,
3'825'123'056'546'413'051llu,
3'825'123'056'546'413'051llu /* no bound for 29 */,
3'825'123'056'546'413'051llu /* no bound for 31 */,
(unsigned long long)UINT64_MAX /* off by a bit but it's the last bounds */,
};
static_assert(witnesses.size() == bounds.size());
if (n == 2) return true; // 2 is prime
if (n % 2 == 0) return false; // other even numbers are not
if (n <= witnesses.back()) { // I know the first few primes
return (std::ranges::find(witnesses, n) != std::end(witnesses));
}
// write n = 2^s * d + 1 with d odd
unsigned s = 0;
uint32_t d = n - 1;
while (d % 2 == 0) {
++s;
d /= 2;
}
// test widtnesses until the bounds say it's a sure thing
auto it = bounds.cbegin();
for (auto a : witnesses) {
//std::cout << a << " ";
if (!test(n, s, d, a)) return false;
if (n < *it++) return true;
}
return true;
}
template<std::size_t N>
auto composite() {
std::bitset<N / 2 + 1> is_composite;
for (uint32_t i = 3; (uint64_t)i * i < N; i += 2) {
if (is_composite[i / 2]) continue;
for (uint64_t j = i * i; j < N; j += 2 * i) is_composite[j / 2] = true;
}
return is_composite;
}
bool slow_prime(uint32_t n) {
static const auto is_composite = composite<UINT32_MAX + 1llu>();
if (n < 2) return false;
if (n == 2) return true;
if (n % 2 == 0) return false;
return !is_composite.test(n / 2);
}
int main() {
/*
std::cout << "2047: ";
bool fast = is_prime(2047);
bool slow = slow_prime(2047);
std::cout << (fast ? "fast prime" : "");
std::cout << (slow ? "slow prime" : "");
std::cout << std::endl;
*/
//std::cout << "2: prime\n";
for (uint64_t i = 0; i <= UINT32_MAX; ++i) {
if (i % 1000000 == 1) { std::cout << "\r" << i << " "; std::cout.flush(); }
bool fast = is_prime(i);
bool slow = slow_prime(i);
if (fast != slow) std::cout << i << std::endl;
assert(fast == slow);
//std::cout << i << ": " << (is_prime(i) ? "prime" : "") << std::endl;
}
}
The sieve takes ~15s to compute and uses 256MB of memory, verifying Miller-Rabin takes ~12m45s or 765 times slower than the sieve. Which tells me that if you are testing more than 85 million 32bit numbers for primes then just compute them all with a sieve. Since the sieve is O(n^2) it only gets better if your maximum input is smaller.

Choose maximum number in array so that their GCD is > 1

Question:
Given an array arr[] with N integers.
What is the maximum number of items that can be chosen from the array so that their GCD is greater than 1?
Example:
4
30 42 105 1
Answer: 3
Constransts
N <= 10^3
arr[i] <= 10^18
My take:
void solve(int i, int gcd, int chosen){
if(i > n){
maximize(res, chosen);
return;
}
solve(i+1, gcd, chosen);
if(gcd == -1) solve(i+1, arr[i], chosen+1);
else{
int newGcd = __gcd(gcd, arr[i]);
if(newGcd > 1) solve(i+1, newGcd, chosen+1);
}
}
After many tries, my code still clearly got TLE, is there any more optimized solution for this problem?
Interesting task you have. I implemented two variants of solutions.
All algorithms that are used in my code are: Greatest Common Divisor (through Euclidean Algorithm), Binary Modular Exponentiation, Pollard Rho, Trial Division, Fermat Primality Test.
First variant called SolveCommon() iteratively finds all possible unique factors of all numbers by computing pairwise Greatest Common Divisor.
When all possible unique factors are found one can compute count of each unique factor inside each number. Finally maximal count for any factor will be final answer.
Second variant called SolveFactorize() finds all factor by doing factorization of each number using three algorithms: Pollard Rho, Trial Division, Fermat Primality Test.
Pollard-Rho factorization algorithm is quite fast, it has time complexity O(N^(1/4)), so for 64-bit number it will take around 2^16 iterations. To compare, Trial Division algorithm has complexity of O(N^(1/2)) which is square times slower than Pollard Rho. So in code below Pollard Rho can handle 64 bit inputs, although not very fast.
First variant SolveCommon() is much faster than second SolveFactorize(), especially if numbers are quite large, timings are provided in console output after following code.
Code below as an example provides test of random 100 numbers each 20 bit. 64 bit 1000 numbers are too large to handle by SolveFactorize() method, but SolveCommon() method solves 1000 64-bit numbers within 1-2 seconds.
Try it online!
#include <cstdint>
#include <random>
#include <tuple>
#include <unordered_map>
#include <algorithm>
#include <set>
#include <iostream>
#include <chrono>
#include <cmath>
#include <map>
#define LN { std::cout << "LN " << __LINE__ << std::endl; }
using u64 = uint64_t;
using u128 = unsigned __int128;
static std::mt19937_64 rng{123}; //{std::random_device{}()};
auto CurTime() {
return std::chrono::high_resolution_clock::now();
}
static auto const gtb = CurTime();
double Time() {
return std::llround(std::chrono::duration_cast<
std::chrono::duration<double>>(CurTime() - gtb).count() * 1000) / 1000.0;
}
u64 PowMod(u64 a, u64 b, u64 const c) {
u64 r = 1;
while (b != 0) {
if (b & 1)
r = (u128(r) * a) % c;
a = (u128(a) * a) % c;
b >>= 1;
}
return r;
}
bool IsFermatPrp(u64 N, size_t ntrials = 24) {
// https://en.wikipedia.org/wiki/Fermat_primality_test
if (N <= 10)
return N == 2 || N == 3 || N == 5 || N == 7;
for (size_t trial = 0; trial < ntrials; ++trial)
if (PowMod(rng() % (N - 3) + 2, N - 1, N) != 1)
return false;
return true;
}
bool FactorTrialDivision(u64 N, std::vector<u64> & factors, u64 limit = u64(-1)) {
// https://en.wikipedia.org/wiki/Trial_division
if (N <= 1)
return true;
while ((N & 1) == 0) {
factors.push_back(2);
N >>= 1;
}
for (u64 d = 3; d <= limit && d * d <= N; d += 2)
while (N % d == 0) {
factors.push_back(d);
N /= d;
}
if (N > 1)
factors.push_back(N);
return N == 1;
}
u64 GCD(u64 a, u64 b) {
// https://en.wikipedia.org/wiki/Euclidean_algorithm
while (b != 0)
std::tie(a, b) = std::make_tuple(b, a % b);
return a;
}
bool FactorPollardRho(u64 N, std::vector<u64> & factors) {
// https://en.wikipedia.org/wiki/Pollard%27s_rho_algorithm
auto f = [N](auto x) -> u64 { return (u128(x + 1) * (x + 1)) % N; };
auto DiffAbs = [](auto x, auto y){ return x >= y ? x - y : y - x; };
if (N <= 1)
return true;
if (IsFermatPrp(N)) {
factors.push_back(N);
return true;
}
for (size_t trial = 0; trial < 8; ++trial) {
u64 x = rng() % (N - 2) + 1;
size_t total_steps = 0;
for (size_t cycle = 1;; ++cycle) {
bool good = true;
u64 y = x;
for (u64 i = 0; i < (u64(1) << cycle); ++i) {
x = f(x);
++total_steps;
u64 const d = GCD(DiffAbs(x, y), N);
if (d > 1) {
if (d == N) {
good = false;
break;
}
//std::cout << N << ": " << d << ", " << total_steps << std::endl;
if (!FactorPollardRho(d, factors))
return false;
if (!FactorPollardRho(N / d, factors))
return false;
return true;
}
}
if (!good)
break;
}
}
factors.push_back(N);
return false;
}
void Factor(u64 N, std::vector<u64> & factors) {
if (N <= 1)
return;
if (1) {
FactorTrialDivision(N, factors, 1 << 8);
N = factors.back();
factors.pop_back();
}
FactorPollardRho(N, factors);
}
size_t SolveFactorize(std::vector<u64> const & nums) {
std::unordered_map<u64, size_t> cnts;
std::vector<u64> factors;
std::set<u64> unique_factors;
for (auto num: nums) {
factors.clear();
Factor(num, factors);
//std::cout << num << ": "; for (auto f: factors) std::cout << f << " "; std::cout << std::endl;
unique_factors.clear();
unique_factors.insert(factors.begin(), factors.end());
for (auto f: unique_factors)
++cnts[f];
}
size_t max_cnt = 0;
for (auto [_, cnt]: cnts)
max_cnt = std::max(max_cnt, cnt);
return max_cnt;
}
size_t SolveCommon(std::vector<u64> const & nums) {
size_t const K = nums.size();
std::set<u64> cmn(nums.begin(), nums.end()), cmn2, tcmn;
std::map<u64, bool> used;
cmn.erase(1);
while (true) {
cmn2.clear();
used.clear();
for (auto i = cmn.rbegin(); i != cmn.rend(); ++i) {
auto j = i;
++j;
for (; j != cmn.rend(); ++j) {
auto gcd = GCD(*i, *j);
if (gcd != 1) {
used[*i] = true;
used[*j] = true;
cmn2.insert(gcd);
cmn2.insert(*i / gcd);
cmn2.insert(*j / gcd);
break;
}
}
if (!used[*i])
tcmn.insert(*i);
}
cmn2.erase(1);
if (cmn2.empty())
break;
cmn = cmn2;
}
//for (auto c: cmn) std::cout << c << " "; std::cout << std::endl;
std::unordered_map<u64, size_t> cnts;
for (auto num: nums)
for (auto c: tcmn)
if (num % c == 0)
++cnts[c];
size_t max_cnt = 0;
for (auto [_, cnt]: cnts)
max_cnt = std::max(max_cnt, cnt);
return max_cnt;
}
void TestRandom() {
size_t const cnt_nums = 1000;
std::vector<u64> nums;
for (size_t i = 0; i < cnt_nums; ++i) {
nums.push_back((rng() & ((u64(1) << 20) - 1)) | 1);
//std::cout << nums.back() << " ";
}
//std::cout << std::endl;
{
auto tb = Time();
std::cout << "common " << SolveCommon(nums) << " time " << (Time() - tb) << std::endl;
}
{
auto tb = Time();
std::cout << "factorize " << SolveFactorize(nums) << " time " << (Time() - tb) << std::endl;
}
}
int main() {
TestRandom();
}
Output:
common 325 time 0.061
factorize 325 time 0.005
I think you need to search among all possible prime numbers to find out which prime number can divide most element in the array.
Code:
std::vector<int> primeLessEqualThanN(int N) {
std::vector<int> primes;
for (int x = 2; x <= N; ++x) {
bool isPrime = true;
for (auto& p : primes) {
if (x % p == 0) {
isPrime = false;
break;
}
}
if (isPrime) primes.push_back(x);
}
return primes;
}
int maxNumberGCDGreaterThan1(int N, std::vector<int>& A) {
int A_MAX = *std::max_element(A.begin(), A.end()); // largest number in A
std::vector<int> primes = primeLessEqualThanN(std::sqrt(A_MAX));
int max_count = 0;
for (auto& p : primes) {
int count = 0;
for (auto& n : A)
if (n % p == 0)
count++;
max_count = count > max_count ? count : max_count;
}
return max_count;
}
Note that in this way you cannot find out the value of the GCD, the code is based on that we dont need to know it.

Can you help me with my recursive function?

I have this simple function that I am calculating correctly, but my output statements are off. I have tried other places were the cout statement is commented, but not working.
int recursiveFunc(int n) {
int val; // value at nth sequence
//cout << "(" << n << ") = " << val << endl;
if (n == 1 ) { // base case 1
val = -1;
// outputs before function call
}
else if (n == 2) { // base case 2
val = -1;
// outputs before function call
}
else { // recursive case
//cout << "(" << n << ") = << val << endl;
val = 2*(recursiveFunc(n-1) + recursiveFunc(n-2));
cout << "(" << n << ") = " << val << endl; // not sure where to put cout statement
}
return val;
}
I am looking for an output like (for example n = 5):
(1) = -1
(2) = -1
(3) = -4
(4) = -10
(5) = -28
currently, my output looks like:
(1) = -1
(2) = -1
(3) = -4
(4) = -10
(3) = -4 // here an nth term is displayed twice
(5) = -28
Move your cout statement to outside the function and it works fine.
int recursiveFunc(int n) {
int val; // value at nth sequence
if (n == 1) { // base case 1
val = -1;
}
else if (n == 2) { // base case 2
val = -1;
}
else { // recursive case
val = 2 * (recursiveFunc(n - 1) + recursiveFunc(n - 2));
}
return val;
}
int main() {
for (int i = 1; i < 10; i++) {
std::cout << "(" << i << ") = " << recursiveFunc(i) << endl;
}
return 0;
}
Output:
(1) = -1
(2) = -1
(3) = -4
(4) = -10
(5) = -28
(6) = -76
(7) = -208
(8) = -568
(9) = -1552
Right off the top-of-my-head, a possible fix is to keep track of the highest n output so-far already and only output when a larger n is encountered. (This is probably not the best solution, but I'm feeling too tired and lazy to think about alternatives)
Don't use a global variable for this (mutable global state is an anti-pattern!) - you'll need to pass it as another parameter.
int output_n_and_val( int n, int val, int& biggestN ) {
if( n > biggestN ) {
cout << "(" << n << ") = " << val << endl;
biggestN = n;
}
return val;
}
int recursiveFuncImpl( int n, int& biggestN ) {
if( n == 1 ) {
return output_n_and_val( n, -1, biggestN );
}
else if( n == 2 ) {
return output_n_and_val( n, -1, biggestN );
}
else {
int val = 2 * ( recursiveFuncImpl( n - 1, biggestN ) + recursiveFuncImpl( n - 2, biggestN ) );
return output_n_and_val( n, val, biggestN );
}
}
// Entrypoint function:
int recursiveFunc( int n ) {
int biggestN = -1;
return recursiveFuncImpl( n, biggestN );
}
int main()
{
recursiveFunc( 10 );
recursiveFunc( 5 );
return 0;
}
The downside to this approach is that because of the n == 2 case-handling is encountered before n == 1 you'll never get the output (1) = -1. Fixing that is an exercise left for the reader.
you should use an array to store values that you have computed, for example
const int N = 1000000;
int values[N]; // remember to memset to 0 first
int recursiveFunc(int n) {
if(n < 0) return 0;
if(values[n] != 0) return values[n];
int val; // value at nth sequence
if (n == 1) { // base case 1
val = -1;
}
else if (n == 2) { // base case 2
val = -1;
}
else { // recursive case
val = 2 * (recursiveFunc(n - 1) + recursiveFunc(n - 2));
}
return values[n] = val;
}
int main() {
memset(values, 0, sizeof(values));
for (int i = 1; i < 10; i++) {
std::cout << "(" << i << ") = " << recursiveFunc(i) << endl;
}
return 0;
}

How to toggle a variable in a loop

Variable i toggles between 2 and 3 and multiplied into a, as in the following example:
a=2;
a=a*i // a=2*2=4 i=2
a=a*i // a=4*3=12 i=3
a=a*i // a=12*2=24 i=2
a=a*i // a=24*3=72 i=3
which goes on as long as a is < 1000.
How can I give the i two values sequentially as shown in the example?
int a = 2, i = 2;
while( a < 1000 )
{
a *= i;
i = 5 - i;
}
and many other ways.
You should be able to use a loop
int a = 2;
bool flip = true;
while (a < 1000)
{
a *= flip ? 2 : 3;
flip = !flip;
}
You can't have i be equal to two values at the same time. You can however make i alternate between 2 and 3 until a < 1000. Below is the code;
int a = 2;
int counter = 0;
while (a < 1000) {
if (counter % 2 == 0) {
a *= 2;
}
else {
a *= 3;
}
counter++;
}
Here's a quick solution that doesn't involve a conditional.
int c = 0;
while (a < 1000)
a *= (c++ % 2) + 2;
or even
for(int c = 0; a < 1000; c++)
a *= (c % 2) + 2;
The modulo is found, which results in either a 0 or a 1 and then shifted up by 2 resulting in either 2 or 3.
Here's another efficient way to do this.
#include <iostream>
using namespace std;
int main() {
int its_bacon_time;
int i = ++(its_bacon_time = 0);
int y = 18;
int z = 9;
bool flag = !false;
int sizzle;
typedef bool decision_property;
#define perhaps (decision_property)(-42*42*-42)
#ifdef perhaps
# define YUM -
# define YUMMM return
#endif
bool bacon = !(bool) YUM(sizzle = 6);
if(flag)
std::cout << "YEP" << std::endl;
while (flag) {
if (bacon = !bacon)
flag = !flag; // YUM()?
if (YUM((YUM-i)YUM(i*2))+1>=((0x1337|0xECC8)&0x3E8))
(*((int*)&flag)) &= 0x8000;
else
flag = perhaps;
std::cout << i << " ";
int multiplicative_factor = y / (bacon ? z : y);
int* temporal_value_indicator = &i;
(**(&temporal_value_indicator)) *=
(((((multiplicative_factor & 0x0001) > 0) ? sizzle : bacon) // ~yum~
<< 1) ^ (bacon? 0 : 15));
std::cout << (((((multiplicative_factor & 0x0001) > 0) ? sizzle : bacon) // ~yum~
<< 1) ^ (bacon? 0 : 15)) << std::endl;
}
YUMMM its_bacon_time;
}
Point is that you should probably try something yourself first before asking for something that is really simple to achieve.
int main()
{
int a = 2;
int multiplier;
for (int i = 0; a < 1000; ++i)
{
multiplier = (i % 2) ? 2 : 3;
a *= multiplier;
}
}

self made pow() c++

I was reading through How can I write a power function myself? and the answer given by dan04 caught my attention mainly because I am not sure about the answer given by fortran, but I took that and implemented this:
#include <iostream>
using namespace std;
float pow(float base, float ex){
// power of 0
if (ex == 0){
return 1;
// negative exponenet
}else if( ex < 0){
return 1 / pow(base, -ex);
// even exponenet
}else if ((int)ex % 2 == 0){
float half_pow = pow(base, ex/2);
return half_pow * half_pow;
//integer exponenet
}else{
return base * pow(base, ex - 1);
}
}
int main(){
for (int ii = 0; ii< 10; ii++){\
cout << "pow(" << ii << ".5) = " << pow(ii, .5) << endl;
cout << "pow(" << ii << ",2) = " << pow(ii, 2) << endl;
cout << "pow(" << ii << ",3) = " << pow(ii, 3) << endl;
}
}
though I am not sure if I translated this right because all of the calls giving .5 as the exponent return 0. In the answer it states that it might need a log2(x) based on a^b = 2^(b * log2(a)), but I am unsure about putting that in as I am unsure where to put it, or if I am even thinking about this right.
NOTE: I know that this might be defined in a math library, but I don't need all the added expense of an entire math library for a few functions.
EDIT: does anyone know a floating-point implementation for fractional exponents? (I have seen a double implementation, but that was using a trick with registers, and I need floating-point, and adding a library just to do a trick I would be better off just including the math library)
I have looked at this paper here which describes how to approximate the exponential function for double precision. After a little research on Wikipedia about single precision floating point representation I have worked out the equivalent algorithms. They only implemented the exp function, so I found an inverse function for the log and then simply did
POW(a, b) = EXP(LOG(a) * b).
compiling this gcc4.6.2 yields a pow function almost 4 times faster than the standard library's implementation (compiling with O2).
Note: the code for EXP is copied almost verbatim from the paper I read and the LOG function is copied from here.
Here is the relevant code:
#define EXP_A 184
#define EXP_C 16249
float EXP(float y)
{
union
{
float d;
struct
{
#ifdef LITTLE_ENDIAN
short j, i;
#else
short i, j;
#endif
} n;
} eco;
eco.n.i = EXP_A*(y) + (EXP_C);
eco.n.j = 0;
return eco.d;
}
float LOG(float y)
{
int * nTemp = (int*)&y;
y = (*nTemp) >> 16;
return (y - EXP_C) / EXP_A;
}
float POW(float b, float p)
{
return EXP(LOG(b) * p);
}
There is still some optimization you can do here, or perhaps that is good enough.
This is a rough approximation but if you would have been satisfied with the errors introduced using the double representation, I imagine this will be satisfactory.
I think the algorithm you're looking for could be 'nth root'. With an initial guess of 1 (for k == 0):
#include <iostream>
using namespace std;
float pow(float base, float ex);
float nth_root(float A, int n) {
const int K = 6;
float x[K] = {1};
for (int k = 0; k < K - 1; k++)
x[k + 1] = (1.0 / n) * ((n - 1) * x[k] + A / pow(x[k], n - 1));
return x[K-1];
}
float pow(float base, float ex){
if (base == 0)
return 0;
// power of 0
if (ex == 0){
return 1;
// negative exponenet
}else if( ex < 0){
return 1 / pow(base, -ex);
// fractional exponent
}else if (ex > 0 && ex < 1){
return nth_root(base, 1/ex);
}else if ((int)ex % 2 == 0){
float half_pow = pow(base, ex/2);
return half_pow * half_pow;
//integer exponenet
}else{
return base * pow(base, ex - 1);
}
}
int main_pow(int, char **){
for (int ii = 0; ii< 10; ii++){\
cout << "pow(" << ii << ", .5) = " << pow(ii, .5) << endl;
cout << "pow(" << ii << ", 2) = " << pow(ii, 2) << endl;
cout << "pow(" << ii << ", 3) = " << pow(ii, 3) << endl;
}
return 0;
}
test:
pow(0, .5) = 0.03125
pow(0, 2) = 0
pow(0, 3) = 0
pow(1, .5) = 1
pow(1, 2) = 1
pow(1, 3) = 1
pow(2, .5) = 1.41421
pow(2, 2) = 4
pow(2, 3) = 8
pow(3, .5) = 1.73205
pow(3, 2) = 9
pow(3, 3) = 27
pow(4, .5) = 2
pow(4, 2) = 16
pow(4, 3) = 64
pow(5, .5) = 2.23607
pow(5, 2) = 25
pow(5, 3) = 125
pow(6, .5) = 2.44949
pow(6, 2) = 36
pow(6, 3) = 216
pow(7, .5) = 2.64575
pow(7, 2) = 49
pow(7, 3) = 343
pow(8, .5) = 2.82843
pow(8, 2) = 64
pow(8, 3) = 512
pow(9, .5) = 3
pow(9, 2) = 81
pow(9, 3) = 729
I think that you could try to solve it by using the Taylor's series,
check this.
http://en.wikipedia.org/wiki/Taylor_series
With the Taylor's series you can solve any difficult to solve calculation such as 3^3.8 by using the already known results such as 3^4. In this case you have
3^4 = 81 so
3^3.8 = 81 + 3.8*3( 3.8 - 4) +..+.. and so on depend on how big is your n you will get the closer solution of your problem.
I and my friend faced similar problem while we're on an OpenGL project and math.h didn't suffice in some cases. Our instructor also had the same problem and he told us to seperate power to integer and floating parts. For example, if you are to calculate x^11.5 you may calculate sqrt(x^115, 10) which may result more accurate result.
Reworked on #capellic answer, so that nth_root works with bigger values as well.
Without the limitation of an array that is allocated for no reason:
#include <iostream>
float pow(float base, float ex);
inline float fabs(float a) {
return a > 0 ? a : -a;
}
float nth_root(float A, int n, unsigned max_iterations = 500, float epsilon = std::numeric_limits<float>::epsilon()) {
if (n < 0)
throw "Invalid value";
if (n == 1 || A == 0)
return A;
float old_value = 1;
float value;
for (int k = 0; k < max_iterations; k++) {
value = (1.0 / n) * ((n - 1) * old_value + A / pow(old_value, n - 1));
if (fabs(old_value - value) < epsilon)
return value;
old_value = value;
}
return value;
}
float pow(float base, float ex) {
if (base == 0)
return 0;
if (ex == 0){
// power of 0
return 1;
} else if( ex < 0) {
// negative exponent
return 1 / pow(base, -ex);
} else if (ex > 0 && ex < 1) {
// fractional exponent
return nth_root(base, 1/ex);
} else if ((int)ex % 2 == 0) {
// even exponent
float half_pow = pow(base, ex/2);
return half_pow * half_pow;
} else {
// integer exponent
return base * pow(base, ex - 1);
}
}
int main () {
for (int i = 0; i <= 128; i++) {
std::cout << "pow(" << i << ", .5) = " << pow(i, .5) << std::endl;
std::cout << "pow(" << i << ", .3) = " << pow(i, .3) << std::endl;
std::cout << "pow(" << i << ", 2) = " << pow(i, 2) << std::endl;
std::cout << "pow(" << i << ", 3) = " << pow(i, 3) << std::endl;
}
std::cout << "pow(" << 74088 << ", .3) = " << pow(74088, .3) << std::endl;
return 0;
}
This solution of MINE will be accepted upto O(n) time complexity
utpo input less then 2^30 or 10^8
IT will not accept more then these inputs
It WILL GIVE TIME LIMIT EXCEED warning
but easy understandable solution
#include<bits/stdc++.h>
using namespace std;
double recursive(double x,int n)
{
// static is important here
// other wise it will store same values while multiplying
double p = x;
double ans;
// as we multiple p it will multiply it with q which has the
//previous value of this ans latter we will update the q
// so that q has fresh value for further test cases here
static double q=1; // important
if(n==0){ ans = q; q=1; return ans;}
if(n>0)
{
p *= q;
// stored value got multiply by p
q=p;
// and again updated to q
p=x;
//to update the value to the same value of that number
// cout<<q<<" ";
recursive(p,n-1);
}
return ans;
}
class Solution {
public:
double myPow(double x, int n) {
// double q=x;double N=n;
// return pow(q,N);
// when both sides are double this function works
if(n==0)return 1;
x = recursive(x,abs(n));
if(n<0) return double(1/x);
// else
return x;
}
};
For More help you may try
LEETCODE QUESTION NUMBER 50
**NOW the Second most optimize code pow(x,n) **
logic is that we have to solve it in O(logN) so we devide the n by 2
when we have even power n=4 , 4/2 is 2 means we have to just square it (22)(22)
but when we have odd value of power like n=5, 5/2 here we have square it to get
also the the number itself to it like (22)(2*2)*2 to get 2^5 = 32
HOPE YOU UNDERSTAND FOR MORE YOU CAN VISIT
POW(x,n) question on leetcode
below the optimized code and above code was for O(n) only
*
#include<bits/stdc++.h>
using namespace std;
double recursive(double x,int n)
{
// recursive calls will return the whole value of the program at every calls
if(n==0){return 1;}
// 1 is multiplied when the last value we get as we don't have to multiply further
double store;
store = recursive(x,n/2);
// call the function after the base condtion you have given to it here
if(n%2==0)return store*store;
else
{
return store*store*x;
// odd power we have the perfect square multiply the value;
}
}
// main function or the function for indirect call to recursive function
double myPow(double x, int n) {
if(n==0)return 1;
x = recursive(x,abs(n));
// for negatives powers
if(n<0) return double(1/x);
// else for positves
return x;
}