Logarithm of the very-very large number - c++

I have to find log of very large number.
I do this in C++
I have already made a function of multiplication, addition, subtraction, division, but there were problems with the logarithm. I do not need code, I need a simple idea how to do it using these functions.
Thanks.
P.S.
Sorry, i forgot to tell you: i have to find only binary logarithm of that number
P.S.-2
I found in Wikipedia:
int floorLog2(unsigned int n) {
if (n == 0)
return -1;
int pos = 0;
if (n >= (1 <<16)) { n >>= 16; pos += 16; }
if (n >= (1 << 8)) { n >>= 8; pos += 8; }
if (n >= (1 << 4)) { n >>= 4; pos += 4; }
if (n >= (1 << 2)) { n >>= 2; pos += 2; }
if (n >= (1 << 1)) { pos += 1; }
return pos;
}
if I remade it under the big numbers, it will work correctly?

I assume you're writing a bignum class of your own. If you only care about an integral result of log2, it's quite easy. Take the log of the most significant digit that's not zero, and add 8 for each byte after that one. This is assuming that each byte holds values 0-255. These are only accurate within ±.5, but very fast.
[0][42][53] (10805 in bytes)
log2(42) = 5
+ 8*1 = 8 (because of the one byte lower than MSB)
= 13 (Actual: 13.39941145)
If your values hold base 10 digits, that works out to log2(MSB)+3.32192809*num_digits_less_than_MSB.
[0][5][7][6][2] (5762)
log2(5) = 2.321928095
+ 3.32192809*3 = 9.96578427 (because 3 digits lower than MSB)
= 12.28771 (Actual: 12.49235395)
(only accurate for numbers with less than ~10 million digits)
If you used the algorithm you found on wikipedia, it will be IMMENSELY slow. (but accurate if you need decimals)
It's been pointed out that my method is inaccurate when the MSB is small (still within ±.5, but no farther), but this is easily fixed by simply shifting the top two bytes into a single number, taking the log of that, and doing the multiplication for the bytes less than that number. I believe this will be accurate within half a percent, and still significantly faster than a normal logarithm.
[1][42][53] (76341 in bytes)
log2(1*256+42) = ?
log2(298) = 8.21916852046
+ 8*1 = 8 (because of the one byte lower than MSB)
= 16.21916852046 (Actual: 16.2201704643)
For base 10 digits, it's log2( [mostSignificantDigit]*10+[secondMostSignifcantDigit] ) + 3.32192809*[remainingDigitCount].
If performance is still an issue, you can use lookup tables for the log2 instead of using a full logarithm function.

I assume you want to know how to compute the logarithm "by hand". So I tell you what I've found for this.
Have a look over here, where it is described how to logarithmize by hand. You can implement this as an algorithm. Here's an article by "How Euler did it". I also find this article promising.
I suppose there are more sophisticated methods to do this, but they are so involved you probably don't want to implement them.

Related

C++: Binary to Decimal Conversion

I am trying to convert a binary array to decimal in following way:
uint8_t array[8] = {1,1,1,1,0,1,1,1} ;
int decimal = 0 ;
for(int i = 0 ; i < 8 ; i++)
decimal = (decimal << 1) + array[i] ;
Actually I have to convert 64 bit binary array to decimal and I have to do it for million times.
Can anybody help me, is there any faster way to do the above ? Or is the above one is nice ?
Your method is adequate, to call it nice I would just not mix bitwise operations and "mathematical" way of converting to decimal, i.e. use either
decimal = decimal << 1 | array[i];
or
decimal = decimal * 2 + array[i];
It is important, before attempting any optimisation, to profile the code. Time it, look at the code being generated, and optimise only when you understand what is going on.
And as already pointed out, the best optimisation is to not do something, but to make a higher level change that removes the need.
However...
Most changes you might want to trivially make here, are likely to be things the compiler has already done (a shift is the same as a multiply to the compiler). Some may actually prevent the compiler from making an optimisation (changing an add to an or will restrict the compiler - there are more ways to add numbers, and only you know that in this case the result will be the same).
Pointer arithmetic may be better, but the compiler is not stupid - it ought to already be producing decent code for dereferencing the array, so you need to check that you have not in fact made matters worse by introducing an additional variable.
In this case the loop count is well defined and limited, so unrolling probably makes sense.
Further more it depends on how dependent you want the result to be on your target architecture. If you want portability, it is hard(er) to optimise.
For example, the following produces better code here:
unsigned int x0 = *(unsigned int *)array;
unsigned int x1 = *(unsigned int *)(array+4);
int decimal = ((x0 * 0x8040201) >> 20) + ((x1 * 0x8040201) >> 24);
I could probably also roll a 64-bit version that did 8 bits at a time instead of 4.
But it is very definitely not portable code. I might use that locally if I knew what I was running on and I just wanted to crunch numbers quickly. But I probably wouldn't put it in production code. Certainly not without documenting what it did, and without the accompanying unit test that checks that it actually works.
The binary 'compression' can be generalized as a problem of weighted sum -- and for that there are some interesting techniques.
X mod (255) means essentially summing of all independent 8-bit numbers.
X mod 254 means summing each digit with a doubling weight, since 1 mod 254 = 1, 256 mod 254 = 2, 256*256 mod 254 = 2*2 = 4, etc.
If the encoding was big endian, then *(unsigned long long)array % 254 would produce a weighted sum (with truncated range of 0..253). Then removing the value with weight 2 and adding it manually would produce the correct result:
uint64_t a = *(uint64_t *)array;
return (a & ~256) % 254 + ((a>>9) & 2);
Other mechanism to get the weight is to premultiply each binary digit by 255 and masking the correct bit:
uint64_t a = (*(uint64_t *)array * 255) & 0x0102040810204080ULL; // little endian
uint64_t a = (*(uint64_t *)array * 255) & 0x8040201008040201ULL; // big endian
In both cases one can then take the remainder of 255 (and correct now with weight 1):
return (a & 0x00ffffffffffffff) % 255 + (a>>56); // little endian, or
return (a & ~1) % 255 + (a&1);
For the sceptical mind: I actually did profile the modulus version to be (slightly) faster than iteration on x64.
To continue from the answer of JasonD, parallel bit selection can be iteratively utilized.
But first expressing the equation in full form would help the compiler to remove the artificial dependency created by the iterative approach using accumulation:
ret = ((a[0]<<7) | (a[1]<<6) | (a[2]<<5) | (a[3]<<4) |
(a[4]<<3) | (a[5]<<2) | (a[6]<<1) | (a[7]<<0));
vs.
HI=*(uint32_t)array, LO=*(uint32_t)&array[4];
LO |= (HI<<4); // The HI dword has a weight 16 relative to Lo bytes
LO |= (LO>>14); // High word has 4x weight compared to low word
LO |= (LO>>9); // high byte has 2x weight compared to lower byte
return LO & 255;
One more interesting technique would be to utilize crc32 as a compression function; then it just happens that the result would be LookUpTable[crc32(array) & 255]; as there is no collision with this given small subset of 256 distinct arrays. However to apply that, one has already chosen the road of even less portability and could as well end up using SSE intrinsics.
You could use accumulate, with a doubling and adding binary operation:
int doubleSumAndAdd(const int& sum, const int& next) {
return (sum * 2) + next;
}
int decimal = accumulate(array, array+ARRAY_SIZE,
doubleSumAndAdd);
This produces big-endian integers, whereas OP code produces little-endian.
Try this, I converted a binary digit of up to 1020 bits
#include <sstream>
#include <string>
#include <math.h>
#include <iostream>
using namespace std;
long binary_decimal(string num) /* Function to convert binary to dec */
{
long dec = 0, n = 1, exp = 0;
string bin = num;
if(bin.length() > 1020){
cout << "Binary Digit too large" << endl;
}
else {
for(int i = bin.length() - 1; i > -1; i--)
{
n = pow(2,exp++);
if(bin.at(i) == '1')
dec += n;
}
}
return dec;
}
Theoretically this method will work for a binary digit of infinate length

How can I represent the number 2^1000 in C++? [duplicate]

This question already has answers here:
Closed 10 years ago.
So, I was trying to do problem # 16 on Project Euler, from http://projecteuler.net if you haven't seen it. It is as follows:
2^15 = 32768 and the sum of its digits is 3 + 2 + 7 + 6 + 8 = 26.
What is the sum of the digits of the number 2^1000?
I am having trouble figuring out how to represent the number 2^1000 in C++. I am guessing there is a trick to this, but I am really stuck. I don't really want the answer to the problem, I just want to know how to represent that number as a variable, or if perhaps there is a trick, maybe someone could let me know?
Represent it as a string. That means you need to write two pieces of code:
You need to write a piece of code to double a number, given that number as a string.
You need to write a piece of code to sum the digits of a number represented as a string.
With those two pieces, it's easy.
One good algorithm worth knowing for this problem:
2^1 = 2
2^2 = 2 x 2 = 2 + 2
2^3 = 2 x (2 x 2) = (2 + 2) + (2 + 2)
2^4 = 2 x [2 x ( 2 x 2)] = [(2 + 2) + (2 + 2)] + [(2 + 2) + (2 + 2)]
Thus we have a recursive definition for calculating a power of two in terms of the addition operation: just add together two of the previous power of two.
This link deals with this problem very well.
Here is a complete program. The digits are held in a vector.
#include <iostream>
#include <numeric>
#include <ostream>
#include <vector>
int main()
{
std::vector<unsigned int> digits;
digits.push_back(1); // 2 ** 0 = 1
const int limit = 1000;
for (int i = 0; i != limit; ++i)
{
// Invariant: digits holds the individual digits of the number 2 ** i
unsigned int carry = 0;
for (auto iter = digits.begin(); iter != digits.end(); ++iter)
{
unsigned int d = *iter;
d = 2 * d + carry;
carry = d / 10;
d = d % 10;
*iter = d;
}
if (carry != 0)
{
digits.push_back(carry);
}
}
unsigned int sum = std::accumulate(digits.cbegin(), digits.cend(), 0U);
std::cout << sum << std::endl;
return 0;
}
The whole point of this problem is to come up with a way of doing this without actually calculating 2^1000.
However, if you do want to calculate 2^1000—which may be a good idea, because it's a great way to test whether your other algorithm is correct—you're going to want some kind of "bignum" library, such as gmp:
mpz_t two_to_1000;
mpz_ui_pow_ui(two_to_1000, 2, 1000);
Or you can use the C++ interface to gmp. It doesn't do exponentiation, so the first part gets slightly more complicated instead of less, but it makes the digit-summing simpler:
mpz_class two_to_1000;
mpz_ui_pow_ui(two_to_1000.get_mpz_t(), 2, 1000);
mpz_class digitsum(0);
while (two_to_1000) {
digitsum += two_to_1000 % 10;
two_to_1000 /= 10;
}
(There's actually no reason to make digitsum an mpz there, so you may want to figure out how to prove that the result will fit into 32 bits, add that as a comment, and just use a long for digitsum.)
All that being said, I probably wouldn't have written this gmp code to test it, when the whole thing is a one-liner in Python:
print(sum(map(int, str(2**1000))))
And, even though converting the bignum to a string to convert each digit to an int to sum them up is possibly the least efficient way to solve it, it still takes under 200us on the slowest machine I have here. And there's really no reason the double-check needs to be in the same language as the actual solution.
You'd need a 1000 bit machine integer to represent 2^1000; I've never heard of a machine with such. But there are a lot of big integer packages around, which do the arithmetic over as many machine words as are needed. The simplest solution might be to use one of these.(Although given the particular operations you need, doing the arithmetic on a string, as David Schwartz suggested, might be appropriate. In the general case, it's not a very good idea, but since all you're doing is multiplying by two, and then taking the decimal digits, it might work out well.)
Since 2^10 is about 10^3, and 2^1000 = (2^10)^100 = (10^3)^100 = 10^300 (about).
So allocate an array like
char digits[ 300 ]; // may be too few
and store a value between 0 .. 9 in each char.

determine power of number

i know that if number is power of two,then it must satisfy (x&(x-1))=0; for example let's take x=16 or 10000 x-16=10000-1=01111 and (x&(x-1))=0; for another non power number 7 for example, 7=0111,7-1=0110 7&(7-1)=0110 which is not equal 0,my question how can i determine if number is some power of another number k? for example 625 is 5^4,and also how can i find in which power is equal k to n?i am interested using bitwise operators,sure i can find it by brute force methods(by standard algorithm,thanks a lot
I doubt you're going to find a bitwise algorithm for determining that a number is a power of 5.
In general, given y = n^x, to find x, you need to use logarithms, i.e. x = log_n(y). Most languages don't offer a log_n function, but you can achieve it with the following identity:
log_n(y) = log(y) / log(n)
If y is an integer power of n, then x will be an integer. Of course, due to the limitations of finite-precision computer arithmetic, you won't necessarily get the exact answer with the method above.
I'm afraid, you can't do that with just simple bit magic. Bits are typically good for powers of 2. For powers of, say, 5 you'd probably need to operate in base-5 system, where 15=110, 105=510, 1005=2510, 10005=12510, 100005=62510, etc. In base-5 system you can recognize powers of 5 just as easily as powers of 2 in binary. But you'd first need to convert your numbers to that base.
For arbitrary k there is only the generic solution:
bool is_pow(unsigned long x, unsigned int base) {
assert(base >= 2);
if (x == 0) {
return false;
}
unsigned long t = x;
while (t % base == 0) {
t /= base;
}
return t == 1;
}
When k is a power of two, you can speed things up by checking whether x is a power of two and whether the number of trailing zero bits of x is divisible by log2(k).
And if computational speed is important and your k is fixed, you can always use the trivial implementation:
bool is_pow5(unsigned long x) {
if (x == 5 || x == 25 || x == 125 || x == 625)
return true;
if (x < 3125)
return false;
// you got the idea
...
}

Calculate how many ones in bits and bits inverting [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
How many 1s in an n-bit integer?
Hello
How to calculate how many ones in bits?
1100110 -> 4
101 -> 2
And second question:
How to invert bits?
1100110 -> 0011001
101 -> 010
Thanks
If you can get your bits into a std::bitset, you can use the flip method to invert, and the count method to count the bits.
The book Hacker's Delight by Henry S Warren Jr. contains lots of useful little gems on computing this sort of thing - and lots else besides. Everyone who does low level bit twiddling should have a copy :)
The counting-1s section is 8 pages long!
One of them is:
int pop(unsigned x)
{
x = x - ((x >> 1) & 0x55555555);
x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
x = (x + (x >> 4)) & 0x0F0F0F0F;
x = x + (x >> 8);
x = x + (x >> 16);
return x & 0x0000003F;
}
A potentially critical advantage compared to the looping options already presented is that the runtime is not variable. If it's inside a hard-real-time interrupt service routine this is much more important than "fastest-average-computation" time.
There's also a long thread on bit counting here:
How to count the number of set bits in a 32-bit integer?
You can loop while the number is non-zero, and increment a counter when the last bit is set. Or if you are working on Intel architecture, you can use the popcnt instruction in inline assembly.
int count_bit_set(unsigned int x) {
int count = 0;
while (x != 0) {
count += (x & 1);
x = x >> 1;
}
return count;
}
You use the ~ operator.
Counting bits: http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetNaive
Inverting bits: x = ~x;
For the first question, Fast Bit Counting has a few ways of doing it, the simplest being:
int bitcount (unsigned int n) {
int count = 0;
while (n) {
count += n & 0x1u;
n >>= 1;
}
return count;
}
For the second question, use the ´~´ (bitwise negation) operator.
To count the number of set bits in a number you can use the hakmem parallel counting which is the fastest approach not using predefined tables for parallel counting:
http://tekpool.wordpress.com/2006/09/25/bit-count-parallel-counting-mit-hakmem/
while inverting bits is really easy:
i = ~i;
A somewhat trikcy (but faster) solution would be:
int setbitcount( unsigned int x )
{
int result;
for( result=0; x; x&=x-1, ++result )
;
return result;
}
Compared to sylvain's soultion, this function iterates in the loop only the number of set bits. That is: for the number 1100110, it will do only 4 iteration (compared to 32 in Sylvain's algorithm).
The key is the expression x&=x-1, which will clear the least significant set bit. i.e.:
1) 1100110 & 1100101 = 1100100
2) 1100100 & 1100011 = 1100000
3) 1100000 & 1011111 = 1000000
4) 1000000 & 0111111 = 0
You can also inverse bits by XOR'ing them with some number. For example - inversing byte:
INVERTED_BYTE = BYTE ^ 0xFF
How to calculate how many ones in bits?
Hamming weight.
How to invert bits?
i = ~i;

algorithm to figure out how many bytes are required to hold an int

sorry for the stupid question, but how would I go about figuring out, mathematically or using c++, how many bytes it would take to store an integer.
If you mean from an information theory point of view, then the easy answer is:
log(number) / log(2)
(It doesn't matter if those are natural, binary, or common logarithms, because of the division by log(2), which calculates the logarithm with base 2.)
This reports the number of bits necessary to store your number.
If you're interested in how much memory is required for the efficient or usual encoding of your number in a specific language or environment, you'll need to do some research. :)
The typical C and C++ ranges for integers are:
char 1 byte
short 2 bytes
int 4 bytes
long 8 bytes
If you're interested in arbitrary-sized integers, special libraries are available, and every library will have its own internal storage mechanism, but they'll typically store numbers via 4- or 8- byte chunks up to the size of the number.
You could find the first power of 2 that's larger than your number, and divide that power by 8, then round the number up to the nearest integer. So for 1000, the power of 2 is 1024 or 2^10; divide 10 by 8 to get 1.25, and round up to 2. You need two bytes to hold 1000!
If you mean "how large is an int" then sizeof(int) is the answer.
If you mean "how small a type can I use to store values of this magnitude" then that's a bit more complex. If you already have the value in integer form, then presumably it fits in 4, 3, 2, or 1 bytes. For unsigned values, if it's 16777216 or over you need 4 bytes, 65536-16777216 requires 3 bytes, 256-65535 needs 2, and 0-255 fits in 1 byte. The formula for this comes from the fact that each byte can hold 8 bits, and each bit holds 2 digits, so 1 byte holds 2^8 values, ie. 256 (but starting at 0, so 0-255). 2 bytes therefore holds 2^16 values, ie. 65536, and so on.
You can generalise that beyond the normal 4 bytes used for a typical int if you like. If you need to accommodate signed integers as well as unsigned, bear in mind that 1 bit is effectively used to store whether it is positive or negative, so the magnitude is 1 power of 2 less.
You can calculate the number of bits you need iteratively from an integer by dividing it by two and discarding the remainder. Each division you can make and still have a non-zero value means you have one more bit of data in use - and every 8 bits you're using means 1 byte.
A quick way of calculating this is to use the shift right function and compare the result against zero.
int value = 23534; // or whatever
int bits = 0;
while (value)
{
value >> 1;
++bits;
}
std::cout << "Bits used = " << bits << std::endl;
std::cout << "Bytes used = " << (bits / 8) + 1 << std::endl;
This is basically the same question as "how many binary digits would it take to store a number x?" All you need is the logarithm.
A n-bit integer can store numbers up to 2n-1. So, given a number x, ceil(log2 x) gets you the number of digits you need.
It's exactly the same thing as figuring out how many decimal digits you need to write a number by hand. For example, log10 123456 = 5.09151220... , so ceil( log10(123456) ) = 6, six digits.
Since nobody put up the simplest code that works yet, I mind as well do it:
unsigned int get_number_of_bytes_needed(unsigned int N) {
unsigned int bytes = 0;
while(N) {
N >>= 8;
++bytes;
};
return bytes;
};
assuming sizeof(long int) = 4.
int nbytes( long int x )
{
unsigned long int n = (unsigned long int) x;
if (n <= 0xFFFF)
{
if (n <= 0xFF) return 1;
else return 2;
}
else
{
if (n <= 0xFFFFFF) return 3;
else return 4;
}
}
The shortest code way to do this is as follows:
int bytes = (int)Math.Log(num, 256) + 1;
The code is small enough to be inlined, which helps offset the "slow" FP code. Also, there are no branches, which can be expensive.
Try this code:
// works for num >= 0
int numberOfBytesForNumber(int num) {
if (num < 0)
return 0;
else if (num == 0)
return 1;
else if (num > 0) {
int n = 0;
while (num != 0) {
num >>= 8;
n++;
}
return n;
}
}
/**
* assumes i is non-negative.
* note that this returns 0 for 0, when perhaps it should be special cased?
*/
int numberOfBytesForNumber(int i) {
int bytes = 0;
int div = 1;
while(i / div) {
bytes++;
div *= 256;
}
if(i % 8 == 0) return bytes;
return bytes + 1;
}
This code runs at 447 million tests / sec on my laptop where i = 1 to 1E9. i is a signed int:
n = (i > 0xffffff || i < 0) ? 4 : (i < 0xffff) ? (i < 0xff) ? 1 : 2 : 3;
Python example: no logs or exponents, just bit shift.
Note: 0 counts as 0 bits and only positive ints are valid.
def bits(num):
"""Return the number of bits required to hold a int value."""
if not isinstance(num, int):
raise TypeError("Argument must be of type int.")
if num < 0:
raise ValueError("Argument cannot be less than 0.")
for i in count(start=0):
if num == 0:
return i
num = num >> 1