How can I represent the number 2^1000 in C++? [duplicate] - c++

This question already has answers here:
Closed 10 years ago.
So, I was trying to do problem # 16 on Project Euler, from http://projecteuler.net if you haven't seen it. It is as follows:
2^15 = 32768 and the sum of its digits is 3 + 2 + 7 + 6 + 8 = 26.
What is the sum of the digits of the number 2^1000?
I am having trouble figuring out how to represent the number 2^1000 in C++. I am guessing there is a trick to this, but I am really stuck. I don't really want the answer to the problem, I just want to know how to represent that number as a variable, or if perhaps there is a trick, maybe someone could let me know?

Represent it as a string. That means you need to write two pieces of code:
You need to write a piece of code to double a number, given that number as a string.
You need to write a piece of code to sum the digits of a number represented as a string.
With those two pieces, it's easy.

One good algorithm worth knowing for this problem:
2^1 = 2
2^2 = 2 x 2 = 2 + 2
2^3 = 2 x (2 x 2) = (2 + 2) + (2 + 2)
2^4 = 2 x [2 x ( 2 x 2)] = [(2 + 2) + (2 + 2)] + [(2 + 2) + (2 + 2)]
Thus we have a recursive definition for calculating a power of two in terms of the addition operation: just add together two of the previous power of two.
This link deals with this problem very well.

Here is a complete program. The digits are held in a vector.
#include <iostream>
#include <numeric>
#include <ostream>
#include <vector>
int main()
{
std::vector<unsigned int> digits;
digits.push_back(1); // 2 ** 0 = 1
const int limit = 1000;
for (int i = 0; i != limit; ++i)
{
// Invariant: digits holds the individual digits of the number 2 ** i
unsigned int carry = 0;
for (auto iter = digits.begin(); iter != digits.end(); ++iter)
{
unsigned int d = *iter;
d = 2 * d + carry;
carry = d / 10;
d = d % 10;
*iter = d;
}
if (carry != 0)
{
digits.push_back(carry);
}
}
unsigned int sum = std::accumulate(digits.cbegin(), digits.cend(), 0U);
std::cout << sum << std::endl;
return 0;
}

The whole point of this problem is to come up with a way of doing this without actually calculating 2^1000.
However, if you do want to calculate 2^1000—which may be a good idea, because it's a great way to test whether your other algorithm is correct—you're going to want some kind of "bignum" library, such as gmp:
mpz_t two_to_1000;
mpz_ui_pow_ui(two_to_1000, 2, 1000);
Or you can use the C++ interface to gmp. It doesn't do exponentiation, so the first part gets slightly more complicated instead of less, but it makes the digit-summing simpler:
mpz_class two_to_1000;
mpz_ui_pow_ui(two_to_1000.get_mpz_t(), 2, 1000);
mpz_class digitsum(0);
while (two_to_1000) {
digitsum += two_to_1000 % 10;
two_to_1000 /= 10;
}
(There's actually no reason to make digitsum an mpz there, so you may want to figure out how to prove that the result will fit into 32 bits, add that as a comment, and just use a long for digitsum.)
All that being said, I probably wouldn't have written this gmp code to test it, when the whole thing is a one-liner in Python:
print(sum(map(int, str(2**1000))))
And, even though converting the bignum to a string to convert each digit to an int to sum them up is possibly the least efficient way to solve it, it still takes under 200us on the slowest machine I have here. And there's really no reason the double-check needs to be in the same language as the actual solution.

You'd need a 1000 bit machine integer to represent 2^1000; I've never heard of a machine with such. But there are a lot of big integer packages around, which do the arithmetic over as many machine words as are needed. The simplest solution might be to use one of these.(Although given the particular operations you need, doing the arithmetic on a string, as David Schwartz suggested, might be appropriate. In the general case, it's not a very good idea, but since all you're doing is multiplying by two, and then taking the decimal digits, it might work out well.)

Since 2^10 is about 10^3, and 2^1000 = (2^10)^100 = (10^3)^100 = 10^300 (about).
So allocate an array like
char digits[ 300 ]; // may be too few
and store a value between 0 .. 9 in each char.

Related

How do I sum up big integers that don't fit into unsigned long long? C++

I've created a program that sums up all possible individual sub-strings of given data. For example:
1 1 2 2 should return 30 because,
1
1 + 1
1 + 1 + 2
1 + 1 + 2 + 2
1
1 + 2
1 + 2 + 2
2
2 + 2
2
Sums up to 30, now the problem isn't creating such a program, the issue is when the big (10^15) numbers come in when there can be as many as 10^5 of them. Now my question is: How do I deal with such numbers? I can only use standard library, so no GMP for me unfortunately and I'm also forced to run on GCC 4.4.4 which makes it even worse.
I think it would be pretty easy to do a bare bone (only what you need) implementation of a UINT_128. (your maximum answer would actually fit in a uint_96)
Assuming I know the way you are going to implement the program, all you need is constructor that take a uint64_t; addition operator; multiplication operator; and possibly a display.
I would store it internally as 4 uint32_t (words) and the operations could be implemented by just operating on the words the same way addition or long multiplication is done for decimals by hand. Multiplication could be simplified because I believe your factors would never exceed a uint64_t so you could take advantage of that.
If you need to display something other than binary or hex, you probably need to implement division also (or if you are really lazy you could get away with a digit by digit binary search style guess and check using only multiplication to get the decimal representation).
For anyone interested in answer I've figured it out just by moving number of overflows to separate unsigned long long int.
for(int i = 1; i <= n; i++){
SUM += addends[i - 1] * (i * (n - i + 1));
if(SUM > 10000000000000000000){ // If close to overflow of unsigned long long int
SUM -= 10000000000000000000; // Remove 10^17
counter ++; // And boost counter, to make recreating true value possible
}
}
Probably really ineffective, but it's good enough for my purposes. Thank you all for your help!

C++: Binary to Decimal Conversion

I am trying to convert a binary array to decimal in following way:
uint8_t array[8] = {1,1,1,1,0,1,1,1} ;
int decimal = 0 ;
for(int i = 0 ; i < 8 ; i++)
decimal = (decimal << 1) + array[i] ;
Actually I have to convert 64 bit binary array to decimal and I have to do it for million times.
Can anybody help me, is there any faster way to do the above ? Or is the above one is nice ?
Your method is adequate, to call it nice I would just not mix bitwise operations and "mathematical" way of converting to decimal, i.e. use either
decimal = decimal << 1 | array[i];
or
decimal = decimal * 2 + array[i];
It is important, before attempting any optimisation, to profile the code. Time it, look at the code being generated, and optimise only when you understand what is going on.
And as already pointed out, the best optimisation is to not do something, but to make a higher level change that removes the need.
However...
Most changes you might want to trivially make here, are likely to be things the compiler has already done (a shift is the same as a multiply to the compiler). Some may actually prevent the compiler from making an optimisation (changing an add to an or will restrict the compiler - there are more ways to add numbers, and only you know that in this case the result will be the same).
Pointer arithmetic may be better, but the compiler is not stupid - it ought to already be producing decent code for dereferencing the array, so you need to check that you have not in fact made matters worse by introducing an additional variable.
In this case the loop count is well defined and limited, so unrolling probably makes sense.
Further more it depends on how dependent you want the result to be on your target architecture. If you want portability, it is hard(er) to optimise.
For example, the following produces better code here:
unsigned int x0 = *(unsigned int *)array;
unsigned int x1 = *(unsigned int *)(array+4);
int decimal = ((x0 * 0x8040201) >> 20) + ((x1 * 0x8040201) >> 24);
I could probably also roll a 64-bit version that did 8 bits at a time instead of 4.
But it is very definitely not portable code. I might use that locally if I knew what I was running on and I just wanted to crunch numbers quickly. But I probably wouldn't put it in production code. Certainly not without documenting what it did, and without the accompanying unit test that checks that it actually works.
The binary 'compression' can be generalized as a problem of weighted sum -- and for that there are some interesting techniques.
X mod (255) means essentially summing of all independent 8-bit numbers.
X mod 254 means summing each digit with a doubling weight, since 1 mod 254 = 1, 256 mod 254 = 2, 256*256 mod 254 = 2*2 = 4, etc.
If the encoding was big endian, then *(unsigned long long)array % 254 would produce a weighted sum (with truncated range of 0..253). Then removing the value with weight 2 and adding it manually would produce the correct result:
uint64_t a = *(uint64_t *)array;
return (a & ~256) % 254 + ((a>>9) & 2);
Other mechanism to get the weight is to premultiply each binary digit by 255 and masking the correct bit:
uint64_t a = (*(uint64_t *)array * 255) & 0x0102040810204080ULL; // little endian
uint64_t a = (*(uint64_t *)array * 255) & 0x8040201008040201ULL; // big endian
In both cases one can then take the remainder of 255 (and correct now with weight 1):
return (a & 0x00ffffffffffffff) % 255 + (a>>56); // little endian, or
return (a & ~1) % 255 + (a&1);
For the sceptical mind: I actually did profile the modulus version to be (slightly) faster than iteration on x64.
To continue from the answer of JasonD, parallel bit selection can be iteratively utilized.
But first expressing the equation in full form would help the compiler to remove the artificial dependency created by the iterative approach using accumulation:
ret = ((a[0]<<7) | (a[1]<<6) | (a[2]<<5) | (a[3]<<4) |
(a[4]<<3) | (a[5]<<2) | (a[6]<<1) | (a[7]<<0));
vs.
HI=*(uint32_t)array, LO=*(uint32_t)&array[4];
LO |= (HI<<4); // The HI dword has a weight 16 relative to Lo bytes
LO |= (LO>>14); // High word has 4x weight compared to low word
LO |= (LO>>9); // high byte has 2x weight compared to lower byte
return LO & 255;
One more interesting technique would be to utilize crc32 as a compression function; then it just happens that the result would be LookUpTable[crc32(array) & 255]; as there is no collision with this given small subset of 256 distinct arrays. However to apply that, one has already chosen the road of even less portability and could as well end up using SSE intrinsics.
You could use accumulate, with a doubling and adding binary operation:
int doubleSumAndAdd(const int& sum, const int& next) {
return (sum * 2) + next;
}
int decimal = accumulate(array, array+ARRAY_SIZE,
doubleSumAndAdd);
This produces big-endian integers, whereas OP code produces little-endian.
Try this, I converted a binary digit of up to 1020 bits
#include <sstream>
#include <string>
#include <math.h>
#include <iostream>
using namespace std;
long binary_decimal(string num) /* Function to convert binary to dec */
{
long dec = 0, n = 1, exp = 0;
string bin = num;
if(bin.length() > 1020){
cout << "Binary Digit too large" << endl;
}
else {
for(int i = bin.length() - 1; i > -1; i--)
{
n = pow(2,exp++);
if(bin.at(i) == '1')
dec += n;
}
}
return dec;
}
Theoretically this method will work for a binary digit of infinate length

How to represent a number in base 2³²?

If I have some base 10 or base 16 number, how do I change it into base 232?
The reason I'm trying to do this, is for implementing BigInt as suggested by other members here Why to use higher base for implementing BigInt?
Will it be the same as integer (base 10) till 232? What will happen after it?
You are trying to find something of the form
a0 + a1 * (2^32) + a2 * (2^32)^2 + a3 * (2^32)^3 + ...
which is exactly the definition of a base-232 system, so ignore all the people that told you that your question doesn't make sense!
Anyway, what you are describing is known as base conversion. There are quick ways and there are easy ways to solve this. The quick ways are very complicated (there are entire chapters of books dedicated to the subject), and I'm not going to attempt to address them here (not least because I've never attempted to use them).
One easy way is to first implement two functions in your number system, multiplication and addition. (i.e. implement BigInt add(BigInt a, BigInt b) and BigInt mul(BigInt a, BigInt b)). Once you've solved that, you will notice that a base-10 number can be expressed as:
b0 + b1 * 10 + b2 * 10^2 + b3 * 10^3 + ...
which can also be written as:
b0 + 10 * (b1 + 10 * (b2 + 10 * (b3 + ...
so if you move left-to-right in your input string, you can peel off one base-10 digit at a time, and use your add and mul functions to accumulate into your BigInt:
BigInt a = 0;
for each digit b {
a = add(mul(a, 10), b);
}
Disclaimer: This method is not computationally efficient, but it will at least get you started.
Note: Converting from base-16 is much simpler, because 232 is an exact multiple of 16. So the conversion basically comes down to concatenating bits.
Let's suppose that we are talking about a base-10 number:
a[0]*10^0 + a[1]*10^1 + a[2]*10^2 + a[3]*10^3 + ... + a[N]*10^N
where each a[i] is a digit in the range 0 to 9 inclusive.
I'm going to assume that you can parse the string that is your input value and find the array a[]. Once you can do that, and assuming that you have already implemented your BigInt class with the + and * operators, then you are home. You can simply evaluate the expression above with an instance of your BigInt class.
You can evaluate this expression relatively efficiently using Horner's method.
I've just written this down off the top of my head, and I will bet that there are much more efficient base conversion schemes.
If I have some base 10 or base 16 number, how do I change it into base 2^32?
Just like you convert it to any other base. You want to write the number n as
n = a_0 + a_1 * 2^32 + a_2 * 2^64 + a_3 * 2^96 + ... + a_k * 2^(32 * k).
So, find the largest power of 2^32 that divides into n, subtract off the multiple of that power from n, and repeat with the difference.
However, are you sure that you asked the right question?
I suspect that you mean to be asking a different question. I suspect that you mean to ask: how do I parse a base-10 number into an instance of my BigInteger? That's easy. Code up your implementation, and make sure that you've implemented + and *. I'm completely agnostic to how you actually internally represent integers, but if you want to use base 2^32, fine, do it. Then:
BigInteger Parse(string s) {
BigInteger b = new BigInteger(0);
foreach(char c in s) { b = b * 10 + (int)c - (int)'0'; }
return b;
}
I'll leave it to you to translate this to C.
Base 16 is easy, since 232 is 168, an exact power. So, starting from the least significant digit, read 8 base-16 digits at a time, convert those digits into a 32-bit value, and that is the next base-232 "digit".
Base 10 is more difficult. As you say, if it's less than 232, then you just take the value as a single base-232 "digit". Otherwise, the simplest method I can think of is to use the Long Division algorithm to repeatedly divide the base-10 value by 232; at each stage, the remainder is the next base-232 "digit". Perhaps someone who knows more number theory than me could provide a better solution.
I think this is a totally reasonable thing to do.
What you are doing is representing a very large number (like an encryption key) in an array of 32 bit integers.
A base 16 representation is base 2^4, or a series of 4 bits at a time. If you are receiving a stream of base 16 "digits", fill in the low 4 bits of the first integer in your array, then the next lowest, until you read 8 "digits". Then go to the next element in the array.
long getBase16()
{
char cCurr;
switch (cCurr = getchar())
{
case 'A':
case 'a':
return 10;
case 'B':
case 'b':
return 11;
...
default:
return cCurr - '0';
}
}
void read_input(long * plBuffer)
{
long * plDst = plBuffer;
int iPos = 32;
*(++plDst) = 0x00;
long lDigit;
while (lDigit = getBase16())
{
if (!iPos)
{
*(++plDst) = 0x00;
iPos = 32;
}
*plDst >> 4;
iPos -= 4;
*plDst |= (lDigit & 0x0F) << 28
}
}
There is some fix up to do, like ending by shifting *plDst by iPos, and keeping track of the number of integers in your array.
There is also some work to convert from base 10.
But this is enough to get you started.

Logarithm of the very-very large number

I have to find log of very large number.
I do this in C++
I have already made a function of multiplication, addition, subtraction, division, but there were problems with the logarithm. I do not need code, I need a simple idea how to do it using these functions.
Thanks.
P.S.
Sorry, i forgot to tell you: i have to find only binary logarithm of that number
P.S.-2
I found in Wikipedia:
int floorLog2(unsigned int n) {
if (n == 0)
return -1;
int pos = 0;
if (n >= (1 <<16)) { n >>= 16; pos += 16; }
if (n >= (1 << 8)) { n >>= 8; pos += 8; }
if (n >= (1 << 4)) { n >>= 4; pos += 4; }
if (n >= (1 << 2)) { n >>= 2; pos += 2; }
if (n >= (1 << 1)) { pos += 1; }
return pos;
}
if I remade it under the big numbers, it will work correctly?
I assume you're writing a bignum class of your own. If you only care about an integral result of log2, it's quite easy. Take the log of the most significant digit that's not zero, and add 8 for each byte after that one. This is assuming that each byte holds values 0-255. These are only accurate within ±.5, but very fast.
[0][42][53] (10805 in bytes)
log2(42) = 5
+ 8*1 = 8 (because of the one byte lower than MSB)
= 13 (Actual: 13.39941145)
If your values hold base 10 digits, that works out to log2(MSB)+3.32192809*num_digits_less_than_MSB.
[0][5][7][6][2] (5762)
log2(5) = 2.321928095
+ 3.32192809*3 = 9.96578427 (because 3 digits lower than MSB)
= 12.28771 (Actual: 12.49235395)
(only accurate for numbers with less than ~10 million digits)
If you used the algorithm you found on wikipedia, it will be IMMENSELY slow. (but accurate if you need decimals)
It's been pointed out that my method is inaccurate when the MSB is small (still within ±.5, but no farther), but this is easily fixed by simply shifting the top two bytes into a single number, taking the log of that, and doing the multiplication for the bytes less than that number. I believe this will be accurate within half a percent, and still significantly faster than a normal logarithm.
[1][42][53] (76341 in bytes)
log2(1*256+42) = ?
log2(298) = 8.21916852046
+ 8*1 = 8 (because of the one byte lower than MSB)
= 16.21916852046 (Actual: 16.2201704643)
For base 10 digits, it's log2( [mostSignificantDigit]*10+[secondMostSignifcantDigit] ) + 3.32192809*[remainingDigitCount].
If performance is still an issue, you can use lookup tables for the log2 instead of using a full logarithm function.
I assume you want to know how to compute the logarithm "by hand". So I tell you what I've found for this.
Have a look over here, where it is described how to logarithmize by hand. You can implement this as an algorithm. Here's an article by "How Euler did it". I also find this article promising.
I suppose there are more sophisticated methods to do this, but they are so involved you probably don't want to implement them.

Double precision in C++ (or pow(2, 1000))

I'm working on Project Euler to brush up on my C++ coding skills in preparation for the programming challenge(s) we'll be having this next semester (since they don't let us use Python, boo!).
I'm on #16, and I'm trying to find a way to keep real precision for 2¹°°°
For instance:
int main(){
double num = pow(2, 1000);
printf("%.0f", num):
return 0;
}
prints
10715086071862673209484250490600018105614050000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Which is missing most of the numbers (from python):
>>> 2**1000
10715086071862673209484250490600018105614048117055336074437503883703510511249361224931983788156958581275946729175531468251871452856923140435984577574698574803934567774824230985421074605062371141877954182153046474983581941267398767559165543946077062914571196477686542167660429831652624386837205668069376L
Granted, I can write the program with a Python 1 liner
sum(int(_) for _ in str(2**1000))
that gives me the result immediately, but I'm trying to find a way to do it in C++. Any pointers? (haha...)
Edit:
Something outside the standard libs is worthless to me - only dead-tree code is allowed in those contests, and I'm probably not going to print out 10,000 lines of external code...
If you just keep track of each digit in a char array, this is easy. Doubling a digit is trivial, and if the result is greater than 10 you just subtract 10 and add a carry to the next digit. Start with a value of 1, loop over the doubling function 1000 times, and you're done. You can predict the number of digits you'll need with ceil(1000*log(2)/log(10)), or just add them dynamically.
Spoiler alert: it appears I have to show the code before anyone will believe me. This is a simple implementation of a bignum with two functions, Double and Display. I didn't make it a class in the interest of simplicity. The digits are stored in a little-endian format, with the least significant digit first.
typedef std::vector<char> bignum;
void Double(bignum & num)
{
int carry = 0;
for (bignum::iterator p = num.begin(); p != num.end(); ++p)
{
*p *= 2;
*p += carry;
carry = (*p >= 10);
*p -= carry * 10;
}
if (carry != 0)
num.push_back(carry);
}
void Display(bignum & num)
{
for (bignum::reverse_iterator p = num.rbegin(); p != num.rend(); ++p)
std::cout << static_cast<int>(*p);
}
int main(int argc, char* argv[])
{
bignum num;
num.push_back(1);
for (int i = 0; i < 1000; ++i)
Double(num);
Display(num);
std::cout << std::endl;
return 0;
}
You need a bignum library, such as this one.
You probably need a pointer here (pun intended)
In C++ you would need to create your own bigint lib in order to do the same as in python.
C/C++ operates on fundamental data types. You are using a double which has only 64 bits to store a 1000 bit number. double uses 51 bit for the significant digits and 11 bit for the magnitude.
The only solution for you is to either use a library like bignum mentioned elsewhere or to roll out your own.
UPDATE: I just browsed to the Euler Problem site and found that Problem 13 is about summing large integers. The iterated method can become very tricky after a short while, so I'd suggest to use the code from Problem #13 you should have already to solve this, because 2**N => 2**(N-1) + 2**(N-1)
Using bignums is cheating and not a solution. Also, you don't need to compute 2**1000 or anything like that to get to the result. I'll give you a hint:
Take the first few values of 2**N:
0 1 2 4 8 16 32 64 128 256 ...
Now write down for each number the sum of its digits:
1 2 4 8 7 5 10 11 13 ...
You should notice that (x~=y means x and y have the same sum of digits)
1+1=2, 1+(1+2)=4, 1+(1+2+4)=8, 1+(1+2+4+8)=16~=7 1+(1+2+4+8+7)=23~=5
Now write a loop.
Project Euler = Think before Compute!
If you want to do this sort of thing on a practical basis, you're looking for an arbitrary precision arithmetic package. There are a number around, including NTL, lip, GMP, and MIRACL.
If you're just after something for Project Euler, you can write your own code for raising to a power. The basic idea is to store your large number in quite a few small pieces, and implement your own carries, borrows, etc., between the pieces.
Isn't pow(2, 1000) just 2 left-shifted 1000 times, essentially? It should have an exact binary representation in a double float. It shouldn't require a bignum library.