Is there a way to find sum of digits of 100!? - c++

I know there is a way of finding the sum of digits of 100!(or any other big number's factorial) using Python. But I find it really tough when it comes to C++ as the the size of even LONG LONG is not enough.
I just want to know if there is some other way.
I get it that it is not possible as our processor is generally 32 bits. What I am referring is some other kind of tricky technique or algorithm which can accomplish the same using the same resources.

Use a digit array with the standard, on-paper method of multiplication. For example, in C :
#include <stdio.h>
#define DIGIT_COUNT 256
void multiply(int* digits, int factor) {
int carry = 0;
for (int i = 0; i < DIGIT_COUNT; i++) {
int digit = digits[i];
digit *= factor;
digit += carry;
digits[i] = digit % 10;
carry = digit / 10;
}
}
int main(int argc, char** argv) {
int n = 100;
int digits[DIGIT_COUNT];
digits[0] = 1;
for (int i = 1; i < DIGIT_COUNT; i++) { digits[i] = 0; }
for (int i = 2; i < n; i++) { multiply(digits, i); }
int digitSum = 0;
for (int i = 0; i < DIGIT_COUNT; i++) { digitSum += digits[i]; }
printf("Sum of digits in %d! is %d.\n", n, digitSum);
return 0;
}

How are you going to find the sum of digits of 100!. If you calculate 100! first, and then find the sum, then what is the point. You will have to use some intelligent logic to find it without actually calculating 100!. Remove all the factors of five because they are only going to add zeros. Think in this direction rather than thinking about the big number. Also I am sure the final answer i.e. the sum of the digits will be within LONG LONG.
There are C++ big int libraries, but I think the emphasis here is on algorithm rather than library.

long long is not a part of C++. g++ provides it as an extension.
Arbitrary Precision Arithmetic is something that you are looking for. Check out the pseudocode given in the wiki page.
Furthermore long long cannot store such large values. So you can either create your BigInteger Class or you can use some 3rd party libraries like GMP or C++ BigInteger.

If you're referring to the Project Euler problem, my reading of that is that it wants you to write your own arbitrary-precision integer library or class that can multiply numbers.
My suggestion is to store the base-10 digits of a number, in reverse order to the way you'd normally write them, because you'll need to convert the number to base 10 in the end, anyway. Storing the digits in reverse order makes writing the addition and multiplication routines slightly easier, in my opinion. Then write addition and multiplication routines that emulate how you would add or multiply numbers manually.

Observe that multiplying any number by 10 or 100 does not change the sum of the digits.
Once you recognize that, see that multiplying by 2 and 5, or by 20 and 50, also does not change the sum, since 2x5 = 10 and 20x50 = 1000.
Then notice that anytime your current computation ends in a 0, you can simply divide by 10, and keep calculating your factorial.
Make a few more observations about shortcuts to eliminate numbers from 1 to 100, and I think you might be able to fit the answer into standard ints.

There are a number of BigInteger libraries available in C++. Just Google "C++ BigInteger". But if this is a programming contest problem then you should better try to implement your own BigInteger library.

Nothing in project Euler requires more than __int64.
I would suggest trying to do it using base 10000.

You could take the easy road and use perl/python/lisp/scheme/erlang/etc to calculate 100! using one of their built-in bignum libraries or the fact some languages use exact integer arithmetic. Then take that number, store it into a string, and find the sum of the characters (accounting for '0' = 48 etc).
Or, you could consider that in 100!, you will get a really large number with many many zeros. If you calculate 100! iteratively, consider dividing by 10 every time the current factorial is divisible by 10. I believe this will yield a result within the range of long long or something.
Or, probably a better exercise is to write your own big int library. You will need it for some later problems if you do not determine the clever tricks.

Related

Is there a good way to optimize the multiplication of two BigNums?

I have a class BigNum:
struct BigNum{
vector <int> digits;
BigNum(vector <int> data){
for(int item : data){d.push_back(item);}
}
int get_digit(size_t index){
return (index >= d.size() ? 0 : d[index]);
}
};
and I'm trying to write code to multiply two BigNums. Currently, I've been using the traditional method of multiplication, which is multiplying the first number by each digit of the other and adding it to a running total. Here's my code:
BigNum add(BigNum a, BigNum b){ // traditional adding: goes digit by digit and keeps a "carry" variable
vector <int> ret;
int carry = 0;
for(size_t i = 0; i < max(a.digits.size(), b.digits.size()); ++i){
int curr = a.get_digit(i) + b.get_digit(i) + carry;
ret.push_back(curr%10);
carry = curr/10;
}
// leftover from carrying values
while(carry != 0){
ret.push_back(carry%10);
carry /= 10;
}
return BigNum(ret);
}
BigNum mult(BigNum a, BigNum b){
BigNum ret({0});
for(size_t i = 0; i < a.d.size(); ++i){
vector <int> row(i, 0); // account for the zeroes at the end of each row
int carry = 0;
for(size_t j = 0; j < b.d.size(); ++j){
int curr = a.d[i] * b.d[j] + carry;
row.push_back(curr%10);
carry = curr/10;
}
while(carry != 0){ // leftover from carrying
row.push_back(carry%10);
carry /= 10;
}
ret = add(ret, BigNum(row)); // add the current row to our running sum
}
return ret;
}
This code still works pretty slowly; it takes around a minute to calculate the factorial of 1000. Is there a better way to multiply two BigNums? If not, is there a better way to represent large numbers that will speed up this code?
If you use a different base, say 2^16 instead of 10, the multiplication will be much faster.
But getting to print in decimal will be longer.
Get a ready made bignum library. Those tend to be optimized to death, all the way down to specific CPU models, with assembly where necessary.
GMP and MPIR are two popular ones. The latter is more Windows friendly.
One way is to use a larger base than ten. It's a huge waste, in both time and space, to take an int, able to hold values up to about four billion (unsigned variant) and use it to store single digits.
What you can do is use unsigned int/long values for a start, then choose a base such that the square of that base will fit into the value. So, for example, the square root of the largest 32-bit unsigned int is a touch over 65,000 so you choose 10,000 as the base.
So a "bigdigit" (I'll use that term for a digit in the base-10,000 scheme, is effectively equal to four decimal digits (just digits from here on), and this has several effects:
much less space taken up (about 1/1,000th of the space);
still no chance of overflow when you multiply four-digit groups.
faster multiplications, doing four digits at a time rather than one; and
still easy printing since it's in a base-ten-to-the-power-of-something format.
Those last two points warrant some explanation.
On the second last one, it should be something like sixteen times faster since, to multiply 1234 and 5678, each digit in the first has to be multiplied with every digit in the second. For a normal digit, that's sixteen multiplications, while it's only one for a bigdigit.
Since the bigdigits are exactly four digits, the output is still relatively easy, something like:
printf("%d", node[0]);
for (int i = 1; i < node_count; ++i) {
printf("%04d", node[0]);
}
Beyond that, and the normal C++ optimisations like passing const references rather than copying all objects, you can examine the same tricks used by MPIR and GMP. I tend to avoid them myself since they have (or did have at some point) a rather nasty habit of just violently exiting programs when they ran out of memory, something I find inexcusable in a general purpose library. In any case, I have routines built up over time that do, while nowhere near as much as GMP, certainly more than I need (and that use the same algorithms in many cases).
One of the tricks for multiplication is the Karatsuba algorithm (to be honest, I'm not sure if GMP/MPIR use this but, unless they've got something much better, I suspect they would).
It basically involves splitting the numbers into parts so that a = a1a0 is the first, and b = b1b0. In other words:
a = a1 x Bp + a0
b = b1 x Bp + b0
The Bp is just some integral power of the actual base you're using, and can generally be the closest value to the square root of the larger number (about half as many digits).
You then work out:
c2 = a1 x b1
c0 = a0 x b0
c1 = (a1 + a0) x (b1 + b0) - c2 - c0
That last point is tricky but it has been proven mathematically. I suggest if you want to go into that level of depth, I'm not the best person for the job. At some point, even I, the consumate "don't believe anything you can't prove yourself" type, have take the expert opinions as fact :-)
Then you work some add/shift magic (multiplication looks to be involved but, since it's multiplication by a power of the base, it's really just a matter of shifting values left).
c = c2 x B2p + c1 x Bp + c0
Now you may be wondering why three multiplications is a better approach than one, but you need to take into account that these multiplications are using far fewer digits than the original. If you remember back to the comment I made above about doing one multiplication rather than sixteen when switching from base-10 to base-10,000, you'll realise the number of digit multiplications is proportional to the square of the numbers of digits.
That means it can be better to perform three smaller multiplications even with some extra shifting and adding. And the beauty of this solution is that you can recursively apply it to the smaller numbers until you get down to the point where you're just multiplying two unsigned int values.
I probably haven't done the concept justice, and you do need to watch for and adjust the case where c1 becomes negative but, if you want raw speed, this is the sort of thing you'll have to look into.
And, as my more advanced math buddies will tell me (quite often), if you're not willing to have your entire head explode, you probably shouldn't be doing math :-)

Multiplication of big big integers found in string without using external libraries, C++

So I've been reading several topics about how GNU and other libraries are needed for multiplication of large large numbers, but I am working on something where I can't use large external libraries. Therefore, I went with the approach of inputting the big numbers as strings and I was able to perform on these numbers what I wanted to but now I have come to the hard part: multiplication of two integers up to 50 digits each. Is there any kind of code I can find around~200 or less lines that I can implement in my code and can allow me to do this? Or is there an easy way to implement this multiplication by splitting these numbers? Some thoughts or help would be greatly appreciated.
A way I have done this in the past is to create an array of integers for the arbitrarily sized numbers. 50 decimal digits requires around 167 bits, or a little over 20 bytes. Round up to 24 so it fits nicely in an array of 32-bit integers. Then convert your decimal strings into 24 bit integers a and b. From there, you can effectively just do the multiplication the same way you were taught to do it by hand. Assuming you already have a function that will add these numbers, you can multiply them like so:
int32_t a[6];
int32_t b[6];
// multiplication results require twice as much space as the operands
int32_t result[12];
memset(&result, 0, sizeof(12));
int32_t temp_result[6][7];
for (int i = 0; i < 6; i++)
{
for (int j = 0; j < 6; j++)
{
int64_t product = a[i] * b[j];
temp_result[i][j] = product & (0xffffffff);
temp_result[i][j + 1] = product & (0xffffffff00000000);
}
}
for (int i = 0; i < 6; i++)
{
// source a, source b, result
add(temp_result[i], result[i], result[i]);
}
If you don't already have an add function, the process is very similar.
mini-gmp (small single .c and .h file, part of the full gmp package).
boot multiprecision (not so small but header-only).

Can't store too lengthy int type

Consider the problem:
It can be shown that for some powers of two in decimal format like:
2^9 = 512
2^89 = 618,970,019,642,690,137,449,562,112
The results end in a string consisting of 1s and 2s. In fact, it can be proven that for every integer R, there
exists a power of 2 such that 2K where K > 0 has a string of only 1s and 2s in its last R digits.
It can be shown clearly in the table below:
R Smallest K 2^K
1 1 2
2 9 512
3 89 ...112
4 89 ...2112
Using this technique, what then is the sum of all the smallest K values for 1 <= R <= 10?
Proposed sol:
Now this problem ain't that difficult to solve. You can simply do
int temp = power(2, int)
and then if you can get the length of the temp then multiply it with
(100^len)-i or (10^len)-i
// where i would determine how many last digits you want.
Now this temp = power(2,int) gets much higher with increasing int that you can't even store it in the int type or even in long int....
So what would be done. And is there any other solution based on bit strings. I guess that might make this problem easy.
Thanks in advance.
No, I doubt there are any solutions based on "strings of bits". That would be quite inefficient. But there are Bignum Libraries like GMP which feature variable types either fixed-size much bigger than int types, or of arbitrary size limited only by memory capacity, plus matching sets of math operations, working similarly to software FPU emulation.
Quoting after reference with a minor paraphrase.
#include <gmpxx.h>
int
main (void)
{
mpz_class a, b, c;
a = 1234;
b = "-5676739826856836954375492356569366529629568926519085610160816539856926459237598";
c = a+b;
cout << "sum is " << c << "\n";
cout << "absolute value is " << abs(c) << "\n";
return 0;
}
Thanks to C++ operator overloading, it is much easier to use than ANSI C version.
Since you are only interested in the the n least significant digits of your result, you could try to devise an algorithm that only calculates those. Based on the standard algorithm for written multiplication you can see that the n least significant digits of the product are entirely determined by the n least significant digits of the multiplicands. Based on this it should be possible to create an algorithm that calculates as many digits of R^K as fit into a long int.
The only problem you might run into is that there may be numbers that end in a matching sequence that is longer that a long int can hold. In that case you can still resort to calculating additional digits using your own algorithm or a library.
Note that this is basically the same thing that big-number libraries do, only your approach might be more efficient, because you are calculating less digits that you are unlikely to need.
Try GMP, http://gmplib.org/
It can store a number with any size if it fits in the memory.
Altough you might be better off with less brute force approach.
You can store binary strings in std::bitset or in std::vector
www.cplusplus.com/reference/bitset/bitset/
I think bitset is your choice.
Using big arithmetic for operations on powers of 2 is not though

BigInt implementation - converting a string to binary representatio stored as unsigned int

I'm doing a BigInt implementation in C++ and I'm having a hard time figuring out how to create a converter from (and to) string (C string would suffice for now).
I implement the number as an array of unsigned int (so basically putting blocks of bits next to each other). I just can't figure out how to convert a string to this representation.
For example if usigned int would be 32b and i'd get a string of "4294967296", or "5000000000" or basically anything larger than what a 32b int can hold, how would I properly convert it to appropriate binary representation?
I know I'm missing something obvious, and I'm only asking for a push to the right direction. Thanks for help and sorry for asking such a silly question!
Well one way (not necessarily the most efficient) is to implement the usual arithmetic operators and then just do the following:
// (pseudo-code)
// String to BigInt
String s = ...;
BigInt x = 0;
while (!s.empty())
{
x *= 10;
x += s[0] - '0';
s.pop_front();
}
Output(x);
// (pseudo-code)
// BigInt to String
BigInt x = ...;
String s;
while (x > 0)
{
s += '0' + x % 10;
x /= 10;
}
Reverse(s);
Output(s);
If you wanted to do something trickier than you could try the following:
If input I is < 100 use above method.
Estimate D number of digits of I by bit length * 3 / 10.
Mod and Divide by factor F = 10 ^ (D/2), to get I = X*F + Y;
Execute recursively with I=X and I=Y
Implement and test the string-to-number algorithm using a builtin type such as int.
Implement a bignum class with operator+, operator*, and whatever else the above algorithm uses.
Now the algorithm should work unchanged with the bignum class.
Use the string conversion algo to debug the class, not the other way around.
Also, I'd encourage you to try and write at a high level, and not fall back on C constructs. C may be simpler, but usually does not make things easier.
Take a look at, for instance, mp_toradix and mp_read_radix in Michael Bromberger's MPI.
Note that repeated division by 10 (used in the above) performs very poorly, which shows up when you have very big integers. It's not the "be all and end all", but it's more than good enough for homework.
A divide and conquer approach is possible. Here is the gist. For instance, given the number 123456789, we can break it into pieces: 1234 56789, by dividing it by a power of 10. (You can think of these pieces of two large digits in base 100,000. Now performing the repeated division by 10 is now cheaper on the two pieces! Dividing 1234 by 10 three times and 56879 by 10 four times is cheaper than dividing 123456789 by 10 eight times.
Of course, a really large number can be recursively broken into more than two pieces.
Bruno Haibl's CLN (used in CLISP) does something like that and it is blazingly fast compared to MPI, in converting numbers with thousands of digits to numeric text.

Double precision in C++ (or pow(2, 1000))

I'm working on Project Euler to brush up on my C++ coding skills in preparation for the programming challenge(s) we'll be having this next semester (since they don't let us use Python, boo!).
I'm on #16, and I'm trying to find a way to keep real precision for 2¹°°°
For instance:
int main(){
double num = pow(2, 1000);
printf("%.0f", num):
return 0;
}
prints
10715086071862673209484250490600018105614050000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Which is missing most of the numbers (from python):
>>> 2**1000
10715086071862673209484250490600018105614048117055336074437503883703510511249361224931983788156958581275946729175531468251871452856923140435984577574698574803934567774824230985421074605062371141877954182153046474983581941267398767559165543946077062914571196477686542167660429831652624386837205668069376L
Granted, I can write the program with a Python 1 liner
sum(int(_) for _ in str(2**1000))
that gives me the result immediately, but I'm trying to find a way to do it in C++. Any pointers? (haha...)
Edit:
Something outside the standard libs is worthless to me - only dead-tree code is allowed in those contests, and I'm probably not going to print out 10,000 lines of external code...
If you just keep track of each digit in a char array, this is easy. Doubling a digit is trivial, and if the result is greater than 10 you just subtract 10 and add a carry to the next digit. Start with a value of 1, loop over the doubling function 1000 times, and you're done. You can predict the number of digits you'll need with ceil(1000*log(2)/log(10)), or just add them dynamically.
Spoiler alert: it appears I have to show the code before anyone will believe me. This is a simple implementation of a bignum with two functions, Double and Display. I didn't make it a class in the interest of simplicity. The digits are stored in a little-endian format, with the least significant digit first.
typedef std::vector<char> bignum;
void Double(bignum & num)
{
int carry = 0;
for (bignum::iterator p = num.begin(); p != num.end(); ++p)
{
*p *= 2;
*p += carry;
carry = (*p >= 10);
*p -= carry * 10;
}
if (carry != 0)
num.push_back(carry);
}
void Display(bignum & num)
{
for (bignum::reverse_iterator p = num.rbegin(); p != num.rend(); ++p)
std::cout << static_cast<int>(*p);
}
int main(int argc, char* argv[])
{
bignum num;
num.push_back(1);
for (int i = 0; i < 1000; ++i)
Double(num);
Display(num);
std::cout << std::endl;
return 0;
}
You need a bignum library, such as this one.
You probably need a pointer here (pun intended)
In C++ you would need to create your own bigint lib in order to do the same as in python.
C/C++ operates on fundamental data types. You are using a double which has only 64 bits to store a 1000 bit number. double uses 51 bit for the significant digits and 11 bit for the magnitude.
The only solution for you is to either use a library like bignum mentioned elsewhere or to roll out your own.
UPDATE: I just browsed to the Euler Problem site and found that Problem 13 is about summing large integers. The iterated method can become very tricky after a short while, so I'd suggest to use the code from Problem #13 you should have already to solve this, because 2**N => 2**(N-1) + 2**(N-1)
Using bignums is cheating and not a solution. Also, you don't need to compute 2**1000 or anything like that to get to the result. I'll give you a hint:
Take the first few values of 2**N:
0 1 2 4 8 16 32 64 128 256 ...
Now write down for each number the sum of its digits:
1 2 4 8 7 5 10 11 13 ...
You should notice that (x~=y means x and y have the same sum of digits)
1+1=2, 1+(1+2)=4, 1+(1+2+4)=8, 1+(1+2+4+8)=16~=7 1+(1+2+4+8+7)=23~=5
Now write a loop.
Project Euler = Think before Compute!
If you want to do this sort of thing on a practical basis, you're looking for an arbitrary precision arithmetic package. There are a number around, including NTL, lip, GMP, and MIRACL.
If you're just after something for Project Euler, you can write your own code for raising to a power. The basic idea is to store your large number in quite a few small pieces, and implement your own carries, borrows, etc., between the pieces.
Isn't pow(2, 1000) just 2 left-shifted 1000 times, essentially? It should have an exact binary representation in a double float. It shouldn't require a bignum library.