I am up to about 8E10000 so how is it calculating such large number, there is no variable that can hold such large numbers.
Normal types in C can usually only store up to 64 bits, instead of a single variable, you can use an array of characters to store digits of your number and write functions for each operation (sum, minus and so on) in your program.
You may look at this: GNU Multiple Precision Arithmetic Library
In a nut shell they aren't using one variable to hold the operands but data structures than can probably hold arbitrary long numbers (like an array) and they evaluate operations by considering the number to be in a large radix system.
When you actually do a math operation the operands aren't variables but array (or any other data structure that is suitable) and you do it by doing the operation (where available) component wise.
When you want to add 2 array you choose a radix and then loop the arrays and add op1[i] to op2[i] then you take that value and check to see if it its bigger than your radix and compute a carriage that you add to next addition.
car = (op1[i] + op2[i])%radix
You need to be careful in choosing the radix and the underlaying data so an addition doesn't cause an overflow.
This how you also do when you add numbers in the base 10 by hand but without taking into account the radix.
You can also look over this for a bigint package.
Related
I have a geometric algorithm which takes as input a polygon. However, the files I am supposed to use as input files store the coordinates of the polygons in a rather peculiar way. Each file consists of one line, a counterclockwise sequence of the vertices. Each vertex is represented by its x and y coordinates each of which is written as the quotient of two integers int/int. However, these integers are incredibly large. I wrote a program that parses them from a string into long long using the function std::stoll. However, it appears that some of the numbers in the input file are larger than 2^64.
The output coordinates are usually quite small, in the range 0-1000. How do I go about parsing these numbers and then dividing them, obtaining doubles? Is there any standard library way of doing this, or should I use something like the boost library?
If you are after a ratio of two large numbers as string, you can shorten the strings:
"194725681173571753193674" divided by "635482929374729202" is the same as
"1947256811735717" divided by "6354829293" to at least 9 digits (I just removed the same amount of digits on both sides). Depending on the needed precision, this might be the simplest solution. Just remove digits before converting to long long.
You can parse the inputs directly into a long double I believe. However, that approach will introduce precision errors. If precision is important, then avoid this.
A general solution for precise results is to represent the large integer with an array of integers where one integer represents the lower order bytes, next integer represents the larger bytes etc. This is generally called arbitrary precision arithmetic.
Is there any standard library way of doing this
No, other than basic building blocks such as vector for storing the array.
or should I use something like the boost library?
That's often a good place to start. Boost happens to have a library for this.
For all the Knapsack problem that I've seen so far on the internet, all of them have the form (cost, value) given a capacity of the cost variable. All of the problems seems to have cost as an integer only which makes it quite convenient to make a 2D array for Value and Keep array. But what if the cost variable isn't an integer but instead a double data type? There's no way to make a Value and Keep array based on the double data type. How can I approach this situation?
Ex:
budget: $3458
item_name(laptop) cost(1177.44) value (131)
item_name(desktop) cost(1054.44) value(35)
item_name(GPU) cost(1252.66) value(105)
item_name(CPU) cost(946.021) value(136)
You can scan your input for the smallest exponent (using frexp()), and add in the mantissa precision (53 bits?) to find a scaling factor that will convert all your numbers to exactly proportionate integers.
You will need a bigint library to handle the resulting integers, though.
Switch to a dynamic program that finds the least costly solution for each value, with 2D arrays for Cost and Keep instead of Value and Keep. (The difference between the programs is minor.)
I'm trying to minimize overhead as much as possible when adding numbers in an arithmetic series. I'm talking about a very large set, such as from 1 to 2^128. Is there any fast way of doing this? If so, what would it be without actually using the arithmetic sequence sum formula? Just as a reference, the sum from 1 to 2^128 is:
57896044618658097711785492504343953926464851149359812787997104700240680714240
Only fast way is to use the formula:
n * (n+1) / 2
Any other method (adding naively) will take way too long! (Even if you had a million years on a supercomputer, you wouldn't finish the calculation).
For such a large integer though, you cannot use normal integers. You will need to use a big integer object. So get a Big Integer library, eg. Google search, https://mattmccutchen.net/bigint/.
Note: a 256-bit integer may be able to hold results up to around that scale, but it is quite platform and compiler-dependent, as to whether 256-bit integers are readily available, and how they are used.
I am implementing a BigInt class that must support arbitrary-precision operations on integers.
Quote from "The Algorithm Design Manual" by S.Skiena:
What base should I do [editor's note: arbitrary-precision] arithmetic in? - It is perhaps simplest to implement your own high-precision arithmetic package in decimal, and thus represent each integer as a string of base-10 digits. However, it is far more efficient to use a higher base, ideally equal to the square root of the largest integer supported fully by hardware arithmetic.
How do I find the largest integer supported fully by hardware arithmetic? If I understand correctly, being my machine an x64 based PC, the largest integer supported should be 2^64 (http://en.wikipedia.org/wiki/X86-64 - Architectural features: 64-bit integer capability), so I should use base 2^32, but is there a way in c++ to get this size programmatically so I can typedef my base_type to it?
You might be searching for std::uintmax_t and std::intmax_t.
static_cast<unsigned>(-1) is the max int. e.g. all bits set to 1 Is that what you are looking for ?
You can also use std::numeric_limits<unsigned>::max() or UINT_MAX, and all of these will yield the same result. and what these values tell is the maximum capacity of unsigned type. e.g. the maximum value that can be stored into unsigned type.
int (and, by extension, unsigned int) is the "natural" size for the architecture. So a type that has half the bits of an int should work reasonably well. Beyond that, you really need to configure for the particular hardware; the type of the storage unit and the type of the calculation unit should be typedefs in a header and their type selected to match the particular processor. Typically you'd make this selection after running some speed tests.
INT_MAX doesn't help here; it tells you the largest value that can be stored in an int, which may or may not be the largest value that the hardware can support directly. Similarly, INTMAX_MAX is no help, either; it tells you the largest value that can be stored as an integral type, but doesn't tell you whether operations on such a value can be done in hardware or require software emulation.
Back in the olden days, the rule of thumb was that operations on ints were done directly in hardware, and operations on longs were done as multiple integer operations, so operations on longs were much slower than operations on ints. That's no longer a good rule of thumb.
Things are not so black and white. There are MAY issues here, and you may have other things worth considering. I've now written two variable precision tools (in MATLAB, VPI and HPF) and I've chosen different approaches in each. It also matters whether you are writing an integer form or a high precision floating point form.
The difference is, integers can grow without bound in the number of digits. But if you are doing a floating point implementation with a user specified number of digits, you always know the number of digits in the mantissa. This is fixed.
First of all, it is simplest to use a single integer for each decimal digit. This makes many things work nicely, so I/O is easy. It is a bit inefficient in terms of storage though. Adds and subtracts are easy though. And if you use integers for each digit, then multiplies are even easy. In MATLAB for example, conv is pretty fast, though it is still O(n^2). I think gmp uses an fft multiply, so faster yet.
But assuming you use a basic conv multiply, then you need to worry about overflows for numbers with a huge number of digits. For example, suppose I store decimal digits as 8 bit signed integers. Using conv, followed by carries, I can do a multiply. For example, suppose I have the number 9999.
N = repmat(9,1,4)
N =
9 9 9 9
conv(N,N)
ans =
81 162 243 324 243 162 81
Thus even to form the product 9999*9999, I'd need to be careful as the digits will overflow an 8 bit signed integer. If I'm using 16 bit integers to accumulate the convolution products, then a multiply between a pair of 1000 digits integers can cause an overflow.
N = repmat(9,1,1000);
max(conv(N,N))
ans =
81000
So if you are worried about the possibility of millions of digits, you need to watch out.
One alternative is to use what I call migits, essentially working in a higher base than 10. Thus by using base 1000000 and doubles to store the elements, I can store 6 decimal digits per element. A convolution will still cause overflows for larger numbers though.
N = repmat(999999,1,10000);
log2(max(conv(N,N)))
ans =
53.151
Thus a convolution between two sets of base 1000000 migits that are 10000 migits in length (60000 decimal digits) will overflow the point where a double cannot represent an integer exactly.
So again, if you will use numbers with millions of digits, beware. A nice thing about the use of a higher base of migits with a convolution based multiply is since the conv operation is O(n^2), then going from base 10 to base 100 gives you a 4-1 speedup. Going to base 1000 yields a 9-1 speedup in the convolutions.
Finally, the use of a base other than 10 as migits makes it logical to implement guard digits (for floats.) In floating point arithmetic, you should never trust the least significant bits of a computation, so it makes sense to keep a few digits hidden in the shadows. So when I wrote my HPF tool, I gave the user control of how many digits would be carried along. This is not an issue for integers of course.
There are many other issues. I discuss them in the docs carried with those tools.
I would like to implement a BigInt class which will be able to handle really big numbers. I want only to add and multiply numbers, however the class should also handle negative numbers.
I wanted to represent the number as a string, but there is a big overhead with converting string to int and back for adding. I want to implement addition as on the high school, add corresponding order and if the result is bigger than 10, add the carry to next order.
Then I thought that it would be better to handle it as a array of unsigned long long int and keep the sign separated by bool. With this I'm afraid of size of the int, as C++ standard as far as I know guarantees only that int < float < double. Correct me if I'm wrong. So when I reach some number I should move in array forward and start adding number to the next array position.
Is there any data structure that is appropriate or better for this?
So, you want a dynamic array of integers of a well known size?
Sounds like vector<uint32_t> should work for you.
As you already found out, you will need to use specific types in your platform (or the language if you have C++11) that have a fixed size. A common implementation of big number would use 32bit integers and ensure that only the lower 16 bits are set. This enables you to operate on the digits (where digit would be [0..2^16) ) and then normalize the result by applying the carry-overs.
On a modern, 64-bit x86 platform, the best approach is probably to store your bigint as a dynamically-allocated array of unsigned 32-bit integers, so your arithmetic can fit in 64 bits. You can handle your sign separately, as a member variable of the class, or you can use 2's-complement arithmetic (which is how signed int's are typically represented).
The standard C <stdint.h> include file defines uint32_t and uint64_t, so you can avoid platform-dependent integer types. Or, (if your platform doesn't provide these), you can improvise and define this kind of thing yourself -- preferably in a separate "platform_dependent.h" file...