how is 128 bit integer formed in abseil library?

how is 128 bit integer formed in abseil library? - c++

In Abseil library absl::uint128 big = absl::MakeUint128(1, 0);
this represents 2^64 , but i don't understand what does '1' and '0' mean here.
Can someone explain me how the number is actually formed ?

absl::MakeUint128(x, y); constructs a number equal to 2^64 * x + y
And see https://abseil.io/docs/cpp/guides/numeric

How? In any possible way. But there is a very simple way to make it.
You may already know how to do arithmetic with one digit numbers in base ten, right? Then you also know to use this arithmetic to get arithmetic of two digit numbers in base 10, right? Be aware that this then gives you an arithmetic of one digit numbers in base 100 (just consider '34' or '66' as a single symbols).
Your computer knows how to make arithmetic of one number digit in base 2^64, so it makes the same extension that you use to get in base 10 to get arithmetic of two digit numbers in base 2^64. This then leads to an arithmetic in base 2^128, or an arithmetic of 128 digits numbers in base 2.

Related

How to calculate number of digits on huge number? C++

so the problem I have is that there is two integers (a, b) that is in [1, 10^16] interval and I need to do find out how many digits will number a^b have? Those numbers are too big for saving them on single variables, and if I write them on Array it would take a lot of time.
Is there a way to count the number a^b number of digits with some kind of formula or any simpler way then Arrays?

after fixing the one-off error suggested in the comments
number of digits of a^b = floor( b * log(a) ) + 1

karakfa has it right.
The base-k logarithm of a number n, rounded up to the nearest whole number, will give you the number of digits required to represent n in base k.
EDIT: as pointed out in comments, it should not be rounded up, but rounded down and then incremented by one. This accounts for round powers of 10 having an extra digit.
If your number is a^b then take the base-10 logarithm, log a^b and use the laws of logarithms to simplify as b log a. Note that this simplification happens inside the ceiling function so the simplification is valid. Computing log a should not be an issue (it will be between 0 and 16) and b is known. Just make sure to round after multiplying, not before.
Note that limited precision of floating-point numbers may introduce some errors into this method. If the true value of b x log a is different from the nearest floating-point representation of b x log a in such a way that they fall on different sides of an integer, the method fails. You can possibly detect when you are close to this condition and remediate it somehow.

You could use a library that supports arbitrarily large numbers, like GMP .
The core C++ language itself offers no types to work with such large numbers. So either you use a pre-existing library or write one yourself (I suggest the former - don't re-invent the wheel).

Conversion Big Integer <-> double in C++

I am writing my own long arithmetic library in C++ for fun and it is already pretty finished, I even implemented several Cryptogrphic algorithms with that library, but one important thing is still missing: I want to convert doubles (and floats/long doubles) into my number and vice versa. My numbers are represented as a variable sized array of unsigned long ints plus a sign bit.
I tried to find the answer with google, but the problem is that people rarely ever implement such things themselves, so I only find things about how to use Java BigInteger etc.
Conceptually, it is rather easy: I take the mantissa, shift it by the number of bits dictated by the exponent and set the sign. In the other direction I truncate it so that it fits into the mantissa and set the exponent depending on my log2 function.
But I am having a hard time to figure out the details, I could either play around with some bit patterns and cast it to a double, but I didn't find an elegant way to achieve that or I could "calculate" it by starting with 2, exponentiate, multiply etc, but that doesn't seem very efficient.
I would appreciate a solution that doesn't use any library calls because I am trying to avoid libraries for my project, otherwise I could just have used gmp, furthermore, I often have two solutions on several other occasions, one using inline assembler which is efficient and one that is more platform independent, so either answer is useful for me.
edit: I use uint64_t for my parts, but I would like to be able to change it depending on the machine, but I am willing to do some different implementations with some #ifdefs to achieve that.

I'm going to make non-portable assumption here: namely, that unsigned long long has more accurate digits than double. (This is true on all modern desktop systems that I know of.)
First, convert the most significant integer(s) into an unsigned long long. Then convert that to a double S. Let M be the number of integers less than those used in that first step. multiply S by(1ull << (sizeof(unsigned)*CHAR_BIT*M). (If shifting more than 63 bits, you will have to split those into seperate shifts and do some alrithmetic) Finally, if the original number was negative you multiply this result by -1.
This rounds a lot, but even with this rounding, due to the above assumption, no digits are lost that wouldn't be lost anyway with the conversion to a double. I think this is a similar process to what Mark Ransom said, but I'm not certain.
For converting from a double to a biginteger, first seperate the mantissa into a double M and the exponent into an int E, using frexp. Multiply M by UNSIGNED_MAX, and store that result in an unsigned R. If std::numeric_limits<double>::radix() is 2 (I don't know if it is or not for x86/x64), you can easily shift R left by E-(sizeof(unsigned)*CHAR_BIT) bits and you're done. Otherwise the result will instead beR*(E**(sizeof(unsigned)*CHAR_BIT)) (where ** means to the power of)
If performance is a concern, you can add an overload to your bignum class for multiplying by std::constant_integer<unsigned, 10>, which simply returns (LHS<<4)+(LHS<<2). You can similarly optimize other constants if you wish.

This blog post might help you Clarifying and optimizing Integer>>asFloat
Otherwise, you can yet have an idea of algorithm with this SO question Converting from unsigned long long to float with round to nearest even

You don't say explicitly, but I assume your library is integer only and the unsigned longs are 32 bit and binary (not decimal). The conversion to double is simple, so I'll tackle that first.
Start with a multiplier for the current piece; if the number is positive it will be 1.0, if negative it will be -1.0. For each of the unsigned long ints in your bignum, multiply by the current multiplier and add it to the result, then multiply your multiplier by pow(2.0, 32) (4294967296.0) for 32 bits or pow(2.0, 64) (18446744073709551616.0) for 64 bits.
You can optimize this process by working with only the 2 most significant values. You need to use 2 even if the number of bits in your integer type is larger than the precision of a double, since the number of used bits in the most significant value might only be 1. You can generate the multiplier by taking a power of 2 to the number of skipped bits, e.g. pow(2.0, most_significant_count*sizeof(bit_array[0])*8). You can't use a bit shift as given in another answer because it will overflow after the first value.
To convert from double, you can get the exponent and mantissa separated from each other with the frexp function. The mantissa will come as a floating point value between 0.5 and 1.0 so you'll want to multiply it by pow(2.0, 32) or pow(2.0, 64) to convert it to an integer, then adjust the exponent by -32 or -64 to compensate.

To go from a big integer to a double, just do it the same way you parse numbers. For example, you parse the number "531" as "1 + (3 * 10) + (5 * 100)". Compute each portion using doubles, starting with the least significant portion.
To go from a double to a big integer, do it the same way but in reverse starting with the most significant portion. So, to convert 531, you first see that it's more than 100 but less than 1000. You find the first digit by dividing by 100. Then you subtract to get the remainder of 31. Then find the next digit by dividing by 10. And so on.
Of course, you won't be using tens (unless you store your big integers as digits). Exactly how you break it apart depends on how your big integer class is constructed. For example, if it's uses 64-bit units, then you'll use powers of 2^64 instead of powers of 10.

How do I find the largest integer fully supported by hardware arithmetics?

I am implementing a BigInt class that must support arbitrary-precision operations on integers.
Quote from "The Algorithm Design Manual" by S.Skiena:
What base should I do [editor's note: arbitrary-precision] arithmetic in? - It is perhaps simplest to implement your own high-precision arithmetic package in decimal, and thus represent each integer as a string of base-10 digits. However, it is far more efficient to use a higher base, ideally equal to the square root of the largest integer supported fully by hardware arithmetic.
How do I find the largest integer supported fully by hardware arithmetic? If I understand correctly, being my machine an x64 based PC, the largest integer supported should be 2^64 (http://en.wikipedia.org/wiki/X86-64 - Architectural features: 64-bit integer capability), so I should use base 2^32, but is there a way in c++ to get this size programmatically so I can typedef my base_type to it?

You might be searching for std::uintmax_t and std::intmax_t.

static_cast<unsigned>(-1) is the max int. e.g. all bits set to 1 Is that what you are looking for ?
You can also use std::numeric_limits<unsigned>::max() or UINT_MAX, and all of these will yield the same result. and what these values tell is the maximum capacity of unsigned type. e.g. the maximum value that can be stored into unsigned type.

int (and, by extension, unsigned int) is the "natural" size for the architecture. So a type that has half the bits of an int should work reasonably well. Beyond that, you really need to configure for the particular hardware; the type of the storage unit and the type of the calculation unit should be typedefs in a header and their type selected to match the particular processor. Typically you'd make this selection after running some speed tests.
INT_MAX doesn't help here; it tells you the largest value that can be stored in an int, which may or may not be the largest value that the hardware can support directly. Similarly, INTMAX_MAX is no help, either; it tells you the largest value that can be stored as an integral type, but doesn't tell you whether operations on such a value can be done in hardware or require software emulation.
Back in the olden days, the rule of thumb was that operations on ints were done directly in hardware, and operations on longs were done as multiple integer operations, so operations on longs were much slower than operations on ints. That's no longer a good rule of thumb.

Things are not so black and white. There are MAY issues here, and you may have other things worth considering. I've now written two variable precision tools (in MATLAB, VPI and HPF) and I've chosen different approaches in each. It also matters whether you are writing an integer form or a high precision floating point form.
The difference is, integers can grow without bound in the number of digits. But if you are doing a floating point implementation with a user specified number of digits, you always know the number of digits in the mantissa. This is fixed.
First of all, it is simplest to use a single integer for each decimal digit. This makes many things work nicely, so I/O is easy. It is a bit inefficient in terms of storage though. Adds and subtracts are easy though. And if you use integers for each digit, then multiplies are even easy. In MATLAB for example, conv is pretty fast, though it is still O(n^2). I think gmp uses an fft multiply, so faster yet.
But assuming you use a basic conv multiply, then you need to worry about overflows for numbers with a huge number of digits. For example, suppose I store decimal digits as 8 bit signed integers. Using conv, followed by carries, I can do a multiply. For example, suppose I have the number 9999.
N = repmat(9,1,4)
N =
9 9 9 9
conv(N,N)
ans =
81 162 243 324 243 162 81
Thus even to form the product 9999*9999, I'd need to be careful as the digits will overflow an 8 bit signed integer. If I'm using 16 bit integers to accumulate the convolution products, then a multiply between a pair of 1000 digits integers can cause an overflow.
N = repmat(9,1,1000);
max(conv(N,N))
ans =
81000
So if you are worried about the possibility of millions of digits, you need to watch out.
One alternative is to use what I call migits, essentially working in a higher base than 10. Thus by using base 1000000 and doubles to store the elements, I can store 6 decimal digits per element. A convolution will still cause overflows for larger numbers though.
N = repmat(999999,1,10000);
log2(max(conv(N,N)))
ans =
53.151
Thus a convolution between two sets of base 1000000 migits that are 10000 migits in length (60000 decimal digits) will overflow the point where a double cannot represent an integer exactly.
So again, if you will use numbers with millions of digits, beware. A nice thing about the use of a higher base of migits with a convolution based multiply is since the conv operation is O(n^2), then going from base 10 to base 100 gives you a 4-1 speedup. Going to base 1000 yields a 9-1 speedup in the convolutions.
Finally, the use of a base other than 10 as migits makes it logical to implement guard digits (for floats.) In floating point arithmetic, you should never trust the least significant bits of a computation, so it makes sense to keep a few digits hidden in the shadows. So when I wrote my HPF tool, I gave the user control of how many digits would be carried along. This is not an issue for integers of course.
There are many other issues. I discuss them in the docs carried with those tools.

Number Base conversion and digits access?

The language is c++
I have to read in some data from 0 - n, n could theoretically be infinity. Based on the value of n, I have to change the numbers over from decimal to that base, even if its base 10000. So if I read in 5 numbers, n=5, I have to convert them over to base5.
That said, I am not sure how to do the conversion, but I'm sure I could get it reading over some article. But what really concerns me is when I convert over to whatever the n base might be what type would my result be to store in an array? Long?
Once I get the converted numbers in some array, how would I access each individual digit in each number for manipulation later?
Thanks.

Basically, most manipulations you're going to perform on a number are base-invariant. This means that you can add/sub/mul/div (And even perform power/root/log operations) two numbers without even knowing their base.
Think about it this way, the computer does nothing special when it adds two unsigned ints even thou all it's really working with is a 32 digits base-2 number.
You can probably make due with using ints (or whatever data type you need) and convert the base during display.

Conversion from decimal to base is done by division / modulo. x is the decimal number, b is the target base.
r = x % b
y = (x-r) : b
replace x by y and repeat from 1 until y becomes 0
the result are the r's, bottom up
Beneath of that you'll have to create a std::map with replacement patterns for the numbers in r, i. e. for base 16 some entries would be 10 -> A, 11 -> B. This implies, that you'll have to think about a representation form for very large n.
BTW: Consider a book about programming 101, conversion of decimal to bin / oct / hex is always explainend and easily adaptable for other bases.

Arbitrary precision arithmetic with GMP

I'm using the GMP library to make a Pi program, that will calculate about 7 trillion digits of Pi. Problem is, I can't figure out how many bits are needed to hold that many decimal places.

7 trillion digits can represent any of 10^(7 trillion) distinct numbers.
x bits can represent 2^x distinct numbers.
So you want to solve:
2^x = 10^7000000000000
Take the log-base-2 of both sides:
x = log2(10^7000000000000)
Recall that log(a^b) = b * log(a):
x = 7000000000000 * log2(10)
I get 23253496664212 bits. I would add one or two more just to be safe. Good luck finding the petabytes to hold them, though.
I suspect you are going to need a more interesting algorithm.

I wanna just correct one thing about what was written in the response answer:
Recall that log(a^b) = a * log(b)
well it is the opposite :
log(a^b) = b * log(a)

2^10 = 1024, so ten bits will represent slightly more than three digits. Since you're talking about 7 trillion digits, that would be something like 23 trillion bits, or about 3 terabytes, which is more than I could get on one drive from Costco last I visited.
You may be getting overambitious. I'd wonder about the I/O time to read and write entire disks for each operation.
(The mathematical way to solve it is to use logarithms, since a number that takes 7 trillion digits to represent has a log base 10 of about 7 trillion. Find the log of the number in the existing base, convert the base, and you've got your answer. For shorthand between base 2 and base 10, use ten bits==three digits, because that's not very far wrong. It says that the log base 10 of 2 is .3, when it's actually more like .301.)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js