Number Base conversion and digits access? - c++

The language is c++
I have to read in some data from 0 - n, n could theoretically be infinity. Based on the value of n, I have to change the numbers over from decimal to that base, even if its base 10000. So if I read in 5 numbers, n=5, I have to convert them over to base5.
That said, I am not sure how to do the conversion, but I'm sure I could get it reading over some article. But what really concerns me is when I convert over to whatever the n base might be what type would my result be to store in an array? Long?
Once I get the converted numbers in some array, how would I access each individual digit in each number for manipulation later?
Thanks.

Basically, most manipulations you're going to perform on a number are base-invariant. This means that you can add/sub/mul/div (And even perform power/root/log operations) two numbers without even knowing their base.
Think about it this way, the computer does nothing special when it adds two unsigned ints even thou all it's really working with is a 32 digits base-2 number.
You can probably make due with using ints (or whatever data type you need) and convert the base during display.

Conversion from decimal to base is done by division / modulo. x is the decimal number, b is the target base.
r = x % b
y = (x-r) : b
replace x by y and repeat from 1 until y becomes 0
the result are the r's, bottom up
Beneath of that you'll have to create a std::map with replacement patterns for the numbers in r, i. e. for base 16 some entries would be 10 -> A, 11 -> B. This implies, that you'll have to think about a representation form for very large n.
BTW: Consider a book about programming 101, conversion of decimal to bin / oct / hex is always explainend and easily adaptable for other bases.

Related

how is 128 bit integer formed in abseil library?

In Abseil library absl::uint128 big = absl::MakeUint128(1, 0);
this represents 2^64 , but i don't understand what does '1' and '0' mean here.
Can someone explain me how the number is actually formed ?
absl::MakeUint128(x, y); constructs a number equal to 2^64 * x + y
And see https://abseil.io/docs/cpp/guides/numeric
How? In any possible way. But there is a very simple way to make it.
You may already know how to do arithmetic with one digit numbers in base ten, right? Then you also know to use this arithmetic to get arithmetic of two digit numbers in base 10, right? Be aware that this then gives you an arithmetic of one digit numbers in base 100 (just consider '34' or '66' as a single symbols).
Your computer knows how to make arithmetic of one number digit in base 2^64, so it makes the same extension that you use to get in base 10 to get arithmetic of two digit numbers in base 2^64. This then leads to an arithmetic in base 2^128, or an arithmetic of 128 digits numbers in base 2.

How to calculate number of digits on huge number? C++

so the problem I have is that there is two integers (a, b) that is in [1, 10^16] interval and I need to do find out how many digits will number a^b have? Those numbers are too big for saving them on single variables, and if I write them on Array it would take a lot of time.
Is there a way to count the number a^b number of digits with some kind of formula or any simpler way then Arrays?
after fixing the one-off error suggested in the comments
number of digits of a^b = floor( b * log(a) ) + 1
karakfa has it right.
The base-k logarithm of a number n, rounded up to the nearest whole number, will give you the number of digits required to represent n in base k.
EDIT: as pointed out in comments, it should not be rounded up, but rounded down and then incremented by one. This accounts for round powers of 10 having an extra digit.
If your number is a^b then take the base-10 logarithm, log a^b and use the laws of logarithms to simplify as b log a. Note that this simplification happens inside the ceiling function so the simplification is valid. Computing log a should not be an issue (it will be between 0 and 16) and b is known. Just make sure to round after multiplying, not before.
Note that limited precision of floating-point numbers may introduce some errors into this method. If the true value of b x log a is different from the nearest floating-point representation of b x log a in such a way that they fall on different sides of an integer, the method fails. You can possibly detect when you are close to this condition and remediate it somehow.
You could use a library that supports arbitrarily large numbers, like GMP .
The core C++ language itself offers no types to work with such large numbers. So either you use a pre-existing library or write one yourself (I suggest the former - don't re-invent the wheel).

Exotic base conversion without an intermediate value?

I was just implementing an arbitrary base converter, and I had a moment of curiosity: is there a generalizable base conversion approach that works with any possible range of base inputs (excluding base 0), AND does not use an intermediate value in a more convenient base?
When writing a base conversion function, particularly one that goes from an arbitrary base to another arbitrary base, it can easily be implemented by first converting the number to decimal, and then re-converting it to the target number. You can also pretty easily bypass the intermediate value step arithmetically, provided you're in the range of base [1,10].
However, it seems to become trickier when you expand the range. Consider the following examples (primes seem like they might narrow the possible approaches a bit?):
base 7 to base 33
base -3 to base 13
base -16 to base 16
I didn't see much conversation about this question here, or elsewhere on Google; most either had a narrower range of bases or used an intermediate value. Any ideas?
The generalized algorithm to perform a base conversion of a value V with n digits in base B to an equivalent value V' in base B' is as follows:
Let d(0),d(1),...d(n-1) be the digits of V in base B. Using a translation table, we convert these digits to a sequence of digits d'(0),d'(1),...,d'(n) where each d'(i) is the original d(i) digit, but expressed in the new base B'
Then, V' is defined by:
V' = d'(0)*B^0 + d'(1)*B^1 + d'(2)*B^2+.....+d'(n-1)*B^(n-1)
Now here's the catch (and the reason you need an intermediate value): not only all values of the above formula have to be expressed in base B', but all the operations (addition and multiplication) have to be performed using aritmetic in base B'
For example: how to convert the number V = 201 expressed in base 3 to base 2 without converting it first to base 10.
The digits of V expressed in base 3 are d(0)=1, d(1)=0, d(2)=2
Converted to base 2, d'(0)=1, d'(1)=0, d'(2)=10
The original base is 3, but the general conversion formula need it to be expressed in the target base (2), so we'll use the value B=11
Then:
V' = d'(0)*B^0 + d'(1)*B^1 + d'(2)*B^2+.....+d'(n-1)*B^(n-1)
Putting values and operating in base 2
V' = 1 + 0 + 10*(11^2) = 1 + 10*11*11 = 1 + 10010 = 10011
So, 201(3 = 10011(2
Proof:
201(3 = 2*3^2 + 0 + 1 = 19
10011 = 1 + 2 + 16 = 19
A brief word about why base-10 is the intermmediate base. Because it's familiar.
School has taught us base-10 whether we knew it or not, from kindergarden to where you now. Our our language centers around base-10, whether articulated, writtened, or compiled. The computer languages by design use base-10 for their high-level human user interface, and common parameters to run-time functions regardless of the internal representation.
Because base-10 is in our everyday lives, we don't need to think as much on the sequence of base-10 artifacts ( 0 through 9 ), leaving our minds to plan other logic for the task at hand. When I designed this alogrythm in 1982, I chose 0-9A-Z artifact list method for simplicity as a type of exponent map, making it easier to leverage the language math operations and string interface, that all center around base-10. (Hmmm that base-10 thing again.)
So if you desire to really want to make your head hurt, by re-engineering a whole new user interface by over complicating the whole process, go for it. You'll still find yourself using base-10 paramters. (Hmmm)
The first effort involved logarithm (The interface to the math function is base-10. (hmmm)).
Largest_exponet_of_new_base = log(your base10 number) / log(the of base you are converting to)

How to fix the position of binary point in an unsigned N-bit interger?

I am working on developing a fixed point algorithm in C++. I know that, for a N-bit integer, the fixed point binary integer is represented as U(a,b). For example, for an 8 bit Integer (i.e 256 samples), If we represent it in the form U(6,2), it means that the binary point is to the left of the 2nd bit starting from the right of the form:
b5 b4 b3 b2 b1 b0 . b(-1) b(-2)
Thus , it has 6 integer bits and 2 fractional bits. In C++, I know there are some bit shift operators I can use, but they are basically used for shifting the bits of the input stream, my question is, how to define a binary fixed point integer of the form, fix<6,2> or U(6,2). All the major processing operation will be carried out on the fractional part and I am just finding a way to do this fix in C++. Any help regarding this would be appreciated.Thanks!
Example : Suppose I have an input discrete signal with 1024 sample points on x-axis (For now just think this input signal is coming from some sensor). Each of this sample point has a particular amplitude. Say the sample at time 2(x-axis) has an amplitude of 3.67(y-axis). Now I have a variable "int *input;" that takes the sample 2, which in binary is 0000 0100. So basically I want to make this as 00000.100 by performing the U(5,3) on the sample 2 in C++. So that I can perform the interpolation operations on fractions of the input sampling period or time.
PS - I don't want to create a separate class or use external libraries for this. I just want to take each 8 bits from my input signal, perform the U(a,b) fix on it followed by rest of the operations are done on the fractional part.
Short answer: left shift.
Long answer:
Fixed point numbers are stored as integers, usually int, which is the fastest integer type for a particular platform.
Normal integer without fractional bits are usually called Q0, Q.0 or QX.0 where X is the total number of bits of underlying storage type(usually int).
To convert between different Q.X formats, left or right shift. For example, to convert 5 in Q0 to 5 in Q4, left shift it 4 bits, or multiply it by 16.
Usually it's useful to find or write a small fixed point library that does basic calculations, like a*b>>q and (a<<q)/b. Because you will do Q.X=Q.Y*Q.Z and Q.X=Q.Y/Q.Z a lot and you need to convert formats when doing calculations. As you may have observed, using normal * operator will give you Q.(X+Y)=Q.X*Q.Y, so in order to fit the result into Q.Z format, you need to right shift the result by (X+Y-Z) bits.
Division is similar, you get Q.(X-Y)=Q.X*Q.Y form the standard / operator, and to get the result in Q.Z format you shift the dividend before the division. What's different is that division is an expensive operation, and it's not trivial to write a fast one from scratch.
Be aware of double-word support of your platform, it will make your life a lot easier. With double word arithmetic, result of a*b can be twice the size of a or b, so that you don't lose range by doing a*b>>c. Without double word, you have to limit the input range of a and b so that a*b doesn't overflow. This is not obvious when you first start, but soon you will find you need more fractional bits or rage to get the job done, and you will finally need to dig into the reference manual of your processor's ISA.
example:
float a = 0.1;// 0.1
int aQ16 = a*65536;// 0.1 in Q16 format
int bQ16 = 4<<16// 4Q16
int cQ16 = a*b>>16 // result = 0.399963378906250Q16 = 26212,
// not 0.4Q16 = 26214 because of truncating error
If this is your question:
Q. Should I define my fixed-binary-point integer as a template, U<int a, int b>(int number), or not, U(int a, int b)
I think your answer to that is: "Do you want to define operators that take two fixed-binary-point integers? If so make them a template."
The template is just a little extra complexity if you're not defining operators. So I'd leave it out.
But if you are defining operators, you don't want to be able to add U<4, 4> and U<6, 2>. What would you define your result as? The templates will give you a compile time error should you try to do that.

Arbitrary precision arithmetic with GMP

I'm using the GMP library to make a Pi program, that will calculate about 7 trillion digits of Pi. Problem is, I can't figure out how many bits are needed to hold that many decimal places.
7 trillion digits can represent any of 10^(7 trillion) distinct numbers.
x bits can represent 2^x distinct numbers.
So you want to solve:
2^x = 10^7000000000000
Take the log-base-2 of both sides:
x = log2(10^7000000000000)
Recall that log(a^b) = b * log(a):
x = 7000000000000 * log2(10)
I get 23253496664212 bits. I would add one or two more just to be safe. Good luck finding the petabytes to hold them, though.
I suspect you are going to need a more interesting algorithm.
I wanna just correct one thing about what was written in the response answer:
Recall that log(a^b) = a * log(b)
well it is the opposite :
log(a^b) = b * log(a)
2^10 = 1024, so ten bits will represent slightly more than three digits. Since you're talking about 7 trillion digits, that would be something like 23 trillion bits, or about 3 terabytes, which is more than I could get on one drive from Costco last I visited.
You may be getting overambitious. I'd wonder about the I/O time to read and write entire disks for each operation.
(The mathematical way to solve it is to use logarithms, since a number that takes 7 trillion digits to represent has a log base 10 of about 7 trillion. Find the log of the number in the existing base, convert the base, and you've got your answer. For shorthand between base 2 and base 10, use ten bits==three digits, because that's not very far wrong. It says that the log base 10 of 2 is .3, when it's actually more like .301.)