How to represent a number in base 2³²?

How to represent a number in base 2³²? - c++

If I have some base 10 or base 16 number, how do I change it into base 232?
The reason I'm trying to do this, is for implementing BigInt as suggested by other members here Why to use higher base for implementing BigInt?
Will it be the same as integer (base 10) till 232? What will happen after it?

You are trying to find something of the form
a0 + a1 * (2^32) + a2 * (2^32)^2 + a3 * (2^32)^3 + ...
which is exactly the definition of a base-232 system, so ignore all the people that told you that your question doesn't make sense!
Anyway, what you are describing is known as base conversion. There are quick ways and there are easy ways to solve this. The quick ways are very complicated (there are entire chapters of books dedicated to the subject), and I'm not going to attempt to address them here (not least because I've never attempted to use them).
One easy way is to first implement two functions in your number system, multiplication and addition. (i.e. implement BigInt add(BigInt a, BigInt b) and BigInt mul(BigInt a, BigInt b)). Once you've solved that, you will notice that a base-10 number can be expressed as:
b0 + b1 * 10 + b2 * 10^2 + b3 * 10^3 + ...
which can also be written as:
b0 + 10 * (b1 + 10 * (b2 + 10 * (b3 + ...
so if you move left-to-right in your input string, you can peel off one base-10 digit at a time, and use your add and mul functions to accumulate into your BigInt:
BigInt a = 0;
for each digit b {
a = add(mul(a, 10), b);
}
Disclaimer: This method is not computationally efficient, but it will at least get you started.
Note: Converting from base-16 is much simpler, because 232 is an exact multiple of 16. So the conversion basically comes down to concatenating bits.

Let's suppose that we are talking about a base-10 number:
a[0]*10^0 + a[1]*10^1 + a[2]*10^2 + a[3]*10^3 + ... + a[N]*10^N
where each a[i] is a digit in the range 0 to 9 inclusive.
I'm going to assume that you can parse the string that is your input value and find the array a[]. Once you can do that, and assuming that you have already implemented your BigInt class with the + and * operators, then you are home. You can simply evaluate the expression above with an instance of your BigInt class.
You can evaluate this expression relatively efficiently using Horner's method.
I've just written this down off the top of my head, and I will bet that there are much more efficient base conversion schemes.

If I have some base 10 or base 16 number, how do I change it into base 2^32?
Just like you convert it to any other base. You want to write the number n as
n = a_0 + a_1 * 2^32 + a_2 * 2^64 + a_3 * 2^96 + ... + a_k * 2^(32 * k).
So, find the largest power of 2^32 that divides into n, subtract off the multiple of that power from n, and repeat with the difference.
However, are you sure that you asked the right question?
I suspect that you mean to be asking a different question. I suspect that you mean to ask: how do I parse a base-10 number into an instance of my BigInteger? That's easy. Code up your implementation, and make sure that you've implemented + and *. I'm completely agnostic to how you actually internally represent integers, but if you want to use base 2^32, fine, do it. Then:
BigInteger Parse(string s) {
BigInteger b = new BigInteger(0);
foreach(char c in s) { b = b * 10 + (int)c - (int)'0'; }
return b;
}
I'll leave it to you to translate this to C.

Base 16 is easy, since 232 is 168, an exact power. So, starting from the least significant digit, read 8 base-16 digits at a time, convert those digits into a 32-bit value, and that is the next base-232 "digit".
Base 10 is more difficult. As you say, if it's less than 232, then you just take the value as a single base-232 "digit". Otherwise, the simplest method I can think of is to use the Long Division algorithm to repeatedly divide the base-10 value by 232; at each stage, the remainder is the next base-232 "digit". Perhaps someone who knows more number theory than me could provide a better solution.

I think this is a totally reasonable thing to do.
What you are doing is representing a very large number (like an encryption key) in an array of 32 bit integers.
A base 16 representation is base 2^4, or a series of 4 bits at a time. If you are receiving a stream of base 16 "digits", fill in the low 4 bits of the first integer in your array, then the next lowest, until you read 8 "digits". Then go to the next element in the array.
long getBase16()
{
char cCurr;
switch (cCurr = getchar())
{
case 'A':
case 'a':
return 10;
case 'B':
case 'b':
return 11;
...
default:
return cCurr - '0';
}
}
void read_input(long * plBuffer)
{
long * plDst = plBuffer;
int iPos = 32;
*(++plDst) = 0x00;
long lDigit;
while (lDigit = getBase16())
{
if (!iPos)
{
*(++plDst) = 0x00;
iPos = 32;
}
*plDst >> 4;
iPos -= 4;
*plDst |= (lDigit & 0x0F) << 28
}
}
There is some fix up to do, like ending by shifting *plDst by iPos, and keeping track of the number of integers in your array.
There is also some work to convert from base 10.
But this is enough to get you started.

Related

Is there a good way to optimize the multiplication of two BigNums?

I have a class BigNum:
struct BigNum{
vector <int> digits;
BigNum(vector <int> data){
for(int item : data){d.push_back(item);}
}
int get_digit(size_t index){
return (index >= d.size() ? 0 : d[index]);
}
};
and I'm trying to write code to multiply two BigNums. Currently, I've been using the traditional method of multiplication, which is multiplying the first number by each digit of the other and adding it to a running total. Here's my code:
BigNum add(BigNum a, BigNum b){ // traditional adding: goes digit by digit and keeps a "carry" variable
vector <int> ret;
int carry = 0;
for(size_t i = 0; i < max(a.digits.size(), b.digits.size()); ++i){
int curr = a.get_digit(i) + b.get_digit(i) + carry;
ret.push_back(curr%10);
carry = curr/10;
}
// leftover from carrying values
while(carry != 0){
ret.push_back(carry%10);
carry /= 10;
}
return BigNum(ret);
}
BigNum mult(BigNum a, BigNum b){
BigNum ret({0});
for(size_t i = 0; i < a.d.size(); ++i){
vector <int> row(i, 0); // account for the zeroes at the end of each row
int carry = 0;
for(size_t j = 0; j < b.d.size(); ++j){
int curr = a.d[i] * b.d[j] + carry;
row.push_back(curr%10);
carry = curr/10;
}
while(carry != 0){ // leftover from carrying
row.push_back(carry%10);
carry /= 10;
}
ret = add(ret, BigNum(row)); // add the current row to our running sum
}
return ret;
}
This code still works pretty slowly; it takes around a minute to calculate the factorial of 1000. Is there a better way to multiply two BigNums? If not, is there a better way to represent large numbers that will speed up this code?

If you use a different base, say 2^16 instead of 10, the multiplication will be much faster.
But getting to print in decimal will be longer.

Get a ready made bignum library. Those tend to be optimized to death, all the way down to specific CPU models, with assembly where necessary.
GMP and MPIR are two popular ones. The latter is more Windows friendly.

One way is to use a larger base than ten. It's a huge waste, in both time and space, to take an int, able to hold values up to about four billion (unsigned variant) and use it to store single digits.
What you can do is use unsigned int/long values for a start, then choose a base such that the square of that base will fit into the value. So, for example, the square root of the largest 32-bit unsigned int is a touch over 65,000 so you choose 10,000 as the base.
So a "bigdigit" (I'll use that term for a digit in the base-10,000 scheme, is effectively equal to four decimal digits (just digits from here on), and this has several effects:
much less space taken up (about 1/1,000th of the space);
still no chance of overflow when you multiply four-digit groups.
faster multiplications, doing four digits at a time rather than one; and
still easy printing since it's in a base-ten-to-the-power-of-something format.
Those last two points warrant some explanation.
On the second last one, it should be something like sixteen times faster since, to multiply 1234 and 5678, each digit in the first has to be multiplied with every digit in the second. For a normal digit, that's sixteen multiplications, while it's only one for a bigdigit.
Since the bigdigits are exactly four digits, the output is still relatively easy, something like:
printf("%d", node[0]);
for (int i = 1; i < node_count; ++i) {
printf("%04d", node[0]);
}
Beyond that, and the normal C++ optimisations like passing const references rather than copying all objects, you can examine the same tricks used by MPIR and GMP. I tend to avoid them myself since they have (or did have at some point) a rather nasty habit of just violently exiting programs when they ran out of memory, something I find inexcusable in a general purpose library. In any case, I have routines built up over time that do, while nowhere near as much as GMP, certainly more than I need (and that use the same algorithms in many cases).
One of the tricks for multiplication is the Karatsuba algorithm (to be honest, I'm not sure if GMP/MPIR use this but, unless they've got something much better, I suspect they would).
It basically involves splitting the numbers into parts so that a = a1a0 is the first, and b = b1b0. In other words:
a = a1 x Bp + a0
b = b1 x Bp + b0
The Bp is just some integral power of the actual base you're using, and can generally be the closest value to the square root of the larger number (about half as many digits).
You then work out:
c2 = a1 x b1
c0 = a0 x b0
c1 = (a1 + a0) x (b1 + b0) - c2 - c0
That last point is tricky but it has been proven mathematically. I suggest if you want to go into that level of depth, I'm not the best person for the job. At some point, even I, the consumate "don't believe anything you can't prove yourself" type, have take the expert opinions as fact :-)
Then you work some add/shift magic (multiplication looks to be involved but, since it's multiplication by a power of the base, it's really just a matter of shifting values left).
c = c2 x B2p + c1 x Bp + c0
Now you may be wondering why three multiplications is a better approach than one, but you need to take into account that these multiplications are using far fewer digits than the original. If you remember back to the comment I made above about doing one multiplication rather than sixteen when switching from base-10 to base-10,000, you'll realise the number of digit multiplications is proportional to the square of the numbers of digits.
That means it can be better to perform three smaller multiplications even with some extra shifting and adding. And the beauty of this solution is that you can recursively apply it to the smaller numbers until you get down to the point where you're just multiplying two unsigned int values.
I probably haven't done the concept justice, and you do need to watch for and adjust the case where c1 becomes negative but, if you want raw speed, this is the sort of thing you'll have to look into.
And, as my more advanced math buddies will tell me (quite often), if you're not willing to have your entire head explode, you probably shouldn't be doing math :-)

C++: Binary to Decimal Conversion

I am trying to convert a binary array to decimal in following way:
uint8_t array[8] = {1,1,1,1,0,1,1,1} ;
int decimal = 0 ;
for(int i = 0 ; i < 8 ; i++)
decimal = (decimal << 1) + array[i] ;
Actually I have to convert 64 bit binary array to decimal and I have to do it for million times.
Can anybody help me, is there any faster way to do the above ? Or is the above one is nice ?

Your method is adequate, to call it nice I would just not mix bitwise operations and "mathematical" way of converting to decimal, i.e. use either
decimal = decimal << 1 | array[i];
or
decimal = decimal * 2 + array[i];

It is important, before attempting any optimisation, to profile the code. Time it, look at the code being generated, and optimise only when you understand what is going on.
And as already pointed out, the best optimisation is to not do something, but to make a higher level change that removes the need.
However...
Most changes you might want to trivially make here, are likely to be things the compiler has already done (a shift is the same as a multiply to the compiler). Some may actually prevent the compiler from making an optimisation (changing an add to an or will restrict the compiler - there are more ways to add numbers, and only you know that in this case the result will be the same).
Pointer arithmetic may be better, but the compiler is not stupid - it ought to already be producing decent code for dereferencing the array, so you need to check that you have not in fact made matters worse by introducing an additional variable.
In this case the loop count is well defined and limited, so unrolling probably makes sense.
Further more it depends on how dependent you want the result to be on your target architecture. If you want portability, it is hard(er) to optimise.
For example, the following produces better code here:
unsigned int x0 = *(unsigned int *)array;
unsigned int x1 = *(unsigned int *)(array+4);
int decimal = ((x0 * 0x8040201) >> 20) + ((x1 * 0x8040201) >> 24);
I could probably also roll a 64-bit version that did 8 bits at a time instead of 4.
But it is very definitely not portable code. I might use that locally if I knew what I was running on and I just wanted to crunch numbers quickly. But I probably wouldn't put it in production code. Certainly not without documenting what it did, and without the accompanying unit test that checks that it actually works.

The binary 'compression' can be generalized as a problem of weighted sum -- and for that there are some interesting techniques.
X mod (255) means essentially summing of all independent 8-bit numbers.
X mod 254 means summing each digit with a doubling weight, since 1 mod 254 = 1, 256 mod 254 = 2, 256*256 mod 254 = 2*2 = 4, etc.
If the encoding was big endian, then *(unsigned long long)array % 254 would produce a weighted sum (with truncated range of 0..253). Then removing the value with weight 2 and adding it manually would produce the correct result:
uint64_t a = *(uint64_t *)array;
return (a & ~256) % 254 + ((a>>9) & 2);
Other mechanism to get the weight is to premultiply each binary digit by 255 and masking the correct bit:
uint64_t a = (*(uint64_t *)array * 255) & 0x0102040810204080ULL; // little endian
uint64_t a = (*(uint64_t *)array * 255) & 0x8040201008040201ULL; // big endian
In both cases one can then take the remainder of 255 (and correct now with weight 1):
return (a & 0x00ffffffffffffff) % 255 + (a>>56); // little endian, or
return (a & ~1) % 255 + (a&1);
For the sceptical mind: I actually did profile the modulus version to be (slightly) faster than iteration on x64.
To continue from the answer of JasonD, parallel bit selection can be iteratively utilized.
But first expressing the equation in full form would help the compiler to remove the artificial dependency created by the iterative approach using accumulation:
ret = ((a[0]<<7) | (a[1]<<6) | (a[2]<<5) | (a[3]<<4) |
(a[4]<<3) | (a[5]<<2) | (a[6]<<1) | (a[7]<<0));
vs.
HI=*(uint32_t)array, LO=*(uint32_t)&array[4];
LO |= (HI<<4); // The HI dword has a weight 16 relative to Lo bytes
LO |= (LO>>14); // High word has 4x weight compared to low word
LO |= (LO>>9); // high byte has 2x weight compared to lower byte
return LO & 255;
One more interesting technique would be to utilize crc32 as a compression function; then it just happens that the result would be LookUpTable[crc32(array) & 255]; as there is no collision with this given small subset of 256 distinct arrays. However to apply that, one has already chosen the road of even less portability and could as well end up using SSE intrinsics.

You could use accumulate, with a doubling and adding binary operation:
int doubleSumAndAdd(const int& sum, const int& next) {
return (sum * 2) + next;
}
int decimal = accumulate(array, array+ARRAY_SIZE,
doubleSumAndAdd);
This produces big-endian integers, whereas OP code produces little-endian.

Try this, I converted a binary digit of up to 1020 bits
#include <sstream>
#include <string>
#include <math.h>
#include <iostream>
using namespace std;
long binary_decimal(string num) /* Function to convert binary to dec */
{
long dec = 0, n = 1, exp = 0;
string bin = num;
if(bin.length() > 1020){
cout << "Binary Digit too large" << endl;
}
else {
for(int i = bin.length() - 1; i > -1; i--)
{
n = pow(2,exp++);
if(bin.at(i) == '1')
dec += n;
}
}
return dec;
}
Theoretically this method will work for a binary digit of infinate length

How can I represent the number 2^1000 in C++? [duplicate]

This question already has answers here:
Closed 10 years ago.
So, I was trying to do problem # 16 on Project Euler, from http://projecteuler.net if you haven't seen it. It is as follows:
2^15 = 32768 and the sum of its digits is 3 + 2 + 7 + 6 + 8 = 26.
What is the sum of the digits of the number 2^1000?
I am having trouble figuring out how to represent the number 2^1000 in C++. I am guessing there is a trick to this, but I am really stuck. I don't really want the answer to the problem, I just want to know how to represent that number as a variable, or if perhaps there is a trick, maybe someone could let me know?

Represent it as a string. That means you need to write two pieces of code:
You need to write a piece of code to double a number, given that number as a string.
You need to write a piece of code to sum the digits of a number represented as a string.
With those two pieces, it's easy.

One good algorithm worth knowing for this problem:
2^1 = 2
2^2 = 2 x 2 = 2 + 2
2^3 = 2 x (2 x 2) = (2 + 2) + (2 + 2)
2^4 = 2 x [2 x ( 2 x 2)] = [(2 + 2) + (2 + 2)] + [(2 + 2) + (2 + 2)]
Thus we have a recursive definition for calculating a power of two in terms of the addition operation: just add together two of the previous power of two.
This link deals with this problem very well.

Here is a complete program. The digits are held in a vector.
#include <iostream>
#include <numeric>
#include <ostream>
#include <vector>
int main()
{
std::vector<unsigned int> digits;
digits.push_back(1); // 2 ** 0 = 1
const int limit = 1000;
for (int i = 0; i != limit; ++i)
{
// Invariant: digits holds the individual digits of the number 2 ** i
unsigned int carry = 0;
for (auto iter = digits.begin(); iter != digits.end(); ++iter)
{
unsigned int d = *iter;
d = 2 * d + carry;
carry = d / 10;
d = d % 10;
*iter = d;
}
if (carry != 0)
{
digits.push_back(carry);
}
}
unsigned int sum = std::accumulate(digits.cbegin(), digits.cend(), 0U);
std::cout << sum << std::endl;
return 0;
}

The whole point of this problem is to come up with a way of doing this without actually calculating 2^1000.
However, if you do want to calculate 2^1000—which may be a good idea, because it's a great way to test whether your other algorithm is correct—you're going to want some kind of "bignum" library, such as gmp:
mpz_t two_to_1000;
mpz_ui_pow_ui(two_to_1000, 2, 1000);
Or you can use the C++ interface to gmp. It doesn't do exponentiation, so the first part gets slightly more complicated instead of less, but it makes the digit-summing simpler:
mpz_class two_to_1000;
mpz_ui_pow_ui(two_to_1000.get_mpz_t(), 2, 1000);
mpz_class digitsum(0);
while (two_to_1000) {
digitsum += two_to_1000 % 10;
two_to_1000 /= 10;
}
(There's actually no reason to make digitsum an mpz there, so you may want to figure out how to prove that the result will fit into 32 bits, add that as a comment, and just use a long for digitsum.)
All that being said, I probably wouldn't have written this gmp code to test it, when the whole thing is a one-liner in Python:
print(sum(map(int, str(2**1000))))
And, even though converting the bignum to a string to convert each digit to an int to sum them up is possibly the least efficient way to solve it, it still takes under 200us on the slowest machine I have here. And there's really no reason the double-check needs to be in the same language as the actual solution.

You'd need a 1000 bit machine integer to represent 2^1000; I've never heard of a machine with such. But there are a lot of big integer packages around, which do the arithmetic over as many machine words as are needed. The simplest solution might be to use one of these.(Although given the particular operations you need, doing the arithmetic on a string, as David Schwartz suggested, might be appropriate. In the general case, it's not a very good idea, but since all you're doing is multiplying by two, and then taking the decimal digits, it might work out well.)

Since 2^10 is about 10^3, and 2^1000 = (2^10)^100 = (10^3)^100 = 10^300 (about).
So allocate an array like
char digits[ 300 ]; // may be too few
and store a value between 0 .. 9 in each char.

BigInt implementation - converting a string to binary representatio stored as unsigned int

I'm doing a BigInt implementation in C++ and I'm having a hard time figuring out how to create a converter from (and to) string (C string would suffice for now).
I implement the number as an array of unsigned int (so basically putting blocks of bits next to each other). I just can't figure out how to convert a string to this representation.
For example if usigned int would be 32b and i'd get a string of "4294967296", or "5000000000" or basically anything larger than what a 32b int can hold, how would I properly convert it to appropriate binary representation?
I know I'm missing something obvious, and I'm only asking for a push to the right direction. Thanks for help and sorry for asking such a silly question!

Well one way (not necessarily the most efficient) is to implement the usual arithmetic operators and then just do the following:
// (pseudo-code)
// String to BigInt
String s = ...;
BigInt x = 0;
while (!s.empty())
{
x *= 10;
x += s[0] - '0';
s.pop_front();
}
Output(x);
// (pseudo-code)
// BigInt to String
BigInt x = ...;
String s;
while (x > 0)
{
s += '0' + x % 10;
x /= 10;
}
Reverse(s);
Output(s);
If you wanted to do something trickier than you could try the following:
If input I is < 100 use above method.
Estimate D number of digits of I by bit length * 3 / 10.
Mod and Divide by factor F = 10 ^ (D/2), to get I = X*F + Y;
Execute recursively with I=X and I=Y

Implement and test the string-to-number algorithm using a builtin type such as int.
Implement a bignum class with operator+, operator*, and whatever else the above algorithm uses.
Now the algorithm should work unchanged with the bignum class.
Use the string conversion algo to debug the class, not the other way around.
Also, I'd encourage you to try and write at a high level, and not fall back on C constructs. C may be simpler, but usually does not make things easier.

Take a look at, for instance, mp_toradix and mp_read_radix in Michael Bromberger's MPI.
Note that repeated division by 10 (used in the above) performs very poorly, which shows up when you have very big integers. It's not the "be all and end all", but it's more than good enough for homework.
A divide and conquer approach is possible. Here is the gist. For instance, given the number 123456789, we can break it into pieces: 1234 56789, by dividing it by a power of 10. (You can think of these pieces of two large digits in base 100,000. Now performing the repeated division by 10 is now cheaper on the two pieces! Dividing 1234 by 10 three times and 56879 by 10 four times is cheaper than dividing 123456789 by 10 eight times.
Of course, a really large number can be recursively broken into more than two pieces.
Bruno Haibl's CLN (used in CLISP) does something like that and it is blazingly fast compared to MPI, in converting numbers with thousands of digits to numeric text.

how to determine base of a number?

Given a integer number and its reresentation in some arbitrary number system. The purpose is to find the base of the number system. For example, number is 10 and representation is 000010, then the base should be 10. Another example: number 21 representation is 0010101 then base is 2. One more example is: number is 6 and representation os 10100 then base is sqrt(2). Does anyone have any idea how to solve such problem?

___
\
number = /__ ( digit[i] * base ^ i )
You know number, you know all digit[i], you just have to find out base.
Whether solving this equation is simple or complex is left as an exercise.

I do not think that an answer can be given for every case. And I actually have a reason to think so! =)
Given a number x, with representation a_6 a_5 a_4 a_3 a_2 a_1 in base b, finding the base means solving
a_6 b^5 + a_5 b^4 + a_4 b^3 + a_3 b^2 + a_2 b^1 + a_1 = x.
This cannot be done generally, as shown by Abel and Ruffini. You might be luckier with shorter numbers, but if more than four digits are involved, the formulas are increasingly ugly.
There are quite a lot good approximation algorithms, though. See here.

For integers only, it's not that difficult (we can enumerate).
Let's look at 21 and its representation 10101.
1 * base^4 <= 21 < (1+1) * base^4
Let's generate the numbers for some bases:
base low high
2 16 32
3 81 162
More generally, we have N represented as ∑ ai * basei. Considering I the maximum power for which aI is non null we have:
a[I] * base^I <= N < (a[I] + 1) * base^I # does not matter if not representable
# Isolate base term
N / (a[I] + 1) < base^I <= N / a[I]
# Ith root
Ithroot( N / (a[I] + 1) ) < base <= Ithroot( N / a[I] )
# Or as a range
base in ] Ithroot(N / (a[I] + 1)), Ithroot( N / a[I] ) ]
In the case of an integer base, or if you have a list of known possible bases, I doubt they'll be many possibilities, so we can just try them out.
Note that it may be faster to actually take the Ithroot of N / (a[I] + 1) and iterate from here instead of computing the second one (which should be close enough)... but I'd need math review on that gut feeling.
If you really don't have any idea (trying to find a floating base)... well it's a bit more difficult I guess, but you can always refine the inequality (including one or two more terms) following the same property.

An algorithm like this should find the base if it is an integer, and should at least narrow down the choices for a non-integer base:
Let N be your integer and R be its representation in the mystery base.
Find the largest digit in R and call it r.
You know that your base is at least r + 1.
For base == (r+1, r+2, ...), let I represent R interpreted in base base
If I equals N, then base is your mystery base.
If I is less than N, try the next base.
If I is greater than N, then your base is somewhere between base - 1 and base.
It's a brute-force method, but it should work. You may also be able to speed it up a bit by incrementing base by more than one if I is significantly smaller than N.
Something else that might help speed things up, particularly in the case of a non-integer base: Remember that as several people have mentioned, a number in an arbitrary base can be expanded as a polynomial like
x = a[n]*base^n + a[n-1]*base^(n-1) + ... + a[2]*base^2 + a[1]*base + a[0]
When evaluating potential bases, you don't need to convert the entire number. Start by converting only the largest term, a[n]*base^n. If this is larger than x, then you already know your base is too big. Otherwise, add one term at a time (moving from most-significant to least-significant). That way, you don't waste time computing terms after you know your base is wrong.
Also, there is another quick way to eliminate a potential base. Notice that you can re-arrange the above polynomial expression and get
(x - a[0]) = a[n]*base^n + a[n-1]*base^(n-1) + ... + a[2]*base^2 + a[1]*base
or
(x - a[0]) = (a[n]*base^(n-1) + a[n-1]*base^(n-2) + ... + a[2]*base + a[1])*base
You know the values of x and a[0] (the "ones" digit, you can interpret it regardless of base). What this gives you the extra condition that (x - a[0]) must be evenly divisible by base (since all your a[] values are integers). If you calculate (x - a[0]) % base and get a non-zero result, then base cannot be the correct base.

Im not sure if this is efficiently solvable. I would just try to pick a random base, see if given the base the result is smaller, larger or equal to the number. In case its smaller, pick a larger base, in case its larger pick a smaller base, otherwise you have the correct base.

This should give you a starting point:
Create an equation from the number and representation, number 42 and represenation "0010203" becomes:
1 * base ^ 4 + 2 * base ^ 2 + 3 = 42
Now you solve the equation to get the value of base.

I'm thinking you will need try and check different bases. To be efficient, your starting base could be max(digit) + 1 as you know it won't be less than that. If that's too small double until you exceed, and then use binary search to narrow it down. This way your algorithm should run in O(log n) for normal situations.

Several of the other posts suggest that the solution might be found by finding the roots of the polynomial the number represents. These will, of course, generally work, though they will have a tendency to produce negative and complex bases as well as positive integers.
Another approach would be to cast this as an integer programming problem and solve using branch-and-bound.
But I suspect that the suggestion of guessing-and-testing will be quicker than any of the cleverer proposals.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js