Convert big hexadecimal to decimal numbers

Convert big hexadecimal to decimal numbers - c++

I have a big hexadecimal number, for example CD4A0619FB0907BC00000 (25!) or any other number like this. Now, using standard C/C++ code only (no libraries like Boost), I want to convert this number to the decimal number 15511210043330985984000000. Unfortunately, it's too big for a 64 bit integer (like long long) and I don't want to use any floating point data types either. If this is possible at all, how can you do this?

Assuming you don't want to use any of resources that might fit your description "libraries like Boost". The simple answer is to write your own subset of one, with just the operations you need.
If 32 hex digits is enough, then simplest would be to create your own 128 bit unsigned int and code a divide by 10 function (producing quotient and remainder) for that 128-bit int. You really don't need any other functions and divide by 10 is pretty easy. Converting up to 32 hex digits to 128 bit int is trivial and generating decimal output from a series of divide by ten is trivial.
If you want essentially unlimited size, then it is likely simpler to represent a decimal number as a string of digits and write a routine to multiply that by 16 and add in another digit. That would never be the efficient solution, just likely easier to code for your purpose and unlimited size.

vector<unsigned int> bin2dec(vector<unsigned int> binary)
{
vector<unsigned int> decimal;
bool all_zero = false;
// binary[i]: stores 8-bit of the nubmer.
// Ex. 258 = 0x102 => binary[0] = 0x2, binary[1] = 0x1.
while (!all_zero) {
all_zero = true;
for (int i = binary.size() - 1; i >= 0; i--) {
int q = binary[i] / 10;
int r = binary[i] % 10;
binary[i] = q;
if (i > 0) {
binary[i-1] += (r << 8);
} else {
decimal.insert(decimal.begin(), r);
}
if (q != 0) {
all_zero = false;
}
}
}
// each element stands for one digit of the decimal number.
// Ex. 258 => decimal[0] = 2, decimal[1] = 5, decimal[2] = 8.
return decimal;
}

If you don't want to use external libraries then you will have to implement a arbitary-precision integer type yourself. See this question for ideas on how to do this. You will also need a function/constructor for converting hexadecimal strings to your new type. See this question for ideas on how to do this.

Related

Is there a way to convert a base 2^64 number to its base10 value in string form or display it in standard out in C or C++ without using big num libs?

Let's say I have a very large number represented using an array of unsigned long(int64), and I want to see its base10 form either stored in a string and/or display it to the standard out directly, how would I do that in C or C++ without using libraries like gmp or boost?, what algorithm or method should I know?
below is an example base2^64 number, with its base10 value in the comment
// base2^64
unsigned long big_num[3] = [77478, 656713, 872];
// base10 = 26364397224300470284329554475476558257587048
I don't exactly know if this is the correct way to convert another number base to base 10, but this is what I did:
To get the base10 value 26364397224300470284329554475476558257587048, I summed up all the digits of the base2^64 number that is multiplied to its base and raised by the index of the digit.
base10 = ((77478 * ((2^64)^2)) + ((656713 * ((2^64)^1))) + ((872 * ((2^64)^0))))
= 26364397224300470284329554475476558257587048
the only problem with this is that there is no primitive data type that can hold this super large sum...
I was just thinking if libraries like boost cpp_int and gmp represents their number like this, and if yes how do they convert it to it's base10 value in string form or display the base10 value in standard out?
Or do they just use half of the bits of the data types that they use like for example in unsigned long and maybe use something like base 10000?

Repeatedly "mod 10" the array to find the next least significant decimal digit, then "divide by 10". Repeat as needed.
Avoid unsigned long to encode 64-bit values as it may be only 32-bit.
If code can encode the number not using the widest type and use uin32_t, then doing the repeated "mod 10" of the array is not so hard.
Below illustrative code still needs to reverse the string - something left for OP. Potential other warts too - hence the advantage of using big number libraries for this sort of thing.
#include <stdlib.h>
#include <stdint.h>
#include <stdio.h>
// Form reverse decimal string
void convert(char dec[], size_t n, uint32_t b32[]) {
// TBD code to handle 0
while (n > 0 && b32[0] == 0) {
b32++;
n--;
}
while (n > 0) {
unsigned char rem = 0;
// Divide by 10.
for (size_t i = 0; i < n; i++) {
uint64_t sum = rem * (1ULL << 32) + b32[i];
b32[i] = (uint32_t) (sum / 10u);
rem = (unsigned char) (sum % 10u);
}
*dec++ = (char) (rem + '0');
if (b32[0] == 0) {
b32++;
n--;
}
}
*dec = 0;
}
Sample
int main() {
// unsigned long big_num[3] = [77478, 656713, 872];
uint32_t big_num[6] = {0, 77478, 0, 656713, 0, 872};
size_t n = sizeof big_num / sizeof big_num[0];
char s[sizeof big_num * 10 + 1];
convert(s, n, big_num);
printf("<%s>\n", s);
// <84078575285567457445592348207400342279346362>
// 26364397224300470284329554475476558257587048
}

To get the decimal representation of this number, you need to repeatedly divide the number by 10 and take the remainder to get the decimal digits. This means you need to implement long division for big numbers, which also requires implementing long addition, subtraction, and multiplication.
That's a lot of code that big number libraries give you, so just use one.

algorithm to figure out how many bytes are required to hold an int

sorry for the stupid question, but how would I go about figuring out, mathematically or using c++, how many bytes it would take to store an integer.

If you mean from an information theory point of view, then the easy answer is:
log(number) / log(2)
(It doesn't matter if those are natural, binary, or common logarithms, because of the division by log(2), which calculates the logarithm with base 2.)
This reports the number of bits necessary to store your number.
If you're interested in how much memory is required for the efficient or usual encoding of your number in a specific language or environment, you'll need to do some research. :)
The typical C and C++ ranges for integers are:
char 1 byte
short 2 bytes
int 4 bytes
long 8 bytes
If you're interested in arbitrary-sized integers, special libraries are available, and every library will have its own internal storage mechanism, but they'll typically store numbers via 4- or 8- byte chunks up to the size of the number.

You could find the first power of 2 that's larger than your number, and divide that power by 8, then round the number up to the nearest integer. So for 1000, the power of 2 is 1024 or 2^10; divide 10 by 8 to get 1.25, and round up to 2. You need two bytes to hold 1000!

If you mean "how large is an int" then sizeof(int) is the answer.
If you mean "how small a type can I use to store values of this magnitude" then that's a bit more complex. If you already have the value in integer form, then presumably it fits in 4, 3, 2, or 1 bytes. For unsigned values, if it's 16777216 or over you need 4 bytes, 65536-16777216 requires 3 bytes, 256-65535 needs 2, and 0-255 fits in 1 byte. The formula for this comes from the fact that each byte can hold 8 bits, and each bit holds 2 digits, so 1 byte holds 2^8 values, ie. 256 (but starting at 0, so 0-255). 2 bytes therefore holds 2^16 values, ie. 65536, and so on.
You can generalise that beyond the normal 4 bytes used for a typical int if you like. If you need to accommodate signed integers as well as unsigned, bear in mind that 1 bit is effectively used to store whether it is positive or negative, so the magnitude is 1 power of 2 less.
You can calculate the number of bits you need iteratively from an integer by dividing it by two and discarding the remainder. Each division you can make and still have a non-zero value means you have one more bit of data in use - and every 8 bits you're using means 1 byte.
A quick way of calculating this is to use the shift right function and compare the result against zero.
int value = 23534; // or whatever
int bits = 0;
while (value)
{
value >> 1;
++bits;
}
std::cout << "Bits used = " << bits << std::endl;
std::cout << "Bytes used = " << (bits / 8) + 1 << std::endl;

This is basically the same question as "how many binary digits would it take to store a number x?" All you need is the logarithm.
A n-bit integer can store numbers up to 2n-1. So, given a number x, ceil(log2 x) gets you the number of digits you need.
It's exactly the same thing as figuring out how many decimal digits you need to write a number by hand. For example, log10 123456 = 5.09151220... , so ceil( log10(123456) ) = 6, six digits.

Since nobody put up the simplest code that works yet, I mind as well do it:
unsigned int get_number_of_bytes_needed(unsigned int N) {
unsigned int bytes = 0;
while(N) {
N >>= 8;
++bytes;
};
return bytes;
};

assuming sizeof(long int) = 4.
int nbytes( long int x )
{
unsigned long int n = (unsigned long int) x;
if (n <= 0xFFFF)
{
if (n <= 0xFF) return 1;
else return 2;
}
else
{
if (n <= 0xFFFFFF) return 3;
else return 4;
}
}

The shortest code way to do this is as follows:
int bytes = (int)Math.Log(num, 256) + 1;
The code is small enough to be inlined, which helps offset the "slow" FP code. Also, there are no branches, which can be expensive.

Try this code:
// works for num >= 0
int numberOfBytesForNumber(int num) {
if (num < 0)
return 0;
else if (num == 0)
return 1;
else if (num > 0) {
int n = 0;
while (num != 0) {
num >>= 8;
n++;
}
return n;
}
}

/**
* assumes i is non-negative.
* note that this returns 0 for 0, when perhaps it should be special cased?
*/
int numberOfBytesForNumber(int i) {
int bytes = 0;
int div = 1;
while(i / div) {
bytes++;
div *= 256;
}
if(i % 8 == 0) return bytes;
return bytes + 1;
}

This code runs at 447 million tests / sec on my laptop where i = 1 to 1E9. i is a signed int:
n = (i > 0xffffff || i < 0) ? 4 : (i < 0xffff) ? (i < 0xff) ? 1 : 2 : 3;

Python example: no logs or exponents, just bit shift.
Note: 0 counts as 0 bits and only positive ints are valid.
def bits(num):
"""Return the number of bits required to hold a int value."""
if not isinstance(num, int):
raise TypeError("Argument must be of type int.")
if num < 0:
raise ValueError("Argument cannot be less than 0.")
for i in count(start=0):
if num == 0:
return i
num = num >> 1

Double precision in C++ (or pow(2, 1000))

I'm working on Project Euler to brush up on my C++ coding skills in preparation for the programming challenge(s) we'll be having this next semester (since they don't let us use Python, boo!).
I'm on #16, and I'm trying to find a way to keep real precision for 2¹°°°
For instance:
int main(){
double num = pow(2, 1000);
printf("%.0f", num):
return 0;
}
prints
10715086071862673209484250490600018105614050000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Which is missing most of the numbers (from python):
>>> 2**1000
10715086071862673209484250490600018105614048117055336074437503883703510511249361224931983788156958581275946729175531468251871452856923140435984577574698574803934567774824230985421074605062371141877954182153046474983581941267398767559165543946077062914571196477686542167660429831652624386837205668069376L
Granted, I can write the program with a Python 1 liner
sum(int(_) for _ in str(2**1000))
that gives me the result immediately, but I'm trying to find a way to do it in C++. Any pointers? (haha...)
Edit:
Something outside the standard libs is worthless to me - only dead-tree code is allowed in those contests, and I'm probably not going to print out 10,000 lines of external code...

If you just keep track of each digit in a char array, this is easy. Doubling a digit is trivial, and if the result is greater than 10 you just subtract 10 and add a carry to the next digit. Start with a value of 1, loop over the doubling function 1000 times, and you're done. You can predict the number of digits you'll need with ceil(1000*log(2)/log(10)), or just add them dynamically.
Spoiler alert: it appears I have to show the code before anyone will believe me. This is a simple implementation of a bignum with two functions, Double and Display. I didn't make it a class in the interest of simplicity. The digits are stored in a little-endian format, with the least significant digit first.
typedef std::vector<char> bignum;
void Double(bignum & num)
{
int carry = 0;
for (bignum::iterator p = num.begin(); p != num.end(); ++p)
{
*p *= 2;
*p += carry;
carry = (*p >= 10);
*p -= carry * 10;
}
if (carry != 0)
num.push_back(carry);
}
void Display(bignum & num)
{
for (bignum::reverse_iterator p = num.rbegin(); p != num.rend(); ++p)
std::cout << static_cast<int>(*p);
}
int main(int argc, char* argv[])
{
bignum num;
num.push_back(1);
for (int i = 0; i < 1000; ++i)
Double(num);
Display(num);
std::cout << std::endl;
return 0;
}

You need a bignum library, such as this one.

You probably need a pointer here (pun intended)
In C++ you would need to create your own bigint lib in order to do the same as in python.

C/C++ operates on fundamental data types. You are using a double which has only 64 bits to store a 1000 bit number. double uses 51 bit for the significant digits and 11 bit for the magnitude.
The only solution for you is to either use a library like bignum mentioned elsewhere or to roll out your own.

UPDATE: I just browsed to the Euler Problem site and found that Problem 13 is about summing large integers. The iterated method can become very tricky after a short while, so I'd suggest to use the code from Problem #13 you should have already to solve this, because 2**N => 2**(N-1) + 2**(N-1)
Using bignums is cheating and not a solution. Also, you don't need to compute 2**1000 or anything like that to get to the result. I'll give you a hint:
Take the first few values of 2**N:
0 1 2 4 8 16 32 64 128 256 ...
Now write down for each number the sum of its digits:
1 2 4 8 7 5 10 11 13 ...
You should notice that (x~=y means x and y have the same sum of digits)
1+1=2, 1+(1+2)=4, 1+(1+2+4)=8, 1+(1+2+4+8)=16~=7 1+(1+2+4+8+7)=23~=5
Now write a loop.
Project Euler = Think before Compute!

If you want to do this sort of thing on a practical basis, you're looking for an arbitrary precision arithmetic package. There are a number around, including NTL, lip, GMP, and MIRACL.
If you're just after something for Project Euler, you can write your own code for raising to a power. The basic idea is to store your large number in quite a few small pieces, and implement your own carries, borrows, etc., between the pieces.

Isn't pow(2, 1000) just 2 left-shifted 1000 times, essentially? It should have an exact binary representation in a double float. It shouldn't require a bignum library.

C/C++ counting the number of decimals?

Lets say that input from the user is a decimal number, ex. 5.2155 (having 4 decimal digits). It can be stored freely (int,double) etc.
Is there any clever (or very simple) way to find out how many decimals the number has? (kinda like the question how do you find that a number is even or odd by masking last bit).

Two ways I know of, neither very clever unfortunately but this is more a limitation of the environment rather than me :-)
The first is to sprintf the number to a big buffer with a "%.50f" format string, strip off the trailing zeros then count the characters after the decimal point. This will be limited by the printf family itself. Or you could use the string as input by the user (rather than sprintfing a floating point value), so as to avoid floating point problems altogether.
The second is to subtract the integer portion then iteratively multiply by 10 and again subtract the integer portion until you get zero. This is limited by the limits of computer representation of floating point numbers - at each stage you may get the problem of a number that cannot be represented exactly (so .2155 may actually be .215499999998). Something like the following (untested, except in my head, which is about on par with a COMX-35):
count = 0
num = abs(num)
num = num - int(num)
while num != 0:
num = num * 10
count = count + 1
num = num - int(num)
If you know the sort of numbers you'll get (e.g., they'll all be 0 to 4 digits after the decimal point), you can use standard floating point "tricks" to do it properly. For example, instead of:
while num != 0:
use
while abs(num) >= 0.0000001:

Once the number is converted from the user representation (string, OCR-ed gif file, whatever) into a floating point number, you are not dealing with the same number necessarily. So the strict, not very useful answer is "No".
If (case A) you can avoid converting the number from the string representation, the problem becomes much easier, you only need to count the digits after the decimal point and subtract the number of trailing zeros.
If you cannot do it (case B), then you need to make an assumption about the maximum number of decimals, convert the number back into string representation and round it to this maximum number using the round-to-even method. For example, if the user supplies 1.1 which gets represented as 1.09999999999999 (hypothetically), converting it back to string yields, guess what, "1.09999999999999". Rounding this number to, say, four decimal points gives you "1.1000". Now it's back to case A.

Off the top of my head:
start with the fractional portion: .2155
repeatedly multiply by 10 and throw away the integer portion of the number until you get zero. The number of steps will be the number of decimals. e.g:
.2155 * 10 = 2.155
.155 * 10 = 1.55
.55 * 10 = 5.5
.5 * 10 = 5.0
4 steps = 4 decimal digits

Something like this might work as well:
float i = 5.2154;
std::string s;
std::string t;
std::stringstream out;
out << i;
s = out.str();
t = s.substr(s.find(".")+1);
cout<<"number of decimal places: " << t.length();

What do you mean "stored freely (int"? Once stored in an int, it has zero decimals left, clearly. A double is stored in a binary form, so no obvious or simple relation to "decimals" either. Why don't you keep the input as a string, just long enough to count those decimals, before sending it on to its final numeric-variable destination?

using the Scientific Notation format (to avoid rounding errors):
#include <stdio.h>
#include <string.h>
/* Counting the number of decimals
*
* 1. Use Scientific Notation format
* 2. Convert it to a string
* 3. Tokenize it on the exp sign, discard the base part
* 4. convert the second token back to number
*/
int main(){
int counts;
char *sign;
char str[15];
char *base;
char *exp10;
float real = 0.00001;
sprintf (str, "%E", real);
sign= ( strpbrk ( str, "+"))? "+" : "-";
base = strtok (str, sign);
exp10 = strtok (NULL, sign);
counts=atoi(exp10);
printf("[%d]\n", counts);
return 0;
}
[5]

If the decimal part of your number is stored in a separate int, you can just count the its decimal digits.
This is a improvement on andrei alexandrescu's improvement. His version was already faster than the naive way (dividing by 10 at every digit). The version below is constant time and faster at least on x86-64 and ARM for all sizes, but occupies twice as much binary code, so it is not as cache-friendly.
Benchmarks for this version vs alexandrescu's version on my PR on facebook folly.
Works on unsigned, not signed.
inline uint32_t digits10(uint64_t v) {
return 1
+ (std::uint32_t)(v>=10)
+ (std::uint32_t)(v>=100)
+ (std::uint32_t)(v>=1000)
+ (std::uint32_t)(v>=10000)
+ (std::uint32_t)(v>=100000)
+ (std::uint32_t)(v>=1000000)
+ (std::uint32_t)(v>=10000000)
+ (std::uint32_t)(v>=100000000)
+ (std::uint32_t)(v>=1000000000)
+ (std::uint32_t)(v>=10000000000ull)
+ (std::uint32_t)(v>=100000000000ull)
+ (std::uint32_t)(v>=1000000000000ull)
+ (std::uint32_t)(v>=10000000000000ull)
+ (std::uint32_t)(v>=100000000000000ull)
+ (std::uint32_t)(v>=1000000000000000ull)
+ (std::uint32_t)(v>=10000000000000000ull)
+ (std::uint32_t)(v>=100000000000000000ull)
+ (std::uint32_t)(v>=1000000000000000000ull)
+ (std::uint32_t)(v>=10000000000000000000ull);
}

Years after the fight but as I have made my own solution in three lines :
string number = "543.014";
size_t dotFound;
stoi(number, &dotFound));
string(number).substr(dotFound).size()
Of course you have to test before if it is really a float
(With stof(number) == stoi(number) for example)

int main()
{
char s[100];
fgets(s,100,stdin);
unsigned i=0,sw=0,k=0,l=0,ok=0;
unsigned length=strlen(s);
for(i=0;i<length;i++)
{
if(isprint(s[i]))
{
if(sw==1)
{
k++;
if(s[i]=='0')
{
ok=0;
}
if(ok==0)
{
if(s[i]=='0')
l++;
else
{
ok=1;
l=0;
}
}
}
if(s[i]=='.')
{
sw=1;
}
}
}
printf("%d",k-l);
return 0;
}

This is a robust C++ 11 implementation suitable for float and double types:
template <typename T>
std::enable_if_t<(std::is_floating_point<T>::value), std::size_t>
decimal_places(T v)
{
std::size_t count = 0;
v = std::abs(v);
auto c = v - std::floor(v);
T factor = 10;
T eps = std::numeric_limits<T>::epsilon() * c;
while ((c > eps && c < (1 - eps)) && count < std::numeric_limits<T>::max_digits10)
{
c = v * factor;
c = c - std::floor(c);
factor *= 10;
eps = std::numeric_limits<T>::epsilon() * v * factor;
count++;
}
return count;
}
It throws the value away each iteration and instead keeps track of a power of 10 multiplier to avoid rounding issues building up. It uses machine epsilon to correctly handle decimal numbers that cannot be represented exactly in binary such as the value of 5.2155 as stipulated in the question.

Based on what others wrote, this has worked well for me. This solution does handle the case where a number can't be represented exactly in binary.
As suggested by others, the condition for the while loop indicates the precise behavior. My update uses the machine epsilon value to test whether the remainder on any loop is representable by the numeric format. The test should not compare to 0 or a hardcoded value like 0.000001.
template<class T, std::enable_if_t<std::is_floating_point_v<T>, T>* = nullptr>
unsigned int NumDecimalPlaces(T val)
{
unsigned int decimalPlaces = 0;
val = std::abs(val);
val = val - std::round(val);
while (
val - std::numeric_limits<T>::epsilon() > std::numeric_limits<T>::epsilon() &&
decimalPlaces <= std::numeric_limits<T>::digits10)
{
std::cout << val << ", ";
val = val * 10;
++decimalPlaces;
val = val - std::round(val);
}
return val;
}
As an example, if the input value is 2.1, the correct solution is 1. However, some other answers posted here would output 16 if using double precision because 2.1 can't be precisely represented in double precision.

I would suggest reading the value as a string, searching for the decimal point, and parsing the text before and after it as integers. No floating point or rounding errors.

char* fractpart(double f)
{
int intary={1,2,3,4,5,6,7,8,9,0};
char charary={'1','2','3','4','5','6','7','8','9','0'};
int count=0,x,y;
f=f-(int)f;
while(f<=1)
{
f=f*10;
for(y=0;y<10;y++)
{
if((int)f==intary[y])
{
chrstr[count]=charary[y];
break;
}
}
f=f-(int)f;
if(f<=0.01 || count==4)
break;
if(f<0)
f=-f;
count++;
}
return(chrstr);
}

Here is the complete program
#include <iostream.h>
#include <conio.h>
#include <string.h>
#include <math.h>
char charary[10]={'1','2','3','4','5','6','7','8','9','0'};
int intary[10]={1,2,3,4,5,6,7,8,9,0};
char* intpart(double);
char* fractpart(double);
int main()
{
clrscr();
int count = 0;
double d = 0;
char intstr[10], fractstr[10];
cout<<"Enter a number";
cin>>d;
strcpy(intstr,intpart(d));
strcpy(fractstr,fractpart(d));
cout<<intstr<<'.'<<fractstr;
getche();
return(0);
}
char* intpart(double f)
{
char retstr[10];
int x,y,z,count1=0;
x=(int)f;
while(x>=1)
{
z=x%10;
for(y=0;y<10;y++)
{
if(z==intary[y])
{
chrstr[count1]=charary[y];
break;
}
}
x=x/10;
count1++;
}
for(x=0,y=strlen(chrstr)-1;y>=0;y--,x++)
retstr[x]=chrstr[y];
retstr[x]='\0';
return(retstr);
}
char* fractpart(double f)
{
int count=0,x,y;
f=f-(int)f;
while(f<=1)
{
f=f*10;
for(y=0;y<10;y++)
{
if((int)f==intary[y])
{
chrstr[count]=charary[y];
break;
}
}
f=f-(int)f;
if(f<=0.01 || count==4)
break;
if(f<0)
f=-f;
count++;
}
return(chrstr);
}

One way would be to read the number in as a string. Find the length of the substring after the decimal point and that's how many decimals the person entered. To convert this string into a float by using
atof(string.c_str());
On a different note; it's always a good idea when dealing with floating point operations to store them in a special object which has finite precision. For example, you could store the float points in a special type of object called "Decimal" where the whole number part and the decimal part of the number are both ints. This way you have a finite precision. The downside to this is that you have to write out methods for arithmetic operations (+, -, *, /, etc.), but you can easily overwrite operators in C++. I know this deviates from your original question, but it's always better to store your decimals in a finite form. In this way you can also answer your question of how many decimals the number has.

How best to implement BCD as an exercise?

I'm a beginner (self-learning) programmer learning C++, and recently I decided to implement a binary-coded decimal (BCD) class as an exercise, and so I could handle very large numbers on Project Euler. I'd like to do it as basically as possible, starting properly from scratch.
I started off using an array of ints, where every digit of the input number was saved as a separate int. I know that each BCD digit can be encoded with only 4 bits, so I thought using a whole int for this was a bit overkill. I'm now using an array of bitset<4>'s.
Is using a library class like this overkill as well?
Would you consider it cheating?
Is there a better way to do this?
EDIT: The primary reason for this is as an exercise - I wouldn't want to use a library like GMP because the whole point is making the class myself. Is there a way of making sure that I only use 4 bits for each decimal digit?

Just one note, using an array of bitset<4>'s is going to require the same amount of space as an array of long's. bitset is usually implemented by having an array of word sized integers be the backing store for the bits, so that bitwise operations can use bitwise word operations, not byte ones, so more gets done at a time.
Also, I question your motivation. BCD is usually used as a packed representation of a string of digits when sending them between systems. There isn't really anything to do with arithmetic usually. What you really want is an arbitrary sized integer arithmetic library like GMP.

Is using a library class like this overkill as well?
I would benchmark it against an array of ints to see which one performs better. If an array of bitset<4> is faster, then no it's not overkill. Every little bit helps on some of the PE problems
Would you consider it cheating?
No, not at all.
Is there a better way to do this?
Like Greg Rogers suggested, an arbitrary precision library is probably a better choice, unless you just want to learn from rolling your own. There's something to learn from both methods (using a library vs. writing a library). I'm lazy, so I usually use Python.

Like Greg Rogers said, using a bitset probably won't save any space over ints, and doesn't really provide any other benefits. I would probably use a vector instead. It's twice as big as it needs to be, but you get simpler and faster indexing for each digit.
If you want to use packed BCD, you could write a custom indexing function and store two digits in each byte.

Is using a library class like this overkill as well?
Would you consider it cheating?
Is there a better way to do this?
1&2: not really
3: each byte's got 8-bits, you could store 2 BCD in each unsigned char.

In general, bit operations are applied in the context of an integer, so from the performance aspect there is no real reason to go to bits.
If you want to go to bit approach to gain experience, then this may be of help
#include <stdio.h>
int main
(
void
)
{
typedef struct
{
unsigned int value:4;
} Nibble;
Nibble nibble;
for (nibble.value = 0; nibble.value < 20; nibble.value++)
{
printf("nibble.value is %d\n", nibble.value);
}
return 0;
}
The gist of the matter is that inside that struct, you are creating a short integer, one that is 4 bits wide. Under the hood, it is still really an integer, but for your intended use, it looks and acts like a 4 bit integer.
This is shown clearly by the for loop, which is actually an infinite loop. When the nibble value hits, 16, the value is really zero, as there are only 4 bits to work with.
As a result nibble.value < 20 never becomes true.
If you look in the K&R White book, one of the notes there is the fact that bit operations like this are not portable, so if you want to port your program to another platform, it may or may not work.
Have fun.

You are trying to get base-10 representation (i.e. decimal digit in each cell of the array). This way either space (one int per digit), or time (4-bits per dgit, but there is overhead of packing/unpacking) is wasted.
Why not try with base-256, for example, and use an array of bytes? Or even base-2^32 with array of ints? The operations are implemented the same way as in base-10. The only thing that will be different is converting the number to a human-readable string.
It may work like this:
Assuming base-256, each "digit" has 256 possible values, so the numbers 0-255 are all single digit values. Than 256 is written as 1:0 (I'll use colon to separate the "digits", we cannot use letters like in base-16), analoge in base-10 is how after 9, there is 10.
Likewise 1030 (base-10) = 4 * 256 + 6 = 4:6 (base-256).
Also 1020 (base-10) = 3 * 256 + 252 = 3:252 (base-256) is two-digit number in base-256.
Now let's assume we put the digits in array of bytes with the least significant digit first:
unsigned short digits1[] = { 212, 121 }; // 121 * 256 + 212 = 31188
int len1 = 2;
unsigned short digits2[] = { 202, 20 }; // 20 * 256 + 202 = 5322
int len2 = 2;
Then adding will go like this (warning: notepad code ahead, may be broken):
unsigned short resultdigits[enough length] = { 0 };
int len = len1 > len2 ? len1 : len2; // max of the lengths
int carry = 0;
int i;
for (i = 0; i < len; i++) {
int leftdigit = i < len1 ? digits1[i] : 0;
int rightdigit = i < len2 ? digits2[i] : 0;
int sum = leftdigit + rightdigit + carry;
if (sum > 255) {
carry = 1;
sum -= 256;
} else {
carry = 0;
}
resultdigits[i] = sum;
}
if (carry > 0) {
resultdigits[i] = carry;
}
On the first iteration it should go like this:
sum = 212 + 202 + 0 = 414
414 > 256, so carry = 1 and sum = 414 - 256 = 158
resultdigits[0] = 158
On the second iteration:
sum = 121 + 20 + 1 = 142
142 < 256, so carry = 0
resultdigits[1] = 142
So at the end resultdigits[] = { 158, 142 }, that is 142:158 (base-256) = 142 * 256 + 158 = 36510 (base-10), which is exactly 31188 + 5322
Note that converting this number to/from a human-readable form is by no means a trivial task - it requires multiplication and division by 10 or 256 and I cannot present code as a sample without proper research. The advantage is that the operations 'add', 'subtract' and 'multiply' can be made really efficient and the heavy conversion to/from base-10 is done only once in the beginning and once after the end of the calculation.
Having said all that, personally, I'd use base 10 in array of bytes and not care about the memory loss. This will require adjusting the constants 255 and 256 above to 9 and 10 respectively.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Convert big hexadecimal to decimal numbers - c++

Related

Is there a way to convert a base 2^64 number to its base10 value in string form or display it in standard out in C or C++ without using big num libs?

algorithm to figure out how many bytes are required to hold an int

Double precision in C++ (or pow(2, 1000))

C/C++ counting the number of decimals?

How best to implement BCD as an exercise?

Categories

Resources