How do multiply an array of ints to result in a single number? - c++

So I have a single int broken up into an array of smaller ints. For example, int num = 136928 becomes int num[3] = {13,69,28}. I need to multiply the array by a certain number. The normal operation would be 136928 * 2 == 273856. But I need to do [13,69,28] * 2 to give the same answer as 136928 * 2 would in the form of an array again - the result should be
for(int i : arr) {
i *= 2;
//Should multiply everything in the array
//so that arr now equals {27,38,56}
}
Any help would be appreciated on how to do this (also needs to work with multiplying floating numbers) e.g. arr * 0.5 should half everything in the array.
For those wondering, the number has to be split up into an array because it is too large to store in any standard type (64 bytes). Specifically I am trying to perform a mathematical operation on the result of a sha256 hash. The hash returns an array of the hash as uint8_t[64].

Consider using Boost.Multiprecision instead. Specifically, the cpp_int type, which is a representation of an arbitrary-sized integer value.
//In your includes...
#include <boost/multiprecision/cpp_int.hpp>
//In your relevant code:
bool is_little_endian = /*...*/;//Might need to flip this
uint8_t values[64];
boost::multiprecision::cpp_int value;
boost::multiprecision::cpp_int::import_bits(
value,
std::begin(values),
std::end(values),
is_little_endian
);
//easy arithmetic to perform
value *= 2;
boost::multiprecision::cpp_int::export_bits(
value,
std::begin(values),
8,
is_little_endian
);
//values now contains the properly multiplied result
Theoretically this should work with the properly sized type uint512_t, found in the same namespace as cpp_int, but I don't have a C++ compiler to test with right now, so I can't verify. If it does work, you should prefer uint512_t, since it'll probably be faster than an arbitrarily-sized integer.

If you just need multiplying with / dividing by two (2) then you can simply shift the bits in each byte that makes up the value.
So for multiplication you start at the left (I'm assuming big endian here). Then you take the most significant bit of the byte and store it in a temp var (a possible carry bit). Then you shift the other bits to the left. The stored bit will be the least significant bit of the next byte, after shifting. Repeat this until you processed all bytes. You may be left with a single carry bit which you can toss away if you're performing operations modulo 2^512 (64 bytes).
Division is similar, but you start at the right and you carry the least significant bit of each byte. If you remove the rightmost bit then you calculate the "floor" of the calculation (i.e. three divided by two will be one, not one-and-a-half or two).
This is useful if
you don't want to copy the bytes or
if you just need bit operations otherwise and you don't want to include a multi-precision / big integer library.
Using a big integer library would be recommended for maintainability.

Related

How to design INT of 16,32, 64 bytes or even bigger in C++

As a beginner, I know we can use an ARRAY to store larger numbers if required, but I want to have a 16 bytes INT data type in c++ on which I can perform all arithmetic operations as performed on basic data types like INT or FLOAT
So can we in effect increase, default data types size as desired, like int of 64 bytes or double of 120 bytes, not directly on basic data type but in effect which is the same as of increasing capacity of datatypes.
Is this even possible, if yes then how and if not then what are completely different ways to achieve the same?
Yes, it's possible, but no, it's not trivial.
First, I feel obliged to point out that this is one area where C and C++ really don't provide as much access to the hardware at the lowest level as you'd really like. In assembly language, you normally get a couple of features that make multiple-precision arithmetic quite a bit easier to implement. One is a carry flag. This tracks whether a previous addition generated a carry (or a previous subtraction a borrow). So to add two 12-bit numbers on a machine with 64-bit registers you'd typically write code on this general order:
; r0 contains the bottom 64-bits of the first operand
; r1 contains the upper 64 bits of the first operand
; r2 contains the lower 64 bits of the second operand
; r3 contains the upper 64 bits of the second operand
add r0, r2
adc r1, r3
Likewise, when you multiply two numbers, most processors generate the full answer in two separate registers, so when (for example) you multiply two 64-bit numbers, you get a 128-bit result.
In C and C++, however, we don't get that. One easy way to get around it is to work in smaller chunks. For example, if we want a 128-bit type on an implementation that provides 64-bit long long as its largest integer type, we can work in 32-bit chunks. When we're going to do an operation, we widen those to a long long, and do the operation on the long long. This way, when we add or multiply two 32-bit chunks, if the result is larger than 32 bits, we can still store it all in our 64-bit long long.
So, for addition life is pretty easy. We add the two lowest order words. We use a bitmask to get the bottom 32 bits and store them into the bottom 32 bits of the result. Then we take the upper 32 bits, and use them as a "carry" when we add the next 32 bits of the operands. Continue until we've added all 128 (or whatever) bits of operands and gotten our overall result.
Subtraction is pretty similar. In fact, we can do 2's complement on the second operand, then add to get our result.
Multiplication gets a little trickier. It's not always immediately obvious how we can carry out multiplication in smaller pieces. The usual is based on the distributive property. That is, we can take some large numbers A and B, and break them up into (a0 + a1) and (b0 + b1), where each an and bn is a 32-bit chunk of the operand. Then we use the distributive property to turn that into:
a0 * b0 + a0 * b1 + a1 * b0 + a1 * b1
This can be extended to an arbitrary number of "chunks", though if you're dealing with really large numbers there are much better ways (e.g., karatsuba).
If you want to define non-atomic big integers, you can use plain structs.
template <std::size_t size>
struct big_int {
std::array<std::int8_t, size> bytes;
};
using int128_t = big_int<16>;
using int256_t = big_int<32>;
using int512_t = big_int<64>;
int main() {
int128_t i128 = { 0 };
}

Is there a computationally efficient way to store values smaller than a byte in multidimensional arrays in C/C++?

My current project involves working with arrays of 5+ dimensions, but the individual elements of the array do not need to have 256 possible values. I was wondering if I could save on memory space by using a custom data type with, for example, only 4 or 6 bits to represent the value of an element, and if these memory savings would come at some significant performance cost.
Multidimensional arrays in C are really basically arrays of arrays.
(It can't be any other way as RAM is inherently linear).
You can emulate them on linear arrays in terms of pointer arithmetic:
#undef NDEBUG
#include <assert.h>
#include <stdint.h>
int main()
{
typedef uint32_t TYPE;
enum{A=3,B=4,C=5};
TYPE a[A][B][C];
assert((char*)&a[1][2][3] == ((char*)&a) + \
3*sizeof(TYPE) + 2 *C*sizeof(TYPE) + 1 *B*C*sizeof(TYPE));
}
Computers don't let you address sub-char types but it's not difficult to imagine a sub-char type.
The above char offset calculation for addressing a[1][2][3] could be rewriten like
char_ix = (3*sizeof(TYPE)*CHAR_BIT + 2 *C*sizeof(TYPE)*CHAR_BIT + 1 *B*C*sizeof(TYPE)*CHAR_BIT)/CHAR_BIT;
and if instead of chars (8-bits) you wanted to address e.g., 4-bits, you'd change it to
char_ix_of_4_bit =
(3*sizeof(TYPE)*CHAR_BIT/2 +
2 *C*sizeof(TYPE)*CHAR_BIT/2 +
1 *B*C*sizeof(TYPE)*CHAR_BIT/2) \
/ CHAR_BIT; //2 4-bits per octet
char_ix_of_4_bit_remainder =
(3*sizeof(TYPE)*CHAR_BIT/2 +
2 *C*sizeof(TYPE)*CHAR_BIT/2 +
1 *B*C*sizeof(TYPE)*CHAR_BIT/2) \
% CHAR_BIT; //2 4-bits per octet
The 4 bit value at the destination would then be
((unsigned char*)&a)[char_ix_of_4_bit] >> (4*char_ix_of_4_bit_remainder)
Similar for other bit groups.
In short, you can think of multidimensional bit arrays, reimagine them as linear bit arrays and then use regular indexing and bit shifting
to address the appropriate bit group or individual bits (IIRC, C++'s std::bitset/std::vector<bool> hide the last part under bit indexing
with an overloaded [] operator, but it's not hard to do it manually (which is what you'll need to do in pure C anyway, as pure C doesn't have operator overloading).
Bit ops are said to be slower and generate larger code than operations with whole types, but this might be well be offset by better cache locality, which using sub-char bit arrays might buy you depending on your data (you'd better have lots of data if you're attempting to do this).

Determining number size in c++

I am given the hex number 0x04030201, stored in the middle of an array filled with zeroes. My code has to determine the size of this number in bits.
I am required to implement this in 2 different ways. My first way was to use the sizeof() function like this:
#include <iostream>
#include <cstdio>
using namespace std;
int main()
{
int num[] = {0,0,0x04030201,0,0};
cout<<"The size of the integer is "<<sizeof(num[2])*4<<" bits."<<endl;
return 0;
}
My expected output is 28 bits(4 bits times the 7 significant characters). But my output gives:
The size of the integer is 16 bits.
What's my mistake?
What's my mistake?
Using the wrong tool.
If you look up sizeof in your favourite documentation, you'll see that it tells you how many bytes an object or type takes up. This has nothing to do with counting the number of significant bits in an integer value, and I don't know whence you got the notion that it would.
If you want to count significant bits in an integer, you will need either of:
A numerical algorithm to calculate this using bitwise arithmetic, or
A compiler intrinsic to do this for you, e.g. __builtin_clz in GCC (subtracted from total number of bits in the type — now you can use sizeof!)
My code has to determine the size of this number in bits
What does that mean?
Is it:
the size of the smallest type native to your (unspecified) architecture which can represent your number?
in this case, your code
sizeof(num[2])*4
is wrong because it (for some reason) assumes 4-bit chars. If you want to know the number of bits in a char, it's called CHAR_BIT. The sizeof expression correctly gives you the size in chars, so the code should be
sizeof(num[2])*CHAR_BIT
Note however that sizeof(num[2])==sizeof(num[0])==sizeof(int), because every int is the same size: it doesn't depend on what value it holds.
Or did you want the smallest number of bits that can represent your number, ignoring what those bits are stored in?
In this case log2(n) gives you the power of 2 your number represents. Taking the floor of that (truncating the floating-point value to an integer) gives you the number of the highest bit. Add one to get the number of bits (they start from zero, since you need one bit to represent 1=20).
If you want to do this without using logarithms, start by writing out your number in base 2 (start with a smaller number for testing, to save time). Hint: examine what happens when you divide it by 2.

Represent Integers with 2000 or more digits [duplicate]

This question already has answers here:
Handling large numbers in C++?
(10 answers)
Closed 7 years ago.
I would like to write a program, which could compute integers having more then 2000 or 20000 digits (for Pi's decimals). I would like to do in C++, without any libraries! (No big integer, boost,...). Can anyone suggest a way of doing it? Here are my thoughts:
using const char*, for holding the integer's digits;
representing the number like
( (1 * 10 + x) * 10 + x )...
The obvious answer works along these lines:
class integer {
bool negative;
std::vector<std::uint64_t> data;
};
Where the number is represented as a sign bit and a (unsigned) base 2**64 value.
This means the absolute value of your number is:
data[0] + (data[1] << 64) + (data[2] << 128) + ....
Or, in other terms you represent your number as a little-endian bitstring with words as large as your target machine can reasonably work with. I chose 64 bit integers, as you can minimize the number of individual word operations this way (on a x64 machine).
To implement Addition, you use a concept you have learned in elementary school:
a b
+ x y
------------------
(a+x+carry) (b+y reduced to one digit length)
The reduction (modulo 2**64) happens automatically, and the carry can only ever be either zero or one. All that remains is to detect a carry, which is simple:
bool next_carry = false;
if(x += y < y) next_carry = true;
if(prev_carry && !++x) next_carry = true;
Subtraction can be implemented similarly using a borrow instead.
Note that getting anywhere close to the performance of e.g. libgmp is... unlikely.
A long integer is usually represented by a sequence of digits (see positional notation). For convenience, use little endian convention: A[0] is the lowest digit, A[n-1] is the highest one. In general case your number is equal to sum(A[i] * base^i) for some value of base.
The simplest value for base is ten, but it is not efficient. If you want to print your answer to user often, you'd better use power-of-ten as base. For instance, you can use base = 10^9 and store all digits in int32 type. If you want maximal speed, then better use power-of-two bases. For instance, base = 2^32 is the best possible base for 32-bit compiler (however, you'll need assembly to make it work optimally).
There are two ways to represent negative integers, The first one is to store integer as sign + digits sequence. In this case you'll have to handle all cases with different signs yourself. The other option is to use complement form. It can be used for both power-of-two and power-of-ten bases.
Since the length of the sequence may be different, you'd better store digit sequence in std::vector. Do not forget to remove leading zeroes in this case. An alternative solution would be to store fixed number of digits always (fixed-size array).
The operations are implemented in pretty straightforward way: just as you did them in school =)
P.S. Alternatively, each integer (of bounded length) can be represented by its reminders for a set of different prime modules, thanks to CRT. Such a representation supports only limited set of operations, and requires nontrivial convertion if you want to print it.

big integer addition without carry flag

In assembly languages, there is usually an instruction that adds two operands and a carry. If you want to implement big integer additions, you simply add the lowest integers without a carry and the next integers with a carry. How would I do that efficiently in C or C++ where I don't have access to the carry flag? It should work on several compilers and architectures, so I cannot simply use inline assembly or such.
You can use "nails" (a term from GMP): rather than using all 64 bits of a uint64_t when representing a number, use only 63 of them, with the top bit zero. That way you can detect overflow with a simple bit-shift. You may even want less than 63.
Or, you can do half-word arithmetic. If you can do 64-bit arithmetic, represent your number as an array of uint32_ts (or equivalently, split 64-bit words into upper and lower 32-bit chunks). Then, when doing arithmetic operations on these 32-bit integers, you can first promote to 64 bits do the arithmetic there, then convert back. This lets you detect carry, and it's also good for multiplication if you don't have a "multiply hi" instruction.
As the other answer indicates, you can detect overflow in an unsigned addition by:
uint64_t sum = a + b;
uint64_t carry = sum < a;
As an aside, while in practice this will also work in signed arithmetic, you have two issues:
It's more complex
Technically, overflowing a signed integer is undefined behavior
so you're usually better off sticking to unsigned numbers.
You can figure out the carry by virtue of the fact that, if you overflow by adding two numbers, the result will always be less than either of those other two values.
In other words, if a + b is less than a, it overflowed. That's for positive values of a and b of course but that's what you'd almost certainly be using for a bignum library.
Unfortunately, a carry introduces an extra complication in that adding the largest possible value plus a carry of one will give you the same value you started with. Hence, you have to handle that as a special case.
Something like:
carry = 0
for i = 7 to 0:
if a[i] > b[i]:
small = b[i], large = a[i]
else:
small = a[i], large = b[i]
if carry is 1 and large is maxvalue:
c[i] = small
carry = 1
else:
c[i] = large + small + carry
if c[i] < large:
carry = 1
else
carry = 0
In reality, you may also want to consider not using all the bits in your array elements.
I've implemented libraries in the past, where the maximum "digit" is less than or equal to the square root of the highest value it can hold. So for 8-bit (octet) digits, you store values from 0 through 15 - that way, multiplying two digits and adding the maximum carry will always fit with an octet, making overflow detection moot, though at the cost of some storage.
Similarly, 16-bit digits would have the range 0 through 255 so that it won't overflow at 65536.
In fact, I've sometimes limited it to more than that, ensuring the artificial wrap value is a power of ten (so an octet would hold 0 through 9, 16-bit digits would be 0 through 99, 32-bit digits from 0 through 9999, and so on.
That's a bit more wasteful on space but makes conversion to and from text (such as printing your numbers) incredibly easy.
u can check for carry for unsigned types by checking, is result less than an operand (any operand will do).
just start the thing with carry 0.
If I understand you correctly, you want to write you own addition for you own big integer type.
You can do this with a simple function. No need to worry about the carry flag in the first run. Just go from right to left, add digit by digit and the carry flag (internally in that function), starting with a carry of 0, and set the result to (a+b+carry) %10 and the carry to (a+b+carry) / 10.
this SO could be relevant:
how to implement big int in c