Translation of number systems - c++

I have three numbers in decimal form, they make itself one more decimal number. <120, 111, 200> - (120 * 256 + 111) * 256 + 200 = 7892936 - Decimal. I keep the number is because I have a variable number of bytes to write the number.
Q: How can I carry out the reverse operation?? If I need to convert
7892936 to <120, 111, 200>?
Drawing up a hexadecimal number from several decimal numbers

You may use bitmask and right-shift. Following may help:
std::array<std::uint8_t, 4> convert(std::uint32_t u)
{
return {
(u >> 24) & 0xFF,
(u >> 16) & 0xFF,
(u >> 8) & 0xFF,
u & 0xFF
};
}
Live example

you just need to do modulus and divide in similar order:
int val = 7892936;
while(val > 0){
int mod = val%256;
print mod;
val /= 256;
}
So the result will be:
200
111
120

Use repeated modulo and division operations in a loop for generic integer radix conversion. Optimizations for specific bases are possible, but shouldn't really concern you yet.
Also, you probably don't have numbers in decimal form. You have numbers. Unless you store them as strings, it's up to the computer to store them, and it will store them as binary.

int bigNumber = 7892936;
int a = bigNumber & FF;
int b = (bigNumber & FF00) >> 8;
int c = (bigNumber & FF0000) >> 16;

perform AND with ff, ff00, and ff0000 respectively.

Related

Reverse nibbles of a hexadecimal number in C++

What would be the fastest way possible to reverse the nibbles (e.g digits) of a hexadecimal number in C++?
Here's an example of what I mean : 0x12345 -> 0x54321
Here's what I already have:
unsigned int rotation (unsigned int hex) {
unsigned int result = 0;
while (hex) {
result = (result << 4) | (hex & 0xF);
hex >>= 4;
}
return result;
}
This problem can be split into two parts:
Reverse the nibbles of an integer. Reverse the bytes, and swap the nibble within each byte.
Shift the reversed result right by some amount to adjust for the "variable length". There are std::countl_zero(x) & -4 (number of leading zeroes, rounded down to a multiple of 4) leading zero bits that are part of the leading zeroes in hexadecimal, shifting right by that amount makes them not participate in the reversal.
For example, using some of the new functions from <bit>:
#include <stdint.h>
#include <bit>
uint32_t reverse_nibbles(uint32_t x) {
// reverse bytes
uint32_t r = std::byteswap(x);
// swap adjacent nibbles
r = ((r & 0x0F0F0F0F) << 4) | ((r >> 4) & 0x0F0F0F0F);
// adjust for variable-length of input
int len_of_zero_prefix = std::countl_zero(x) & -4;
return r >> len_of_zero_prefix;
}
That requires C++23 for std::byteswap which may be a bit optimistic, you can substitute it with some other byteswap.
Easily adaptable to uint64_t too.
i would do it without loops based on the assumption that the input is 32 bits
result = (hex & 0x0000000f) << 28
| (hex & 0x000000f0) << 20
| (hex & 0x00000f00) << 12
....
dont know if faster, but I find it more readable

8-digit BCD check

I've a 8-digit BCD number and need to check it out to see if it is a valid BCD number. How can I programmatically (C/C++) make this?
Ex: 0x12345678 is valid, but 0x00f00abc isn't.
Thanks in advance!
You need to check each 4-bit quantity to make sure it's less than 10. For efficiency you want to work on as many bits as you can at a single time.
Here I break the digits apart to leave a zero between each one, then add 6 to each and check for overflow.
uint32_t highs = (value & 0xf0f0f0f0) >> 4;
uint32_t lows = value & 0x0f0f0f0f;
bool invalid = (((highs + 0x06060606) | (lows + 0x06060606)) & 0xf0f0f0f0) != 0;
Edit: actually we can do slightly better. It doesn't take 4 bits to detect overflow, only 1. If we divide all the digits by 2, it frees a bit and we can check all the digits at once.
uint32_t halfdigits = (value >> 1) & 0x77777777;
bool invalid = ((halfdigits + 0x33333333) & 0x88888888) != 0;
The obvious way to do this is:
/* returns 1 if x is valid BCD */
int
isvalidbcd (uint32_t x)
{
for (; x; x = x>>4)
{
if ((x & 0xf) >= 0xa)
return 0;
}
return 1;
}
This link tells you all about BCD, and recommends something like this asa more optimised solution (reworking to check all the digits, and hence using a 64 bit data type, and untested):
/* returns 1 if x is valid BCD */
int
isvalidbcd (uint32_t x)
{
return !!(((uint64_t)x + 0x66666666ULL) ^ (uint64_t)x) & 0x111111110ULL;
}
For a digit to be invalid, it needs to be 10-15. That in turn means 8 + 4 or 8+2 - the low bit doesn't matter at all.
So:
long mask8 = value & 0x88888888;
long mask4 = value & 0x44444444;
long mask2 = value & 0x22222222;
return ((mask8 >> 2) & ((mask4 >>1) | mask2) == 0;
Slightly less obvious:
long mask8 = (value>>2);
long mask42 = (value | (value>>1);
return (mask8 & mask42 & 0x22222222) == 0;
By shifting before masking, we don't need 3 different masks.
Inspired by #Mark Ransom
bool invalid = (0x88888888 & (((value & 0xEEEEEEEE) >> 1) + (0x66666666 >> 1))) != 0;
// or
bool valid = !((((value & 0xEEEEEEEEu) >> 1) + 0x33333333) & 0x88888888);
Mask off each BCD digit's 1's place, shift right, then add 6 and check for BCD digit overflow.
How this works:
By adding +6 to each digit, we look for an overflow * of the 4-digit sum.
abcd
+ 110
-----
*efgd
But the bit value of d does not contribute to the sum, so first mask off that bit and shift right. Now the overflow bit is in the 8's place. This all is done in parallel and we mask these carry bits with 0x88888888 and test if any are set.
0abc
+ 11
-----
*efg

Hashing Function for Three Signed Integers

I'm trying to use an unordered_map with three signed integers as a key (this is because I wish to use tbb's concurrent_unordered_map).
I put together this little (3x16-bit => 64-bit) function:
// to hash
int64_t result = int16_t(x);
result = int64_t(result << 16) + int16_t(y);
result = int64_t(result << 16) + int16_t(z);
// from hash
int16_t x_ = int16_t(result >> 32);
int16_t y_ = int16_t(result >> 16);
int16_t z_ = int16_t(result & 0xFFFF);
This isn't working, what mistake have I made here?
My distribution of numbers is such that negative or positive number closer to zero is more likely (typically less than +/- 2^8), but I would like to extend this to work with a range up to 2^32, rather than my 2^16 example here. Ideally, I'm looking for as very few collisions within the typical range and preferably a simple algorithm. Any suggestions?
Your problem is that you are performing bit manipulations and adding on signed numbers. If the numbers are negative, the addition operation will translate into a subtraction. It will be difficult to tease out the correct original values after that happens.
Consider:
int16_t x = -1, y = 2, z = -3;
int64_t result = x; // result: FFFFFFFFFFFFFFFF
result = (result << 16) + y; // result: FFFFFFFFFFFF0000 + 0002
result = (result << 16) + z; // result: FFFFFFFF00020000 - 0003
return result; // result: FFFFFFFF0001FFFD
Thus, while -1 and -3 has been preserved, the result of the subtraction has reduced 2 to 1.
Instead, you should limit your operations on unsigned values. With unsigned values, + and | will be equivalent in your code since you are adding into the part of the number that is being 0 filled.
int64_t hash () {
uint64_t result = uint16_t(x_);
result = (result << 16) + uint16_t(y_);
result = (result << 16) + uint16_t(z_);
return result;
}

Grabbing n bits from a byte

I'm having a little trouble grabbing n bits from a byte.
I have an unsigned integer. Let's say our number in hex is 0x2A, which is 42 in decimal. In binary it looks like this: 0010 1010. How would I grab the first 5 bits which are 00101 and the next 3 bits which are 010, and place them into separate integers?
If anyone could help me that would be great! I know how to extract from one byte which is to simply do
int x = (number >> (8*n)) & 0xff // n being the # byte
which I saw on another post on stack overflow, but I wasn't sure on how to get separate bits out of the byte. If anyone could help me out, that'd be great! Thanks!
Integers are represented inside a machine as a sequence of bits; fortunately for us humans, programming languages provide a mechanism to show us these numbers in decimal (or hexadecimal), but that does not alter their internal representation.
You should review the bitwise operators &, |, ^ and ~ as well as the shift operators << and >>, which will help you understand how to solve problems like this.
The last 3 bits of the integer are:
x & 0x7
The five bits starting from the eight-last bit are:
x >> 3 // all but the last three bits
& 0x1F // the last five bits.
"grabbing" parts of an integer type in C works like this:
You shift the bits you want to the lowest position.
You use & to mask the bits you want - ones means "copy this bit", zeros mean "ignore"
So, in you example. Let's say we have a number int x = 42;
first 5 bits:
(x >> 3) & ((1 << 5)-1);
or
(x >> 3) & 31;
To fetch the lower three bits:
(x >> 0) & ((1 << 3)-1)
or:
x & 7;
Say you want hi bits from the top, and lo bits from the bottom. (5 and 3 in your example)
top = (n >> lo) & ((1 << hi) - 1)
bottom = n & ((1 << lo) - 1)
Explanation:
For the top, first get rid of the lower bits (shift right), then mask the remaining with an "all ones" mask (if you have a binary number like 0010000, subtracting one results 0001111 - the same number of 1s as you had 0-s in the original number).
For the bottom it's the same, just don't have to care with the initial shifting.
top = (42 >> 3) & ((1 << 5) - 1) = 5 & (32 - 1) = 5 = 00101b
bottom = 42 & ((1 << 3) - 1) = 42 & (8 - 1) = 2 = 010b
You could use bitfields for this. Bitfields are special structs where you can specify variables in bits.
typedef struct {
unsigned char a:5;
unsigned char b:3;
} my_bit_t;
unsigned char c = 0x42;
my_bit_t * n = &c;
int first = n->a;
int sec = n->b;
Bit fields are described in more detail at http://www.cs.cf.ac.uk/Dave/C/node13.html#SECTION001320000000000000000
The charm of bit fields is, that you do not have to deal with shift operators etc. The notation is quite easy. As always with manipulating bits there is a portability issue.
int x = (number >> 3) & 0x1f;
will give you an integer where the last 5 bits are the 8-4 bits of number and zeros in the other bits.
Similarly,
int y = number & 0x7;
will give you an integer with the last 3 bits set the last 3 bits of number and the zeros in the rest.
just get rid of the 8* in your code.
int input = 42;
int high3 = input >> 5;
int low5 = input & (32 - 1); // 32 = 2^5
bool isBit3On = input & 4; // 4 = 2^(3-1)

How to convert 8 17-bit integers into 17 8-bit integers efficiently

Okay, I have the following problem: I have a set of 8 (unsigned) numbers that are all 17bit (a.k.a. none of them are any bigger than 131071). Since 17bit numbers are annoying work work with (keeping them in a 32-bit int is a waste of space), I would like to turn these into 17 8-bit numbers, like so:
If I have these 8 17-bit integers:
[25409, 23885, 24721, 23159, 25409, 23885, 24721, 23159]
I would turn them into a base 2 representationL
["00110001101000001", "00101110101001101", "00110000010010001", "00101101001110111", "00110001101000001", "00101110101001101", "00110000010010001", "00101101001110111"]
Then join that into one big string:
"0011000110100000100101110101001101001100000100100010010110100111011100110001101000001001011101010011010011000001001000100101101001110111"
Then split that into 17 strings, each with 8 chars:
["00110001", "10100000", "10010111", "01010011", "01001100", "00010010", "00100101", "10100111", "01110011", "00011010", "00001001", "01110101", "00110100", "11000001", "00100010", "01011010", "01110111"]
And, finally, convert the binary representations back into integers
[49, 160, 151, 83, 76, 18, 37, 167, 115, 26, 9, 117, 52, 193, 34, 90, 119]
This method works, but it's not very efficient, I am looking for something more efficient than this, preferrably coded in C++, since that's the language I am working with. I just can't think of any way to do this more efficient, and 17-bit numbers aren't exactly easy to work with (16-bit numbers would be much nicer to work with).
Thanks in advance, xfbs
Store the lowest 16 bits of each number as-is (i.e. in two bytes). This leaves the most significant bit of each number. Since there are eight such numbers, simply combine the eight bits into one extra byte.
This will require exactly the same amount of memory as your method, but will involve a lot less bit twiddling.
P.S. Regardless of the storage method, you should be using bit-manipulation operators (<<, >>, &, | and so on) to do the job; there should not be any intermediate string-based representations involved.
Have a look at std::bitset<N>. May be you can stuff them into that?
Efficiently? Then don't use string conversions, bitfields, etc. Manage to do shifts yourself to achieve that. (Note that the arrays must be unsigned so that we don't encounter problems when shifting).
uint32 A[8]; //Your input, unsigned int
ubyte B[17]; //Output, unsigned byte
B[0] = (ubyte)A[0];
B[1] = (ubyte)(A[0] >> 8);
B[2] = (ubyte)A[1];
B[3] = (ubyte)(A[1] >> 8);
.
:
And for the last one, we do what ajx said. We take the most significant digit of each number (shifting them 16 bits to the right leaves the 17th digit) and fill the bits of our output by shifting each of the most significant digits from 0 to 7 to the left:
B[16] = (A[0] >> 16) | ((A[1] >> 16) << 1) | ((A[2] >> 16) << 2) | ((A[3] >> 16) << 3) | ... | ((A[7] >> 16) << 7);
Well, "efficient" was this. Other easier methods exist, too.
Though you say they are 17-bit numbers, they must be stored into an array of 32bit integers, where only the less significant 17 bits are used. You can extract from the first directly two bytes (dst[0] = src[0] >> 9 is the first, dst[1] = (src[0] >> 1) & 0xff the second); then you "push" the first bit as the 18th bit of the second, so that
dst[2] = (src[0] & 1) << 7 | src[1] >> 10;
dst[3] = (src[1] >> 2) & 0xff;
if you generalize it, you will see that this "formula" may be applied
dst[2*i] = src[i] >> (9+i) | (src[i-1] & BITS(i)) << (8-i);
dst[2*i + 1] = (src[i] >> (i+1)) & 0xff;
and for the last one: dst[16] = src[7] & 0xff;.
The whole code could look like
dst[0] = src[0] >> 9;
dst[1] = (src[0] >> 1) & 0xff;
for(i = 1; i < 8; i++)
{
dst[2*i] = src[i] >> (9+i) | (src[i-1] & BITS(i)) << (8-i);
dst[2*i + 1] = (src[i] >> (i+1)) & 0xff;
}
dst[16] = src[7] & 0xff;
Likely analysing better the loops, optimizations can be done so that we don't need to treat in a special manner the cases on the boundaries. The BITS macro create a mask of N bits set to 1 (least significant bits). Something like (to be checked for a better way, if any)
#define BITS(I) (~((~0)<<(I)))
ADD
Here I supposed src is e.g. int32_t and dst int8_t or alike.
This is in C, so you can use vector instead.
#define srcLength 8
#define destLength 17
int src[srcLength] = { 25409, 23885, 24721, 23159, 25409, 23885, 24721, 23159 };
unsigned char dest[destLength] = { 0 };
int srcElement = 0;
int bits = 0;
int i = 0;
int j = 0;
do {
while( bits >= srcLength ) {
dest[i++] = srcElement >> (bits - srcLength);
srcElement = srcElement & ((1 << bits) - 1);
bits -= srcLength;
}
if( j < srcLength ) {
srcElement <<= destLength;
bits += destLength;
srcElement |= src[j++];
}
} while (bits > 0);
Disclaimer: if you literally have seventeen integers (and not 100000 groups by 17), you should forget these optimizations as long as your program doesn't run veeery slowly.
I'd probably go about it this way. I don't want to deal with weird types when I'm doing my processing. Maybe I need to store them in some funky formatting due to legacy problems though. The values that are hard-coded should probably be based off of the 17 value, just didn't bother.
struct int_block {
static const uint32 w = 17;
static const uint32 m = 131071;
int_block() : data(151, 0) {} // w * 8 + (sizeof(uint32) - w)
uint32 get(size_t i) const {
uint32 retval = *reinterpret_cast<const uint32 *>( &data[i*w] );
retval &= m;
return retval;
}
void set(size_t i, uint32 val) {
uint32 prev = *reinterpret_cast<const uint32 *>( &data[i*w] );
prev &= ~m;
val |= prev;
*reinterpret_cast<uint32 *>( &data[i*w] ) = val;
}
std::vector<char> data;
};
TEST(int_block_test) {
int_block ib;
for (uint32 i = 0; i < 8; i++)
ib.set(i, i+25);
for (uint32 i = 0; i < 8; i++)
CHECK_EQUAL(i+25, ib.get(i));
}
You'd be able to break this by giving it bad values, but I'll leave that as an exercise for the reader. :))
Quite honestly, I think you'd be happier off representing them as 32-bit integers and just writing conversion functions. But I suspect you don't have control over that.