Note - This is NOT a duplicate of this question - Count the consecutive zero bits (trailing) on the right in parallel: an explanation? . The linked question has a different context, it only asks the purpose of signed() being use. DO NOT mark this question as duplicate.
I've been finding a way to acquire the number of trailing zeros in a number. I found a bit twiddling Stanford University Write up HERE here that gives the following explanation.
unsigned int v; // 32-bit word input to count zero bits on right
unsigned int c = 32; // c will be the number of zero bits on the right
v &= -signed(v);
if (v) c--;
if (v & 0x0000FFFF) c -= 16;
if (v & 0x00FF00FF) c -= 8;
if (v & 0x0F0F0F0F) c -= 4;
if (v & 0x33333333) c -= 2;
if (v & 0x55555555) c -= 1;
Why does this end up working ? I have an understanding of how Hex numbers are represented as binary and bitwise operators, but I am unable to figure out the intuition behind this working ? What is the working mechanism ?
The code is broken (undefined behavior is present). Here is a fixed version which is also slightly easier to understand (and probably faster):
uint32_t v; // 32-bit word input to count zero bits on right
unsigned c; // c will be the number of zero bits on the right
if (v) {
v &= -v; // keep rightmost set bit (the one that determines the answer) clear all others
c = 0;
if (v & 0xAAAAAAAAu) c |= 1; // binary 10..1010
if (v & 0xCCCCCCCCu) c |= 2; // binary 1100..11001100
if (v & 0xF0F0F0F0u) c |= 4;
if (v & 0xFF00FF00u) c |= 8;
if (v & 0xFFFF0000u) c |= 16;
}
else c = 32;
Once we know only one bit is set, we determine one bit of the result at a time, by simultaneously testing all bits where the result is odd, then all bits where the result has the 2's-place set, etc.
The original code worked in reverse, starting with all bits of the result set (after the if (c) c--;) and then determining which needed to be zero and clearing them.
Since we are learning one bit of the output at a time, I think it's more clear to build the output using bit operations not arithmetic.
This code (from the net) is mostly C, although v &= -signed(v); isn't correct C. The intent is for it to behave as v &= ~v + 1;
First, if v is zero, then it remains zero after the & operation, and all of the if statements are skipped, so you get 32.
Otherwise, the & operation (when corrected) clears all bits to the left of the rightmost 1, so at that point v contains a single 1 bit. Then c is decremented to 31, i.e. all 1 bits within the possible result range.
The if statements then determine its numeric position one bit at a time (one bit of the position number, not of v), clearing the bits that should be 0.
The code first transforms v is such a way that is is entirely null, except the left most one that remains. Then, it determines the position of this first one.
First let's see how we suppress all ones but the left most one.
Assume that k is the position of the left most one in v. v=(vn-1,vn-2,..vk+1,1,0,..0).
-v is the number that added to v will give 0 (actually it gives 2^n, but bit 2^n is ignored if we only keep the n less significant bits).
What must the value of bits in -v so that v+-v=0?
obviously bits k-1..0 of -k must be at 0 so that added to the trailing zeros in v they give a zero.
bit k must be at 1. Added to the one in vk, it will give a zero and a carry at one at order k+1
bit k+1 of -v will be added to vk+1 and to the carry generated at step k. It must be the logical complement of vk+1. So whatever the value of vk+1, we will have 1+0+1 if vk+1=0 (or 1+1+0 if vk+1=1) and result will be 0 at order k+1 with a carry generated at order k+2.
This is similar for bits n-1..k+2 and they must all be the logical complement of the corresponding bit in v.
Hence, we get the well-known result that to get -v, one must
leave unchanged all trailing zeros of v
leave unchanged the left most one of v
complement all the other bits.
If we compute v&-v, we have
v vn-1 vn-2 ... vk+1 1 0 0 ... 0
-v & ~vn-1 ~vn-2 ... ~vk+1 1 0 0 ... 0
v&-v 0 0 ... 0 1 0 0 ... 0
So v&-v only keeps the left most one in v.
To find the location of first one, look at the code:
if (v) c--; // no 1 in result? -> 32 trailing zeros.
// Otherwise it will be in range c..0=31..0
if (v & 0x0000FFFF) c -= 16; // If there is a one in left most part of v the range
// of possible values for the location of this one
// will be 15..0.
// Otherwise, range must 31..16
// remaining range is c..c-15
if (v & 0x00FF00FF) c -= 8; // if there is one in either byte 0 (c=15) or byte 2 (c=31),
// the one is in the lower part of range.
// So we must substract 8 to boundaries of range.
// Other wise, the one is in the upper part.
// Possible range of positions of v is now c..c-7
if (v & 0x0F0F0F0F) c -= 4; // do the same for the other bits.
if (v & 0x33333333) c -= 2;
if (v & 0x55555555) c -= 1;
Related
I need to test whether the positions (from 0 to 31 for a 32bit integer) with bit value 1 form a contiguous region. For example:
00111111000000000000000000000000 is contiguous
00111111000000000000000011000000 is not contiguous
I want this test, i.e. some function has_contiguous_one_bits(int), to be portable.
One obvious way is to loop over positions to find the first set bit, then the first non-set bit and check for any more set bits.
I wonder whether there exists a faster way? If there are fast methods to find the highest and lowest set bits (but from this question it appears there aren't any portable ones), then a possible implementation is
bool has_contiguous_one_bits(int val)
{
auto h = highest_set_bit(val);
auto l = lowest_set_bit(val);
return val == (((1 << (h-l+1))-1)<<l);
}
Just for fun, here are the first 100 integers with contiguous bits:
0 1 2 3 4 6 7 8 12 14 15 16 24 28 30 31 32 48 56 60 62 63 64 96 112 120 124 126 127 128 192 224 240 248 252 254 255 256 384 448 480 496 504 508 510 511 512 768 896 960 992 1008 1016 1020 1022 1023 1024 1536 1792 1920 1984 2016 2032 2040 2044 2046 2047 2048 3072 3584 3840 3968 4032 4064 4080 4088 4092 4094 4095 4096 6144 7168 7680 7936 8064 8128 8160 8176 8184 8188 8190 8191 8192 12288 14336 15360 15872 16128 16256 16320
they are (of course) of the form (1<<m)*(1<<n-1) with non-negative m and n.
Solution:
static _Bool IsCompact(unsigned x)
{
return (x & x + (x & -x)) == 0;
}
Briefly:
x & -x gives the lowest bit set in x (or zero if x is zero).
x + (x & -x) converts the lowest string of consecutive 1s to a single 1 higher up (or wraps to zero).
x & x + (x & -x) clears that lowest string of consecutive 1s.
(x & x + (x & -x)) == 0 tests whether any other 1 bits remain.
Longer:
-x equals ~x+1 (for the int in the question, we assume two’s complement, but unsigned is preferable). After the bits are flipped in ~x, adding 1 carries so that it flips back the low 1 bits in ~x and the first 0 bit but then stops. Thus, the low bits of -x up to and including its first 1 are the same as the low bits of x, but all higher bits are flipped. (Example: ~10011100 gives 01100011, and adding 1 gives 01100100, so the low 100 are the same, but the high 10011 are flipped to 01100.) Then x & -x gives us the only bit that is 1 in both, which is that lowest 1 bit (00000100). (If x is zero, x & -x is zero.)
Adding this to x causes a carry through all the consecutive 1s, changing them to 0s. It will leave a 1 at the next higher 0 bit (or carry through the high end, leaving a wrapped total of zero) (10100000.)
When this is ANDed with x, there are 0s in the places where the 1s were changed to 0s (and also where the carry changed a 0 to a 1). So the result is not zero only if there is another 1 bit higher up.
There is actually no need to use any intrinsics.
First flip all the 0s before the first 1. Then test if the new value is a mersenne number. In this algo, zero is mapped to true.
bool has_compact_bits( unsigned const x )
{
// fill up the low order zeroes
unsigned const y = x | ( x - 1 );
// test if the 1's is one solid block
return not ( y & ( y + 1 ) );
}
Of course, if you want to use intrinsics, here is the popcount method:
bool has_compact_bits( unsigned const x )
{
size_t const num_bits = CHAR_BIT * sizeof(unsigned);
size_t const sum = __builtin_ctz(x) + __builtin_popcount(x) + __builtin_clz(z);
return sum == num_bits;
}
Actually you don't need to count leading zeros. As suggested by pmg in the comments, exploiting the fact that the numbers you are looking for are those of sequence OEIS A023758, i.e. Numbers of the form 2^i - 2^j with i >= j, you may just count trailing zeros (i.e. j - 1), toggle those bits in the original value (equivalent to add 2^j - 1), and then check if that value is of the form 2^i - 1. With GCC/clang intrinsics,
bool has_compact_bits(int val) {
if (val == 0) return true; // __builtin_ctz undefined if argument is zero
int j = __builtin_ctz(val) + 1;
val |= (1 << j) - 1; // add 2^j - 1
val &= (val + 1); // val set to zero if of the form (2^i - 1)
return val == 0;
}
This version is slightly faster then yours and the one proposed by KamilCuk and the one by Yuri Feldman with popcount only.
If you are using C++20, you may get a portable function by replacing __builtin_ctz with std::countr_zero:
#include <bit>
bool has_compact_bits(int val) {
int j = std::countr_zero(static_cast<unsigned>(val)) + 1; // ugly cast
val |= (1 << j) - 1; // add 2^j - 1
val &= (val + 1); // val set to zero if of the form (2^i - 1)
return val == 0;
}
The cast is ugly, but it is warning you that it is better to work with unsigned types when manipulating bits. Pre-C++20 alternatives are boost::multiprecision::lsb.
Edit:
The benchmark on the strikethrough link was limited by the fact that no popcount instruction had been emitted for Yuri Feldman version. Trying to compile them on my PC with -march=westmere, I've measured the following time for 1 billion iterations with identical sequences from std::mt19937:
your version: 5.7 s
KamilCuk's second version: 4.7 s
my version: 4.7 s
Eric Postpischil's first version: 4.3 s
Yuri Feldman's version (using explicitly __builtin_popcount): 4.1 s
So, at least on my architecture, the fastest seems to be the one with popcount.
Edit 2:
I've updated my benchmark with the new Eric Postpischil's version. As requested in the comments, code of my test can be found here. I've added a no-op loop to estimate the time needed by the PRNG. I've also added the two versions by KevinZ. Code has been compiled on clang with -O3 -msse4 -mbmi to get popcnt and blsi instruction (thanks to Peter Cordes).
Results: At least on my architecture, Eric Postpischil's version is exactly as fast as Yuri Feldman's one, and at least twice faster than any other version proposed so far.
Not sure about fast, but can do a one-liner by verifying that val^(val>>1) has at most 2 bits on.
This only works with unsigned types: shifting in a 0 at the top (logical shift) is necessary, not an arithmetic right shift that shifts in a copy of the sign bit.
#include <bitset>
bool has_compact_bits(unsigned val)
{
return std::bitset<8*sizeof(val)>((val ^ (val>>1))).count() <= 2;
}
To reject 0 (i.e. only accept inputs that have exactly 1 contiguous bit-group), logical-AND with val being non-zero. Other answers on this question accept 0 as compact.
bool has_compact_bits(unsigned val)
{
return std::bitset<8*sizeof(val)>((val ^ (val>>1))).count() <= 2 and val;
}
C++ portably exposes popcount via std::bitset::count(), or in C++20 via std::popcount. C still doesn't have a portable way that reliably compiles to a popcnt or similar instruction on targets where one is available.
CPUs have dedicated instructions for that, very fast. On PC they are BSR/BSF (introduced in 80386 in 1985), on ARM they are CLZ/CTZ
Use one to find the index of least significant set bit, shift integer right by that amount. Use another one to find an index of the most significant set bit, compare your integer with (1u<<(bsr+1))-1.
Unfortunately, 35 years wasn't enough to update the C++ language to match the hardware. To use these instructions from C++ you'll need intrinsics, these aren't portable, and return results in slightly different formats. Use preprocessor, #ifdef etc, to detect the compiler and then use appropriate intrinsics. In MSVC they are _BitScanForward, _BitScanForward64, _BitScanReverse, _BitScanReverse64. In GCC and clang they are __builtin_clz and __builtin_ctz.
Comparison with zeros instead of ones will save some operations:
bool has_compact_bits2(int val) {
if (val == 0) return true;
int h = __builtin_clz(val);
// Clear bits to the left
val = (unsigned)val << h;
int l = __builtin_ctz(val);
// Invert
// >>l - Clear bits to the right
return (~(unsigned)val)>>l == 0;
}
The following results in one instructions less then the above on gcc10 -O3 on x86_64 and uses on sign extension:
bool has_compact_bits3(int val) {
if (val == 0) return true;
int h = __builtin_clz(val);
val <<= h;
int l = __builtin_ctz(val);
return ~(val>>l) == 0;
}
Tested on godbolt.
You can rephrase the requirement:
set N the number of bits that are different than the previous one (by iterating through the bits)
if N=2 and and the first or last bit is 0 then answer is yes
if N=1 then answer is yes (because all the 1s are on one side)
if N=0 then and any bit is 0 then you have no 1s, up to you if you consider the answer to be yes or no
anything else: the answer is no
Going through all bits could look like this:
unsigned int count_bit_changes (uint32_t value) {
unsigned int bit;
unsigned int changes = 0;
uint32_t last_bit = value & 1;
for (bit = 1; bit < 32; bit++) {
value = value >> 1;
if (value & 1 != last_bit {
changes++;
last_bit = value & 1;
}
}
return changes;
}
But this can surely be optimized (e.g. by aborting the for loop when value reached 0 which means no more significant bits with value 1 are present).
You can do this sequence of calculations (assuming val as an input):
uint32_t x = val;
x |= x >> 1;
x |= x >> 2;
x |= x >> 4;
x |= x >> 8;
x |= x >> 16;
to obtain a number with all zeros below the most significant 1 filled with ones.
You can also calculate y = val & -val to strip all except the least significant 1 bit in val (for example, 7 & -7 == 1 and 12 & -12 == 4).
Warning: this will fail for val == INT_MIN, so you'll have to handle this case separately, but this is immediate.
Then right-shift y by one position, to get a bit below the actual LSB of val, and do the same routine as for x:
uint32_t y = (val & -val) >> 1;
y |= y >> 1;
y |= y >> 2;
y |= y >> 4;
y |= y >> 8;
y |= y >> 16;
Then x - y or x & ~y or x ^ y produces the 'compact' bit mask spanning the whole length of val. Just compare it to val to see if val is 'compact'.
We can make use of the gcc builtin instructions to check if:
The count of set bits
int __builtin_popcount (unsigned int x)
Returns the number of 1-bits in x.
is equal to (a - b):
a: Index of the highest set bit (32 - CTZ) (32 because 32 bits in an unsigned integer).
int __builtin_clz (unsigned int x)
Returns the number of leading 0-bits in x, starting at the most significant bit position. If x is 0, the result is undefined.
b: Index of the lowest set bit (CLZ):
int __builtin_clz (unsigned int x)
Returns the number of leading 0-bits in x, starting at the most significant bit position. If x is 0, the result is undefined.
For example if n = 0b0001100110; we will obtain 4 with popcount but the index difference (a - b) will return 6.
bool has_contiguous_one_bits(unsigned n) {
return (32 - __builtin_clz(n) - __builtin_ctz(n)) == __builtin_popcount(n);
}
which can also be written as:
bool has_contiguous_one_bits(unsigned n) {
return (__builtin_popcount(n) + __builtin_clz(n) + __builtin_ctz(n)) == 32;
}
I don't think it is more elegant or efficient than the current most upvoted answer:
return (x & x + (x & -x)) == 0;
with following assembly:
mov eax, edi
neg eax
and eax, edi
add eax, edi
test eax, edi
sete al
but it is probably easier to understand.
Okay, here is a version that loops over bits
template<typename Integer>
inline constexpr bool has_compact_bits(Integer val) noexcept
{
Integer test = 1;
while(!(test & val) && test) test<<=1; // skip unset bits to find first set bit
while( (test & val) && test) test<<=1; // skip set bits to find next unset bit
while(!(test & val) && test) test<<=1; // skip unset bits to find an offending set bit
return !test;
}
The first two loops found the first compact region. The final loop checks whether there is any other set bit beyond that region.
This question is asked on Pearls of programming Question 2. And I am having trouble understanding its solution.
Here is the solution written in the book.
#define BITSPERWORD 32
#define SHIFT 5
#define MASK 0x1F
#define N 10000000
int a[1 + N/BITSPERWORD];
void set(int i) { a[i>>SHIFT] |= (1<<(i & MASK)); }
void clr(int i) { a[i>>SHIFT]&=~(1<<(i & MASK)); }
int test(int i) { return a[i>>SHIFT]&(1<<(i & MASK)); }
I have ran this in my compiler and I have looked at another question that talks about this problem, but I still dont understand how this solution works.
Why does it do a[i>>SHIFT]? Why cant it just be a[i]=1; Why does i need to shifted right 5 times?
32 is 25, so a right-shift of 5 bits is equivalent to dividing by 32. So by doing a[i>>5], you are dividing i by 32 to figure out which element of the array contains bit i -- there are 32 bits per element.
Meanwhile & MASK is equivalent to mod 32, so 1<<(i & MASK) builds a 1-bit mask for the particular bit within the word.
Divide the 32 bits of int i (starting form bit 0 to bit 31) into two parts.
First part is the most significant bits 31 to 5. Use this part to find the index in the array of ints (called a[] here) that you are using to implement the bit array. Initially, the entire array of ints is zeroed out.
Since every int in a[] is 32 bits, it can keep track of 32 ints with those 32 bits. We divide every input i with 32 to find the int in a[] that is supposed to keep track of this i.
Every time a number is divided by 2, it is effectively right shifted once. To divide a number by 32, you simply right shift it 5 times. And that is exactly what we get by filtering out the first part.
Second part is the least significant bits 0 to 4. After a number has been binned into the correct index, use this part to set the specific bit of the zero stored in a[] at this index. Obviously, if some bit of the zero at this index has already been set, the value at that index will not be zero anymore.
How to get the first part? Right shifting i by 5 (i.e. i >> SHIFT).
How to get the second part? Do bitwise AND of i by 11111. (11111)2 = 0x1F, defined as MASK. So, i & MASK will give the integer value represented by the last 5 bits of i.
The last 5 bits tell you how many bits to go inside the number in a[]. For example, if i is 5, you want to set the bit in the index 0 of a[] and you specifically want to set the 5th bit of the int value a[0].
Index to set = 5 / 32 = (0101 >> 5) = 0000 = 0.
Bit to set = 5th bit inside a[0]
= a[0] & (1 << 5)
= a[0] & (1 << (00101 & 11111)).
Setting the bit for given i
Get the int to set by a[i >> 5]
Get the bit to set by pushing a 1 a total of i % 32 times to the left i.e. 1 << (i & 0x1F)
Simply set the bit as a[i >> 5] = a[i >> 5] | (1 << (i & 0x1F));
That can be shortened to a[i >> 5] |= (1 << (i & 0x1F));
Getting/Testing the bit for given i
Get the int where the desired bit lies by a[i >> 5]
Generate a number where all bits except for the i & 0x1F bit are 0. You can do that by negating 1 << (i & 0x1F).
AND the number generated above with the value stored at this index in a[]. If the value is 0, this particular bit was 0. If the value is non-zero, this bit was 1.
In code you would simply, return a[i >> 5] & (1 << (i & 0x1F)) != 0;
Clearing the bit for given i: It means setting the bit for that i to 0.
Get the int where the bit lies by a[i >> 5]
Get the bit by 1 << (i & 0x1F)
Invert all the bits of 1 << (i & 0x1F) so that the i's bit is 0.
AND the number at this index and the number generated in step 3. That will clear i's bit, leaving all other bits intact.
In code, this would be: a[i >> 5] &= ~(1 << (i & 0x1F));
I've got an interesting problem that has me looking for a more efficient way of doing things.
Let's say we have a value (in binary)
(VALUE) 10110001
(MASK) 00110010
----------------
(AND) 00110000
Now, I need to be able to XOR any bits from the (AND) value that are set in the (MASK) value (always lowest to highest bit):
(RESULT) AND1(0) xor AND4(1) xor AND5(1) = 0
Now, on paper, this is certainly quick since I can see which bits are set in the mask. It seems to me that programmatically I would need to keep right shifting the MASK until I found a set bit, XOR it with a separate value, and loop until the entire byte is complete.
Can anyone think of a faster way? I'm looking for the way to do this with the least number of operations and stored values.
If I understood this question correctly, what you want is to get every bit from VALUE that is set in the MASK, and compute the XOR of those bits.
First of all, note that XOR'ing a value with 0 will not change the result. So, to ignore some bits, we can treat them as zeros.
So, XORing the bits set in VALUE that are in MASK is equivalent to XORing the bits in VALUE&MASK.
Now note that the result is 0 if the number of set bits is even, 1 if it is odd.
That means we want to count the number of set bits. Some architectures/compilers have ways to quickly compute this value. For instance, on GCC this can be obtained with __builtin_popcount.
So on GCC, this can be computed with:
int set_bits = __builtin_popcount(value & mask);
return set_bits % 2;
If you want the code to be portable, then this won't do. However, a comment in this answer suggests that some compilers can inline std::bitset::count to efficiently obtain the same result.
If I'm understanding you right, you have
result = value & mask
and you want to XOR the 1 bits of mask & result together. The XOR of a series of bits is the same as counting the number of bits and checking if that count is even or odd. If it's odd, the XOR would be 1; if even, XOR would give 0.
count_bits(mask & result) % 2 != 0
mask & result can be simplified to simply result. You don't need to AND it with mask again. The % 2 != 0 can be alternately written as & 1.
count_bits(result) & 1
As far as how to count bits, the Bit Twiddling Hacks web page gives a number of bit counting algorithms.
Counting bits set, Brian Kernighan's way
unsigned int v; // count the number of bits set in v
unsigned int c; // c accumulates the total bits set in v
for (c = 0; v; c++)
{
v &= v - 1; // clear the least significant bit set
}
Brian Kernighan's method goes through as many iterations as there are
set bits. So if we have a 32-bit word with only the high bit set, then
it will only go once through the loop.
If you were to use that implementation, you could optimize it a bit further. If you think about it, you don't need the full count of bits. You only need to track their parity. Instead of counting bits you could just flip c each iteration.
unsigned bit_parity(unsigned v) {
unsigned c;
for (c = 0; v; c ^= 1) {
v &= v - 1;
}
}
(Thanks to Slava for the suggestion.)
Using that the XOR with 0 doesn't change anything, it's OK to apply the mask and then unconditionally XOR all bits together, which can be done in a parallel-prefix way. So something like this (not tested):
x = m & v;
x ^= x >> 16;
x ^= x >> 8;
x ^= x >> 4;
x ^= x >> 2;
x ^= x >> 1;
result = x & 1
You can use more (or fewer) steps as needed, this is for 32 bits.
One significant issue to be aware of if using v &= v - 1 in the main body of your code is it will change the value of v to 0 in conducting the count. With other methods, the value is changed to the number of 1's. While count logic is generally wrapped as a function, where that is no longer a concern, if you are required to present your counting logic in the main body of your code, you must preserve a copy of v if that value is needed again.
In addition to the other two methods presented, the following is another favorite from bit-twiddling hacks that generally has a bit better performance than the loop method for larger numbers:
/* get the population 1's in the binary representation of a number */
unsigned getn1s (unsigned int v)
{
v = v - ((v >> 1) & 0x55555555);
v = (v & 0x33333333) + ((v >> 2) & 0x33333333);
v = (v + (v >> 4)) & 0x0F0F0F0F;
v = v + (v << 8);
v = v + (v << 16);
return v >> 24;
}
The problem is to tell if two 8-bit chars are gray codes(differ only in 1 bit) in C++?
I found an elegant C++ solution:
bool isGray(char a, char b) {
int m = a ^ b;
return m != 0 && (m & (m - 1) & 0xff) == 0;
}
I was confused that what does the "& 0xff" do?
& 0xff extracts the 8 lowest bits from the resulting value, ignoring any higher ones.
It's wrong. The mistaken idea is that char is 8 bit.
It's also pointless. The presumed problem is that m can have more bits than char (true) so that "unnecessary" bits are masked off.
But m is sign-extended. That means the sign bit is copied to the higher bits. Now, when we're comparing x==0 we're checking whether all bits are zero, and with x & 0xff we're comparing if the lower 8 bits are zero. If the 8th bit of x is copied to all higher positions (by sign extension), then the two conditions are the same regardless of whether the copied bit was 0 or 1.
I need to implement a bitwise shift (logical, not arithmetic) on OpenInsight 8.
In the system mostly everything is a string but there are 4 functions that treat numbers as 32-bit integers. The bitwise functions available are AND, OR, NOT and XOR. Any arithmetic operators treat the number as signed.
I'm currently having a problem with implementing left and right shifts which I need to implement SHA-1.
Can anyone suggest an algorithm which can help me accomplish this? Pseudocode is good enough, I just need a general idea.
You can implement shifting with integer multiplication and division:
Shift left = *2
Shift right = /2
Perhaps you need to mask the number first to make the most siginificant bit zero to prevent integer overflow.
logical shift down by one bit using signed arithmetic and bitwise ops
if v < 0 then
v = v & 0x7fffffff // clear the top bit
v = v / 2 // shift the rest down
v = v + 0x40000000 // set the penultimate bit
else
v = v / 2
fi
If there's no logical right shift you can easily achieve that by right shifting arithmetically n bits then clear the top n bits
For example: shift right 2 bits:
x >= 2;
x &= 0x3fffffff;
Shift right n bits
x >= n;
x &= ~(0xffffffff << (32 - n));
// or
x >= n;
x &= (1 << (32 - n)) - 1;
For left shifting there's no logical/mathematical differentiation because they are all the same, just shift 0s in.