void Tools::Swap(uint32_t number){
int temp1 = (number >> 31) & 1;
int temp2 = number & 1;
int ans = number & 7ffffffe;
int mask = (temp2 << 31) | temp1;
ans = ans | mask;
cout << ans << endl;
}
I've worked it out on paper and it does seem to swap the first and last bits but I want to be sure it's the best way I can be doing this.
No. (temp2 << 31) causes undefined behaviour if temp2 is 1, and int is 32-bit or narrower.
However, if you replace all of the int by uint32_t and slap an 0x on the front of 7ffffffe, then it seems correct.
It might be better to let the compiler create temps as needed. With shift counts of 31, there's no need to use &.
uint32_t number;
// ...
number = (number<<31) | (number & 0x7ffffffe) | (number>>31);
Related
Question is Given - To Count Total number of set bits for 2 given numbers . For example take 2 and 3 as input . So 2 represents - 10 in binary form and 3 represents - 11 in binary form , So total number of set bits = 3.
My work -
#include <iostream>
using namespace std;
int bit(int n1, int n2){
int count = 0;
while(n1 != 0 && n2 != 0){
if(n1 & 1 || n2 & 1) {
count++;
}
n1 >> 1;
n2 >> 1;
}
return count;
}
int main() {
int a;
cin >> a;
int b;
cin >> b;
cout << bit(a,b);
return 0;
}
Expected Output - 3
So please anyone knows what i am doing wrong please correct it and answer it.
Why ask the question for 2 numbers if the intended combined result is just the sum of the separate results?
If you can use C++20, std::popcount gives you the number of set bits in one unsigned variable.
If you can't use C++20, there is std::bitset. Construct one from your number and use its method count.
So your code becomes:
int bit(int n1, int n2) {
#if defined(__cpp_lib_bitops)
return std::popcount(static_cast<unsigned int>(n1)) + std::popcount(static_cast<unsigned int>(n2));
#else
return std::bitset<sizeof(n1)*8>(n1).count() + std::bitset<sizeof(n2)*8>(n2).count();
#endif
}
Demo
I'm not quite sure what happens if you give negative integers as input. I would do a check beforehand or just work with unsigned types from the beginning.
What you are doing wrong was shown in a (now deleted) answer by Ankit Kumar:
if(n1&1 || n2&1){
count++;
}
If a bit is set in both n1 and n2, you are including it only once, not twice.
What should you do, other than using C++20's std::popcount, is to write an algorithm1 that calculates the number of bits set in a number and then call it twice.
Also note that you should use an unsigned type, to avoid implementation defined behavior of right shifting a (possibly) signed value.
1) I suppose that is the purpose of OP's exercise, everybody else shuold just use std::popcount.
It is possible without a loop, no if and only 1 variable.
First OR the values together
int value = n1 | n2; // get the combinedd bits
Then divide value in bitblocks
value = (value & 0x55555555) + ((value >> 1) & 0x55555555);
value = (value & 0x33333333) + ((value >> 1) & 0x33333333);
value = (value & 0x0f0f0f0f) + ((value >> 1) & 0x0f0f0f0f);
value = (value & 0x00ff00ff) + ((value >> 1) & 0x00ff00ff);
value = (value & 0x0000ffff) + ((value >> 1) & 0x0000ffff);
This works for 32bit values only, adjust fopr 64bit accordingly
how can I turn off leftmost non-zero bit of a number in O(1)?
for example
n = 366 (base 10) = 101101110 (in base 2)
then after turning the leftmost non-zero bit off ,number looks like = 001101110
n will always be >0
Well, if you insist on O(1) under any circumstances, the Intel Intrinsics function _bit_scan_reverse() defined in immintrin.h does a hardware find for the most-significant non-zero bit in a int number.
Though the operation does use a loop (functional equivalent), I believe its constant time given its latency at fixed 3 (as per Intel Intrinsics Guide).
The function will return the index to the most-significant non-zero bit thus doing a simple:
n = n & ~(1 << _bit_scan_reverse(n));
should do.
This intrinsic is undefined for n == 0. So you gotta watch out there. I'm following the assumption of your original post where n > 0.
n = 2^x + y.
x = log(n) base 2
Your highest set bit is x.
So in order to reset that bit,
number &= ~(1 << x);
Another approach:
int highestOneBit(int i) {
i |= (i >> 1);
i |= (i >> 2);
i |= (i >> 4);
i |= (i >> 8);
i |= (i >> 16);
return i - (i >> 1);
}
int main() {
int n = 32767;
int z = highestOneBit(n); // returns the highest set bit number i.e 2^x.
cout<< (n&(~z)); // Resets the highest set bit.
return 0;
}
Check out this question, for a possibly faster solution, using a processor instruction.
However, an O(lgN) solution is:
int cmsb(int x)
{
unsigned int count = 0;
while (x >>= 1) {
++count;
}
return x & ~(1 << count);
}
If ANDN is not supported and LZCNT is supported, the fastest O(1) way to do it is not something along the lines of n = n & ~(1 << _bit_scan_reverse(n)); but rather...
int reset_highest_set_bit(int x)
{
const int mask = 0x7FFFFFFF; // 011111111[...]
return x & (mask >> __builtin_clz(x));
}
I am trying to solve the following problem:
/*
* Return 1 if ptr1 and ptr2 are within the *same* 64-byte aligned
* block (or word) of memory. Return zero otherwise.
*
* Operators / and % and loops are NOT allowed.
*/
/*
I have the following code:
int withinSameBlock(int * ptr1, int * ptr2) {
// TODO
int temp = (1 << 31) >> 25;
int a = ptr1;
int b = ptr2;
return (a & temp) == (b & temp);
}
I have been told that this correctly solves the problem, but I am unsure how it works. Specifically, how does the line int temp = (1 << 31) >> 25; help to solve the problem?
The line:
int temp = (1 << 31) >> 25;
is either incorrect or triggers undefined behavior (depending on wordsize). It just so happens that the undefined behavior on your machine and your compiler does the right thing
and just happens to give the correct answer. To avoid undefined behavior and make the code clearer, you should use:
int withinSameBlock(int * ptr1, int * ptr2) {
uintptr_t temp = ~(uintptr_t)63;
uintptr_t a = (uintptr_t)ptr1;
uintptr_t b = (uintptr_t)ptr2;
return (a & temp) == (b & temp);
}
I'm not sure where you get that code (homework?) but this is terrible.
1. casting pointer to int and do arithmetics is generally very bad practice. The actual size is undefined by those primitive types, for instant, it breaks on every architecture that pointer or int is not 32-bit.
You should use uintptr_t, which is generally larger than or equal to the size of a pointer (except for theoretical arch permitted by ambigous spec)
For example:
#include <stdint.h>
#include <stdio.h>
int withinSameBlock(int * ptr1, int * ptr2) {
uintptr_t p1 = reinterpret_cast<uintptr_t>(ptr1);
uintptr_t p2 = reinterpret_cast<uintptr_t>(ptr2);
uintptr_t mask = ~ (uintptr_t)0x3F;
return (p1 & mask) == (p2 & mask);
}
int main() {
int* a = (int*) 0xdeadbeef;
int* b = (int*) 0xdeadbeee;
int* c = (int*) 0xdeadc0de;
printf ("%p, %p: %d\n", a, b, withinSameBlock(a, b));
printf ("%p, %p: %d\n", a, c, withinSameBlock(a, c));
return 0;
}
First, we need to be clear that the code will only work on systems where a pointer is 32 bits, and int is also 32 bits. On a 64-bit system, the code will fail miserably.
The left shift (1 << 31) sets the most significant bit of the int. In other words, the line
int temp = (1 << 31);
is the same as
int temp = 0x80000000;
Since an int is a signed number, the most significant bit is the sign bit. Shifting as signed number to the right copies the sign bit into lower order bits. So shifting to the right 25 times results in a value that has a 1 in the upper 26 bits. In other words, the line
int temp = (1 << 31) >> 25;
is the same as (and would be much clearer if it was written as)
int temp = 0xffffffc0;
The line
return (a & temp) == (b & temp);
compares the upper 26 bits of a and b, ignoring the lower 6 bits. If the upper bits match, then a and b point to the same block of memory.
Assuming 32 bit pointers, if the two pointers are in the same 64-byte block of memory, then their addresses will vary only in the 6 least significant bits.
(1 << 31) >> 25 will give you a bitmask that looks like this:
11111111111111111111111111000000
a=ptr1 and b=ptr2 will set a and b equal to the value of the pointers, which are memory addresses. The bitwise AND of temp with each of these (i.e., a&temp and b&temp) will mask off the last 6 bits of the addresses held by a and b. If the remaining 26 bits are the same, then the original addresses must have been within 64 bytes of each other.
Demo code:
#include <stdio.h>
void main()
{
int temp = (1 << 31) >> 25;
printf("temp=%x\n",temp);
int p=5, q=6;
int *ptr1=&p, *ptr2=&q;
printf("*ptr1=%x, *ptr2=%x\n",ptr1, ptr2);
int a = ptr1;
int b = ptr2;
printf("a=%x, b=%x\n",a,b);
if ((a & temp) == (b & temp)) printf("true\n");
else printf("false\n");
}
Is there a clever (ie: branchless) way to "compact" a hex number. Basically move all the 0s all to one side?
eg:
0x10302040 -> 0x13240000
or
0x10302040 -> 0x00001324
I looked on Bit Twiddling Hacks but didn't see anything.
It's for a SSE numerical pivoting algorithm. I need to remove any pivots that become 0. I can use _mm_cmpgt_ps to find good pivots, _mm_movemask_ps to convert that in to a mask, and then bit hacks to get something like the above. The hex value gets munged in to a mask for a _mm_shuffle_ps instruction to perform a permutation on the SSE 128 bit register.
To compute mask for _pext:
mask = arg;
mask |= (mask << 1) & 0xAAAAAAAA | (mask >> 1) & 0x55555555;
mask |= (mask << 2) & 0xCCCCCCCC | (mask >> 2) & 0x33333333;
First do bit-or on pairs of bits, then on quads. Masks prevent shifted values from overflowing to other digits.
After computing mask this way or harold's way (which is probably faster) you don't need the full power of _pext, so if targeted hardware doesn't support it you can replace it with this:
for(int i = 0; i < 7; i++) {
stay_mask = mask & (~mask - 1);
arg = arg & stay_mask | (arg >> 4) & ~stay_mask;
mask = stay_mask | (mask >> 4);
}
Each iteration moves all nibbles one digit to the right if there is some space. stay_mask marks bits that are in their final positions. This uses somewhat less operations than Hacker's Delight solution, but might still benefit from branching.
Supposing we can use _pext_u32, the issue then is computing a mask that has an F for every nibble that isn't zero. I'm not sure what the best approach is, but you can compute the OR of the 4 bits of the nibble and then "spread" it back out to F's like this:
// calculate horizontal OR of every nibble
x |= x >> 1;
x |= x >> 2;
// clean up junk
x &= 0x11111111;
// spread
x *= 0xF;
Then use that as the mask of _pext_u32.
_pext_u32 can be emulated by this (taken from Hacker's Delight, figure 7.6)
unsigned compress(unsigned x, unsigned m) {
unsigned mk, mp, mv, t;
int i;
x = x & m; // Clear irrelevant bits.
mk = ~m << 1; // We will count 0's to right.
for (i = 0; i < 5; i++) {
mp = mk ^ (mk << 1); // Parallel prefix.
mp = mp ^ (mp << 2);
mp = mp ^ (mp << 4);
mp = mp ^ (mp << 8);
mp = mp ^ (mp << 16);
mv = mp & m; // Bits to move.
m = m ^ mv | (mv >> (1 << i)); // Compress m.
t = x & mv;
x = x ^ t | (t >> (1 << i)); // Compress x.
mk = mk & ~mp;
}
return x;
}
But that's a bit of a disaster. It's probably better to just resort to branching code then.
uint32_t fun(uint32_t val) {
uint32_t retVal(0x00);
uint32_t sa(28);
for (int sb(28); sb >= 0; sb -= 4) {
if (val & (0x0F << sb)) {
retVal |= (0x0F << sb) << (sa - sb)
sa -= 4;
}
}
return retVal;
}
I think this (or something similar) is what you're looking for. Eliminating the 0 nibbles within a number. I've not debugged it, and it would only works on one side atm.
If your processor supports conditional instruction execution, you may get a benefit from this algorithm:
uint32_t compact(uint32_t orig_value)
{
uint32_t mask = 0xF0000000u; // Mask for isolating a hex digit.
uint32_t new_value = 0u;
for (unsigned int i = 0; i < 8; ++i) // 8 hex digits
{
if (orig_value & mask == 0u)
{
orig_value = orig_value << 4; // Shift the original value by 1 digit
}
new_value |= orig_value & mask;
mask = mask >> 4; // next digit
}
return new_value;
}
This looks like a good candidate for loop unrolling.
The algorithm assumes that when the original value is shifted left, zeros are shifted in, filling in the "empty" bits.
Edit 1:
On a processor that supports conditional execution of instructions, the shifting of the original value would be conditionally executed depending on the result of the ANDing of the original value and the mask. Thus no branching, only ignored instructions.
I came up with the following solution. Please take a look, maybe it will help you.
#include <iostream>
#include <sstream>
#include <algorithm>
using namespace std;
class IsZero
{
public:
bool operator ()(char c)
{
return '0' == c;
}
};
int main()
{
int a = 0x01020334; //IMPUT
ostringstream my_sstream;
my_sstream << hex << a;
string str = my_sstream.str();
int base_str_length = str.size();
cout << "Input hex: " << str << endl;
str.insert(remove_if(begin(str), end(str), IsZero()), count_if(begin(str), end(str), IsZero()), '0');
str.replace(begin(str) + base_str_length, end(str), "");
cout << "Processed hex: " << str << endl;
return 0;
}
Output:
Input hex: 1020334
Processed hex: 1233400
Lets say that I have an array of 4 32-bit integers which I use to store the 128-bit number
How can I perform left and right shift on this 128-bit number?
Thanks!
Working with uint128? If you can, use the x86 SSE instructions, which were designed for exactly that. (Then, when you've bitshifted your value, you're ready to do other 128-bit operations...)
SSE2 bit shifts take ~4 instructions on average, with one branch (a case statement). No issues with shifting more than 32 bits, either. The full code for doing this is, using gcc intrinsics rather than raw assembler, is in sseutil.c (github: "Unusual uses of SSE2") -- and it's a bit bigger than makes sense to paste here.
The hurdle for many people in using SSE2 is that shift ops take immediate (constant) shift counts. You can solve that with a bit of C preprocessor twiddling (wordpress: C preprocessor tricks). After that, you have op sequences like:
LeftShift(uint128 x, int n) = _mm_slli_epi64(_mm_slli_si128(x, n/8), n%8)
for n = 65..71, 73..79, … 121..127
... doing the whole shift in two instructions.
void shiftl128 (
unsigned int& a,
unsigned int& b,
unsigned int& c,
unsigned int& d,
size_t k)
{
assert (k <= 128);
if (k >= 32) // shifting a 32-bit integer by more than 31 bits is "undefined"
{
a=b;
b=c;
c=d;
d=0;
shiftl128(a,b,c,d,k-32);
}
else
{
a = (a << k) | (b >> (32-k));
b = (b << k) | (c >> (32-k));
c = (c << k) | (d >> (32-k));
d = (d << k);
}
}
void shiftr128 (
unsigned int& a,
unsigned int& b,
unsigned int& c,
unsigned int& d,
size_t k)
{
assert (k <= 128);
if (k >= 32) // shifting a 32-bit integer by more than 31 bits is "undefined"
{
d=c;
c=b;
b=a;
a=0;
shiftr128(a,b,c,d,k-32);
}
else
{
d = (c << (32-k)) | (d >> k); \
c = (b << (32-k)) | (c >> k); \
b = (a << (32-k)) | (b >> k); \
a = (a >> k);
}
}
Instead of using a 128 bit number why not use a bitset? Using a bitset, you can adjust how big you want it to be. Plus you can perform quite a few operations on it.
You can find more information on these here:
http://www.cppreference.com/wiki/utility/bitset/start?do=backlink
First, if you're shifting by n bits and n is greater than or equal to 32, divide by 32 and shift whole integers. This should be trivial. Now you're left with a remaining shift count from 0 to 31. If it's zero, return early, you're done.
For each integer you'll need to shift by the remaining n, then shift the adjacent integer by the same amount and combine the valid bits from each.
Since you mentioned you're storing your 128-bit value in an array of 4 integers, you could do the following:
void left_shift(unsigned int* array)
{
for (int i=3; i >= 0; i--)
{
array[i] = array[i] << 1;
if (i > 0)
{
unsigned int top_bit = (array[i-1] >> 31) & 0x1;
array[i] = array[i] | top_bit;
}
}
}
void right_shift(unsigned int* array)
{
for (int i=0; i < 4; i++)
{
array[i] = array[i] >> 1;
if (i < 3)
{
unsigned int bottom_bit = (array[i+1] & 0x1) << 31;
array[i] = array[i] | bottom_bit;
}
}
}