What is the actual effect of variable BitMask in the function CeilLog2? - c++

Here is the definition of the function:
inline uint32_t CountLeadingZeros(uint32_t Val)
{
// Use BSR to return the log2 of the integer
unsigned long Log2;
if (_BitScanReverse(&Log2, Val) != 0)
{
return 31 - Log2;
}
return 32;
}
inline uint32_t CeilLog2(uint32_t Val)
{
int BitMask = ((int)(CountLeadingZeros(Val) << 26)) >> 31;
return (32 - CountLeadingZeros(Val - 1)) & (~BitMask);
}
Here is my hypothesis:
The range of the return value of the function CountLeadingZeros is [0, 32]. When the input Val is equal to 0, CountLeadingZeros(Val) << 26 should be 1000,0000,....,0000,0000.
Since the left hand side of operator >> is signed number, the result of >> 32 would be 1111,1111,....,1111,1111. When Val is not equal to 0, the BitMask would always be 0000,0000,....,0000,0000.
So I guess that the utility of variable BitMask is to let the function return 0 when the input Val is zero.
But the question is that when I pass an -1 to this function, it would be cast to 4294967295, result in the output become 32.
Is my hypothesis right?
I have seen this implementation many times in the RayTracing renderer on the github.
What is actual effect of BitMask here? Confused :(

Since the left hand side of operator >> is signed number, the result of >> 32 would be 1111,1111,....,1111,1111. When Val is not equal to 0, the BitMask would always be 0000,0000,....,0000,0000.
Your analysis is absolutely correct: BitMask is either all ones when Val is non-zero; otherwise it is all zeros. You can eliminate BitMask with a simple conditional:
return Val ? (32 - CountLeadingZeros(Val - 1)) : 0;
This does not create new branching, because the conditional replaces the if of CountLeadingZeros.
But the question is that when I pass an -1 to this function, it would be cast to 4294967295, result in the output become 32.
Function takes an unsigned number, so you should pass 0xFFFFFFFF, not -1 (representation of negatives is implementation-defined). In this case the return value should be 32, the correct value of log2 ceiling for this value.

Related

The largest n-bit integer

I thought that computing the largest n-bit integer would be trivial by using bit-shifts. Specifically, my idea was to set all of the bits to 1, and then shift them to the right:
template <typename T = uint16_t>
auto largest(uint8_t n){
constexpr auto bits = 8*sizeof(T);
assert(n <= bits);
return static_cast<T>(-1) >> (bits - n);
}
Generally, this idea seems to work. If I print out the result for 0, 1, ..., I get 0, 1, 3,..., 65535 (as expected).
However, this is where things get strange...
If instead of a uint16_t, I use a uint32_t or uint64_t then I find that
largest<uint16_t>(1) = 1
largest<uint32_t>(1) = 1
largest<uint64_t>(1) = 1
which is saying, "The largest 1-bit integer is 1" (as expected). However...
largest<uint16_t>(0) = 0
largest<uint32_t>(0) = 4294967295
largest<uint64_t>(0) = 18446744073709551615
So the value of 0 seems to be an edge case if I use uint32_t or uint64_t to hold the integer type.
To diagnose further, I hard-coded those edge cases so that the compiler can better see it:
static_cast<uint16_t>(-1) >> 16;
static_cast<uint32_t>(-1) >> 32;
static_cast<uint64_t>(-1) >> 64;
and now for the 32 and 64-bit cases, both GCC and Clang head throw a warning
prog.cc:22:31: warning: right shift count >= width of type [-Wshift-count-overflow]
22 | static_cast<uint64_t>(-1) >> 64;
I couldn't find any documentation about why this isn't allowed, and why this only happens for the 32 and 64 bit case. I understand why it might complain about the count > width, but the count == width case seems valid to me.
Does anybody have some insight as to what is going on?
Also, I would like to hear suggestions for how to compute the largest n-bit integer without having to put in a branch (obviously I could handle the case of n==0 specially).
Here is a code link so that you don't have to retype everything: https://wandbox.org/permlink/3oqxqQR9ypP5q7yw
You can use a lookup table to avoid branching.

what is the parameter bool in APInt(unsigned numBits, uint64_t val, bool isSigned = false) used for?

I am curious about the third paramter, the boolean isSigned in APInt(unsigned numBits, uint64_t val, bool isSigned = false) from LLVMs llvm/ADT/APInt.h header.
No matter what I set it to, the result of functions like getActiveBits() or getMinSignedBits() does not change at all.
Furthermore if I want get an signed/unsigned value, I use getSExtValue() or getZExtValue().
The value of isSigned does not matter to them either.
So when will isSigned matter?
TL;DR: isSigned is only important for numBits > 64.
The sizeof(val) is only 64 bits. Imagine you want to store a signed value in a integer with a size > 64 bits.
If the value is negativ, all high bits must be set to 1 since negative values are typically stored as two's complement.
If the value is positive, all high bits must be set to 0
Example
Let's assume you want to store -1 in 128 bits.
The binary representation of uint64_t val holding -1 is
1111111111111111111111111111111111111111111111111111111111111111
This are only 64 ones, so there are 64 bits left to be filled. Without the isSigned value, it would be impossible to know whether these bits should be ones, resulting in -1 or zeros, resulting in 18446744073709551615.
Long explanation
A look a the source code reveals, that isSigned is only used under certain circumstances:
APInt(unsigned numBits, uint64_t val, bool isSigned = false)
: BitWidth(numBits) {
assert(BitWidth && "bitwidth too small");
if (isSingleWord()) {
U.VAL = val;
clearUnusedBits();
} else {
initSlowCase(val, isSigned);
}
}
According to its function header, isSingleWord
returns true if the number of bits <= 64, false otherwise.
Therefore the line
if (isSingleWord()) {
checks if the storage for the value occupies more memory than val itself.
If numBits is larger than 64, APInt::initSlowCase gets called:
void APInt::initSlowCase(uint64_t val, bool isSigned) {
U.pVal = getClearedMemory(getNumWords());
U.pVal[0] = val;
if (isSigned && int64_t(val) < 0)
for (unsigned i = 1; i < getNumWords(); ++i)
U.pVal[i] = WORDTYPE_MAX;
clearUnusedBits();
}
This function copies the value from the val variable and fills the bits to numBits.
That is necessary, because begative values are stored as two's complement. If isSigned is set and val is a negative value, all bits of the high words are set to ones.
Acording to the docs is tells if val is signed or unsigned.
If isSigned is true then val is treated as if it were a signed value (i.e. as an int64_t) and the appropriate sign extension to the bit width will be done. Otherwise, no sign extension occurs (high order bits beyond the range of val are zero filled).

Signed extension from 24 bit to 32 bit in C++

I have 3 unsigned bytes that are coming over the wire separately.
[byte1, byte2, byte3]
I need to convert these to a signed 32-bit value but I am not quite sure how to handle the sign of the negative values.
I thought of copying the bytes to the upper 3 bytes in the int32 and then shifting everything to the right but I read this may have unexpected behavior.
Is there an easier way to handle this?
The representation is using two's complement.
You could use:
uint32_t sign_extend_24_32(uint32_t x) {
const int bits = 24;
uint32_t m = 1u << (bits - 1);
return (x ^ m) - m;
}
This works because:
if the old sign was 1, then the XOR makes it zero and the subtraction will set it and borrow through all higher bits, setting them as well.
if the old sign was 0, the XOR will set it, the subtract resets it again and doesn't borrow so the upper bits stay 0.
Templated version
template<class T>
T sign_extend(T x, const int bits) {
T m = 1;
m <<= bits - 1;
return (x ^ m) - m;
}
Assuming both representations are two's complement, simply
upper_byte = (Signed_byte(incoming_msb) >= 0? 0 : Byte(-1));
where
using Signed_byte = signed char;
using Byte = unsigned char;
and upper_byte is a variable representing the missing fourth byte.
The conversion to Signed_byte is formally implementation-dependent, but a two's complement implementation doesn't have a choice, really.
You could let the compiler process itself the sign extension. Assuming that the lowest significant byte is byte1 and the high significant byte is byte3;
int val = (signed char) byte3; // C guarantees the sign extension
val << 16; // shift the byte at its definitive place
val |= ((int) (unsigned char) byte2) << 8; // place the second byte
val |= ((int) (unsigned char) byte1; // and the least significant one
I have used C style cast here when static_cast would have been more C++ish, but as an old dinosaur (and Java programmer) I find C style cast more readable for integer conversions.
This is a pretty old question, but I recently had to do the same (while dealing with 24-bit audio samples), and wrote my own solution for it. It's using a similar principle as this answer, but more generic, and potentially generates better code after compiling.
template <size_t Bits, typename T>
inline constexpr T sign_extend(const T& v) noexcept {
static_assert(std::is_integral<T>::value, "T is not integral");
static_assert((sizeof(T) * 8u) >= Bits, "T is smaller than the specified width");
if constexpr ((sizeof(T) * 8u) == Bits) return v;
else {
using S = struct { signed Val : Bits; };
return reinterpret_cast<const S*>(&v)->Val;
}
}
This has no hard-coded math, it simply lets the compiler do the work and figure out the best way to sign-extend the number. With certain widths, this can even generate a native sign-extension instruction in the assembly, such as MOVSX on x86.
This function assumes you copied your N-bit number into the lower N bits of the type you want to extend it to. So for example:
int16_t a = -42;
int32_t b{};
memcpy(&b, &a, sizeof(a));
b = sign_extend<16>(b);
Of course it works for any number of bits, extending it to the full width of the type that contained the data.
Here's a method that works for any bit count, even if it's not a multiple of 8. This assumes you've already assembled the 3 bytes into an integer value.
const int bits = 24;
int mask = (1 << bits) - 1;
bool is_negative = (value & ~(mask >> 1)) != 0;
value |= -is_negative & ~mask;
You can use a bitfield
template<size_t L>
inline int32_t sign_extend_to_32(const char *x)
{
struct {int32_t i: L;} s;
memcpy(&s, x, 3);
return s.i;
// or
return s.i = (x[2] << 16) | (x[1] << 8) | x[0]; // assume little endian
}
Easy and no undefined behavior invoked
int32_t r = sign_extend_to_32<24>(your_3byte_array);
Of course copying the bytes to the upper 3 bytes in the int32 and then shifting everything to the right as you thought is also a good idea. There's no undefined behavior if you use memcpy like above. An alternative is reinterpret_cast in C++ and union in C, which can avoid the use of memcpy. However there's an implementation defined behavior because right shift is not always a sign-extension shift (although almost all modern compilers do that)
Assuming your 24bit value is stored in variable int32_t val, you can easily extend the sign by following:
val = (val << 8) >> 8;

Unions in c++ and bitwise operations

I have seen the following structure in a source code.
template<unsigned bitno, unsigned nbits = 1, typename T = u8>
struct RegBit
{
T data;
enum { mask = (1u << nbits) - 1u };
template<typename T2>
RegBit& operator=(T2 val)
{
data = (data & ~(mask << bitno)) | ((nbits > 1 ? val & mask : !!val) << bitno);
return *this;
}
operator unsigned() const { return (data >> bitno) & mask; }
};
union {
u8 raw;
RegBit<0> r0;
RegBit<1> r1;
RegBit<2> r2;
RegBit<3> r3;
RegBit<6> r6;
RegBit<7> r7;
} P;
After a first reading, I found out that an unsigned cast of an object whose type is RegBit will return the bit number bitno + 1 of data.
However, I don't understand how the = overloaded operator is handled. I mean that I understand the syntax, but not what the bitwise operation is meant to do.
And last thing, if you run the code and affect a value to P.raw, you'll notice that ∀ i ∈ [0;7], P.ri.data = P.raw.
How is that possible ?
Of course then, the code does what it's suppose to do imho, ie: ∀ i ∈ [0;7], (unsigned)P.ri is the (i+1)th bit of P.raw.
How does the operator= work ?
When you write P.r2 = 1; , the assignment operator for the r2 member gets invoked. So it would have the effect of P.r2.operator= (1); which returns a reference to P.r2.
Let's analyze the assignment details in the specialized template with bitno=2, nbits=1 and T being u8:
mask = (1u << nbits) - 1u
= (1 shifted by 1 bits, aka binary 10) - 1
= binary 1 (i.e. it's a binary number with the n lowest bits set)
Let's analyse the full expression step by step. First the left part:
mask << bitno ===> binary 100
~(mask << bitno) ===> binary 1..1011 (aka the bit bitno is set to 0, counting from least significant bit)
(data & ~(mask << bitno)) ===> the bit bitno is set to 0 in data (thanks to the bitwise &)
Now the right part of the expression:
(nbits > 1 ? val & mask : !!val) is a conditional operator:
if nbits >1 is true, then it's val&mask, aka the n lowest bits of val
if not, then it's !!val, aka "not not val" which evalauates to 0 if val is 0 and 1 if val is not 0.
In our case, it's the second alternative so 0 or 1 depending on val.
((nbits > 1 ? val & mask : !!val) << bitno) then shifts the 0 or the 1 by 2 bits.
Now finally combining all this:
data = (data & ~(mask << bitno)) | ((nbits > 1 ? val & mask : !!val) << bitno);
= (data with the bit bitno set to 0) ored with (val expressed on one bit in the bit bitno, counting from the least significant )
Otherwise stated, as a bit value 0 ored with a bit value x gives as results a bit value x, this expression sets the bit bitno to val (val being handled as a bool).
But what is the union supposed to do ?
The union handles all its members (which are all of the same type u8) in the same memory location.
So what would be according to you the expected output of the following:
P.raw=0;
P.r2=1;
P.r3=0;
P.r4=1;
cout << (int)P.raw <<endl;
The optimist who wrote your code snippet certainly expects a result of 20 (aka binary 10100). That may work like this on many compilers. But in reality this is ABSOLUTELY NOT GUARANTEED according to the standard:
9.5/1: In a union, at most one of the non-static data members can be active at any time, that is, the value of at most one of the
non-static data members can be stored in a union at any time.
Otherwise stated, if you store something in r2 you're not sure that you will find back the same value in r4. The only thing that is sure is that if you store something in r2 and do not store anything else in the other members, you'll find back in r2 what you've stored there.
Alternatives to the union
If you need to ensure portability, you could consider using either std::bitset or standard bitfields.

Checking whether a number is positive or negative using bitwise operators

I can check whether a number is odd/even using bitwise operators. Can I check whether a number is positive/zero/negative without using any conditional statements/operators like if/ternary etc.
Can the same be done using bitwise operators and some trick in C or in C++?
Can I check whether a number is positive/zero/negative without using any conditional statements/operators like if/ternary etc.
Of course:
bool is_positive = number > 0;
bool is_negative = number < 0;
bool is_zero = number == 0;
If the high bit is set on a signed integer (byte, long, etc., but not a floating point number), that number is negative.
int x = -2300; // assuming a 32-bit int
if ((x & 0x80000000) != 0)
{
// number is negative
}
ADDED:
You said that you don't want to use any conditionals. I suppose you could do this:
int isNegative = (x & 0x80000000);
And at some later time you can test it with if (isNegative).
Or, you could use signbit() and the work's done for you.
I'm assuming that under the hood, the math.h implementation is an efficient bitwise check (possibly solving your original goal).
Reference: http://en.cppreference.com/w/cpp/numeric/math/signbit
There is a detailed discussion on the Bit Twiddling Hacks page.
int v; // we want to find the sign of v
int sign; // the result goes here
// CHAR_BIT is the number of bits per byte (normally 8).
sign = -(v < 0); // if v < 0 then -1, else 0.
// or, to avoid branching on CPUs with flag registers (IA32):
sign = -(int)((unsigned int)((int)v) >> (sizeof(int) * CHAR_BIT - 1));
// or, for one less instruction (but not portable):
sign = v >> (sizeof(int) * CHAR_BIT - 1);
// The last expression above evaluates to sign = v >> 31 for 32-bit integers.
// This is one operation faster than the obvious way, sign = -(v < 0). This
// trick works because when signed integers are shifted right, the value of the
// far left bit is copied to the other bits. The far left bit is 1 when the value
// is negative and 0 otherwise; all 1 bits gives -1. Unfortunately, this behavior
// is architecture-specific.
// Alternatively, if you prefer the result be either -1 or +1, then use:
sign = +1 | (v >> (sizeof(int) * CHAR_BIT - 1)); // if v < 0 then -1, else +1
// On the other hand, if you prefer the result be either -1, 0, or +1, then use:
sign = (v != 0) | -(int)((unsigned int)((int)v) >> (sizeof(int) * CHAR_BIT - 1));
// Or, for more speed but less portability:
sign = (v != 0) | (v >> (sizeof(int) * CHAR_BIT - 1)); // -1, 0, or +1
// Or, for portability, brevity, and (perhaps) speed:
sign = (v > 0) - (v < 0); // -1, 0, or +1
// If instead you want to know if something is non-negative, resulting in +1
// or else 0, then use:
sign = 1 ^ ((unsigned int)v >> (sizeof(int) * CHAR_BIT - 1)); // if v < 0 then 0, else 1
// Caveat: On March 7, 2003, Angus Duggan pointed out that the 1989 ANSI C
// specification leaves the result of signed right-shift implementation-defined,
// so on some systems this hack might not work. For greater portability, Toby
// Speight suggested on September 28, 2005 that CHAR_BIT be used here and
// throughout rather than assuming bytes were 8 bits long. Angus recommended
// the more portable versions above, involving casting on March 4, 2006.
// Rohit Garg suggested the version for non-negative integers on September 12, 2009.
#include<stdio.h>
void main()
{
int n; // assuming int to be 32 bit long
//shift it right 31 times so that MSB comes to LSB's position
//and then and it with 0x1
if ((n>>31) & 0x1 == 1) {
printf("negative number\n");
} else {
printf("positive number\n");
}
getch();
}
Signed integers and floating points normally use the most significant bit for storing the sign so if you know the size you could extract the info from the most significant bit.
There is generally little benefit in doing this this since some sort of comparison will need to be made to use this information and it is just as easy for a processor to tests whether something is negative as it is to test whether it is not zero. If fact on ARM processors, checking the most significant bit will be normally MORE expensive than checking whether it is negative up front.
It is quite simple
It can be easily done by
return ((!!x) | (x >> 31));
it returns
1 for a positive number,
-1 for a negative, and
0 for zero
This can not be done in a portable way with bit operations in C. The representations for signed integer types that the standard allows can be much weirder than you might suspect. In particular the value with sign bit on and otherwise zero need not be a permissible value for the signed type nor the unsigned type, but a so-called trap representation for both types.
All computations with bit operators that you can thus do might have a result that leads to undefined behavior.
In any case as some of the other answers suggest, this is not really necessary and comparison with < or > should suffice in any practical context, is more efficient, easier to read... so just do it that way.
// if (x < 0) return -1
// else if (x == 0) return 0
// else return 1
int sign(int x) {
// x_is_not_zero = 0 if x is 0 else x_is_not_zero = 1
int x_is_not_zero = (( x | (~x + 1)) >> 31) & 0x1;
return (x & 0x01 << 31) >> 31 | x_is_not_zero; // for minux x, don't care the last operand
}
Here's exactly what you waht!
Here is an update related to C++11 for this old question. It is also worth considering std::signbit.
On Compiler Explorer using gcc 7.3 64bit with -O3 optimization, this code
bool s1(double d)
{
return d < 0.0;
}
generates
s1(double):
pxor xmm1, xmm1
ucomisd xmm1, xmm0
seta al
ret
And this code
bool s2(double d)
{
return std::signbit(d);
}
generates
s2(double):
movmskpd eax, xmm0
and eax, 1
ret
You would need to profile to ensure that there is any speed difference, but the signbit version does use 1 less opcode.
When you're sure about the size of an integer (assuming 16-bit int):
bool is_negative = (unsigned) signed_int_value >> 15;
When you are unsure of the size of integers:
bool is_negative = (unsigned) signed_int_value >> (sizeof(int)*8)-1; //where 8 is bits
The unsigned keyword is optional.
if( (num>>sizeof(int)*8 - 1) == 0 )
// number is positive
else
// number is negative
If value is 0 then number is positive else negative
A simpler way to find out if a number is positive or negative:
Let the number be x
check if [x * (-1)] > x. if true x is negative else positive.
You can differentiate between negative/non-negative by looking at the most significant bit.
In all representations for signed integers, that bit will be set to 1 if the number is negative.
There is no test to differentiate between zero and positive, except for a direct test against 0.
To test for negative, you could use
#define IS_NEGATIVE(x) ((x) & (1U << ((sizeof(x)*CHAR_BIT)-1)))
Suppose your number is a=10 (positive). If you shift a a times it will give zero.
i.e:
10>>10 == 0
So you can check if the number is positive, but in case a=-10 (negative):
-10>>-10 == -1
So you can combine those in an if:
if(!(a>>a))
print number is positive
else
print no. is negative
#include<stdio.h>
int checksign(int n)
{
return (n >= 0 && (n & (1<<32-1)) >=0);
}
void main()
{
int num = 11;
if(checksign(num))
{
printf("Unsigned number");
}
else
{
printf("signed Number");
}
}
Without if:
string pole[2] = {"+", "-"};
long long x;
while (true){
cin >> x;
cout << pole[x/-((x*(-1))-1)] << "\n\n";
}
(not working for 0)
if(n & (1<<31))
{
printf("Negative number");
}
else{
printf("positive number");
}
It check the first bit which is most significant bit of the n number and then & operation is work on it if the value is 1 which is true then the number is negative and it not then it is positive number