By Left shifting, can a number be set to ZERO - c++

I have a doubt in Left Shift Operator
int i = 1;
i <<= (sizeof (int) *8);
cout << i;
It prints 1.
i has been initialized to 1.
And while moving the bits till the size of the integer, it fills the LSB with 0's and as 1 crosses the limit of integer, i was expecting the output to be 0.
How and Why it is 1?

Let's say sizeof(int) is 4 on your platform. Then the expression becomes:
i = i << 32;
The standard says:
6.5.7-3
If the value of the right operand is negative or is greater than or
equal to the width of the promoted left operand, the behavior is
undefined.

As cnicutar said, your example exhibits undefined behaviour. That means that the compiler is free to do whatever the vendor seems fit, including making demons fly out your nose or just doing nothing to the value at hand.
What you can do to convince yourself, that left shifting by the number of bits will produce 0 is this:
int i = 1;
i <<= (sizeof (int) *4);
i <<= (sizeof (int) *4);
cout << i;

Expanding on the previous answer...
On the x86 platform your code would get compiled down to something like this:
; 32-bit ints:
mov cl, 32
shl dword ptr i, cl
The CPU will shift the dword in the variable i by the value contained in the cl register modulo 32. So, 32 modulo 32 yields 0. Hence, the shift doesn't really occur. And that's perfectly fine per the C standard. In fact, what the C standard says in 6.5.7-3 is because the aforementioned CPU behavior was quite common back in the day and influenced the standard.

As already mentioned by others, according to C standard the behavior of the shift is undefined.
That said, the program prints 1. A low-level explanation of why it prints 1 is as follows:
When compiling without optimizations the compiler (GCC, clang) emits the SHL instruction:
...
mov $32,%ecx
shll %cl,0x1c(%esp)
...
The Intel documentation for SHL instruction says:
SAL/SAR/SHL/SHR—Shift
The count is masked to 5 bits (or 6 bits if in 64-bit mode and REX.W is used). The count range is limited to 0 to 31 (or 63 if 64-bit mode and REX.W is used).
Masking the shift count 32 (binary 00100000) to 5 bits yields 0 (binary 00000000). Therefore the shll %cl,0x1c(%esp) instruction isn't doing any shifting and leaves the value of i unchanged.

Related

How does a C/C++ compiler optimize division by non-powers-of-two? [duplicate]

I've been reading about div and mul assembly operations, and I decided to see them in action by writing a simple program in C:
File division.c
#include <stdlib.h>
#include <stdio.h>
int main()
{
size_t i = 9;
size_t j = i / 5;
printf("%zu\n",j);
return 0;
}
And then generating assembly language code with:
gcc -S division.c -O0 -masm=intel
But looking at generated division.s file, it doesn't contain any div operations! Instead, it does some kind of black magic with bit shifting and magic numbers. Here's a code snippet that computes i/5:
mov rax, QWORD PTR [rbp-16] ; Move i (=9) to RAX
movabs rdx, -3689348814741910323 ; Move some magic number to RDX (?)
mul rdx ; Multiply 9 by magic number
mov rax, rdx ; Take only the upper 64 bits of the result
shr rax, 2 ; Shift these bits 2 places to the right (?)
mov QWORD PTR [rbp-8], rax ; Magically, RAX contains 9/5=1 now,
; so we can assign it to j
What's going on here? Why doesn't GCC use div at all? How does it generate this magic number and why does everything work?
Integer division is one of the slowest arithmetic operations you can perform on a modern processor, with latency up to the dozens of cycles and bad throughput. (For x86, see Agner Fog's instruction tables and microarch guide).
If you know the divisor ahead of time, you can avoid the division by replacing it with a set of other operations (multiplications, additions, and shifts) which have the equivalent effect. Even if several operations are needed, it's often still a heck of a lot faster than the integer division itself.
Implementing the C / operator this way instead of with a multi-instruction sequence involving div is just GCC's default way of doing division by constants. It doesn't require optimizing across operations and doesn't change anything even for debugging. (Using -Os for small code size does get GCC to use div, though.) Using a multiplicative inverse instead of division is like using lea instead of mul and add
As a result, you only tend to see div or idiv in the output if the divisor isn't known at compile-time.
For information on how the compiler generates these sequences, as well as code to let you generate them for yourself (almost certainly unnecessary unless you're working with a braindead compiler), see libdivide.
Dividing by 5 is the same as multiplying 1/5, which is again the same as multiplying by 4/5 and shifting right 2 bits. The value concerned is CCCCCCCCCCCCCCCD in hex, which is the binary representation of 4/5 if put after a hexadecimal point (i.e. the binary for four fifths is 0.110011001100 recurring - see below for why). I think you can take it from here! You might want to check out fixed point arithmetic (though note it's rounded to an integer at the end).
As to why, multiplication is faster than division, and when the divisor is fixed, this is a faster route.
See Reciprocal Multiplication, a tutorial for a detailed writeup about how it works, explaining in terms of fixed-point. It shows how the algorithm for finding the reciprocal works, and how to handle signed division and modulo.
Let's consider for a minute why 0.CCCCCCCC... (hex) or 0.110011001100... binary is 4/5. Divide the binary representation by 4 (shift right 2 places), and we'll get 0.001100110011... which by trivial inspection can be added the original to get 0.111111111111..., which is obviously equal to 1, the same way 0.9999999... in decimal is equal to one. Therefore, we know that x + x/4 = 1, so 5x/4 = 1, x=4/5. This is then represented as CCCCCCCCCCCCD in hex for rounding (as the binary digit beyond the last one present would be a 1).
In general multiplication is much faster than division. So if we can get away with multiplying by the reciprocal instead we can significantly speed up division by a constant
A wrinkle is that we cannot represent the reciprocal exactly (unless the division was by a power of two but in that case we can usually just convert the division to a bit shift). So to ensure correct answers we have to be careful that the error in our reciprocal does not cause errors in our final result.
-3689348814741910323 is 0xCCCCCCCCCCCCCCCD which is a value of just over 4/5 expressed in 0.64 fixed point.
When we multiply a 64 bit integer by a 0.64 fixed point number we get a 64.64 result. We truncate the value to a 64-bit integer (effectively rounding it towards zero) and then perform a further shift which divides by four and again truncates By looking at the bit level it is clear that we can treat both truncations as a single truncation.
This clearly gives us at least an approximation of division by 5 but does it give us an exact answer correctly rounded towards zero?
To get an exact answer the error needs to be small enough not to push the answer over a rounding boundary.
The exact answer to a division by 5 will always have a fractional part of 0, 1/5, 2/5, 3/5 or 4/5 . Therefore a positive error of less than 1/5 in the multiplied and shifted result will never push the result over a rounding boundary.
The error in our constant is (1/5) * 2-64. The value of i is less than 264 so the error after multiplying is less than 1/5. After the division by 4 the error is less than (1/5) * 2−2.
(1/5) * 2−2 < 1/5 so the answer will always be equal to doing an exact division and rounding towards zero.
Unfortunately this doesn't work for all divisors.
If we try to represent 4/7 as a 0.64 fixed point number with rounding away from zero we end up with an error of (6/7) * 2-64. After multiplying by an i value of just under 264 we end up with an error just under 6/7 and after dividing by four we end up with an error of just under 1.5/7 which is greater than 1/7.
So to implement divison by 7 correctly we need to multiply by a 0.65 fixed point number. We can implement that by multiplying by the lower 64 bits of our fixed point number, then adding the original number (this may overflow into the carry bit) then doing a rotate through carry.
Here is link to a document of an algorithm that produces the values and code I see with Visual Studio (in most cases) and that I assume is still used in GCC for division of a variable integer by a constant integer.
http://gmplib.org/~tege/divcnst-pldi94.pdf
In the article, a uword has N bits, a udword has 2N bits, n = numerator = dividend, d = denominator = divisor, ℓ is initially set to ceil(log2(d)), shpre is pre-shift (used before multiply) = e = number of trailing zero bits in d, shpost is post-shift (used after multiply), prec is precision = N - e = N - shpre. The goal is to optimize calculation of n/d using a pre-shift, multiply, and post-shift.
Scroll down to figure 6.2, which defines how a udword multiplier (max size is N+1 bits), is generated, but doesn't clearly explain the process. I'll explain this below.
Figure 4.2 and figure 6.2 show how the multiplier can be reduced to a N bit or less multiplier for most divisors. Equation 4.5 explains how the formula used to deal with N+1 bit multipliers in figure 4.1 and 4.2 was derived.
In the case of modern X86 and other processors, multiply time is fixed, so pre-shift doesn't help on these processors, but it still helps to reduce the multiplier from N+1 bits to N bits. I don't know if GCC or Visual Studio have eliminated pre-shift for X86 targets.
Going back to Figure 6.2. The numerator (dividend) for mlow and mhigh can be larger than a udword only when denominator (divisor) > 2^(N-1) (when ℓ == N => mlow = 2^(2N)), in this case the optimized replacement for n/d is a compare (if n>=d, q = 1, else q = 0), so no multiplier is generated. The initial values of mlow and mhigh will be N+1 bits, and two udword/uword divides can be used to produce each N+1 bit value (mlow or mhigh). Using X86 in 64 bit mode as an example:
; upper 8 bytes of dividend = 2^(ℓ) = (upper part of 2^(N+ℓ))
; lower 8 bytes of dividend for mlow = 0
; lower 8 bytes of dividend for mhigh = 2^(N+ℓ-prec) = 2^(ℓ+shpre) = 2^(ℓ+e)
dividend dq 2 dup(?) ;16 byte dividend
divisor dq 1 dup(?) ; 8 byte divisor
; ...
mov rcx,divisor
mov rdx,0
mov rax,dividend+8 ;upper 8 bytes of dividend
div rcx ;after div, rax == 1
mov rax,dividend ;lower 8 bytes of dividend
div rcx
mov rdx,1 ;rdx:rax = N+1 bit value = 65 bit value
You can test this with GCC. You're already seen how j = i/5 is handled. Take a look at how j = i/7 is handled (which should be the N+1 bit multiplier case).
On most current processors, multiply has a fixed timing, so a pre-shift is not needed. For X86, the end result is a two instruction sequence for most divisors, and a five instruction sequence for divisors like 7 (in order to emulate a N+1 bit multiplier as shown in equation 4.5 and figure 4.2 of the pdf file). Example X86-64 code:
; rbx = dividend, rax = 64 bit (or less) multiplier, rcx = post shift count
; two instruction sequence for most divisors:
mul rbx ;rdx = upper 64 bits of product
shr rdx,cl ;rdx = quotient
;
; five instruction sequence for divisors like 7
; to emulate 65 bit multiplier (rbx = lower 64 bits of multiplier)
mul rbx ;rdx = upper 64 bits of product
sub rbx,rdx ;rbx -= rdx
shr rbx,1 ;rbx >>= 1
add rdx,rbx ;rdx = upper 64 bits of corrected product
shr rdx,cl ;rdx = quotient
; ...
To explain the 5 instruction sequence, a simple 3 instruction sequence could overflow. Let u64() mean upper 64 bits (all that is needed for quotient)
mul rbx ;rdx = u64(dvnd*mplr)
add rdx,rbx ;rdx = u64(dvnd*(2^64 + mplr)), could overflow
shr rdx,cl
To handle this case, cl = post_shift-1. rax = multiplier - 2^64, rbx = dividend. u64() is upper 64 bits. Note that rax = rax<<1 - rax. Quotient is:
u64( ( rbx * (2^64 + rax) )>>(cl+1) )
u64( ( rbx * (2^64 + rax<<1 - rax) )>>(cl+1) )
u64( ( (rbx * 2^64) + (rbx * rax)<<1 - (rbx * rax) )>>(cl+1) )
u64( ( (rbx * 2^64) - (rbx * rax) + (rbx * rax)<<1 )>>(cl+1) )
u64( ( ((rbx * 2^64) - (rbx * rax))>>1) + (rbx*rax) )>>(cl ) )
mul rbx ; (rbx*rax)
sub rbx,rdx ; (rbx*2^64)-(rbx*rax)
shr rbx,1 ;( (rbx*2^64)-(rbx*rax))>>1
add rdx,rbx ;( ((rbx*2^64)-(rbx*rax))>>1)+(rbx*rax)
shr rdx,cl ;((((rbx*2^64)-(rbx*rax))>>1)+(rbx*rax))>>cl
I will answer from a slightly different angle: Because it is allowed to do it.
C and C++ are defined against an abstract machine. The compiler transforms this program in terms of the abstract machine to concrete machine following the as-if rule.
The compiler is allowed to make ANY changes as long as it doesn't change the observable behaviour as specified by the abstract machine. There is no reasonable expectation that the compiler will transform your code in the most straightforward way possible (even when a lot of C programmer assume that). Usually, it does this because the compiler wants to optimize the performance compared to the straightforward approach (as discussed in the other answers at length).
If under any circumstances the compiler "optimizes" a correct program to something that has a different observable behaviour, that is a compiler bug.
Any undefined behaviour in our code (signed integer overflow is a classical example) and this contract is void.

Assembler division x86 strange numbers [duplicate]

I've been reading about div and mul assembly operations, and I decided to see them in action by writing a simple program in C:
File division.c
#include <stdlib.h>
#include <stdio.h>
int main()
{
size_t i = 9;
size_t j = i / 5;
printf("%zu\n",j);
return 0;
}
And then generating assembly language code with:
gcc -S division.c -O0 -masm=intel
But looking at generated division.s file, it doesn't contain any div operations! Instead, it does some kind of black magic with bit shifting and magic numbers. Here's a code snippet that computes i/5:
mov rax, QWORD PTR [rbp-16] ; Move i (=9) to RAX
movabs rdx, -3689348814741910323 ; Move some magic number to RDX (?)
mul rdx ; Multiply 9 by magic number
mov rax, rdx ; Take only the upper 64 bits of the result
shr rax, 2 ; Shift these bits 2 places to the right (?)
mov QWORD PTR [rbp-8], rax ; Magically, RAX contains 9/5=1 now,
; so we can assign it to j
What's going on here? Why doesn't GCC use div at all? How does it generate this magic number and why does everything work?
Integer division is one of the slowest arithmetic operations you can perform on a modern processor, with latency up to the dozens of cycles and bad throughput. (For x86, see Agner Fog's instruction tables and microarch guide).
If you know the divisor ahead of time, you can avoid the division by replacing it with a set of other operations (multiplications, additions, and shifts) which have the equivalent effect. Even if several operations are needed, it's often still a heck of a lot faster than the integer division itself.
Implementing the C / operator this way instead of with a multi-instruction sequence involving div is just GCC's default way of doing division by constants. It doesn't require optimizing across operations and doesn't change anything even for debugging. (Using -Os for small code size does get GCC to use div, though.) Using a multiplicative inverse instead of division is like using lea instead of mul and add
As a result, you only tend to see div or idiv in the output if the divisor isn't known at compile-time.
For information on how the compiler generates these sequences, as well as code to let you generate them for yourself (almost certainly unnecessary unless you're working with a braindead compiler), see libdivide.
Dividing by 5 is the same as multiplying 1/5, which is again the same as multiplying by 4/5 and shifting right 2 bits. The value concerned is CCCCCCCCCCCCCCCD in hex, which is the binary representation of 4/5 if put after a hexadecimal point (i.e. the binary for four fifths is 0.110011001100 recurring - see below for why). I think you can take it from here! You might want to check out fixed point arithmetic (though note it's rounded to an integer at the end).
As to why, multiplication is faster than division, and when the divisor is fixed, this is a faster route.
See Reciprocal Multiplication, a tutorial for a detailed writeup about how it works, explaining in terms of fixed-point. It shows how the algorithm for finding the reciprocal works, and how to handle signed division and modulo.
Let's consider for a minute why 0.CCCCCCCC... (hex) or 0.110011001100... binary is 4/5. Divide the binary representation by 4 (shift right 2 places), and we'll get 0.001100110011... which by trivial inspection can be added the original to get 0.111111111111..., which is obviously equal to 1, the same way 0.9999999... in decimal is equal to one. Therefore, we know that x + x/4 = 1, so 5x/4 = 1, x=4/5. This is then represented as CCCCCCCCCCCCD in hex for rounding (as the binary digit beyond the last one present would be a 1).
In general multiplication is much faster than division. So if we can get away with multiplying by the reciprocal instead we can significantly speed up division by a constant
A wrinkle is that we cannot represent the reciprocal exactly (unless the division was by a power of two but in that case we can usually just convert the division to a bit shift). So to ensure correct answers we have to be careful that the error in our reciprocal does not cause errors in our final result.
-3689348814741910323 is 0xCCCCCCCCCCCCCCCD which is a value of just over 4/5 expressed in 0.64 fixed point.
When we multiply a 64 bit integer by a 0.64 fixed point number we get a 64.64 result. We truncate the value to a 64-bit integer (effectively rounding it towards zero) and then perform a further shift which divides by four and again truncates By looking at the bit level it is clear that we can treat both truncations as a single truncation.
This clearly gives us at least an approximation of division by 5 but does it give us an exact answer correctly rounded towards zero?
To get an exact answer the error needs to be small enough not to push the answer over a rounding boundary.
The exact answer to a division by 5 will always have a fractional part of 0, 1/5, 2/5, 3/5 or 4/5 . Therefore a positive error of less than 1/5 in the multiplied and shifted result will never push the result over a rounding boundary.
The error in our constant is (1/5) * 2-64. The value of i is less than 264 so the error after multiplying is less than 1/5. After the division by 4 the error is less than (1/5) * 2−2.
(1/5) * 2−2 < 1/5 so the answer will always be equal to doing an exact division and rounding towards zero.
Unfortunately this doesn't work for all divisors.
If we try to represent 4/7 as a 0.64 fixed point number with rounding away from zero we end up with an error of (6/7) * 2-64. After multiplying by an i value of just under 264 we end up with an error just under 6/7 and after dividing by four we end up with an error of just under 1.5/7 which is greater than 1/7.
So to implement divison by 7 correctly we need to multiply by a 0.65 fixed point number. We can implement that by multiplying by the lower 64 bits of our fixed point number, then adding the original number (this may overflow into the carry bit) then doing a rotate through carry.
Here is link to a document of an algorithm that produces the values and code I see with Visual Studio (in most cases) and that I assume is still used in GCC for division of a variable integer by a constant integer.
http://gmplib.org/~tege/divcnst-pldi94.pdf
In the article, a uword has N bits, a udword has 2N bits, n = numerator = dividend, d = denominator = divisor, ℓ is initially set to ceil(log2(d)), shpre is pre-shift (used before multiply) = e = number of trailing zero bits in d, shpost is post-shift (used after multiply), prec is precision = N - e = N - shpre. The goal is to optimize calculation of n/d using a pre-shift, multiply, and post-shift.
Scroll down to figure 6.2, which defines how a udword multiplier (max size is N+1 bits), is generated, but doesn't clearly explain the process. I'll explain this below.
Figure 4.2 and figure 6.2 show how the multiplier can be reduced to a N bit or less multiplier for most divisors. Equation 4.5 explains how the formula used to deal with N+1 bit multipliers in figure 4.1 and 4.2 was derived.
In the case of modern X86 and other processors, multiply time is fixed, so pre-shift doesn't help on these processors, but it still helps to reduce the multiplier from N+1 bits to N bits. I don't know if GCC or Visual Studio have eliminated pre-shift for X86 targets.
Going back to Figure 6.2. The numerator (dividend) for mlow and mhigh can be larger than a udword only when denominator (divisor) > 2^(N-1) (when ℓ == N => mlow = 2^(2N)), in this case the optimized replacement for n/d is a compare (if n>=d, q = 1, else q = 0), so no multiplier is generated. The initial values of mlow and mhigh will be N+1 bits, and two udword/uword divides can be used to produce each N+1 bit value (mlow or mhigh). Using X86 in 64 bit mode as an example:
; upper 8 bytes of dividend = 2^(ℓ) = (upper part of 2^(N+ℓ))
; lower 8 bytes of dividend for mlow = 0
; lower 8 bytes of dividend for mhigh = 2^(N+ℓ-prec) = 2^(ℓ+shpre) = 2^(ℓ+e)
dividend dq 2 dup(?) ;16 byte dividend
divisor dq 1 dup(?) ; 8 byte divisor
; ...
mov rcx,divisor
mov rdx,0
mov rax,dividend+8 ;upper 8 bytes of dividend
div rcx ;after div, rax == 1
mov rax,dividend ;lower 8 bytes of dividend
div rcx
mov rdx,1 ;rdx:rax = N+1 bit value = 65 bit value
You can test this with GCC. You're already seen how j = i/5 is handled. Take a look at how j = i/7 is handled (which should be the N+1 bit multiplier case).
On most current processors, multiply has a fixed timing, so a pre-shift is not needed. For X86, the end result is a two instruction sequence for most divisors, and a five instruction sequence for divisors like 7 (in order to emulate a N+1 bit multiplier as shown in equation 4.5 and figure 4.2 of the pdf file). Example X86-64 code:
; rbx = dividend, rax = 64 bit (or less) multiplier, rcx = post shift count
; two instruction sequence for most divisors:
mul rbx ;rdx = upper 64 bits of product
shr rdx,cl ;rdx = quotient
;
; five instruction sequence for divisors like 7
; to emulate 65 bit multiplier (rbx = lower 64 bits of multiplier)
mul rbx ;rdx = upper 64 bits of product
sub rbx,rdx ;rbx -= rdx
shr rbx,1 ;rbx >>= 1
add rdx,rbx ;rdx = upper 64 bits of corrected product
shr rdx,cl ;rdx = quotient
; ...
To explain the 5 instruction sequence, a simple 3 instruction sequence could overflow. Let u64() mean upper 64 bits (all that is needed for quotient)
mul rbx ;rdx = u64(dvnd*mplr)
add rdx,rbx ;rdx = u64(dvnd*(2^64 + mplr)), could overflow
shr rdx,cl
To handle this case, cl = post_shift-1. rax = multiplier - 2^64, rbx = dividend. u64() is upper 64 bits. Note that rax = rax<<1 - rax. Quotient is:
u64( ( rbx * (2^64 + rax) )>>(cl+1) )
u64( ( rbx * (2^64 + rax<<1 - rax) )>>(cl+1) )
u64( ( (rbx * 2^64) + (rbx * rax)<<1 - (rbx * rax) )>>(cl+1) )
u64( ( (rbx * 2^64) - (rbx * rax) + (rbx * rax)<<1 )>>(cl+1) )
u64( ( ((rbx * 2^64) - (rbx * rax))>>1) + (rbx*rax) )>>(cl ) )
mul rbx ; (rbx*rax)
sub rbx,rdx ; (rbx*2^64)-(rbx*rax)
shr rbx,1 ;( (rbx*2^64)-(rbx*rax))>>1
add rdx,rbx ;( ((rbx*2^64)-(rbx*rax))>>1)+(rbx*rax)
shr rdx,cl ;((((rbx*2^64)-(rbx*rax))>>1)+(rbx*rax))>>cl
I will answer from a slightly different angle: Because it is allowed to do it.
C and C++ are defined against an abstract machine. The compiler transforms this program in terms of the abstract machine to concrete machine following the as-if rule.
The compiler is allowed to make ANY changes as long as it doesn't change the observable behaviour as specified by the abstract machine. There is no reasonable expectation that the compiler will transform your code in the most straightforward way possible (even when a lot of C programmer assume that). Usually, it does this because the compiler wants to optimize the performance compared to the straightforward approach (as discussed in the other answers at length).
If under any circumstances the compiler "optimizes" a correct program to something that has a different observable behaviour, that is a compiler bug.
Any undefined behaviour in our code (signed integer overflow is a classical example) and this contract is void.

Weird result in left shift (1ull << s == 1 if s == 64) [duplicate]

This question already has answers here:
Why doesn't left bit-shift, "<<", for 32-bit integers work as expected when used more than 32 times?
(10 answers)
Closed 3 years ago.
Why is the result of
uint32_t s = 64;
uint64_t val = 1ull << s;
and
uint64_t s = 64;
uint64_t val = 1ull << s;
1?
But
uint64_t val = 1ull << 0x40;
gets optimized to 0?
I really don't understand why it equals 1. It does no matter whether I use my VC++ or g++ compiler.
And how can I ensure that 1ull << s equals 0 when s equals 64, what's in my opinion is the correct result? I also need the imo. correct result in my program.
This is because on x64, the instruction SHL (when operating on a 64-bit source/destination operand) only uses the bottom 6 bits of the shift amount. In effect, you are shifting by 0 bits.
From the "Intel 64 and IA-32 Architecture Software Developer's Manual" (can be downloaded from Intel in PDF form, which is hard to link into,) under the entry for "SAL/SAR/SHL/SHR - Shift" instructions:
The count is masked to 5 bits (or 6 bits if in 64-bit mode and REX.W is used). The count range is limited to 0 to 31 (or 63 if 64-bit mode and REX.W is used).
As commented below, it is also an "Undefined Behavior" in the C++ language to shift an integer by more bits than its size. (Thanks to #sgarizvi for the reference.) The C++ standard, under Section 8.5.7 (Shift Operators) states that:
The behavior is undefined if the right operand is negative, or greater than or equal to the length in bits of the promoted left operand...
That's why the compiler is producing code that gives different results under different conditions (constant or variable shift count, optimized or not, etc.)
About how to "fix" it, I have no clever tricks. You can do something like this:
template <typename IntT>
IntT ShiftLeft (IntT x, unsigned count) {
// Optional check; depends on how much you want to embrace C++!
static_assert(std::is_integral_v<IntT>, "This shift only works for integral types.");
if (count < sizeof(IntT) * 8)
return x << count;
else
return 0;
}
This code would work for signed and unsigned integer types (left-shift is the same for signed and unsigned values.)

Bit fiddling: Negate an int, only if it's negative

I'm working through a codebase that uses the following encoding to indicate sampling with replacement: We maintain an array of ints as indicators of position and presence in the sample, where positive ints indicate a position in another array and negative ints indicate that we should not use the data point in this iteration.
Example:
data_points: [...] // Objects vector of length 5 == position.size()
std::vector<int> position: [3, 4, -3, 1, -2]
would mean that the first element in data_points should go to bucket 3, the second to bucket 4, and the fourth to bucket 1.
The negative values indicate that for this iteration we won't be using those data points, namely the 3rd and 5th data points are marked as excluded because their values in position are negative and have been set with position[i] = ~position[i].
The trick is that we might perform this multiple times, but the positions of the data points in the index should not change. So in the next iteration, if we want to exclude data point 1, and include data point 5 we could do something like:
position[0] = ~position[0] // Bit-wise complement flips the sign on ints and subtracts 1
position[4] = ~position[4]
This will change the position vector into
std::vector<int> position: [-4, 4, -3, 1, 1]
Now on to the question: At the end of each round I want to reset all signs to positive, i.e. position should become [3, 4, 3, 1, 2].
Is there a bit-fiddling trick that will allow me to do this without having an if condition for the sign of the value?
Also, because I'm new to to bit fiddling like this, why/how does taking the bit complement of a signed, positive int give us its mathematical complement? (i.e. the same value with the sign flipped)
Edit: The above is wrong, the complement of a (int) will give -(a + 1) and depends on the representation of ints as noted in the answers. So the original question of simply taking the positive value of an existing one does not apply, we actually need to perform a bitwise complement to get the original value.
Example:
position[0] = 1
position[0] = ~position[0] // Now position[0] is -2
// So if we did
position[0] = std::abs(position[0]) // This is wrong, position[0] is now 2!
// If we want to reset it to 1, we need to do another complement
position[0] = ~position[0] // Now position[0] is again 1
I recommend not attempting bit fiddling. Partly because you're dealing with signed numbers, and you would throw away portability if you fiddle. Partly because bit fiddling is not as readable as reusable functions.
A simple solution:
std::for_each(position.begin(), position.end(), [](int v) {
return std::abs(v);
});
why/how does taking the bit complement of a signed, positive int give us its mathematical complement? (i.e. the same value with the sign flipped)
It doesn't. Not in general anyway. It does so only on systems that use 1's complement representation for negative numbers, and the reason for that is simply because that's how the representation is specified. A negative number is represented by binary complement of the positive value.
Most commonly used representation these days is 2's complement which doesn't work that way.
Probably the first go-to source for bit twiddling hacks: The eponymous site
int v; // we want to find the absolute value of v
unsigned int r; // the result goes here
int const mask = v >> sizeof(int) * CHAR_BIT - 1;
r = (v + mask) ^ mask;
However, I would question the assumption that position[i] = std::abs(position[i]) has worse performance. You should definitely have profiling results that demonstrate the bit hack being superior before you check in that kind of code.
Feel free to play around with a quick benchmark (with disassembly) of both - I don't think there is a difference in speed:
gcc 8.2
clang 6.0
Also take a look at the assembly that is actually generated:
https://godbolt.org/z/Ghcw_c
Evidently, clang sees your bithack and is not impressed - it generates a conditional move in all cases (which is branch-free). gcc does as you tell it, but has two more implementations of abs in store, some making use of register semantics of the target architecture.
And if you get into (auto-)vectorization, things get even more muddy. You'll have to profile regardless.
Conclusion: Just write std::abs - your compiler will do all the bit-twiddling for you.
To answer the extended question: Again, first write the obvious and intuitive code and then check that your compiler does the right thing: Look ma, no branches!
If you let it have its fun with auto-vectorization then you probably won't understand (or be good at judging) the assembly, so you have to profile anyway. Concrete example:
https://godbolt.org/z/oaaOwJ. clang likes to also unroll the auto-vectorized loop while gcc is more conservative. In any case, it's still branch-free.
Chances are that your compiler understands the nitty-gritty details of instruction scheduling on your target platform better than you. If you don't obscure your intent with bit-magic, it'll do a good job by itself. If it's still a hot spot in your code, you can then go and see if you can hand-craft a better version (but that will probably have to be in assembly).
Use functions to signify intent. Allow the compiler's optimiser to do a better job than you ever will.
#include <cmath>
void include_index(int& val)
{
val = std::abs(val);
}
void exclude_index(int& val)
{
val = -std::abs(val);
}
bool is_included(int const& val)
{
return val > 0;
}
Sample output from godbolt's gcc8 x86_64 compiler (note that it's all bit-twiddling and there are no conditional jumps - the bane of high performance computing):
include_index(int&):
mov eax, DWORD PTR [rdi]
sar eax, 31
xor DWORD PTR [rdi], eax
sub DWORD PTR [rdi], eax
ret
exclude_index(int&):
mov eax, DWORD PTR [rdi]
mov edx, DWORD PTR [rdi]
sar eax, 31
xor edx, eax
sub eax, edx
mov DWORD PTR [rdi], eax
ret
is_included(int const&):
mov eax, DWORD PTR [rdi]
test eax, eax
setg al
ret
https://godbolt.org/z/ni6DOk
Also, because I'm new to to bit fiddling like this, why/how does taking the bit complement of a signed, positive int give us its mathematical complement? (i.e. the same value with the sign flipped)
This question deserves an answer all by itself since everyone will tell you that this is how you do it, but nobody ever tells you why.
Notice that 1 - 0 = 1 and 1 - 1 = 0. What this means is that if we do 1 - b, where b is a single bit, the result is the opposite of b, or not b (~b). Also notice how this subtraction will never produce a borrow, this is very important, since b can only be at most 1.
Also notice that subtracting a number with n bits simply means performing n 1-bit subtractions, while taking care of borrows. But our special case will never procude a borrow.
Basically we have created a mathematical definiton for the bitwise not operation. To flip a bit b, do 1 - b. If we wanna flip an n bit number, do this for every bit. But doing n subtractions in sequence is the same as subtracting two n bit numbers. So if we wanna calculate the bitwise not of an 8-bit number, a, we simply do 11111111 - a, and the same for any n bit number. Once again this works because subtracting a bit from 1 will never produce a borrow.
But what is the sequence of n "1" bits? It's the value 2^n - 1. So taking the bitwise not of a number, a, is the same as calculating 2^n - 1 - a.
Now, numbers inside a computer are stored as numbers modulo 2^n. This is because we only have a limited number of bits available. You may know that if you work with 8 bits and you do 255 + 1 you'll get 0. This is because an 8-bit number is a number modulo 2^8 = 256, and 255 + 1 = 256. 256 is obviously equal to 0 modulo 256.
But why not do the same backwards? By this logic, 0 - 1 = 255, right? This is indeed correct. Mathematically, -1 and 255 are "congruent" modulo 256. Congruent essentially means equal to, but it's used to differentiate between regular equality and equality in modular arithmetic.
Notice in fact how 0 is also congruent to 256 modulo 256. So 0 - 1 = 256 - 1 = 255. 256 is our modulus, 2^8. But if bitwise not is defined as 2^n - 1 - a, then we have ~a = 2^8 - 1 - a. You'll notice how we have that - 1 in the middle. We can remove that by adding 1.
So we now have ~a + 1 = 2^n - 1 - a + 1 = 2^n - a. But 2^n - a is the negative a modulo n. And so here we have our negative number. This is called two's complement, and it's used in pretty much every modern processor, because it's the mathematical definition of a negative number in modular arithmetic modulo 2^n, and because numbers inside a processor work as if they were in modulo 2^n the math just works out by itself. You can add and subtract without doing any extra steps. Multiplication and division do require "sign extension", but that's just a quirk of how those operations are defined, the meaning of the number doesn't change when extending the sign.
Of course with this method you lose a bit, because you now have half of the numbers being positive and the other half negative, but you can't just magically add a bit to your processor, so the new range of value you can represent is from -2^(n-1) to 2^(n-1) - 1 inclusive.
Alternatively, you can keep the number as it is and not add 1 at the end. This is known as the one's complement. Of course, this is not quite the same as the mathematical negative, so adding, subtracting, multiplying and dividing don't just work out of the box, you need extra steps to adjust the result. This is why two's complement is the de facto standard for signed arithmetic. There's also the problem that, in one's complement, both 0 and 2^n - 1 represent the same quantity, zero, while in two's complement, negative 0 is still correctly 0 (because ~0 + 1 = 2^n - 1 + 1 = 2^n = 0). I think one's complement is used in the Internet Protocol as a checksum, but other than that it has very limited purpose.
But be aware, "de facto" standard means that this is just what everyone does, but there is no rule that says that it MUST be done this way, so always make sure to check the documentation of you target architecture to make sure you're doing the right thing. Even though let's be honest, chances of finding a useful one's complement processor out there nowadays are pretty much null unless you're working on some extremely specific architecture, but still, better be safe than sorry.
"Is there a bit-fiddling trick that will allow me to do this without having an if condition for the sign of the value?"
Do you need to keep the numeric value of the number being changed from a negative value?
If not, you can use std::max to set the negative values to zero
iValue = std::max(iValue, 0); // returns zero if iValue is less than zero
If you need to keep the numeric value, but just change from negative to positive, then
iValue = std::abs(iValue); // always returns a positive value of iValue

Weird behavior of right shift operator (1 >> 32)

I recently faced a strange behavior using the right-shift operator.
The following program:
#include <cstdio>
#include <cstdlib>
#include <iostream>
#include <stdint.h>
int foo(int a, int b)
{
return a >> b;
}
int bar(uint64_t a, int b)
{
return a >> b;
}
int main(int argc, char** argv)
{
std::cout << "foo(1, 32): " << foo(1, 32) << std::endl;
std::cout << "bar(1, 32): " << bar(1, 32) << std::endl;
std::cout << "1 >> 32: " << (1 >> 32) << std::endl; //warning here
std::cout << "(int)1 >> (int)32: " << ((int)1 >> (int)32) << std::endl; //warning here
return EXIT_SUCCESS;
}
Outputs:
foo(1, 32): 1 // Should be 0 (but I guess I'm missing something)
bar(1, 32): 0
1 >> 32: 0
(int)1 >> (int)32: 0
What happens with the foo() function ? I understand that the only difference between what it does and the last 2 lines, is that the last two lines are evaluated at compile time. And why does it "work" if I use a 64 bits integer ?
Any lights regarding this will be greatly appreciated !
Surely related, here is what g++ gives:
> g++ -o test test.cpp
test.cpp: In function 'int main(int, char**)':
test.cpp:20:36: warning: right shift count >= width of type
test.cpp:21:56: warning: right shift count >= width of type
It's likely the CPU is actually computing
a >> (b % 32)
in foo; meanwhile, the 1 >> 32 is a constant expression, so the compiler will fold the constant at compile-time, which somehow gives 0.
Since the standard (C++98 §5.8/1) states that
The behavior is undefined if the right operand is negative, or greater than or equal to the length in bits of the promoted left operand.
there is no contradiction having foo(1,32) and 1>>32 giving different results.
On the other hand, in bar you provided a 64-bit unsigned value, as 64 > 32 it is guaranteed the result must be 1 / 232 = 0. Nevertheless, if you write
bar(1, 64);
you may still get 1.
Edit: The logical right shift (SHR) behaves like a >> (b % 32/64) on x86/x86-64 (Intel #253667, Page 4-404):
The destination operand can be a register or a memory location. The count operand can be an immediate value or the CL register. The count is masked to 5 bits (or 6 bits if in 64-bit mode and REX.W is used). The count range is limited to 0 to 31 (or 63 if 64-bit mode and REX.W is used). A special opcode encoding is provided for a count of 1.
However, on ARM (armv6&7, at least), the logical right-shift (LSR) is implemented as (ARMISA Page A2-6)
(bits(N), bit) LSR_C(bits(N) x, integer shift)
assert shift > 0;
extended_x = ZeroExtend(x, shift+N);
result = extended_x<shift+N-1:shift>;
carry_out = extended_x<shift-1>;
return (result, carry_out);
where (ARMISA Page AppxB-13)
ZeroExtend(x,i) = Replicate('0', i-Len(x)) : x
This guarantees a right shift of ≥32 will produce zero. For example, when this code is run on the iPhone, foo(1,32) will give 0.
These shows shifting a 32-bit integer by ≥32 is non-portable.
OK. So it's in 5.8.1:
The operands shall be of integral or enumeration type and integral promotions are performed. The type of the result is
that of the promoted left operand. The behavior is undefined if the right operand is negative, or greater than or equal to
the length in bits of the promoted left operand.
So you have an Undefined Behaviour(tm).
What happens in foo is that the shift width is greater than or equal to the size of the data being shifted. In the C99 standard that results in undefined behaviour. It's probably the same in whatever C++ standard MS VC++ is built to.
The reason for this is to allow compiler designers to take advantage of any CPU hardware support for shifts. For example, the i386 architecture has an instruction to shift a 32 bit word by a number of bits, but the number of bits is defined in a field in the instruction that is 5 bits wide. Most likely, your compiler is generating the instruction by taking your bit shift amount and masking it with 0x1F to get the bit shift in the instruction. This means that shifting by 32 is the same as shifting by 0.
I compiled it on 32 bit windows using VC9 compiler. It gave me the following warning. Since sizeof(int) is 4 bytes on my system compiler is indicating that right shifting by 32 bits results in undefined behavior. Since it is undefined, you can not predict the result. Just for checking I right shifted with 31 bits and all the warnings disappeared and the result was also as expected (i.e. 0).
I suppose the reason is that int type holds 32-bits (for most systems), but one bit is used for sign as it is signed type. So only 31 bits are used for actual value.
The warning says it all!
But in fairness I got bitten by the same error once.
int a = 1;
cout << ( a >> 32);
is completely undefined. In fact the compiler generally gives a different results than the runtime in my experience. What I mean by this is if the compiler can see to evaluate the shift expression at run time it may give you a different result to the expression evaluated at runtime.
foo(1,32) performs a rotate-shit, so bits that should disappear on the right reappear on the left. If you do it 32 times, the single bit set to 1 is back to its original position.
bar(1,32) is the same, but the bit is in the 64-32+1=33th bit, which is above the representable numbers for a 32-bit int. Only the 32 lowest bit are taken, and they are all 0's.
1 >> 32 is performed by the compiler. No idea why gcc uses a non-rotating shift here and not in the generated code.
Same thing for ((int)1 >> (int)32)