Unexpected bit shifting result

Unexpected bit shifting result - c++

I have:
(gdb) display/t raw_data[4]<<8
24: /t raw_data[4]<<8 = 1111100000000
(gdb) display/t raw_data[5]
25: /t raw_data[5] = 11100111
(gdb) display/t (raw_data[4]<<8)|raw_data[5]
26: /t (raw_data[4]<<8)|raw_data[5] = 11111111111111111111111111100111
Why is the result on line 26 not 0001111111100111? Thanks.
edit: More specifically:
(gdb) display/t raw_data[5]
27: /t raw_data[5] = 11100111
(gdb) display/t 0|raw_data[5]
28: /t 0|raw_data[5] = 11111111111111111111111111100111
Why is the result on line 26 not 11100111?

Your data type is a char, which on your platform appears to be signed. The entry raw_data[5] holds the negative number -25.
The print format t prints the data as unsigned integer in binary. When you print raw_data[5], it is converted to the unsigned char 213, but has only 8 bits. When you do the integer arithmetic on the data, the chars are promoted to a 32-bit integer.
Promoting the negative char value -25 to a signed int will, of course, yield -25, but its representation as an unsigned int is now 2^^32 + x, whereas as an unsigned char it was 2^^8 + x. That's where all the ones at the beginning of the 32-bit binary number come from.
It's maybe better to work with unsigned raw data.

Let's just ignore the first block, since the second block is a minimal reproduction.
Also note that 0 | x preserves the value of x, but causes the usual integral promotions.
Then the second block is not so unexpected.
(gdb) display/t raw_data[5]
27: /t raw_data[5] = 11100111
Ok, raw_data[5] is int8_t(-25)
(gdb) display/t 0|raw_data[5]
28: /t 0|raw_data[5] = 11111111111111111111111111100111
and 0|raw_data[5] is int(-25). Indeed, the value was preserved.

The constant 8 caused a promotion to a signed integer, so you're seeing sign extension as well at the promotion. Change it to UINT8_C(8). You'll need to include stdint.h for the macro.

Related

Does big integer in AWK only have 53 bits?

Very strangely, I found that in awk, the big integer looks like has only 53 bits.Here is my example:
function bits2str(bits,data, mask)
{
if (bits == 0)
return "0"
mask = 1
for (; bits != 0; bits = rshift(bits, 1))
data = (and(bits, mask) ? "1" : "0") data
while ((length(data) % 8) != 0)
data = "0" data
return data
}
BEGIN{
print 32,"\tlshift 48:\t", lshift(32,48), "\t", bits2str(lshift(32,48))
print 429,"\tlshift 48:\t", lshift(429,48), "\t", bits2str(lshift(429,48))
}
and the output is:
32 lshift 48: 0 0
429 lshift 48: 3659174697238528 00001101000000000000000000000000000000000000000000000000
but in c++, its output is:
32 lshift 48: 9007199254740992
429 lshift 48: 120752765008871424
After comparing the two output, I found that the awk's only have 53 bits,
and then I researched the source code of gawk(start from line 3021 in the file named builtin.c, gawk 4.1.1, http://ftp.gnu.org/gnu/gawk/), but I found no special operation on int.
So, what causes this? Why it is like this?

In AWK, all numbers are stored in floating point.
From Bitwise function:
For all of these functions, first the double precision floating-point value is converted to the widest C unsigned integer type, then the bitwise operation is performed. If the result cannot be represented exactly as a C double, leading nonzero bits are removed one by one until it can be represented exactly. The result is then converted back into a C double.
Assuming IEEE-754 is used, doubles can only represent integers up to 253.

if you use gawk, you need add the -M option for big number.
kent$ awk 'BEGIN{print lshift(32,48)}'
0
kent$ awk -M 'BEGIN{print lshift(32,48)}'
9007199254740992

How to display a variable with specified type on windbg

I am using windbg to debug my application, but I can't find a command to dump a variable value with specified type.
for example, there is a variable, say A, its type is int.
now I 'd like to dump variable A with uint type.
how to do it ?
thanks in advance.

dt is your friend
0:000> dt i
Local var # 0x18f2cc Type int
0n-2
0:000> dt (uint) 0x18f2cc
CrashTestD!UINT
0xfffffffe
If you want decimal output, Set Number Base 10
0:000> n 10
base is 10
0:000> dt (uint) 0x18f2cc
CrashTestD!UINT
0n4294967294
Still wondering, use :
0:000> .formats 0xfffffffe
Evaluate expression:
Hex: fffffffe
Decimal: -2
Octal: 37777777776
Binary: 11111111 11111111 11111111 11111110
Chars: ....
Time: unavailable
Float: low -1.#QNAN high 0
Double: 2.122e-314
Much more Here:

Is Shifting more than 32 bits of a uint64_t integer on an x86 machine Undefined Behavior?

Learning the hard way, I tried to left shift a long long and uint64_t to more than 32 bits on an x86 machine resulted 0. I vaguely remember to have read somewhere than on a 32 bit machine shift operators only work on the first 32 bits but cannot recollect the source.
I would like to know is if Shifting more than 32 bits of a uint64_t integer on an x86 machine is an Undefined Behavior?

The standard says (6.5.7 in n1570):
3 The integer promotions are performed on each of the operands. The type of the result is
that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undeﬁned.
4 The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are ﬁlled with
zeros. If E1 has an unsigned type, the value of the result is E1 × 2E2 , reduced modulo
one more than the maximum value representable in the result type. If E1 has a signed
type and nonnegative value, and E1 × 2E2 is representable in the result type, then that is
the resulting value; otherwise, the behavior is undeﬁned.
5 The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type
or if E1 has a signed type and a nonnegative value, the value of the result is the integral
part of the quotient of E1 / 2E2 . If E1 has a signed type and a negative value, the
resulting value is implementation-deﬁned.
Shifting a uint64_t a distance of less than 64 bits is completely defined by the standard.
Since long long must be at least 64 bits, shifting long long values less than 64 bits is defined by the standard for nonnegative values, if the result doesn't overflow.
Note, however, that if you write a literal that fits into 32 bits, e.g. uint64_t s = 1 << 32 as surmised by #drhirsch, you don't actually shift a 64-bit value but a 32-bit one. That is undefined behaviour.
The most common results are a shift by shift_distance % 32 or 0, depending on what the hardware does (and assuming the compiler's compile-time evaluation emulates the hardware semantics, instead of nasal demons.)
Use 1ULL < 63 to make the shift operand unsigned long long before the shift.

The C standard requires the shift to work correctly. A particular buggy compiler might have the defect you describe, but that is buggy behaviour.
This is a test program:
#include <stdio.h>
#include <inttypes.h>
int main(void)
{
uint64_t x = 1;
for (int i = 0; i < 64; i++)
printf("%2d: 0x%.16" PRIX64 "\n", i, (x << i));
return 0;
}
This is the output on an i686 machine running RHEL 5 with GCC 4.1.2, and also on x86/64 machine (also running RHEL 5 and GCC 4.1.2), and on a x86/64 Mac (running Mac OS X 10.7.3 with GCC 4.7.0). Since that's the expected result, I conclude that there is no necessary problem on the 32-bit machine, and that GCC at least has not exhibited any such bug since GCC 4.1.2 (and probably never has exhibited such a bug).
0: 0x0000000000000001
1: 0x0000000000000002
2: 0x0000000000000004
3: 0x0000000000000008
4: 0x0000000000000010
5: 0x0000000000000020
6: 0x0000000000000040
7: 0x0000000000000080
8: 0x0000000000000100
9: 0x0000000000000200
10: 0x0000000000000400
11: 0x0000000000000800
12: 0x0000000000001000
13: 0x0000000000002000
14: 0x0000000000004000
15: 0x0000000000008000
16: 0x0000000000010000
17: 0x0000000000020000
18: 0x0000000000040000
19: 0x0000000000080000
20: 0x0000000000100000
21: 0x0000000000200000
22: 0x0000000000400000
23: 0x0000000000800000
24: 0x0000000001000000
25: 0x0000000002000000
26: 0x0000000004000000
27: 0x0000000008000000
28: 0x0000000010000000
29: 0x0000000020000000
30: 0x0000000040000000
31: 0x0000000080000000
32: 0x0000000100000000
33: 0x0000000200000000
34: 0x0000000400000000
35: 0x0000000800000000
36: 0x0000001000000000
37: 0x0000002000000000
38: 0x0000004000000000
39: 0x0000008000000000
40: 0x0000010000000000
41: 0x0000020000000000
42: 0x0000040000000000
43: 0x0000080000000000
44: 0x0000100000000000
45: 0x0000200000000000
46: 0x0000400000000000
47: 0x0000800000000000
48: 0x0001000000000000
49: 0x0002000000000000
50: 0x0004000000000000
51: 0x0008000000000000
52: 0x0010000000000000
53: 0x0020000000000000
54: 0x0040000000000000
55: 0x0080000000000000
56: 0x0100000000000000
57: 0x0200000000000000
58: 0x0400000000000000
59: 0x0800000000000000
60: 0x1000000000000000
61: 0x2000000000000000
62: 0x4000000000000000
63: 0x8000000000000000

Daniel Fischer's answer answers the question about the C language specification. As for what actually happens on an x86 machine when you issue a shift by a variable amount, refer to the Intel Software Developer Manual Volume 2B, p. 4-506:
The count is masked to 5 bits (or 6 bits
if in 64-bit mode and REX.W is used). The count range is limited to 0 to 31 (or 63 if
64-bit mode and REX.W is used).
So if you attempt to shift by an amount larger than 31 or 63 (for 32- and 64-bit values respectively), the hardware will only use the bottom 5 or 6 bits of the shift amount. So this code:
uint32_t RightShift(uint32_t value, uint32_t count)
{
return value >> count;
}
Will result in RightShift(2, 33) == 1 on x86 and x86-64. It's still undefined behavior according to the C standard, but on x86, if the compiler compiles it down to a sar instruction, it will have defined behavior on that architecture. But you should still avoid writing this sort of code that depends on architecture-specific quirks.

Shifting by a number comprised between 0 and the predecessor of the width of the type does not cause undefined behavior, but left-shifting a negative number does. Would you be doing that?
On the other hand, right-shifting a negative number is implementation-defined, and most compilers, when right-shifting signed types, propagate the sign bit.

No it is ok.
ISO 9899:2011 6.5.7 Bitwise shift operators
If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
That isn't the case here, so it is all fine and well-defined.

Analyze float value in gdb

Sample code:
int main()
{
float x = 456.876;
printf ("\nx = %f\n", x);
return 0;
}
In gdb, I executed this code like this:
Breakpoint 1, main () at sample_float.c:5
5 float x = 456.876;
(gdb) n
7 printf ("\nx = %f\n", x);
(gdb) p &x
$1 = (float *) 0x7fffffffd9dc
(gdb) x/4fb &x
0x7fffffffd9dc: 33 112 -28 67
Is it possible to see the value at address of x, using: x/fb command as: 456.876?
Thanks.

Perhaps I am misreading your question but you can simply do
p/f x
Or
x/f &x
Is that what you were looking for ?

agree with the above answer, but to understand why you got the results that you did.
(gdb) x/4fb &x
0x7fffffffd9dc: 33 112 -28 67
from the gdb manual
x/3uh 0x54320' is a request to display three halfwords (h) of memory, formatted as unsigned decimal integers (u'), starting at address 0x54320.
thus, x/4fb &x is formatting a byte as a float 4 times. not 4 bytes as a float.

Here is a link to examining memory using gdb
You can use the command x (for "examine") to examine memory in any of several formats,
independently of your program's data types.
x/nfu addr
x addr
x
n, f, and u are all optional parameters that specify how much memory to display and how to format it; addr is an expression giving the address where you want to start displaying memory. If you use defaults for nfu, you need not type the slash `/'. Several commands set convenient defaults for addr.
n, the repeat count
The repeat count is a decimal integer; the default is 1. It specifies how much memory (counting by units u) to display.
f, the display format
The display format is one of the formats used by print, `s' (null-terminated string), or `i' (machine instruction). The default is `x' (hexadecimal) initially.
The default changes each time you use either x or print.
u, the unit size The unit size is any of
b: Bytes.
h: Halfwords (two bytes).
w: Words (four bytes). This is the initial default.
g: Giant words (eight bytes).

strange output in comparison of float with float literal

float f = 0.7;
if( f == 0.7 )
printf("equal");
else
printf("not equal");
Why is the output not equal ?
Why does this happen?

This happens because in your statement
if(f == 0.7)
the 0.7 is treated as a double. Try 0.7f to ensure the value is treated as a float:
if(f == 0.7f)
But as Michael suggested in the comments below you should never test for exact equality of floating-point values.

This answer to complement the existing ones: note that 0.7 is not representable exactly either as a float (or as a double). If it was represented exactly, then there would be no loss of information when converting to float and then back to double, and you wouldn't have this problem.
It could even be argued that there should be a compiler warning for literal floating-point constants that cannot be represented exactly, especially when the standard is so fuzzy regarding whether the rounding will be made at run-time in the mode that has been set as that time or at compile-time in another rounding mode.
All non-integer numbers that can be represented exactly have 5 as their last decimal digit. Unfortunately, the converse is not true: some numbers have 5 as their last decimal digit and cannot be represented exactly. Small integers can all be represented exactly, and division by a power of 2 transforms a number that can be represented into another that can be represented, as long as you do not enter the realm of denormalized numbers.

First of all let look inside float number. I take 0.1f it is 4 byte long (binary32), in hex it is
3D CC CC CD.
By the standart IEEE 754 to convert it to decimal we must do like this:
In binary 3D CC CC CD is
0 01111011 1001100 11001100 11001101
here first digit is a Sign bit. 0 means (-1)^0 that our number is positive.
Second 8 bits is an Exponent. In binary it is 01111011 - in decimal 123. But the real Exponent is 123-127 (always 127)=-4, it's mean we need to multiply the number we will get by 2^ (-4).
The last 23 bytes is the Significand precision. There the first bit we multiply by 1/ (2^1) (0.5), second by 1/ (2^2) (0.25) and so on. Here what we get:
We need to add all numbers (power of 2) and add to it 1 (always 1, by standart). It is
1,60000002384185791015625
Now let's multiply this number by 2^ (-4), it's from Exponent. We just devide number above by 2 four time:
0,100000001490116119384765625
I used MS Calculator
**
Now the second part. Converting from decimal to binary.
**
I take the number 0.1
It ease because there is no integer part. First Sign bit - it is 0.
Exponent and Significand precision I will calculate now. The logic is multiply by 2 whole number (0.1*2=0.2) and if it's bigger than 1 substract and continue.
And the number is .00011001100110011001100110011, standart says that we must shift left before we get 1. (something). How you see we need 4 shifts, from this number calculating Exponent (127-4=123). And the Significand precision now is 10011001100110011001100 (and there is lost bits).
Now the whole number. Sign bit 0 Exponent is 123 (01111011) and Significand precision is 10011001100110011001100 and whole it is
00111101110011001100110011001100
let's compare it with those we have from previous chapter
00111101110011001100110011001101
As you see the lasts bit are not equal. It is because I truncate the number. The CPU and compiler know that the is something after Significand precision can not hold and just set the last bit to 1.

Another near exact question was linked to this one, thus the years late answer. I don't think the above answers are complete.
int fun1 ( void )
{
float x=0.7;
if(x==0.7) return(1);
else return(0);
}
int fun2 ( void )
{
float x=1.1;
if(x==1.1) return(1);
else return(0);
}
int fun3 ( void )
{
float x=1.0;
if(x==1.0) return(1);
else return(0);
}
int fun4 ( void )
{
float x=0.0;
if(x==0.0) return(1);
else return(0);
}
int fun5 ( void )
{
float x=0.7;
if(x==0.7f) return(1);
else return(0);
}
float fun10 ( void )
{
return(0.7);
}
double fun11 ( void )
{
return(0.7);
}
float fun12 ( void )
{
return(1.0);
}
double fun13 ( void )
{
return(1.0);
}
Disassembly of section .text:
00000000 <fun1>:
0: e3a00000 mov r0, #0
4: e12fff1e bx lr
00000008 <fun2>:
8: e3a00000 mov r0, #0
c: e12fff1e bx lr
00000010 <fun3>:
10: e3a00001 mov r0, #1
14: e12fff1e bx lr
00000018 <fun4>:
18: e3a00001 mov r0, #1
1c: e12fff1e bx lr
00000020 <fun5>:
20: e3a00001 mov r0, #1
24: e12fff1e bx lr
00000028 <fun10>:
28: e59f0000 ldr r0, [pc] ; 30 <fun10+0x8>
2c: e12fff1e bx lr
30: 3f333333 svccc 0x00333333
00000034 <fun11>:
34: e28f1004 add r1, pc, #4
38: e8910003 ldm r1, {r0, r1}
3c: e12fff1e bx lr
40: 66666666 strbtvs r6, [r6], -r6, ror #12
44: 3fe66666 svccc 0x00e66666
00000048 <fun12>:
48: e3a005fe mov r0, #1065353216 ; 0x3f800000
4c: e12fff1e bx lr
00000050 <fun13>:
50: e3a00000 mov r0, #0
54: e59f1000 ldr r1, [pc] ; 5c <fun13+0xc>
58: e12fff1e bx lr
5c: 3ff00000 svccc 0x00f00000 ; IMB
Why did fun3 and fun4 return one and not the others? why does fun5 work?
It is about the language. The language says that 0.7 is a double unless you use this syntax 0.7f then it is a single. So
float x=0.7;
the double 0.7 is converted to a single and stored in x.
if(x==0.7) return(1);
The language says we have to promote to the higher precision so the single in x is converted to a double and compared with the double 0.7.
00000028 <fun10>:
28: e59f0000 ldr r0, [pc] ; 30 <fun10+0x8>
2c: e12fff1e bx lr
30: 3f333333 svccc 0x00333333
00000034 <fun11>:
34: e28f1004 add r1, pc, #4
38: e8910003 ldm r1, {r0, r1}
3c: e12fff1e bx lr
40: 66666666 strbtvs r6, [r6], -r6, ror #12
44: 3fe66666 svccc 0x00e66666
single 3f333333
double 3fe6666666666666
As Alexandr pointed out if that answer remains IEEE 754 a single is
seeeeeeeefffffffffffffffffffffff
And double is
seeeeeeeeeeeffffffffffffffffffffffffffffffffffffffffffffffffffff
with 52 bits of fraction rather than the 23 that single has.
00111111001100110011... single
001111111110011001100110... double
0 01111110 01100110011... single
0 01111111110 01100110011... double
Just like 1/3rd in base 10 is 0.3333333... forever. We have a repeating pattern here 0110
01100110011001100110011 single, 23 bits
01100110011001100110011001100110.... double 52 bits.
And here is the answer.
if(x==0.7) return(1);
x contains 01100110011001100110011 as its fraction, when that gets converted back
to double the fraction is
01100110011001100110011000000000....
which is not equal to
01100110011001100110011001100110...
but here
if(x==0.7f) return(1);
That promotion doesn't happen the same bit patterns are compared with each other.
Why does 1.0 work?
00000048 <fun12>:
48: e3a005fe mov r0, #1065353216 ; 0x3f800000
4c: e12fff1e bx lr
00000050 <fun13>:
50: e3a00000 mov r0, #0
54: e59f1000 ldr r1, [pc] ; 5c <fun13+0xc>
58: e12fff1e bx lr
5c: 3ff00000 svccc 0x00f00000 ; IMB
0011111110000000...
0011111111110000000...
0 01111111 0000000...
0 01111111111 0000000...
In both cases the fraction is all zeros. So converting from double to single to double there is no loss of precision. It converts from single to double exactly and the bit comparison of the two values works.
The highest voted and checked answer by halfdan is the correct answer, this is a case of mixed precision AND you should never do an equals comparison.
The why wasn't shown in that answer. 0.7 fails 1.0 works. Why did 0.7 fail wasn't shown. A duplicate question 1.1 fails as well.
Edit
The equals can be taken out of the problem here, it is a different question that has already been answered, but it is the same problem and also has the "what the ..." initial shock.
int fun1 ( void )
{
float x=0.7;
if(x<0.7) return(1);
else return(0);
}
int fun2 ( void )
{
float x=0.6;
if(x<0.6) return(1);
else return(0);
}
Disassembly of section .text:
00000000 <fun1>:
0: e3a00001 mov r0, #1
4: e12fff1e bx lr
00000008 <fun2>:
8: e3a00000 mov r0, #0
c: e12fff1e bx lr
Why does one show as less than and the other not less than? When they should be equal.
From above we know the 0.7 story.
01100110011001100110011 single, 23 bits
01100110011001100110011001100110.... double 52 bits.
01100110011001100110011000000000....
is less than.
01100110011001100110011001100110...
0.6 is a different repeating pattern 0011 rather than 0110.
but when converted from a double to a single or in general when represented
as a single IEEE 754.
00110011001100110011001100110011.... double 52 bits.
00110011001100110011001 is NOT the fraction for single
00110011001100110011010 IS the fraction for single
IEEE 754 uses rounding modes, round up, round down or round to zero. Compilers tend to round up by default. If you remember rounding in grade school 12345678 if I wanted to round to the 3rd digit from the top it would be 12300000 but round to the next digit 1235000 if the digit after is 5 or greater then round up. 5 is 1/2 of 10 the base (Decimal) in binary 1 is 1/2 of the base so if the digit after the position we want to round is 1 then round up else don't. So for 0.7 we didn't round up, for 0.6 we do round up.
And now it is easy to see that
00110011001100110011010
converted to a double because of (x<0.7)
00110011001100110011010000000000....
is greater than
00110011001100110011001100110011....
So without having to talk about using equals the issue still presents itself 0.7 is double 0.7f is single, the operation is promoted to the highest precision if they differ.

The problem you're facing is, as other commenters have noted, that it's generally unsafe to test for exact equivalency between floats, as initialization errors, or rounding errors in calculations can introduce minor differences that will cause the == operator to return false.
A better practice is to do something like
float f = 0.7;
if( fabs(f - 0.7) < FLT_EPSILON )
printf("equal");
else
printf("not equal");
Assuming that FLT_EPSILON has been defined as an appropriately small float value for your platform.
Since the rounding or initialization errors will be unlikely to exceed the value of FLT_EPSILON, this will give you the reliable equivalency test you're looking for.

A lot of the answers around the web make the mistake of looking at the abosulute difference between floating point numbers, this is only valid for special cases, the robust way is to look at the relative difference as in below:
// Floating point comparison:
bool CheckFP32Equal(float referenceValue, float value)
{
const float fp32_epsilon = float(1E-7);
float abs_diff = std::abs(referenceValue - value);
// Both identical zero is a special case
if( referenceValue==0.0f && value == 0.0f)
return true;
float rel_diff = abs_diff / std::max(std::abs(referenceValue) , std::abs(value) );
if(rel_diff < fp32_epsilon)
return true;
else
return false;
}

Consider this:
int main()
{
float a = 0.7;
if(0.7 > a)
printf("Hi\n");
else
printf("Hello\n");
return 0;
}
if (0.7 > a) here a is a float variable and 0.7 is a double constant. The double constant 0.7 is greater than the float variable a. Hence the if condition is satisfied and it prints 'Hi'
Example:
int main()
{
float a=0.7;
printf("%.10f %.10f\n",0.7, a);
return 0;
}
Output:
0.7000000000 0.6999999881

Pointing value saved in variable and constant have not same data types. It's the difference in the precision of data types.
If you change the datatype of f variable to double, it'll print equal, This is because constants in floating-point stored in double and non-floating in long by default, double's precision is higher than float. it'll be completely clear if you see the method of floating-point numbers conversion to binary conversion

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js