why IEEE 754 single precision has exponent range from -127 to 128 but not from -128 to 127 - ieee-754

In IEEE 754 single precision 8 bit exponent range is from -127 to 128 but not from -128 to 127

Related

What is the correct way to get the binary representation of long double? [duplicate]

This question already has an answer here:
Why does a long double take up only 10 bytes in a string? [duplicate]
(1 answer)
Closed 1 year ago.
Here's my attempt:
#include <iostream>
union newType {
long double firstPart;
unsigned char secondPart[sizeof(firstPart)];
} lDouble;
int main() {
lDouble.firstPart = -16.5;
for (int_fast16_t i { sizeof(lDouble) - 1 }; i >= 0; --i)
std::cout << (int)lDouble.secondPart[i] << " ";
return 0;
}
Output: 0 0 0 0 0 0 192 3 132 0 0 0 0 0 0 0
Hex: 0 0 0 0 0 0 c0 3 84 0 0 0 0 0 0 0
And I almost agree with the part "c0 3 84", which is "1100 0000 0000 0011 1000 0100".
-16.5 = -1.03125 * 2^4 = (-1 + (-0.5) * 2^-4) * 2^4
Thus, the 117th bit of my fraction part must be 1 and after 5th division I'll get only "0".
sign(-): 1
exponent(2^4): 4 + 16383 = 16387 = 100 0000 0000 0011
fraction: 0000 1000 and 104 '0'
Result: 1| 100 0000 0000 0011| 0000 1000 and 104 '0'
Hex: c 0 0 3 0 8 and 26 '0'
Or: c0 3 8 0 0 0 0 0 0 0 0 0 0 0 0 0
I don' get two things:
"c0 3 84" - where did I lose 4 in my calculations? My guess is that it somehow stores 1 (113 bit) and it shouldn't be stored. Then there's 1000 0100 instead of 0000 1000 (after "c0 3") and that's exactly "84". But we always store 112 bits and 1 is always implicit.
Why doesn't my output start from 192? Why does it start from 0? I thought that first bit is sign bit, then exponent (15 bits) and fraction (112 bits).
I've managed to represent other data types (double, float, unsigned char, etc.). With double I went with the similar approach and got the expected result (e.g. double -16.5 outputs 192 48 128 0 0 0 0 0, or c0 30 80 0 0 0 0 0).
Of course I've tested the solution from How to print binary representation of a long double as in computer memory?
Values for my -16.5 are: 0 0 0 0 0 0 0 0x84 0x3 0xc0 0xe2 0x71 0xf 0x56 0 0
If I revert this I get: 0 0 56 f 71 e2 c0 3 84 0 0 0 0 0 0 0
And I don't understand why (again) does the sequence start not from sign bit, what are those "56 f 71 e2 c0"? Where do they come from? And why (again) there's "4" after "8"?
What is the correct way to get the binary representation of long double?
Same as the way of getting the binary representation of any trivial type. Reinterpreting as an array of unsigned char, and iterating each byte is typical and well defined solution.
std::bitset helps with the binary representation:
long double ld = -16.5;
unsigned char* it = reinterpret_cast<unsigned char*>(&ld);
for (std::size_t i = 0; i < sizeof(ld); i++) {
std::cout
<< "byte "
<< i
<< '\t'
<< std::bitset<CHAR_BIT>(it[i])
<< '\t'
<< std::hex << int(it[i])
<< '\t'
<< std::dec << int(it[i])
<< '\n';
}
Example output on some system:
byte 0 00000000 0 0
byte 1 00000000 0 0
byte 2 00000000 0 0
byte 3 00000000 0 0
byte 4 00000000 0 0
byte 5 00000000 0 0
byte 6 00000000 0 0
byte 7 10000100 84 132
byte 8 00000011 3 3
byte 9 11000000 c0 192
byte 10 01000000 40 64
byte 11 00000000 0 0
byte 12 00000000 0 0
byte 13 00000000 0 0
byte 14 00000000 0 0
byte 15 00000000 0 0
Note that your example has undefined behaviour in C++ due to reading an inactive member of a union.
Why doesn't my output start from 192?
Probably because those bytes at the end happen to be padding.
Why does it start from 0?
Because the padding contains garbage.
I thought that first bit is sign bit, then exponent (15 bits) and fraction (112 bits).
Not so much the "first" bit, but rather the "most significant" bit, excluding the padding. And evidently, you've assumed the number of bits wrongly as some of it is used for padding.
Note that C++ doesn't guarantee that the floating point representation is IEEE-754 and in fact, long double is often not the 128 bit "quadruple" precision float, but rather 80 bit "extended" precision float. This is the case for example in the x86 CPU architecture family.

How does bitwise not operation give negative value [duplicate]

This question already has answers here:
Bitwise NOT operator returning unexpected and negative value? [duplicate]
(4 answers)
Closed 4 years ago.
I want to see how bitwise NOT works through a simple example:
int x = 4;
int y;
int z;
y = ~(x<<1);
z =~(0x01<<1);
cout<<"y = "<<y<<endl;
cout<<"z = "<<z<<endl;
This results in y = -9 and z = -3. I don't see how this happen. Anyone can educate me a bit?
(x<<1) will shift the bits one, so
00000000 00000000 00000000 00000100
will become:
00000000 00000000 00000000 00001000
Which is the representation of 8. Then ~ will invert all the bits such that it becomes:
11111111 11111111 11111111 11110111
Which is the representation of -9.
0x01 is
00000000 00000000 00000000 00000001
in binary, so when shifted once becomes:
00000000 00000000 00000000 00000010
And then when ~ is applied we get:
11111111 11111111 11111111 11111101
Which is -3 in binary
Well, there is a very long story behind.
To make it easier to understand let's use binary numbers.
x = 4 or x = 0b 0000 0000 0000 0000 0000 0000 0000 0100 because sizeOf(int) = 4
after x<<1 x = 0b 0000 0000 0000 0000 0000 0000 0000 1000 and after
~(x<<1) x = 0b 1111 1111 1111 1111 1111 1111 1111 0111.
and here begin complication. Since int is signed type it's mean that the first bit is a sign and the whole system is Two complemnt.
so x = 0b 1111 1111 1111 1111 1111 1111 1111 0111 is x = -9 and for example
x = 0b 1111 1111 1111 1111 1111 1111 1111 1111 is x = -1
and x = 0b 0000 0000 0000 0000 0000 0000 0000 0010 is 2
Learn more about Two complemnt.
Whether an integer is positive or negative (the sign of the integer) is stored in a dedicated bit, the sign bit. The bitwise NOT affects this bit, too, so any positive number becomes a negative number and vice versa.
Note that "dedicated bit" is a bit of an oversimplification, as most contemporary computers do not use "sign and magnitude" representation (where the sign bit would just switch the sign), but "two's complement" representation, where the sign bit also affects the magnitude.
For example, the 8-bit signed integer 00000000 would be 0, but 10000000 (sign bit flipped) would be -128.

Extracting middle 16 bits of a 32 bit long

I am reading TCPPPL by Stroustrup. It gives an example of a function that extracts the middle 16 bits of a 32 bit long like this:
unsigned short middle(long a){ return (a>>8)&0xffff;}.
My question is: isn't it extracting the last 16 bits? Tell me how am I wrong.
It does indeed extract the middle 16 bits:
// a := 0b xxxx xxxx 1111 1111 1111 1111 xxxx xxxx
a>>8; // 0b 0000 0000 xxxx xxxx 1111 1111 1111 1111
&0xffff // 0b 0000 0000 0000 0000 1111 1111 1111 1111
a >> 8 will right-shift the value in a by 8 bits. The low 8 bits are forgotten, and bits previously numbered 31–8 now get moved (renumbered) to 23–0. Finally, masking out the higher 16 bits leaves you with bits 15–0, which were originally (before the shift) at positions 23–8. Voila.
a is going to right shift 8-bit (a>>8) before bitwise and operation.
Have you noticed the >>8 part? It shifts the argument right by eight bits, first.

For cycle bugged?

I made a for cycle to calculate the population of an alien species growth. This is the cycle:
int mind = 96;
int aliens = 1;
for (int i=0; i <= mind; i++)
{
aliens = aliens * 2;
}
cout << aliens;
Oddly, the cout is returning 0, and it makes no sense, it should return a very high value. Is the cycle badly coded?
The issue is simple. you have a int (most likely 32-bit signed integer). The operation you're doing (x2 each cycle) can be expressed as a shift arithmetic left.
Beware the powers of 2! Doing 1 << 31 on a 32-bit signed integer will effectively go back to 0 (after an overflow).
Let's see how your loop goes.
0 2
1 4
2 8
3 16
4 32
5 64
6 128
7 256
8 512
9 1024
10 2048
11 4096
12 8192
13 16384
14 32768
15 65536
16 131072
17 262144
18 524288
19 1048576
20 2097152
21 4194304
22 8388608
23 16777216
24 33554432
25 67108864
26 134217728
27 268435456
28 536870912
29 1073741824
30 -2147483648 // A.K.A. overflow
31 0
At this point I don't think I need to tell you 0 x 2 = 0
The point being: use a double or a integer variable that's at least mind + 1 bits long

GDB is not displaying hexadecimal values for stack

I'm trying to have GDB display the hexadecimal values for the stack, so I used the command x /48b $esp, which is a command that I saw on the Internet that should show the hexdecimal values for 48 bytes on the stack starting at the location pointed to by the stack pointer. However, when I do this command I get integer values (some negative instead). An example is show below:
(gdb) x /48b $esp
0xbffff200: 40 -14 -1 -65 24 -114 4 8
0xbffff208: 123 0 0 0 0 0 0 0
0xbffff210: 16 0 0 0 -3 -112 17 0
0xbffff218: -18 64 27 0 -1 -1 -1 -1
0xbffff220: 88 40 19 0 45 -9 17 0
0xbffff228: 38 38 -64 -14 -1 -65 -64 -14
I've had this command work before (as far as I know it was the exact same command), however all of a sudden it seems not to be working. Any ideas?
You're probably mistyping your command:
Format letters are o(octal), x(hex), d(decimal), u(unsigned decimal),
t(binary), f(float), a(address), i(instruction), c(char) and
s(string).
You should use this command for hex output: x /48x $esp