Signed type representation in c++ - c++

In the book I am reading it says that:
The standard does not define how signed types are represented, but does specify that range should be evenly divided between positive and negative values. Hence, an 8-bit signed char is guaranteed to be able to hold values from -127 through 127; most modern machines use representations that allow values from -128 through 127.
I presume that [-128;127] range arises from method called "twos-complement" in which negative number is !A+1 (e.g. 0111 is 7, and 1001 is then -7). But I cannot wrap my head around why in some older(?) machines the values range [-127;127]. Can anyone clarify this?

Both one's complement and signed magnitude are representations that provide the range [-127,127] with an 8 bit number. Both have a different representation for +0 and -0. Both have been used by (mostly) early computer systems.
The signed magnitude representation is perhaps the simplest for humans to imagine and was probably used for the same reason as why people first created decimal computers, rather than binary.
I would imagine that the only reason why one's complement was ever used, was because two's complement hadn't yet been considered by the creators of early computers. Then later on, because of backwards compatibility. Although, this is just my conjecture, so take it with a grain of salt.
Further information: https://en.wikipedia.org/wiki/Signed_number_representations
As a slightly related factoid: In the IEEE floating point representation, the signed exponent uses excess-K representation and the fractional part is represented by signed magnitude.

It's not actually -127 to 127. But -127 to -0 and 0 to 127.
Earlier processor used two methods:
Signed magnitude: In this a a negative answer is form by putting 1 at the most significant bit. So 10000000 and 00000000 both represent 0
One's complement: Just applying not to positive number. This cause two zero representation: 11111111 and 00000000.
Also two's complement is nearly as old as other two. https://www.linuxvoice.com/edsac-dennis-wheeler-and-the-cambridge-connection/

Related

Why do signed negative integers start at the lowest value?

I can't really explain this question in words alone (probably why I can't find an answer), so I'll try to give as much detail as I can. This isn't really a practical question, I'm just curious.
So let's say we have a signed 8bit int.
sign | bytes | sign 0 | sign 1
? | 0000000 | (+)0 | (-)128
? | 1111111 | (+)127 | (-)1
I don't understand why this works this way, can someone explain? In my head, it makes more sense for the value to be the same and for the sign to just put a plus or minus in front, so to me it looks backwards.
There are a couple of systems for signed integers.
One of them, sign-magnitude, is exactly what you expect: a part that says how big the number is, and a bit that either leaves the number positive or negates it. That makes the sign bit really special, substantially different than the other bits. For example:
sign-magnitude representation
0_0000000 = 0
0_0000001 = 1
1_0000001 = -1
1_0000000 = -0
This has some uncomfortable side-effects, mainly no longer corresponding to unsigned arithmetic in a useful way (if you add two sign-magnitude integers as if they are unsigned weird things happen, eg -0 + 1 = -1), which has far-reaching consequences: addition/subtraction/equals/multiplication all need special signed versions of them, multiplication and division by powers of two in no way corresponds to bit shifts (except accidentally), since it has no clear correlation to Z/2^k Z it's not immediately clear how it behaves algebraically. Also -0 exists as separate thing from 0, which is weird and causes different kinds of trouble depending on your semantics for it, but never no trouble.
The most common system by far is two's complement, where the sign bit does not mean "times 1 or times -1" but "add 0 or add -2^k". As with one's complement, the sign bit is largely a completely normal bit (except with respect to division and right shift). For example:
two's complement representation (8bit)
00000000 = 0 (no surprises there)
10000000 = -128
01111111 = 127
11111111 = -1 (= -128 + 127)
etc
Now note that 11111111 + 00000001 = 0 in unsigned 8bit arithmetic anyway, and -1+1=0 is clearly desirable (in fact it is the definition of -1). So what it comes down to, at least for addition/subtraction/multiplication/left shift, is plain old unsigned arithmetic - you just print the numbers differently. Of course some operators still need special signed versions. Since it corresponds to unsigned arithmetic so closely, you can reason about additions and multiplications as if you are in Z/2^k Z with total confidence. It does have a slight oddity comparable with the existence of negative zero, namely the existence of a negative number with no positive absolute value.
The idea to make the value the same and to just put a plus or minus in front is a known idea, called a signed magnitude representation or a similar expression. A discussion here says the two major problems with signed magnitude representation are that there are two zeros (plus and minus), and that integer arithmetic becomes more complicated in the computer algorithm.
A popular alternative for computers is a two's complement representation, which is what you are asking about. This representation makes arithmetic algorithms simpler, but looks weird when you imagine plotting the binary values along a number line, as you are doing. Two's complement also has a single zero, which takes care of the first major problem.
The signed number representations article in Wikipedia has comparison tables illustrating signed magnitude, two's complement, and three other representation systems of values in a decimal number line from -11 to +16 and in a binary value chart from 0000 to 1111.

Is it possible to differentiate between 0 and -0?

I know that the integer values 0 and -0 are essentially the same.
But, I am wondering if it is possible to differentiate between them.
For example, how do I know if a variable was assigned -0?
bool IsNegative(int num)
{
// How ?
}
int num = -0;
int additinon = 5;
num += (IsNegative(num)) ? -addition : addition;
Is the value -0 saved in the memory the exact same way as 0?
It depends on the machine you're targeting.
On a machine that uses a 2's complement representation for integers there's no difference at bit-level between 0 and -0 (they have the same representation)
If your machine used one's complement, you definitely could
0000 0000 -> signed  0 
1111 1111 -> signed −0
Obviously we're talking about using native support, x86 series processors have native support for the two's complement representation of signed numbers. Using other representations is definitely possible but would probably be less efficient and require more instructions.
(As JerryCoffin also noted: even if one's complement has been considered mostly for historical reasons, signed magnitude representations are still fairly common and do have a separate representation for negative and positive zero)
For an int (in the almost-universal "2's complement" representation) the representations of 0 and -0 are the same. (They can be different for other number representations, eg. IEEE 754 floating point.)
Let's begin with representing 0 in 2's complement (of course there exist many other systems and representations, here I'm referring this specific one), assuming 8-bit, zero is:
0000 0000
Now let's flip all the bits and add 1 to get the 2's complement:
1111 1111 (flip)
0000 0001 (add one)
---------
0000 0000
we got 0000 0000, and that's the representation of -0 as well.
But note that in 1's complement, signed 0 is 0000 0000, but -0 is 1111 1111.
I've decided to leave this answer up since C and C++ implementations are usually closely related, but in fact it doesn't defer to the C standard as I thought it did. The point remains that the C++ standard does not specify what happens for cases like these. It's also relevant that non-twos-complement representations are exceedingly rare in the real world, and that even where they do exist they often hide the difference in many cases rather than exposing it as something someone could easily expect to discover.
The behavior of negative zeros in the integer representations in which they exist is not as rigorously defined in the C++ standard as it is in the C standard. It does, however, cite the C standard (ISO/IEC 9899:1999) as a normative reference at the top level [1.2].
In the C standard [6.2.6.2], a negative zero can only be the result of bitwise operations, or operations where a negative zero is already present (for example, multiplying or dividing negative zero by a value, or adding a negative zero to zero) - applying the unary minus operator to a value of a normal zero, as in your example, is therefore guaranteed to result in a normal zero.
Even in the cases that can generate a negative zero, there is no guarantee that they will, even on a system that does support negative zero:
It is unspecified whether these cases actually generate a negative zero or a normal zero, and whether a negative zero becomes a normal zero when stored in an object.
Therefore, we can conclude: no, there is no reliable way to detect this case. Even if not for the fact that non-twos-complement representations are very uncommon in modern computer systems.
The C++ standard, for its part, makes no mention of the term "negative zero", and has very little discussion of the details of signed magnitude and one's complement representations, except to note [3.9.1 para 7] that they are allowed.
If your machine has distinct representations for -0 and +0, then memcmp will be able to distinguish them.
If padding bits are present, there might actually be multiple representations for values other than zero as well.
In the C++ language specification, there is no such int as negative zero.
The only meaning those two words have is the unary operator - applied to 0, just as three plus five is just the binary operator + applied to 3 and 5.
If there were a distinct negative zero, two's complement (the most common representation of integers types) would be an insufficient representation for C++ implementations, as there is no way to represent two forms of zero.
In contrast, floating points (following IEEE) have separate positive and negative zeroes. They can be distinguished, for example, when dividing 1 by them. Positive zero produces positive infinity; negative zero produces negative infinity.
However, if there happen to be different memory representations of the int 0 (or any int, or any other value of any other type), you can use memcmp to discover that:
#include <string>
int main() {
int a = ...
int b = ...
if (memcmp(&a, &b, sizeof(int))) {
// a and b have different representations in memory
}
}
Of course, if this did happen, outside of direct memory operations, the two values would still work in exactly the same way.
To simplify i found it easier to visualize.
Type int(_32) is stored with 32 bits. 32 bits means 2^32 = 4294967296 unique values. Thus :
unsigned int data range is 0 to 4,294,967,295
In case of negative values it depends on how they are stored. In case
Two's complement –2,147,483,648 to 2,147,483,647
One's complement –2,147,483,647 to 2,147,483,647
In case of One's complement value -0 exists.

Where's the 24th fraction bit on a single precision float? IEEE 754

I found myself today doing some bit manipulation and I decided to refresh my floating-point knowledge a little!
Things were going great until I saw this:
... 23 fraction bits of the significand appear in the memory format but the total precision is 24 bits
I read it again and again but I still can't figure out where the 24th bit is, I noticed something about a binary point so I assumed that it's a point in the middle between the mantissa and the exponent.
I'm not really sure but I believe he author was talking about this bit:
Binary point?
|
s------e-----|-------------m----------
0 - 01111100 - 01000000000000000000000
^ this
The 24th bit is implicit due to normalization.
The significand is shifted left (and one subtracted from the exponent for each bit shift) until the leading bit of the significand is a 1.
Then, since the leading bit is a 1, only the other 23 bits are actually stored.
There is also the possibility of a denormal number. The exponent is stored as a "bias" format signed number, meaning that it's an unsigned number where the middle of the range is defined to mean 01. So, with 8 bits, it's stored as a number from 0..255, but 0 is interpreted to mean -128, 128 is interpreted to mean 0, and 255 is interpreted as 127 (I may have a fencepost error there, but you get the idea).
If, in the process of normalization, this is decremented to 0 (meaning an actual exponent value of -128), then normalization stops, and the significand is stored as-is. In this case, the implicit bit from normalization it taken to be a 0 instead of a 1.
Most floating point hardware is designed to basically assume numbers will be normalized, so they assume that implicit bit is a 1. During the computation, they check for the possibility of a denormal number, and in that case they do roughly the equivalent of throwing an exception, and re-start the calculation with that taken into account. This is why computation with denormals often gets drastically slower than otherwise.
In case you wonder why it uses this strange format: IEEE floating point (like many others) is designed to ensure that if you treat its bit pattern as an integer of the same size, you can compare them as signed, 2's complement integers and they'll still sort into the correct order as floating point numbers. Since the sign of the number is in the most significant bit (where it is for a 2's complement integer) that's treated as the sign bit. The bits of the exponent are stored as the next most significant bits -- but if we used 2's complement for them, an exponent less than 0 would set the second most significant bit of the number, which would result in what looked like a big number as an integer. By using bias format, a smaller exponent leaves that bit clear, and a larger exponent sets it, so the order as an integer reflects the order as a floating point.
Normally (pardon the pun), the leading bit of a floating point number is always 1; thus, it doesn't need to be stored anywhere. The reason is that, if it weren't 1, that would mean you had chosen the wrong exponent to represent it; you could get more precision by shifting the mantissa bits left and using a smaller exponent.
The one exception is denormal/subnormal numbers, which are represented by all zero bits in the exponent field (the lowest possible exponent). In this case, there is no implicit leading 1 in the mantissa, and you have diminishing precision as the value approaches zero.
For normal floating point numbers, the number stored in the floating point variable is (ignoring sign) 1. mantissa * 2exponent-offset. The leading 1 is not stored in the variable.

Exponent in IEEE 754

Why exponent in float is displaced by 127?
Well, the real question is : What is the advantage of such notation in comparison to 2's complement notation?
Since the exponent as stored is unsigned, it is possible to use integer instructions to compare floating point values. the the entire floating point value can be treated as a signed magnitude integer value for purposes of comparison (not twos-compliment).
Just to correct some misinformation: it is 2^n * 1.mantissa, the 1 infront of the fraction is implicitly stored.
Note that there is a slight difference in the representable range for the exponent, between biased and 2's complement. The IEEE standard supports exponents in the range of (-127 to +128), while if it was 2's complement, it would be (-128 to +127). I don't really know the reason why the standard chooses the bias form, but maybe the committee members thought it would be more useful to allow extremely large numbers, rather than extremely small numbers.
#Stephen Canon, in response to ysap's answer (sorry, this should have been a follow up comment to my answer, but the original answer was entered as an unregistered user, so I cannot really comment it yet).
Stephen, obviously you are right, the exponent range I mentioned is incorrect, but the spirit of the answer still applies. Assuming that if it was 2's complement instead of biased value, and assuming that the 0x00 and 0xFF values would still be special values, then the biased exponents allow for (2x) bigger numbers than the 2's complement exponents.
The exponent in a 32-bit float consists of 8 bits, but without a sign bit. So the range is effectively [0;255]. In order to represent numbers < 2^0, that range is shifted by 127, becoming [-127;128].
That way, very small numbers can be represented very precisely. With a [0;255] range, small numbers would have to be represented as 2^0 * 0.mantissa with lots of zeroes in the mantissa. But with a [-127;128] range, small numbers are more precise because they can be represented as 2^-126 * 0.mantissa (with less unnecessary zeroes in the mantissa). Hope you get the point.

Two's complement binary form

In a TC++ compiler, the binary representation of 5 is (00000000000000101).
I know that negative numbers are stored as 2's complement, thus -5 in binary is (111111111111011). The most significant bit (sign bit) is 1 which tells that it is a negative number.
So how does the compiler know that it is -5? If we interpret the binary value given above (111111111111011) as an unsigned number, it will turn out completely different?
Also, why is the 1's compliment of 5 -6 (1111111111111010)?
The compiler doesn't know. If you cast -5 to unsigned int you'll get 32763.
The compiler knows because this is the convention the CPU uses natively. Your computer has a CPU that stores negative numbers in two's complement notation, so the compiler follows suit. If your CPU supported one's complement notation, the compiler would use that (as is the case with IEEE floats, incidentally).
The Wikipedia article on the topic explains how two's complement notation works.
The processor implements signed and unsigned instructions, which will operate on the binary number representation differently. The compiler knows which of these instructions to emit based on the type of the operands involved (i.e. int vs. unsigned int).
The compiler doesn't need to know if a number is negative or not, it simply emits the correct machine or intermediate language instructions for the types involved. The processor or runtime's implementation of these instructions usually doesn't much care if the number is negative or not either, as the formulation of two's complement arithmetic is such that it is the same for positive or negative numbers (in fact, this is the chief advantage of two's complement arithmetic). What would need to know if a number is negative would be something like printf(), and as Andrew Jaffe pointed out, the MSBit being set is indicative of a negative number in two's complement.
The first bit is set only for negative numbers (it's called the sign bit)
Detailed information is available here
The kewl part of two's complement is that the machine language Add, and Subtract instructions can ignore all that, and just do binary arithmetic and it just works...
i.e., -3 + 4
in Binary 2's complement, is
1111 1111 1111 1101 (-3)
+ 0000 0000 0000 0100 ( 4)
-------------------
0000 0000 0000 0001 ( 1)
let us give an example:
we have two numbers in two bytes in binary:
A = 10010111
B = 00100110
(note that the machine does not know the concept of signed or unsigned in this level)
now when you say "add" these two, what does the machine? it simply adds:
R = 10111101 (and carry bit : 1)
now, we -as compiler- need to interpret the operation. we have two options: the numbers can be signed or unsigned.
1- unsigned case: in c, the numbers are of type "unsigned char" and the values are 151 and 38 and the result is 189. this is trivial.
2 - signed case: we, the compiler, interpret the numbers according to their msb and the first number is -105 and the second is still 38. so -105 + 38 = -67. But -67 is 10111101. But this is what we already have in the result (R)! The result is same, the only difference is how the compiler interprets it.
The conclusion is that, no matter how we consider the numbers, the machine does the same operation on the numbers. But the compiler will interpret the results in its turn.
Note that, it is not the machine who knows the concept of 2's complement. it just adds two numbers without caring the content. The compiler, then, looks at the sign bit and decides.
When it comes to subtraction, this time again, the operation is unique: take 2's complement of the second number and add the two.
If the number is declared as a signed data type (and not type cast to an unsigned type), then the compiler will know that, when the sign bit is 1, it's a negative number. As for why 2's complement is used instead of 1's complement, you don't want to be able to have a value of -0, which 1's complement would allow you to do, so they invented 2's complement to fix that.
It's exactly that most significant bit -- if you know a number is signed, then if the MSB=1 the compiler (and the runtime!) knows to interpret it as negative. This is why c-like languages have both integers (positive and negative) and unsigned integers -- in that case you interpret them all as positive. Hence a signed byte goes from -128 to 127, but an unsigned byte from 0 to 255.