2's(N) = 1's(N-1) [closed] - bit-manipulation

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
There is an interesting fact:
The 2's complement of a number N is equivalent to 1's complement of the number N minus 1.
i.e.
2's(N) = 1's(N-1)
The below result is obvious.
2's(N) = 1's(N) + 1
How the first result can be proved with the help of second one?

Both the 2's complement and the 1's complement map one region of negative numbers onto a region of positive numbers in a manner that is easy for a CPU to deal with.
In the case of 8-bit numbers the 1's complement maps -127..-0 to 128..255, on the other hand the 2's complement maps -128..-1 to 128..255.
You can perform the 1's complement and the 2's complement again, so repeated application of 1's complement and 2's complement just gets you back to the same place. (So, we do not need to worry about concerning ourselves with positive numbers.)
If you look at the ranges provided each number from 128 to 255 is replaced by a single number, and this number has a difference of 1 between the 2's complement and the 1's complement.
Stated mathematically (for 8-bit numbers):
2's(N) = 1's(N) + 1
2's(N) = N ^ 0xFF + 1
2's(N) = N ^ 0xFF + 0xFE ^ 0xFF
2's(N) = (N + 0xFE) ^ 0xFF
2's(N) = (N - 1) ^ 0xFF
2's(N) = 1's(N-1)
Justification for each step:
Step 1: Given
Step 2: Definition of 1's complement
Step 3: Identity
Step 4: Distribution
Step 5: Identity (within 1's complement system)
Step 6: Definition of 1's complement

Related

Web compiler outputting weird results [duplicate]

This question already has answers here:
What is “two's complement”?
(24 answers)
Closed 5 years ago.
Take a look at this compiler:
https://ideone.com/Y09Z0N
The code is very simple:
cout << ~5;
And this outputs -6
Now I'm no C++ guru, but somehow I remember that the ~ operator should flip a numbers bits, and since 5 is 101, I would expect to get 010, which is 2, or more precisely 5 is 0000......101 and I should get 1111...010 which should be a really big negative number and not 6 (110). The question is: am I wrong about the operator or am I missing something?
Negative integers are typically represented in two's complement.
This allows basic operations such as addition and subtraction to work with negative numbers in the exactly the same way as positive ones. In that form, negatives are represented as:
00000000 = 0
11111111 = -1
11111110 = -2
11111101 = -3
11111100 = -4
11111011 = -5
11111010 = -6
So indeed -6 = ~5
This is actually a subtle thing about two's complement.
I will write your numbers in base 16, at the size of the variable (I assume 32 bits, but you can alter the answer for other sizes).
5 is 0x00000005. ~5 is its negation, 0xFFFFFFFA. Add 5 to that, and you see ~5 + 5 is 0xFFFFFFFF. Add one more and you get 0x00000000, or 0, due to overflow.
Two's complement representation has made it so that incrementing past 0x7FFFFFFF (or 2147483647) is the actual overflow (and 0x80000000 is -2147483648), and simply the increment from -1 to 0 has been made not to cause an overflow.
You may read more about two's complement representation and also ask for more info from me if you wish.

Write a function to copy 0-15 bits into 16-31 [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 years ago.
Improve this question
How to write a function to copy 0-15 bits into 16-31?
unsigned int n = 10; // 1010
copyFromTo(n);
assert(n == 655370);
n = 5;
copyFromTo(n);
assert(n == 327685);
n = 134;
copyFromTo(n);
assert(n == 8781958);
You want to copy the bits in 0-15 to 16-31. You should understand that multiplying by 2 is equivalent to shifting the bits of the number once to the left (moving to higher bits).
If your number is n, n << 16 would be shifting your number 16 bits to the left. This is equivalent to multiplying n with the 16th power of 2, which happens to be 65536.
To copy the bits, and keep the original bits in 0-15, the command n = n + (n << 16); should work. However, the issue with this is (as pointed out in the comments), that the upper 16-31 bits are still set in n + term. We also need to clear these bits. Note that 65535 corresponds to 2^16 - 1, and would have the first 0-15 bits as 1, and others as 0. So the correct command would be n = (n && 65535) + (n << 16);
This will do it:
void copyFromTo(unsigned int& n)
{
n = (n & 0xffff) * 0x00010001;
}
n << 16 to shift bits would do it, to move lower bits to upper? (edited) And after that, copying just the lower 16 bits into it

How numbers are stored? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 9 years ago.
Improve this question
What kind of method does the compiler use to store numbers? One char is 0-255. Two chars side by side are 0-255255. In c++ one short is 2 bytes. The size is 0-65535. Now, how does the compiler convert 255255 to 65535 and what happens with the - in the unsigned numbers?
The maximum value you can store in n bits (when the lowest value is 0 and the values represented are a continuous range), is 2ⁿ − 1. For 8 bits, this gives 255. For 16 bits, this gives 65535.
Your mistake is thinking that you can just concatenate 255 with 255 to get the maximum value in two chars - this is definitely wrong. Instead, to get from the range of 8 bits, which is 256, to the range of 16 bits, you would do 256 × 256 = 65536. Since our first value is 0, the maximum value is 65535, once again.
Note that a char is only guaranteed to have at least 8 bits and a short at least 16 bits (and must be at least as large as a char).
You have got the math totally wrong. Here's how it really is
since each bit can only take on either of two states only(1 and 0) , n bits as a whole can represents 2^n different quantities not numbers. When dealing with integers a standard short integer size of 2 bytes can represent 2^n - 1 (n=16 so 65535)which are mapped to decimal numbers in real life for compuational simplicity.
When dealing with 2 character they are two seperate entities (string is an array of characters). There are many ways to read the two characters on a whole, if you read is at as a string then it would be same as two seperate characters side by side. let me give you an example :
remember i will be using hexadecimal notation for simplicity!
if you have doubt mapping ASCII characters to hex take a look at this ascii to hex
for simplicity let us assume the characters stored in two adjacent positions are both A.
Now hex code for A is 0x41 so the memory would look like
1 byte ....... 2nd byte
01000100 01000001
if you were to read this from the memory as a string and print it out then the output would be
AA
if you were to read the whole 2 bytes as an integer then this would represent
0 * 2^15 + 1 * 2^14 + 0 * 2^13 + 0 * 2^12 + 0 * 2^11 + 1 * 2^10 + 0 * 2^9 + 0 * 2^8 + 0 * 2^7 + 1 * 2^6 + 0 * 2^5 + 0 * 2^4 + 0 * 2^3 + 0 * 2^2 + 0 * 2^1 + 1 * 2^0
= 17537
if unsigned integers were used then the 2 bytes of data would me mapped to integers between
0 and 65535 but if the same represented a signed value then then , though the range remains the same the biggest positive number that can be represented would be 32767. the values would lie between -32768 and 32767 this is because all of the 2 bytes cannot be used and the highest order bit is left to determine the sign. 1 represents negative and 2 represents positive.
You must also note that type conversion (two characters read as single integer) might not always give you the desired results , especially when you narrow the range. (example a doble precision float is converted to an integer.)
For more on that see this answer double to int
hope this helps.
When using decimal system it's true that range of one digit is 0-9 and range of two digits is 0-99. When using hexadecimal system the same thing applies, but you have to do the math in hexadecimal system. Range of one hexadecimal digit is 0-Fh, and range of two hexadecimal digits (one byte) is 0-FFh. Range of two bytes is 0-FFFFh, and this translates to 0-65535 in decimal system.
Decimal is a base-10 number system. This means that each successive digit from right-to-left represents an incremental power of 10. For example, 123 is 3 + (2*10) + (1*100). You may not think of it in these terms in day-to-day life, but that's how it works.
Now you take the same concept from decimal (base-10) to binary (base-2) and now each successive digit is a power of 2, rather than 10. So 1100 is 0 + (0*2) + (1*4) + (1*8).
Now let's take an 8-bit number (char); there are 8 digits in this number so the maximum value is 255 (2**8 - 1), or another way, 11111111 == 1 + (1*2) + (1*4) + (1*8) + (1*16) + (1*32) + (1*64) + (1*128).
When there are another 8 bits available to make a 16-bit value, you just continue counting powers of 2; you don't just "stick" the two 255s together to make 255255. So the maximum value is 65535, or another way, 1111111111111111 == 1 + (1*2) + (1*4) + (1*8) + (1*16) + (1*32) + (1*64) + (1*128) + (1*256) + (1*512) + (1*1024) + (1*2048) + (1*4096) + (1*8192) + (1*16384) + (1*32768).
It depends on the type: integral types must be stored as binary
(or at least, appear so to a C++ program), so you have one bit
per binary digit. With very few exceptions, all of the bits are
significant (although this is not required, and there is at
least one machine where there are extra bits in an int). On
a typical machine, char will be 8 bits, and if it isn't
signed, can store values in the range [0,2^8); in other words,
between 0 and 255 inclusive. unsigned short will be 16 bits
(range [0,2^16)), unsigned int 32 bits (range [0,2^32))
and unsigned long either 32 or 64 bits.
For the signed values, you'll have to use at least one of the
bits for the sign, reducing the maximum positive value. The
exact representation of negative values can vary, but in most
machines, it will be 2's complement, so the ranges will be
signed char:[-2^7,2^7-1)`, etc.
If you're not familiar with base two, I'd suggest you learn it
very quickly; it's fundamental to how all modern machines store
numbers. You should find out about 2's complement as well: the
usual human representation is called sign plus magnitude, and is
very rare in computers.

subtracting two values of unknown bitsize

I'm trying to subtract two values from each other using twos compliment. I have a problem with the overflowing bit. Since my container hold an unlimited bit sized integer, I don't know if the top bit of the result is really from the result or just the overflow. How would I get rid of the overflow without using - (I can't just do 1 << bits - 1 since that would involve using the container, which has no working operator- yet)
0b1111011111 - 0b111010000 -> 0b1111011111 + 0b000110000 -> 1000001111
vs (normally)
0b00000101 - 0b000000001 -> 0b00000101 + 0b11111111 -> 0b100000100 -> 0b00000100
If you calculate a - b you must somehow "arrange" the word - as you have to make for the 2 compliment a negation with the bitwidth of m=max(bitwidth(a), bitwidth(b)).
To get rid of the of overflow you just do mask = negate(1 << m), and apply the mask with bitwise and.
(Or you could just check that bit and treat it accordingly).
Your problem is that you are subtracting the 9-bit 111010000 from the 10-bit 1111011111. The two's complement of 111010000 is ...11111000110000, where the dots are trying to show that you have to pad to the left with as many 1 bits as you need. Here, you need 10 bits, so the two's complement of 111010000 is not 000110000 but 1000110000.
So you want to calculate 1111011111 + 1000110000 = 11000001111, which you just truncate to 10 bits to get the correct answer 1000001111.

Getting "carry" in x + y [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Best way to detect integer overflow in C/C++
If I have an expression x + y (in C or C++) where x and y are both of type uint64_t which causes an integer overflow, how do I detect how much it overflowed by (the carry), place than in another variable, then compute the remainder?
The remainder will already be stored in the sum of x + y, assuming you are using unsigned integers. Unsigned integer overflow causes a wrap around ( signed integer overflow is undefined ). See standards reference from Pascal in the comments.
The overflow can only be 1 bit. If you add 2 64 bit numbers, there cannot be more than 1 carry bit, so you just have to detect the overflow condition.
For how to detect overflow, there was a previous question on that topic: best way to detect integer overflow.
For z = x + y, z stores the remainder. The overflow can only be 1 bit and it's easy to detect. If you were dealing with signed integers then there's an overflow if x and y have the same sign but z has the opposite. You cannot overflow if x and y have different signs. For unsigned integers you just check the most significant bit in the same manner.
The approach in C and C++ can be quite different, because in C++ you can have operator overloading work for you, and wrap the integer you want to protect in some kind of class (for which you would overload the necessary operators. In C, you would have to wrap the integer you want to protect in a structure (to carry the remainder as well as the result) and call some function to do the heavy lifting.
Other than that, the approach in the two languages is the same: depending on the operation you want to perform (adding, in your example) you have to figure out the worst that could happen and handle it.
In the case of adding, it's quite simple: if the sum of the two is going to be greater than some maximum value (which will be the case if the difference of that maximum value M and one of the operands is greater than the other operand) you can calculate the remainder - the part that's too big: if ((M - O1) > O2) R = O2 - (M - O1) (e.g. if M is 100, O1 is 80 and O2 is 30, 30 - (100 - 80) = 10, which is the remainder).
The case of subtraction is equally simple: if your first operand is smaller than the second, the remainder is the second minus the first (if (O1 < O2) { Rem = O2 - O1; Result = 0; } else { Rem = 0; Result = O1 - O2; }).
It's multiplication that's a bit more difficult: your safest bet is to do a binary multiplication of the values and check that your resulting value doesn't exceed the number of bits you have. Binary multiplication is a long multiplication, just like you would do if you were doing a decimal multiplication by hand on paper, so, for example, 12 * 5 is:
0110
0100
====
0110
0
0110
0
++++++
011110 = 40
if you'd have a four-bit integer, you'd have an overflow of one bit here (i.e. bit 4 is 1, bit 5 is 0. so only bit 4 counts as an overflow).
For division you only really need to care about division by 0, most of the time - the rest will be handled be your CPU.
HTH
rlc