C++ Primer Exercise 4.25 converting binary number

C++ Primer Exercise 4.25 converting binary number - c++

I have a question regarding the exercise 4.25 in C++ Primer:
Exercise 4.25: What is the value of ~'q' << 6 on a machine with 32-bit
ints and 8 bit chars, that uses Latin-1 character set in which 'q' has the
bit pattern 01110001?
I have the solution in binary, but I don't understand how this converts to int:
int main()
{
cout << (std::bitset<8 * sizeof(~'q' << 6)>(~'q' << 6))<<endl;
cout << (~'q' << 6) << endl;
return 0;
}
After executing, the following 2 lines are printed:
11111111111111111110001110000000
-7296
The first line is what I expected, but I don't understand how is it converted to -7296.
I would expect a lot larger number. Also online converters give a different result from this.
Thanks in advance for the help.

In order to answer the question, we need to analyze what types are the partial expressions and what is the precedence of the operators in play.
For this we could refer to character constant and operator precedence.
'q' represents an int as described in the first link:
single-byte integer character constant, e.g. 'a' or '\n' or '\13'.
Such constant has type int ...
'q' thus is equivalent to the int value of its Latin-1 code (binary 01110001) but expanded to fit a 32-bit integer: 00000000 0000000 00000000 01110001.
The operator ~ precedes the operator << so the bitwise negation will be performed first. The results is 11111111 11111111 11111111 10001110.
Then a bitwise shift left is performed (dropping the left 6 bits of the value and padding with 0-s on the right): 11111111 11111111 11100011 10000000.
Now, regarding your second half of the question: cout << (~'q' << 6) << endl; interpretes this value as an int (signed). The standard states:
However, all C++ compilers use two's complement representation, and as of C++20, it is the only representation allowed by the standard, with the guaranteed range from −2N−1 to +2N−1−1
(e.g. -128 to 127 for a signed 8-bit type).
The two's complement value for 11111111 11111111 11100011 10000000 on a 32-bit machine results in the binary code for the decimal -7296.
The number is not large as you would expect, because when you start from -1 decimal (11111111 11111111 11111111 11111111 binary) and count down, the binary representations all have a lot of leading 1-s. The leftmost bit is 1 for a negative number and 0 for a positive number. When you expand the negative value to more bits (e.g. from 32 to 64), you would add more 1-s to the left until you reach 64 bits. More information can be found here. And here is an online converter.

I don't understand how is it converted to -7296.
It(the second value) is the Decimal from signed 2's complement

~'q' << 6
= (~'q') << 6
= (~113) << 6
= (~(0 0111 0001)) << 6
= 1 1000 1110 << 6
= -7296
You may have forgotten to add some 0's in front of 113.

Related

Output of hexadecimal in C++

This is an example in "A Complete Guide to Programming in C++" (Ulla Kirch-Prinz & Peter Prinz)
Example:
cout << dec << -1 << " " << hex << -1;
This statement causes the following output on a 32-bit system:
-1 ffffffff
Could anyone please explain why the second output is ffffffff?
I have trouble with the explanation in the book that says:
When octal or hexadecimal numbers are output, the bits of the number
to be output are always interpreted as unsigned! In other words, the
output shows the bit pattern of a number in octal or hexadecimal
format.

That's because most modern machines use two's complement signed integer representation.
In two's complement, the highest bit is used as a sign bit. If it is set, the number is considered negative, and to get its absolute (positive) value you need to subtract it from 2N, i.e. take it's two's complement.
If you had an 8-bit number, 00000001, it's two's complement would be 100000000-00000001 = 11111111 (or 0xFF hex). So -1 is represented as all 1's in binary form.
It's a very convenient system because you can perform arithmetic as if the numbers were unsigned (letting them overflow), then simply interpret the result as signed, and it will be correct.

compiler implement negative numbers for signed variables in ths following way, if the highest bit is true, your number implements like (VALUE_RANGE - variable)
here is an example on 8 bit numbers, i hope you will expand it.
char 0 1 2 ... 10 ... 126 127 -128 -127 -126 ... -10 ... -2 -1
uchar 0 1 2 ... 10 ... 126 127 128 129 130 ... 246 ... 254 255
hex 0 1 2 ... A ... 7E 7F 80 81 82 ... F6 ... FE FF

The text you've highlighted is saying that the output is equivalent to
cout << dec << -1 << " " << hex << (unsigned)-1;
In a 2's complement system (which any desktop PC is these days), the bit pattern for -1 has all bits set to 1.
For a 32 bit int therefore, the output will be ffffffff.
Finally, note that if int (and therefore unsigned) are 32 bits, the type of the literal 0xffffffff is unsigned.
References:
http://en.cppreference.com/w/cpp/language/integer_literal
https://en.wikipedia.org/wiki/Two%27s_complement

What is the purpose of "int mask = ~0;"?

I saw the following line of code here in C.
int mask = ~0;
I have printed the value of mask in C and C++. It always prints -1.
So I do have some questions:
Why assigning value ~0 to the mask variable?
What is the purpose of ~0?
Can we use -1 instead of ~0?

It's a portable way to set all the binary bits in an integer to 1 bits without having to know how many bits are in the integer on the current architecture.

C and C++ allow 3 different signed integer formats: sign-magnitude, one's complement and two's complement
~0 will produce all-one bits regardless of the sign format the system uses. So it's more portable than -1
You can add the U suffix (i.e. -1U) to generate an all-one bit pattern portably1. However ~0 indicates the intention clearer: invert all the bits in the value 0 whereas -1 will show that a value of minus one is needed, not its binary representation
1 because unsigned operations are always reduced modulo the number that is one greater than the largest value that can be represented by the resulting type

That on a 2's complement platform (that is assumed) gives you -1, but writing -1 directly is forbidden by the rules (only integers 0..255, unary !, ~ and binary &, ^, |, +, << and >> are allowed).

You are studying a coding challenge with a number of restrictions on operators and language constructions to perform given tasks.
The first problem is return the value -1 without the use of the - operator.
On machines that represent negative numbers with two's complement, the value -1 is represented with all bits set to 1, so ~0 evaluates to -1:
/*
* minusOne - return a value of -1
* Legal ops: ! ~ & ^ | + << >>
* Max ops: 2
* Rating: 1
*/
int minusOne(void) {
// ~0 = 111...111 = -1
return ~0;
}
Other problems in the file are not always implemented correctly. The second problem, returning a boolean value representing the fact the an int value would fit in a 16 bit signed short has a flaw:
/*
* fitsShort - return 1 if x can be represented as a
* 16-bit, two's complement integer.
* Examples: fitsShort(33000) = 0, fitsShort(-32768) = 1
* Legal ops: ! ~ & ^ | + << >>
* Max ops: 8
* Rating: 1
*/
int fitsShort(int x) {
/*
* after left shift 16 and right shift 16, the left 16 of x is 00000..00 or 111...1111
* so after shift, if x remains the same, then it means that x can be represent as 16-bit
*/
return !(((x << 16) >> 16) ^ x);
}
Left shifting a negative value or a number whose shifted value is beyond the range of int has undefined behavior, right shifting a negative value is implementation defined, so the above solution is incorrect (although it is probably the expected solution).

Loooong ago this was how you saved memory on extremely limited equipment such as the 1K ZX 80 or ZX 81 computer. In BASIC, you would
Let X = NOT PI
rather than
LET X = 0
Since numbers were stored as 4 byte floating points, the latter takes 2 bytes more than the first NOT PI alternative, where each of NOT and PI takes up a single byte.

There are multiple ways of encoding numbers across all computer architectures. When using 2's complement this will always be true:~0 == -1. On the other hand, some computers use 1's complement for encoding negative numbers for which the above example is untrue, because ~0 == -0. Yup, 1s complement has negative zero, and that is why it is not very intuitive.
So to your questions
the ~0 is assigned to mask so all the bits in mask are equal 1 -> making mask & sth == sth
the ~0 is used to make all bits equal to 1 regardless of the platform used
you can use -1 instead of ~0 if you are sure that your computer platform uses 2's complement number encoding
My personal thought - make your code as much platform-independent as you can. The cost is relatively small and the code becomes fail proof

Why is the binary equivalent calculation getting incorrect?

I wrote the following program to output the binary equivalent of a integer taking(I checked that int on my system is of 4 bytes) it is of 4 bytes. But the output doesn't come the right. The code is:
#include<iostream>
#include<iomanip>
using namespace std;
void printBinary(int k){
for(int i = 0; i <= 31; i++){
if(k & ((1 << 31) >> i))
cout << "1";
else
cout << "0";
}
}
int main(){
printBinary(12);
}
Where am I getting it wrong?

The problem is in 1<<31. Because 231 cannot be represented with a 32-bit signed integer (range −231 to 231 − 1), the result is undefined [1].
The fix is easy: 1U<<31.
[1]: The behavior is implementation-defined since C++14.

This expression is incorrect:
if(k & ((1<<31)>>i))
int is a signed type, so when you shift 1 31 times, it becomes the sign bit on your system. After that, shifting the result right i times sign-extends the number, meaning that the top bits remain 1s. You end up with a sequence that looks like this:
80000000 // 10000...00
C0000000 // 11000...00
E0000000 // 11100...00
F0000000 // 11110...00
F8000000
FC000000
...
FFFFFFF8
FFFFFFFC
FFFFFFFE // 11111..10
FFFFFFFF // 11111..11
To fix this, replace the expression with 1 & (k>>(31-i)). This way you would avoid undefined behavior* resulting from shifting 1 to the sign bit position.
* C++14 changed the definition so that shifting 1 31 times to the left in a 32-bit int is no longer undefined (Thanks, Matt McNabb, for pointing this out).

A typical internal memory representation of a signed integer value looks like:
The most significant bit (first from the right) is the sign bit and in signed numbers(like int) it represents whether the number is negative or not.
When you shift additional bits sign extension is performed to preserve the number's sign. This is done by appending digits to the most significant side of the number.(following a procedure dependent on the particular signed number representation used).
In unsigned numbers the first bit from the right is just the MSB of the represented number, thus when you shift additional bits no sign extension is performed.
Note: the enumeration of the bits starts from 0, so 1 << 31 replaces your sign bit and after that every bit shift operation to the left >> results in sign extension. (as pointed out by #dasblinkenlight)
So, the simple solution to your problem is to make the number unsigned (this is what U does in 1U << 31) before you start the bit manipulation. (as pointed out by #Yu Hao)
For further reading see signed number representations and two's complement.(as it's the most common)

NOT operation on integer value

I knew that ~ operator does NOT operation. But I could not make out the output of the following program (which is -65536). What exactly is happening?
#include <stdio.h>
int main(void) {
int b = 0xFFFF;
printf("%d",~b);
return 0;
}

Assuming 32-bit integers
int b = 0xFFFF; => b = 0x0000FFFF
~b = 0xFFFF0000
The top bit is now set. Assuming 2s complement, this means we have a negative number. Inverting the other bits then adding one gives 0x00010000 or 65536

When you assign the 16-bit value 0xffff to the 32-bit integer b, the variable b actually becomes 0x0000ffff. This means when you do the bitwise complement it becomes 0xffff0000 which is the same as decimal -65536.

The ~ operator in C++ is the bitwise NOT operator. It is also called the bitwise complement. This is flipping the bits of your signed integer.
For instance, if you had
int b = 8;
// b in binary = 1000
// ~b = 0111
This will flip the bits that represent the initial integer value provided.

It is doing a bitwise complement, this output may help you understand what is going on better:
std::cout << std::hex << " b: " << std::setfill('0') << std::setw(8) << b
<< " ~b: " << (~b) << " -65536: " << -65536 << std::endl ;
the result that I receive is as follows:
b: 0000ffff ~b: ffff0000 -65536: ffff0000
So we are setting the lower 16 bits to 1 which gives us 0000ffff and then we do a complement which will set the lower 16 bits to 0 and the upper 16 bits to 1 which gives us ffff0000 which is equal to -65536 in decimal.
In this case since we are working with bitwise operations, examining the data in hex gives us some insight into what is going on.

The result depends on how signed integers are represented on your platform. The most common representation is a 32-bit value using "2s complement" arithmetic to represent negative values. That is, a negative value -x is represented by the same bit pattern as the unsigned value 2^32 - x.
In this case, the original bit pattern has the lower 16 bits set:
0x0000ffff
The bitwise negation clears those bits and sets the upper 16 bits:
0xffff0000
Interpreting this as a negative number gives the value -65536.
Usually, you'll want to use unsigned types when you're messing around with bitwise arithmetic, to avoid this kind of confusion.

Your comment:
If it is NOT of 'b' .. then output should be 0 but why -65536
Suggests that you are expecting the result of:
uint32_t x = 0xFFFF;
uint32_t y = ~x;
to be 0.
That would be true for a logical not operation, such as:
uint32_t x = 0xFFFF;
uint32_t y = !x;
...but operator~ is not a logical NOT, but a bitwise not. There is a big difference.
A logical returns 0 for non-0 values (or false for true values), and 1 for 0 values.
But a bitwise not reverses each bit in a given value. So a binary NOT of 0xF:
0x0F: 00000000 11111111
~0x0F: 11111111 00000000
Is not zero, but 0xF0.

For every binary number in the integer, a bitwise NOT operation turns all 1s into 0s, and all 0s are turned to 1s.
So hexadecimal 0xFFFF is binary 1111 1111 1111 1111 (Each hexadecimal character is 4 bits, and F, being 15, is full 1s in all four bits)
You set a 32 bit integer to that, which means it's now:
0000 0000 0000 0000 1111 1111 1111 1111
You then NOT it, which means it's:
1111 1111 1111 1111 0000 0000 0000 0000
The topmost bit is the signing bit (whether it's positive or negative), so it gives a negative number.

Right shift with zeros at the beginning

I'm trying to do a kind of left shift that would add zeros at the beginning instead of ones. For example, if I left shift 0xff, I get this:
0xff << 3 = 11111000
However, if I right shift it, I get this:
0xff >> 3 = 11111111
Is there any operation I could use to get the equivalent of a left shift? i.e. I would like to get this:
00011111
Any suggestion?
Edit
To answer the comments, here is the code I'm using:
int number = ~0;
number = number << 4;
std::cout << std::hex << number << std::endl;
number = ~0;
number = number >> 4;
std::cout << std::hex << number << std::endl;
output:
fffffff0
ffffffff
Since it seems that in general it should work, I'm interested as to why this specific code doesn't. Any idea?

This is how C and binary arithmetic both work:
If you left shift 0xff << 3, you get binary: 00000000 11111111 << 3 = 00000111 11111000
If you right shift 0xff >> 3, you get binary: 00000000 11111111 >> 3 = 00000000 00011111
0xff is a (signed) int with the positive value 255. Since it is positive, the outcome of shifting it is well-defined behavior in both C and C++. It will not do any arithmetic shifts nor any kind or poorly-defined behavior.
#include <stdio.h>
int main()
{
printf("%.4X %d\n", 0xff << 3, 0xff << 3);
printf("%.4X %d\n", 0xff >> 3, 0xff >> 3);
}
Output:
07F8 2040
001F 31
So you are doing something strange in your program because it doesn't work as expected. Perhaps you are using char variables or C++ character literals.
Source: ISO 9899:2011 6.5.7.
EDIT after question update
int number = ~0; gives you a negative number equivalent to -1, assuming two's complement.
number = number << 4; invokes undefined behavior, since you left shift a negative number. The program implements undefined behavior correctly, since it either does something or nothing at all. It may print fffffff0 or it may print a pink elephant, or it may format the hard drive.
number = number >> 4; invokes implementation-defined behavior. In your case, your compiler preserves the sign bit. This is known as arithmetic shift, and arithmetic right shift works in such a way that the MSB is filled with whatever bit value it had before the shift. So if you have a negative number, you will experience that the program is "shifting in ones".
In 99% of all real world cases, it doesn't make sense to use bitwise operators on signed numbers. Therefore, always ensure that you are using unsigned numbers, and that none of the dangerous implicit conversion rules in C/C++ transforms them into signed numbers (for more info about dangerous conversions, see "the integer promotion rules" and "the usual arithmetic conversions", plenty of good info about those on SO).
EDIT 2, some info from the C99 standard's rationale document V5.10:
6.5.7 Bitwise shift operators
The description of shift operators in K&R suggests that shifting by a
long count should force the left operand to be widened to long before
being shifted. A more intuitive practice, endorsed by the C89
Committee, is that the type of the shift count has no bearing on the
type of the result.
QUIET CHANGE IN C89
Shifting by a long count no longer coerces the shifted operand to
long. The C89 Committee affirmed the freedom in implementation granted
by K&R in not requiring the signed right shift operation to sign
extend, since such a requirement might slow down fast code and since
the usefulness of sign extended shifts is marginal. (Shifting a
negative two’s complement integer arithmetically right one place is
not the same as dividing by two!)

If you explicitly shift 0xff it works as you expected
cout << (0xff >> 3) << endl; // 31
It should be possible only if 0xff is in type of signed width 8 (char and signed char on popular platforms).
So, in common case:
You need to use unsigned ints
(unsigned type)0xff
right shift works as division by 2(with rounding down, if I understand correctly).
So when you have 1 as first bit, you have negative value and after division it's negative again.

The two kinds of right shift you're talking about are called Logical Shift and Arithmetic Shift. C and C++ use logical shift for unsigned integers and most compilers will use arithmetic shift for a signed integer but this is not guaranteed by the standard meaning that the value of right shifting a negative signed int is implementation defined.
Since you want a logical shift you need to switch to using an unsigned integer. You can do this by replacing your constant with 0xffU.

To explain your real code you just need the C++ versions of the quotes from the C standard that Lundin gave in comments:
int number = ~0;
number = number << 4;
Undefined behavior. [expr.shift] says
The value of E1 << E2 is E1 left-shifted E2 bit positions; vacated
bits are zero-ﬁlled. If E1 has an unsigned type, the value of the
result is E1 × 2E2, reduced modulo one more than the maximum value
representable in the result type. Otherwise, if E1 has a signed type
and non-negative value, and E1×2E2 is representable in the result
type, then that is the resulting value; otherwise, the behavior is
undeﬁned.
number = ~0;
number = number >> 4;
Implementation-defined result, in this case your implementation gave you an arithmetic shift:
The value of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has
an unsigned type or if E1 has a signed type and a non-negative value,
the value of the result is the integral part of the quotient of
E1/2E2. If E1 has a signed type and a negative value, the resulting
value is implementation-deﬁned
You should use an unsigned type:
unsigned int number = -1;
number = number >> 4;
std::cout << std::hex << number << std::endl;
Output:
0x0fffffff

To add my 5 cents worth here...
I'm facing exactly the same problem as this.lau! I've done some perfunctory research on this and these are my results:
typedef unsigned int Uint;
#define U31 0x7FFFFFFF
#define U32 0xFFFFFFFF
printf ("U31 right shifted: 0x%08x\n", (U31 >> 30));
printf ("U32 right shifted: 0x%08x\n", (U32 >> 30));
Output:
U31 right shifted: 0x00000001 (expected)
U32 right shifted: 0xffffffff (not expected)
It would appear (in the absence of anyone with detailed knowledge) that the C compiler in XCode for Mac OS X v5.0.1 reserves the MSB as a carry bit that gets pulled along with each shift.
Somewhat annoyingly, the converse is NOT true:-
#define ST00 0x00000001
#define ST01 0x00000002
printf ("ST00 left shifted: 0x%08x\n", (ST00 << 30));
printf ("ST01 left shifted: 0x%08x\n", (ST01 << 30));
Output:
ST00 left shifted: 0x40000000
ST01 left shifted: 0x80000000
I concur completely with the people above that assert that the sign of the operand has no bearing on the behaviour of the shift operator.
Can anyone shed any light on the specification for the Posix4 implementation of C? I feel a definitive answer may rest there.
In the meantime, it appears that the only workaround is a construct along the following lines;-
#define CARD2UNIVERSE(c) (((c) == 32) ? 0xFFFFFFFF : (U31 >> (31 - (c))))
This works - exasperating but necessary.

Just in case if you want the first bit of negative number to be 0 after right shift what we can do is to take the XOR of that negative number with INT_MIN that will make its msb zero, I understand that its not appropriate arithmetic shift but will get work done

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js