int main(){
char c=0371;
cout<<hex<<(int) c;
return 0;
}
I converted c into binary system(011 111 001) then hexadecimal (f9). Then why does it give the result fffffff9 not f9?
If the char type on your system is signed, the value 0xf9 is a negative number (specifically, it’s -7). As a result, when you convert it to an integer, it gives the integer the numeric value -7, which has hexadecimal representation 0xFFFFFFF9 (if you’re using signed 32-bit integer representations).
If you explicitly make your character value an unsigned char, then C++ will cast it as though it has the positive value 249, which has equivalent integer value 249 with hexadecimal representation 0x000000F9.
Related
So, I was trying some stuff and I noticed that if you read one address that contains 11111111 as an unsigned int it shows that is value is 4294967295 and not 255, why is that?
Example of the code:
unsigned int a = 255;
unsigned int* p = &a;
char *p0 = (char*)p;
cout << (void*)p0 << " " <<(unsigned int)*p0<<endl;
char *p0 = (char*)p; Here p0 points at the lowest byte of p. What value that happens to be stored there depends on endianess. Could be 0x00 (Big Endian) or 0xFF (Little Endian).
What is CPU endianness?
*p0 here you access the contents of the supposed char. Since char has implementation-defined signedness, it could either manage numbers -128 to 127 (assuming 2's complement) or 0 to 255. Is char signed or unsigned by default?
Your particular system seems to use signed char, 2's complement, Little Endian. Meaning that the raw binary value 0xFF gets interpreted as -1 decimal.
(unsigned int)*p0 forces a conversion from -1 to unsigned int type. This is done after this well-defined rule (C 6.3.1.4, same rule applies in C++):
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or
subtracting one more than the maximum value that can be represented in the new type
until the value is in the range of the new type.
-1 + UINT_MAX + one more =
-1 + UINT_MAX + 1 =
UINT_MAX =
4294967295 on your 32 bit system.
Lesson learnt: never use char for anything but text, never use it to store values. If you need an 8 bit integer type or something to represent raw binary, use unsigned char or uint8_t.
Line 4 is reading the representation of the object a as char, which is just about the only time it's legal (not undefined) to access an object with an lvalue not matching the actual type of the object. C allows plain char to be a signed or unsigned type (implementation-defined). On the implementation you're using, char is a signed type, and unsigned int is represented in little endian (low-order byte first), so you read the bit pattern with 8 one bits, and this represents the char value -1.
Subsequently converting (via cast or implicitly) the value -1 to unsigned int reduces it modulo UINT_MAX+1, producing the value UINT_MAX.
Having following simple C++ code:
#include <stdio.h>
int main() {
char c1 = 130;
unsigned char c2 = 130;
printf("1: %+u\n", c1);
printf("2: %+u\n", c2);
printf("3: %+d\n", c1);
printf("4: %+d\n", c2);
...
return 0;
}
the output is like that:
1: 4294967170
2: 130
3: -126
4: +130
Can someone please explain me the line 1 and 3 results?
I'm using Linux gcc compiler with all default settings.
(This answer assumes that, on your machine, char ranges from -128 to 127, that unsigned char ranges from 0 to 255, and that unsigned int ranges from 0 to 4294967295, which happens to be the case.)
char c1 = 130;
Here, 130 is outside the range of numbers representable by char. The value of c1 is implementation-defined. In your case, the number happens to "wrap around," initializing c1 to static_cast<char>(-126).
In
printf("1: %+u\n", c1);
c1 is promoted to int, resulting in -126. Then, it is interpreted by the %u specifier as unsigned int. This is undefined behavior. This time the resulting number happens to be the unique number representable by unsigned int that is congruent to -126 modulo 4294967296, which is 4294967170.
In
printf("3: %+d\n", c1);
The int value -126 is interpreted by the %d specifier as int directly, and outputs -126 as expected (?).
In cases 1, 2 the format specifier doesn't match the type of the argument, so the behaviour of the program is undefined (on most systems). On most systems char and unsigned char are smaller than int, so they promote to int when passed as variadic arguments. int doesn't match the format specifier %u which requires unsigned int.
On exotic systems (which your target is not) where unsigned char is as large as int, it will be promoted to unsigned int instead, in which case 4 would have UB since it requires an int.
Explanation for 3 depends a lot on implementation specified details. The result depends on whether char is signed or not, and it depends on the representable range.
If 130 was a representable value of char, such as when it is an unsigned type, then 130 would be the correct output. That appears to not be the case, so we can assume that char is a signed type on the target system.
Initialising a signed integer with an unrepresentable value (such as char with 130 in this case) results in an implementation defined value.
On systems with 2's complement representation for signed numbers - which is ubiquitous representation these days - the implementation defined value is typically the representable value that is congruent with the unrepresentable value modulo the number of representable values. -126 is congruent with 130 modulo 256 and is a representable value of char.
A char is 8 bits. This means it can represent 2^8=256 unique values. A uchar represents 0 to 255, and a signed char represents -128 to 127 (could represent absolutely anything, but this is the typical platform implementation). Thus, assigning 130 to a char is out of range by 2, and the value overflows and wraps the value to -126 when it is interpreted as a signed char. The compiler sees 130 as an integer and makes an implicit conversion from int to char. On most platforms an int is 32-bit and the sign bit is the MSB, the value 130 easily fits into the first 8-bits, but then the compiler wants to chop of 24 bits to squeeze it into a char. When this happens, and you've told the compiler you want a signed char, the MSB of the first 8 bits actually represents -128. Uh oh! You have this in memory now 1000 0010, which when interpreted as a signed char is -128+2. My linter on my platform screams about this . .
I make that important point about interpretation because in memory, both values are identical. You can confirm this by casting the value in the printf statements, i.e., printf("3: %+d\n", (unsigned char)c1);, and you'll see 130 again.
The reason you see the large value in your first printf statement is that you are casting a signed char to an unsigned int, where the char has already overflowed. The machine interprets the char as -126 first, and then casts to unsigned int, which cannot represent that negative value, so you get the max value of the signed int and subtract 126.
2^32-126 = 4294967170 . . bingo
In printf statement 2, all the machine has to do is add 24 zeros to reach 32-bit, and then interpret the value as int. In statement one, you've told it that you have a signed value, so it first turns that to a 32-bit -126 value, and then interprets that -ve integer as an unsigned integer. Again, it flips how it interprets the most significant bit. There are 2 steps:
Signed char is promoted to signed int, because you want to work with ints. The char (is probably copied and) has 24 bits added. Because we're looking at a signed value, some machine instruction will happen to perform twos complement, so the memory here looks quite different.
The new signed int memory is interpreted as unsigned, so the machine looks at the MSB and interprets it as 2^32 instead of -2^31 as happened in the promotion.
An interesting bit of trivia, is you can suppress the clang-tidy linter warning if you do char c1 = 130u;, but you still get the same garbage based on the above logic (i.e. the implicit conversion throws away the first 24-bits, and the sign-bit was zero anyhow). I'm have submitted an LLVM clang-tidy missing functionality report based on exploring this question (issue 42137 if you really wanna follow it) 😉.
Why are two char like signed char and unsigned char with the same value not equal?
char a = 0xfb;
unsigned char b = 0xfb;
bool f;
f = (a == b);
cout << f;
In the above code, the value of f is 0.
Why it's so when both a and b have the same value?
There are no arithmetic operators that accept integers smaller than int. Hence, both char values get promoted to int first, see integral promotion
for full details.
char is signed on your platform, so 0xfb gets promoted to int(-5), whereas unsigned char gets promoted to int(0x000000fb). These two integers do not compare equal.
On the other hand, the standard in [basic.fundamental] requires that all char types occupy the same amount of storage and have the same alignment requirements; that is, they have the same object representation and all bits of the object representation participate in the value representation. Hence, memcmp(&a, &b, 1) == 0 is true.
The value of f and, in fact, the behaviour of the program, is implementation-defined.
In C++14 onwards1, for a signed char, and assuming that CHAR_MAX is 127, a will probably be -5. Formally speaking, if char is signed and the number does not fit into a char, then the conversion is implementation-defined or an implementation-defined signal is raised.
b is 251.
For the comparison a == b (and retaining the assumption that char is a narrower type than an int) both arguments are converted to int, with -5 and 251 therefore retained.
And that's false as the numbers are not equal.
Finally, note that on a platform where char, short, and int are all the same size, the result of your code would be true (and the == would be in unsigned types)! The moral of the story: don't mix your types.
1 C++14 dropped 1's complement and signed magnitude signed char.
Value range for (signed) char is [-128, 127]. (C++14 drops -127 as the lower range).
Value range for unsigned char is [0, 255]
What you're trying to assign to both of the variables is 251 in decimal. Since char cannot hold that value you will suffer a value overflow, as the following warning tells you.
warning: overflow in conversion from 'int' to 'char' changes value from '251' to ''\37777777773'' [-Woverflow]
As a result a will probably hold value -5 while b will be 251 and they are indeed not equal.
In C++ program I have some char buf[256]. The problem is here:
if (buf[pbyte] >= 0xFF)
buf[++pbyte] = 0x00;
This always returns false even when buf[pbyte] is equal to 255 AKA 0xFF as seen in immediate window and watch window. Thus the statement does not get executed. However, when I change this to below:
if (buf[pbyte] >= char(0xFF))
buf[++pbyte] = 0x00;
The program works; how come?
The literal 0xFF is treated as an int with the value 255.
When you compare a char to an int, the char is promoted to an int before the comparison.
On some platforms char is a signed value with a range like -128 to +127. On other platforms char is an unsigned value with a range like 0 to 255.
If your platform's char is signed, and its bit pattern is 0xFF, then it's probably -1. Since -1 is a valid int, the promotion stops there.
You end up comparing -1 to 255.
The solution is to eliminate the implicit conversions. You can write the comparison as:
if (buf[pbyte] == '\xFF') ...
Now both sides are chars, so they'll be promoted in the same manner and are directly comparable.
The problem is that char is signed on your system.
In the common 2s complement representation, a signed char with the "byte-value" 0xFF represents the integer -1, while 0xFF is an int with value 255. Thus, you are effectively comparing int(-1) >= int(255), which yields false. Keep in mind that they are compared as int because of arithmetic conversion rules, that is both operands are promoted ("cast implicitly") to int before comparing.
If you write char(0xFF) however, you do end up with the comparison -1 >= -1, which yields true as expected.
If you want to store numbers in the range [0,255], you should use unsigned char or std::uint8_t instead of char.
Due to the integer promotions in this condition
if (buf[pbyte] >= `0xFF`)
the two operands are converted to the type int (more precisely only the left operand is converted to an object of the type int because the right operand already has the type int). As it seems in your system the type char behaves as the type signed char then the value '\xFF' is a negative value equal to -1. Then this value is converted to an object of the type int you will get 0xFFFFFFFF (assuming that the type int occupies 4 bytes).
On the other hand the integer constant 0xFF is a positive value that has internal representation like 0x000000FF
Thus the condition in the if statement
if ( 0xFFFFFFFF >= `0x000000FF`)
yields false.
When you use the casting ( char )0xFF then the both operands have the same type and the same values.
Integer literals by default are converted to an int, assuming the value will fit into the int type, otherwise it is promoted to a long.
So in your code you are specifying 0xFF which is interpreted as an int type, i.e. 0x000000FF
#include <stdio.h>
int main() {
int i,n;
int a = 123456789;
void *v = &a;
unsigned char *c = (unsigned char*)v;
for(i=0;i< sizeof a;i++) {
printf("%u ",*(c+i));
}
char *cc = (char*)v;
printf("\n %d", *(cc+1));
char *ccc = (char*)v;
printf("\n %u \n", *(ccc+1));
}
This program generates the following output on my 32 bit Ubuntu machine.
21 205 91 7
-51
4294967245
First two lines of output I can understand =>
1st Line : sequence of storing of bytes in memory.
2nd Line : signed value of the second byte value (2's complement).
3rd Line : why such a large value ?
please explain the last line of output. WHY three bytes of 1's are added
because (11111111111111111111111111001101) = 4294967245 .
Apparently your compiler uses signed characters and it is a little endian, two's complement system.
123456789d = 075BCD15h
Little endian: 15 CD 5B 07
Thus v+1 gives value 0xCD. When this is stored in a signed char, you get -51 in signed decimal format.
When passed to printf, the character *(ccc+1) containing value -51 first gets implicitly type promoted to int, because variadic functions like printf has a rule stating that all small integer parameters will get promoted to int (the default argument promotions). During this promotion, the sign is preserved. You still have value -51, but for a 32 bit signed integer, this gives the value 0xFFFFFFCD.
And finally the %u specifier tells printf to treat this as an unsigned integer, so you end up with 4.29 bil something.
The important part to understand here is that %u has nothing to do with the actual type promotion, it just tells printf how to interpret the data after the promotion.
-51 store in 8 bit hex is 0xCD. (Assuming 2s compliment binary system)
When you pass it to a variadic function like printf, default argument promotion takes place and char is promoted to int with representation 0xFFFFFFCD (for 4 byte int).
0xFFFFFFCD interpreted as int is -51 and interpreted as unsigned int is 4294967245.
Further reading: Default argument promotions in C function calls
please explain the last line of output. WHY three bytes of 1's are
added
This is called sign extension. When a smaller signed number is assigned (converted) to larger number, its signed bit get's replicated to ensure it represents same number (for example in 1s and 2s compliment).
Bad printf format specifier
You are attempting to print a char with specifier "%u" which specifies unsigned [int]. Arguments which do not match the conversion specifier in printf is undefined behavior from 7.19.6.1 paragraph 9.
If a conversion specification is invalid, the behavior is undefined. If
any argument is not the correct type for the corresponding conversion
specification, the behavior is undefined.
Use of char to store signed value
Also to ensure char contains signed value, explicitly use signed char as char may behave as signed char or unsigned char. (In latter case, output of your snippet may be 205 205). In gcc you can force char to behave as unsigned char with -funsigned-char option.