Why does C++ prints unsigned char value as negative?

Why does C++ prints unsigned char value as negative? - c++

I'm trying to understand the implicit conversion rules in C++ and I understood that when there are one operation between two primary types the "lower type" is promoted to the "higher type", so let say for:
int a = 5;
float b = 0.5;
std::cout << a + b << "\n";
should print 5.5 because 'a' gets promoted to float type. I also understood that unsigned types are "higher types" than the signed counter parts so:
int c = 5;
unsigned int d = 10;
std::cout << c - d << "\n";
prints 4294967291 because 'c' gets promoted to a unsigned int and since unsigned types wraps around when less than zero we get that big number.
However for the following case I don't understand why I am getting -105 instead of a positive number.
#include <iostream>
int main(void) {
unsigned char a = 150;
std::cout << static_cast<int>(a - static_cast<unsigned char>(255)) << "\n";
return 0;
}
I guess that this code:
a - static_cast<unsigned char>(255)
should result in a positive number so the final cast (to int) shouldn't affect the final result right?

You're missing the (implicit) conversion from unsigned char to int that happens to perform the - (subtract) operation. This integer promotion happens any time you try to apply any integer operation to a value of some integral type smaller than int.

Quoting from C++14, chapter § 5.7
The additive operators + and - group left-to-right. The usual arithmetic conversions are performed for
operands of arithmetic or enumeration type.
and for usual arithmetic conversions, (specific for this case)
....
Otherwise, the integral promotions (4.5) shall be performed on both operands
and, finally, for integral promotions, chapter § 4.5
A prvalue of an integer type other than bool, char16_t, char32_t, or wchar_t whose integer conversion
rank (4.13) is less than the rank of int can be converted to a prvalue of type int if int can represent all
the values of the source type; otherwise, the source prvalue can be converted to a prvalue of type unsigned
int.
Hence, the unsigned char operands are promoted to int and then , the result is calculated.

There are answers here showing what is happening. I won't repeat. I am going to give you a simple tool to help you.
Here is a trick you can do to quickly find the type of an expression:
template <class> struct Name; // purposely no definition given
Name<decltype(your_expression)> n;
This will generate a compiler error for undefined template 'Name', but what we are really interested in is the type of the template argument which will appear in the error message.
E.g. if you want to see what type you get when you do arithmetic between two unsigned char:
#include <utility>
template <class> struct Name;
auto test()
{
Name<decltype(std::declval<unsigned char>() - std::declval<unsigned char>())> n;
// or
unsigned char a{};
Name<decltype(a - a)> n2;
}
will get you
error: implicit instantiation of undefined template 'Name<int>'
which will show you that the type of the expression is int
Of course this won't tell you the rules involved, but it is a quick starting point to see the type of the expression or to verify your assumption of the type of the expression.

Related

Type of char multiply by another char

What is the type of the result of a multiplication of two chars in C/C++?
unsigned char a = 70;
unsigned char b = 58;
cout << a*b << endl; // prints 4060, means no overflow
cout << (unsigned int)(unsigned char)(a*b) << endl; // prints 220, means overflow
I expect the result of multiplying two number of type T (e.g., char, short, int) becomes T. It seems it is int for char because sizeof(a*b) is 4.
I wrote a simple function to check the size of the result of the multiplication:
template<class T>
void print_sizeof_mult(){
T a;
T b;
cout << sizeof(a*b) << endl;
}
print_sizeof_mult<char>(), print_sizeof_mult<short>(), and print_sizeof_mult<int>() are 4 and print_sizeof_mult<long>() is 8.
Are these result only for my particular compiler and machine architecture? Or is it documented somewhere that what type is the output of basic operations in C/C++?

According to the C++ Standard (4.5 Integral promotions)
1 A prvalue of an integer type other than bool, char16_t, char32_t, or
wchar_t whose integer conversion rank (4.13) is less than the rank of
int can be converted to a prvalue of type int if int can represent all
the values of the source type; otherwise, the source prvalue can be
converted to a prvalue of type unsigned int.
and (5 Expressions)
10 Many binary operators that expect operands of arithmetic or
enumeration type cause conversions and yield result types in a similar
way. The purpose is to yield a common type, which is also the type of
the result. This pattern is called the usual arithmetic conversions,
which are deﬁned as follows:
....
Otherwise, the integral promotions (4.5) shall be performed on both
operands.61 Then the following rules shall be applied to the promoted
operands:
and at last (5.6 Multiplicative operators)
2 The operands of * and / shall have arithmetic or unscoped
enumeration type; the operands of % shall have integral or unscoped
enumeration type. The usual arithmetic conversions are performed on
the operands and determine the type of the result.
Types char and short have conversion ranks that are less than the rank of the type int.

Internal temporal variable in mathematic calculation in C++

The following example is used to illustrate my question:
#include <iostream>
#include <string>
int main()
{
signed char p;
signed char temp=100;
signed char t=4;
p = (temp+temp+temp+temp)/t;
std::cout << "Hello, " << int(p)<< "!\n";
}
In the above codes, variable p is defined as the average of four singed char variables. However, the sum of the signed char variable (temp+temp+temp+temp) will be larger than the range of signed char. So my question is how C++ handle this situation.

However, the sum of the signed char variable (temp+temp+temp+temp) will be larger than the range of signed char.
That does not matter as char will be promoted to int due to integral promotion. Details can be found here. So operations will be done over type int and you will get expected result.

Nothing happens, because of integral promotion
Prvalues of small integral types (such as char) may be converted to
prvalues of larger integral types (such as int). In particular,
arithmetic operators do not accept types smaller than int as
arguments, and integral promotions are automatically applied after
lvalue-to-rvalue conversion, if applicable. This conversion always
preserves the value.
(temp+temp+temp+temp) will return an integer.
(temp+temp+temp+temp)/t will be inside of the char range.
so p == temp

Why does adding a constant to an int8_t promote it to a larger type?

In gcc, adding or subtracting a constant to an integral type smaller than int results in an int.
#include <cstdint>
#include <cstdio>
int main()
{
int8_t wat = 5;
printf("%zd\n", sizeof(wat + 1));
return 0;
}
gives 4. I noticed this when using a simple += statement with -Wconversion. With that warning flag set,
#include <cstdint>
int main()
{
int8_t wat = 5;
wat += 5;
return 0;
}
gives
wat.cpp:7:6: warning: conversion to ‘int8_t {aka signed char}’ from ‘int’ may alter its value [-Wconversion]
Is there any way to suppress this warning? Why is this occuring? Casting doesn't seem to do the trick.

According to the C++ Standard
10 Many binary operators that expect operands of arithmetic or
enumeration type cause conversions and yield result types in a similar
way. The purpose is to yield a common type, which is also the type of
the result. This pattern is called the usual arithmetic conversions,
The usual arithmetic conversion includes the integral promotion
1 A prvalue of an integer type other than bool, char16_t, char32_t, or
wchar_t whose integer conversion rank (4.13) is less than the rank of
int can be converted to a prvalue of type int if int can represent all
the values of the source type; otherwise, the source prvalue can be
converted to a prvalue of type unsigned int.
So in this expression
wat += 5;
that is equivalent to
wat = wat + 5;
wat in the right side of the assignment is converted to type int and the type of expression wat + 5 is int. As the range of values of type int is greater than of type int8_t the compiler issues the warning.
Also the message of the warning shows how to suppress the warning: [-Wconversion]

unsigned to signed conversion

Consider the following:
#include <iostream>
int main() {
unsigned int x = 3;
unsigned int y = 5;
std::cout << "a: " << x - y << std::endl;
std::cout << "b: " << ((int)x) - y << std::endl;
std::cout << "c: " << x - ((int)y) << std::endl;
std::cout << "d: " << ((int)x) - ((int)y) << std::endl;
}
$ g++ -Wconversion -Wall uint_stackoverflow.cc -o uint_stackoverflow && ./uint_stackoverflow
a: 4294967294
b: 4294967294
c: 4294967294
d: -2
I understand why "a" doesn't give the expected result. But why "b" and "c" fail puzzles me. For "b" I thought after casting "x" to "int" the result will be "int" again.
Could you please enlighten me?
edit: Shouldn't the compiler warn? g++ (Ubuntu/Linaro 4.4.4-14ubuntu5) 4.4.5
Thanks,
Somebody

In arithmetic operations, if any of the operand is unsigned, the other operand converts to unsigned (if its signed), and the result of the operations will be unsigned also.
Also, casting unsigned to signed and then doing the operation doesn't change the bit representation of the operand. On a two's complement architecture (i.e almost every modern architecture), (int)x has same bit representation as x has, only their interpretation changes when calculating their value in decimal system. But the important point is that the arithmetic operation is performed on the bit representations (not on their values in decimal system). And since the casting doesn't change the bit representation, the bit representation of the result will also NOT change.
C++03 Standard says in §5/9:
Many binary operators that expect
operands of arithmetic or enumeration
type cause conversions and yield
result types in a similar way. The
purpose is to yield a common type,
which is also the type of the result.
This pattern is called the usual
arithmetic conversions, which are
defined as follows:
[...]
Otherwise, if either operand is
unsigned, the other shall be converted
to unsigned.

Quoting the standard as usual....
For C++98, §[expr]/9:
Many binary operators that expect
operands of arithmetic or enumeration
type cause conversions and yield
result types in a similar way. The
purpose is to yield a common type,
which is also the type of the result.
This pattern is called the usual
arithmetic conversions, which are
defined as follows:
If either operand is of type long double, the other shall be converted
to long double.
Otherwise, if either operand is double, the other shall be converted
to double.
Otherwise, if either operand is float, the other shall be converted
to float.
Otherwise, the integral promotions (4.5) shall be performed on both
operands.54)
Then, if either operand is unsigned long the other shall be converted to
unsigned long.
Otherwise, if one operand is a long int and the other unsigned int,
then if a long int can represent all
the values of an unsigned int, the
unsigned int shall be converted to a
long int; otherwise both operands
shall be converted to unsigned long
int.
Otherwise, if either operand is long, the other shall be converted
to long.
Otherwise, if either operand is unsigned, the other shall be
converted to unsigned.
[Note: otherwise, the only remaining
case is that both operands are int ]
Basically, it can be summarized as
long double > double > float > unsigned long > long > unsigned > int
(Types smaller than int will be converted to int)
The text is changed for C++0x (§[expr]/10) after the 5th item, but the effect on OP's code is the same: the int will be converted to an unsigned.

It's because there is a heirarchy of data types when performing implicit conversions, unsigned integers have a higher precedence than signed integers so b and c are being cast back to unsigned integers which is why you're seeing the results you are.
If you are unsure of the types but know the type of the result you want then you should cast both x and y as you did in d.
This has a really good explanation of type conversion:
http://www.learncpp.com/cpp-tutorial/44-type-conversion-and-casting/

Addition of two chars produces int

I've made a simple program and compiled it with GCC 4.4/4.5 as follows:
int main ()
{
char u = 10;
char x = 'x';
char i = u + x;
return 0;
}
g++ -c -Wconversion a.cpp
And I've got the following:
a.cpp: In function ‘int main()’:
a.cpp:5:16: warning: conversion to ‘char’ from ‘int’ may alter its value
The same warning I've got for the following code:
unsigned short u = 10;
unsigned short x = 0;
unsigned short i = u + x;
a.cpp: In function ‘int main()’:
a.cpp:5:16: warning: conversion to ‘short unsigned int’ from ‘int’ may alter its value
Could anyone please explain me why addition of two chars (or two unsigned shorts) produces int?
Is it a compiler bug or is it standard compliant?
Thanks.

What you're seeing is the result of the so-called "usual arithmetic conversions" that occur during arithmetic expressions, particularly those that are binary in nature (take two arguments).
This is described in §5/9:
Many binary operators that expect operands of arithmetic or enumeration type cause conversions and yield result types in a similar way. The purpose is to yield a common type, which is also the type of the result. This pattern is called the usual arithmetic conversions, which are defined as follows:
— If either operand is of type long double, the other shall be converted tolong double.
— Otherwise, if either operand is double, the other shall be converted to double.
— Otherwise, if either operand is float, the other shall be converted to float.
— Otherwise, the integral promotions (4.5) shall be performed on both operands.54)
— Then, if either operand is unsigned long the other shall be converted to unsigned long.
— Otherwise, if one operand is a long int and the other unsigned int, then if a long int can represent all the values of an unsigned int, the unsigned int shall be converted to a long int; otherwise both operands shall be converted to unsigned long
int.
— Otherwise, if either operand is long, the other shall be converted to long.
— Otherwise, if either operand is unsigned, the other shall be converted to unsigned.
[Note: otherwise, the only remaining case is that both operands are int]
The promotions alluded to in §4.5 are:
1 An rvalue of type char, signed char, unsigned char, short int, or unsigned short intcan be converted to an rvalue of type int if int can represent all the values of the source type; otherwise, the source rvalue can be converted to an rvalue of type unsigned int.
2 An rvalue of type wchar_t (3.9.1) or an enumeration type (7.2) can be converted to an rvalue of the first of the following types that can represent all the values of its underlying type: int, unsigned int, long, or unsigned long.
3 An rvalue for an integral bit-field (9.6) can be converted to an rvalue of type int if int can represent all the values of the bit-field; otherwise, it can be converted to unsigned int if unsigned int can represent all the values of the bit-field. If the bit-field is larger yet, no integral promotion applies to it. If the bit-field has an enumerated type, it is treated as any other value of that type for promotion purposes.
4 An rvalue of type bool can be converted to an rvalue of type int, with false becoming zero and true becoming one.
5 These conversions are called integral promotions.
From here, sections such as "Multiplicative operators" or "Additive operators" all have the phrase: "The usual arithmetic conversions are performed..." to specify the type of the expression.
In other words, when you do integral arithmetic the type is determined with the categories above. In your case, the promotion is covered by §4.5/1 and the type of the expressions are int.

When you do any arithmetic operation on char type, the result it returns is of int type.
See this:
char c = 'A';
cout << sizeof(c) << endl;
cout << sizeof(+c) << endl;
cout << sizeof(-c) << endl;
cout << sizeof(c-c) << endl;
cout << sizeof(c+c) << endl;
Output:
1
4
4
4
4
Demonstration at ideone : http://www.ideone.com/jNTMm

when you are adding these two characters with each other they are first being promoted to int.
The result of an addition is an rvalue which is implicitly promoted to
type int if necessary, and if an int can contain the resulting value.
This is true on any platform where sizeof(int) > sizeof(char).
But beware of the fact that char might be treated as signed char by
your compiler.
These links can be of further help - wiki and securecoding

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js