Strange type deduction - c++

Today I saw a really strange type deduction. Here is the code:
unsigned int y = 15;
int k = 5;
auto t = k - y / 2;
Since k is int, I assumed that type of t should be int too. But to my surprise, its type is unsigned int. I cannot find why type is deduced as unsigned int. Any idea why?

Due to the usual arithmetic conversions if two operands have the same conversion rank and one of the operands has unsigned integer type then the type of the expression has the same unsigned integer type.
From the C++ 17 Standard (5 Expressions, p.#10)
— Otherwise, if the operand that has unsigned integer type has rank
greater than or equal to the rank of the type of the other operand,
the operand with signed integer type shall be converted to the type of
the operand with unsigned integer type.
Pay attention to that the conversion rank of the type unsigned int is equal to the rank of the type int (signed int). From the C++ 17 Standard (4.13 Integer conversion rank, p.#1)
— The rank of any unsigned integer type shall equal the rank of the
corresponding signed integer type
A more interesting example is the following. Let's assume that there are two declarations
unsigned int x = 0;
long y = 0;
and the width of the both types is the same and equal for example to 4 bytes. As it is known the rank of the type long is greater than the rank of the type unsigned int. A question arises what id the type of the expression
x + y
The type of the expression is unsigned long.:)
Here is a demonstrative program but instead of the types long and unsigned int there are used the types long long and unsigned long.
#include <iostream>
#include <iomanip>
#include <type_traits>
int main()
{
unsigned long int x = 0;
long long int y = 0;
std::cout << "sizeof( unsigned long ) = "
<< sizeof( unsigned long )
<< '\n';
std::cout << "sizeof( long long ) = "
<< sizeof( long long )
<< '\n';
std::cout << std::boolalpha
<< std::is_same<unsigned long long, decltype( x + y )>::value
<< '\n';
return 0;
}
The program output is
sizeof( unsigned long ) = 8
sizeof( long long ) = 8
true
That is the type of the expression x + y is unsigned long long though neither operand of the expression has this type.

Related

Comparing unsigned integer with negative literals

I have this simple C program.
#include <stdlib.h>
#include <stdio.h>
#include <stdbool.h>
bool foo (unsigned int a) {
return (a > -2L);
}
bool bar (unsigned long a) {
return (a > -2L);
}
int main() {
printf("foo returned = %d\n", foo(99));
printf("bar returned = %d\n", bar(99));
return 0;
}
Output when I run this -
foo returned = 1
bar returned = 0
Recreated in godbolt here
My question is why does foo(99) return true but bar(99) return false.
To me it makes sense that bar would return false. For simplicity lets say longs are 8 bits, then (using twos complement for signed value):
99 == 0110 0011
-2 == unsigned 254 == 1111 1110
So clearly the CMP instruction will see that 1111 1110 is bigger and return false.
But I dont understand what is going on behind the scenes in the foo function. The assembly for foo seems to hardcode to always return mov eax,0x1. I would have expected foo to do something similar to bar. What is going on here?
This is covered in C classes and is specified in the documentation. Here is how you use documents to figure this out.
In the 2018 C standard, you can look up > or “relational expressions” in the index to see they are discussed on pages 68-69. On page 68, you will find clause 6.5.8, which covers relational operators, including >. Reading it, paragraph 3 says:
If both of the operands have arithmetic type, the usual arithmetic conversions are performed.
“Usual arithmetic conversions” is listed in the index as defined on page 39. Page 39 has clause 6.3.1.8, “Usual arithmetic conversions.” This clause explains that operands of arithmetic types are converted to a common type, and it gives rules determining the common type. For two integer types of different signedness, such as the unsigned long and the long int in bar (a and -2L), it says that, if the unsigned type has rank greater than or equal to the rank of the other type, the signed type is converted to the unsigned type.
“Rank” is not in the index, but you can search the document to find it is discussed in clause 6.3.1.1, where it tells you the rank of long int is greater than the rank of int, and the any unsigned type has the same rank as the corresponding type.
Now you can consider a > -2L in bar, where a is unsigned long. Here we have an unsigned long compared with a long. They have the same rank, so -2L is converted to unsigned long. Conversion of a signed integer to unsigned is discussed in clause 6.3.1.3. It says the value is converted by wrapping it modulo ULONG_MAX+1, so converting the signed long −2 produces a ULONG_MAX+1−2 = ULONG_MAX−1, which is a large integer. Then comparing a, which has the value 99, to a large integer with > yields false, so zero is returned.
For foo, we continue with the rules for the usual arithmetic conversions. When the unsigned type does not have rank greater than or equal to the rank of the signed type, but the signed type can represent all the values of the type of the operand with unsigned type, the operand with the unsigned type is converted to the operand of the signed type. In foo, a is unsigned int and -2L is long int. Presumably in your C implementation, long int is 64 bits, so it can represent all the values of a 32-bit unsigned int. So this rule applies, and a is converted to long int. This does not change the value. So the original value of a, 99, is compared to −2 with >, and this yields true, so one is returned.
In the first function
bool foo (unsigned int a) {
return (a > -2L);
}
the both operands of the expression a > -2L have the type long (the first operand is converted to the type long due to the usual arithmetic conversions because the rank of the type long is greater than the rank of the type unsigned int and all values of the type unsigned int in the used system can be represented by the type long). And it is evident that the positive value 99L is greater than the negative value -2L.
The first function could produce the result 0 provided that sizeof( long ) is equal to sizeof( unsigned int ). In this case the type long is unable to represent all (positive) values of the type unsigned int. As a result due to the usual arithmetic conversions the both operands will be converted to the type unsigned long.
For example running the function foo using MS VS 2019 where sizeof( long ) is equal to 4 as sizeof( unsigned int ) you will get the result 0.
Here is a demonstration program written in C++ that visually shows the reason why the result of a call of the function foo using MS VS 2019 can be equal to 0.
#include <iostream>
#include <iomanip>
#include <type_traits>
int main()
{
unsigned int x = 0;
long y = 0;
std::cout << "sizeof( unsigned int ) = " << sizeof( unsigned int ) << '\n';
std::cout << "sizeof( long ) = " << sizeof(long) << '\n';
std::cout << "std::is_same_v<decltype( x + y ), unsigned long> is "
<< std::boolalpha
<< std::is_same_v<decltype( x + y ), unsigned long>
<< '\n';
}
The program output is
sizeof( unsigned int ) = 4
sizeof( long ) = 4
std::is_same_v<decltype( x + y ), unsigned long> is true
That is in general the result of the first function is implementation defined.
In the second functions
bool bar (unsigned long a) {
return (a > -2L);
}
the both operands have the type unsigned long (again due to the usual arithmetic conversions and ranks of the types unsigned long and signed long are equal each other, so an object of the type signed long is converted to the type unsigned long) and -2L interpreted as unsigned long is greater than 99.
The reason for this has to do with the rules of integer conversions.
In the first case, you compare an unsigned int with a long using the > operator, and in the second case you compare a unsigned long with a long.
These operands must first be converted to a common type using the usual arithmetic conversions. These are spelled out in section 6.3.1.8p1 of the C standard, with the following excerpt focusing on integer conversions:
If both operands have the same type, then no further conversion is
needed.
Otherwise, if both operands have signed integer types or both have
unsigned integer types, the operand with the type of lesser integer
conversion rank is converted to the type of the operand with greater
rank.
Otherwise, if the operand that has unsigned integer type has rank
greater or equal to the rank of the type of the other operand, then
the operand with signed integer type is converted to the type of the
operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer type can
represent all of the values of the type of the operand with unsigned
integer type, then the operand with unsigned integer type is converted
to the type of the operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type
corresponding to the type of the operand with signed integer type.
In the case of comparing an unsigned int with a long the second bolded paragraph applies. long has higher rank and (assuming long is 64 bit and int is 32 bit) can hold all values than an unsigned int can, so the unsigned int operand a is converted to a long. Since the value in question is in the range of long, section 6.3.1.3p1 dictates how the conversion happens:
When a value with integer type is converted to another integer type
other than _Bool, if the value can be represented by the new type, it
is unchanged
So the value is preserved and we're left with 99 > -2 which is true.
In the case of comparing an unsigned long with a long, the first bolded paragraph applies. Both types are of the same rank with different signs, so the long constant -2L is converted to unsigned long. -2 is outside the range of an unsigned long so a value conversion must happen. This conversion is specified in section 6.3.1.3p2:
Otherwise, if the new type is unsigned, the value is converted by
repeatedly adding or subtracting one more than the maximum value that
can be represented in the new type until the value is in the range of
the new type.
So the long value -2 will be converted to the unsigned long value 264-2, assuming unsigned long is 64 bit. So we're left with 99 > 264-2, which is false.
I think what is happening here is implicit promotion by the compiler. When you perform comparison on two different primitives, the compiler will promote one of them to the same type as the other. I believe the rules are that the type with the larger possible value is used as the standard.
So in foo() you are implicitly promoting your argument to a signed long type and the comparison works as expected.
In bar() your argument is an unsigned long, which has a larger maximum value than signed long. Here the compiler promotes -2L to unsigned long, which turns into a very large number.

Why is the output of fixed width unsigned integer negative while unsigned integer output wraps around as expected?

#include <iostream>
#define TRY_INT
void testRun()
{
#ifdef TRY_INT //test with unsigned
unsigned int value1{1}; //define some unsigned variables
unsigned int value2{1};
unsigned int value3{2};
#else //test with fixed width
uint16_t value1{1}; //define fixed width unsigned variables
uint16_t value2{1};
uint16_t value3{2};
#endif
if ( value1 > value2 - value3 )
{
std::cout << value1 << " is bigger than: " << value2 - value3 << "\n";
}
else
{
std::cout << value1 << " is smaller than: " << value2 - value3 << "\n";
}
}
int main()
{
testRun();
return 0;
}
with unsigned integers I get:
1 is smaller than: 4294967295
with fixed width unsigned int, output is:
1 is smaller than: -1
My expectation was it would wrap around as well, does this have something to do with std::cout?
I guess it is caused by integral promotion. Citing form cppreference:
...arithmetic operators do not accept types smaller than int as arguments, and integral promotions are automatically applied after lvalue-to-rvalue conversion, if applicable.
unsigned char, char8_t (since C++20) or unsigned short can be converted to int if it can hold its entire value range...
Consequently, if uint16_t is just an alias for unsigned short on your implementation, value2 - value3 is calculated with int type and the result is also int, that's why -1 is shown.
With unsigned int, no promotion is applied and the whole calculation is performed in this type.
In the latest online C++ Draft, see [conv.prom/1]:
A prvalue of an integer type other than bool, char16_­t, char32_­t, or wchar_­t whose integer conversion rank is less than the rank of int can be converted to a prvalue of type int if int can represent all the values of the source type; otherwise, the source prvalue can be converted to a prvalue of type unsigned int.
unsigned int is equivalent to uint32_t and unsigned short int is equivalent to uint16_t.
Therefore, if you use unsigned short int instead of unsigned int you will get the same behavior as for uint16_t.
Why do you get -1?
Integral promotion will try to convert unsigned short int to int if int can hold all possible values of unsigned short int. On the other hand, if that is not the case, integral promotion to unsigned int will be performed.
Therefore the subtraction is most likely done in the type int, not uint16_t.

C++ Standard: Strange signed/unsigned arithmetic division behavior for 32 and 64 bits

I encountered a wrong behavior of my code. Investigating it leads me to a short example which shows the problem:
//g++ 5.4.0
#include <iostream>
#include <vector>
int main()
{
std::vector<short> v(20);
auto D = &v[5] - &v[10];
auto C = D / sizeof(short);
std::cout << "C = " << C;
}
The example is a quite common. What is the result it will print?
C = 9223372036854775805
Tested here: https://rextester.com/l/cpp_online_compiler_gcc
Tested also for Clang C++, VS C++ and C. Result the same.
Discussing with colleagues I was pointed to the document https://en.cppreference.com/w/cpp/language/operator_arithmetic#Conversions .
It tells:
If both operands are signed or both are unsigned, the operand with lesser conversion rank is converted to the operand with the greater integer conversion rank
Otherwise, if the unsigned operand's conversion rank is greater or equal to the conversion rank of the signed operand, the signed operand is converted to the unsigned operand's type.
Otherwise, if the signed operand's type can represent all values of the unsigned operand, the unsigned operand is converted to the signed operand's type
It seems the second rule is working here. But it is not true.
To confirm the second rule, I have tested such example:
//g++ 5.4.0
#include <iostream>
int main()
{
typedef uint32_t u_t; // uint64_t, uint32_t, uint16_t uint8_t
typedef int32_t i_t; // int64_t, int32_t, int16_t int8_t
const u_t B = 2;
const i_t X = -1;
const i_t A1 = X * B;
std::cout << "A1 = X * B = " << A1 << "\n";
const i_t C = A1 / B; // signed / unsigned division
std::cout << "A1 / B = " << C << "\n";
}
with different rank combinations of u_t and i_t and found that it works correctly for any combination, EXCEPT for 32 and 64 bits (int64_t/uint64_t and int32_t/uint32_t). So the second rule DOES NOT work for 16 and 8 bits.
Note: the multiplication operation is working correct for all cases. So it is only division problem.
Also the SECOND rule sounds like it is wrong:
the signed operand is converted to the unsigned operand's type
The signed cannot be converted to unsigned - it is an !! error !! for NEGATIVE values!!
But opposite conversion is correct - the unsigned operand is converted to the signed operand's type
Looking at this I can note that here is a possible mistake in the C++ Standard Arithmetic operations.
Instead of:
Otherwise, if the unsigned operand's conversion rank is greater or equal to the conversion rank of the signed operand, the signed operand is converted to the unsigned operand's type.
it SHALL be:
Otherwise, if the signed operand's conversion rank is greater or equal to the conversion rank of the unsigned operand, the unsigned operand is converted to the signed operand's type.
On my opinion, if signed and unsigned multiplication/division is met then unsigned operand is converted to signed and after that it is casted to correct rank. At least the x86 Assembler follows it.
Please, explain me where here is an error. I want the first test in this post works correct for any type involved in place of the auto type, but now it is not possible and the C++ Standard tells that it is correct behavior.
Sorry for a strange question, but I am in a stuck with the problem. I am coding on C/C++ for 30 years but it is first problem I cannot explain it clearly - whether it is a bug or an expected behavior.
There's a lot to chew on here... I'll address only one point as you forgot to actually ask a question.
In your second code snippet:
const u_t B = 2;
const i_t X = -1;
const i_t A1 = X * B;
you see than A1 is -2 and conclude that in the expression X * B both operands are promoted to signed integers. This is not true.
In X * B, both operands are promoted to unsigned integers, as per the Standard, but its result is then converted to a signed integer with the affectation const i_t A1 = ....
You can easily check that:
const u_t B = 2;
const i_t X = -1;
const auto A1 = X * B; // unsigned
You can also play with decltype(expression) and std::is_signed:
#include <iostream>
#include <iomanip>
#include <type_traits>
int main()
{
signed s = 1;
unsigned u = 1;
std::cout << std::boolalpha
<< " signed * signed is signed? " << std::is_signed_v<decltype(s * s)> << "\n"
<< " signed * unsigned is signed? " << std::is_signed_v<decltype(s * u)> << "\n"
<< "unsigned * signed is signed? " << std::is_signed_v<decltype(u * s)> << "\n"
<< "unsigned * unsigned is signed? " << std::is_signed_v<decltype(u * u)> << "\n";
}
/*
signed * signed is signed? true
signed * unsigned is signed? false
unsigned * signed is signed? false
unsigned * unsigned is signed? false
*/
demo

C++ Implicit Conversion (Signed + Unsigned)

I understand that, regarding implicit conversions, if we have an unsigned type operand and a signed type operand, and the type of the unsigned operand is the same as (or larger) than the type of the signed operand, the signed operand will be converted to unsigned.
So:
unsigned int u = 10;
signed int s = -8;
std::cout << s + u << std::endl;
//prints 2 because it will convert `s` to `unsigned int`, now `s` has the value
//4294967288, then it will add `u` to it, which is an out-of-range value, so,
//in my machine, `4294967298 % 4294967296 = 2`
What I don't understand - I read that if the signed operand has a larger type than the unsigned operand:
if all values in the unsigned type fit in the larger type then the unsigned operand is converted to the signed type
if the values in the unsigned type don't fit in the larger type, then the signed operand will be converted to the unsigned type
so in the following code:
signed long long s = -8;
unsigned int u = 10;
std::cout << s + u << std::endl;
u will be converted to signed long long because int values can fit in signed long long??
If that's the case, in what scenario the smaller type values won't fit in the larger one?
Relevant quote from the Standard:
5 Expressions [expr]
10 Many binary operators that expect operands of arithmetic or
enumeration type cause conversions and yield result types in a similar
way. The purpose is to yield a common type, which is also the type of
the result. This pattern is called the usual arithmetic conversions,
which are defined as follows:
[2 clauses about equal types or types of equal sign omitted]
— Otherwise, if the operand that has unsigned integer type has rank
greater than or equal to the rank of the type of the other operand,
the operand with signed integer type shall be converted to the type of
the operand with unsigned integer type.
— Otherwise, if the type of
the operand with signed integer type can represent all of the values
of the type of the operand with unsigned integer type, the operand
with unsigned integer type shall be converted to the type of the
operand with signed integer type.
— Otherwise, both operands shall be
converted to the unsigned integer type corresponding to the type of
the operand with signed integer type.
Let's consider the following 3 example cases for each of the 3 above clauses on a system where sizeof(int) < sizeof(long) == sizeof(long long) (easily adaptable to other cases)
#include <iostream>
signed int s1 = -4;
unsigned int u1 = 2;
signed long int s2 = -4;
unsigned int u2 = 2;
signed long long int s3 = -4;
unsigned long int u3 = 2;
int main()
{
std::cout << (s1 + u1) << "\n"; // 4294967294
std::cout << (s2 + u2) << "\n"; // -2
std::cout << (s3 + u3) << "\n"; // 18446744073709551614
}
Live example with output.
First clause: types of equal rank, so the signed int operand is converted to unsigned int. This entails a value-transformation which (using two's complement) gives te printed value.
Second clause: signed type has higher rank, and (on this platform!) can represent all values of the unsigned type, so unsigned operand is converted to signed type, and you get -2
Third clause: signed type again has higher rank, but (on this platform!) cannot represent all values of the unsigned type, so both operands are converted to unsigned long long, and after the value-transformation on the signed operand, you get the printed value.
Note that when the unsigned operand would be large enough (e.g. 6 in these examples), then the end result would give 2 for all 3 examples because of unsigned integer overflow.
(Added) Note that you get even more unexpected results when you do comparisons on these types. Lets consider the above example 1 with <:
#include <iostream>
signed int s1 = -4;
unsigned int u1 = 2;
int main()
{
std::cout << (s1 < u1 ? "s1 < u1" : "s1 !< u1") << "\n"; // "s1 !< u1"
std::cout << (-4 < 2u ? "-4 < 2u" : "-4 !< 2u") << "\n"; // "-4 !< 2u"
}
Since 2u is made unsigned explicitly by the u suffix the same rules apply. And the result is probably not what you expect when comparing -4 < 2 when writing in C++ -4 < 2u...
signed int does not fit into unsigned long long. So you will have this conversion:
signed int -> unsigned long long.
Note that the C++11 standard doesn't talk about the larger or smaller types here, it talks about types with lower or higher rank.
Consider the case of long int and unsigned int where both are 32-bit. The long int has a larger rank than the unsigned int, but since long int and unsigned int are both 32-bit, long int can't represent all the values of unsigned int.
Therefore we fall into to the last case (C++11: 5.6p9):
Otherwise, both operands shall be converted to the unsigned integer type corresponding to the
type of the operand with signed integer type.
This means that both the long int and the unsigned int will be converted to unsigned long int.

C++: Signed/unsigned mismatch when only using unsigned types

When I try to compile the following C++ program using the Visual Studio 2010 C++ compiler (X86) with warning level /W4 enabled, I get a signed/unsigned mismatch warning at the marked line.
#include <cstdio>
#include <cstdint>
#include <cstddef>
int main(int argc, char **argv)
{
size_t idx = 42;
uint8_t bytesCount = 20;
// warning C4389: '==' : signed/unsigned mismatch
if (bytesCount + 1 == idx)
{
printf("Hello World\n");
}
// no warning
if (bytesCount == idx)
{
printf("Hello World\n");
}
}
This confuses me, since I'm only using unsigned types. Since the comparison
bytesCount == idx
causes no such warning, it probably has to do with some strange implicit conversation that happens here.
Thus: what is the reason why I get this warning and by what rules does this conversation happen (if this is the reason)?
1 is a signed literal. Try bytesCount + 1U.
The compiler is probably creating a temporary value of the signed type due to the addition of signed and unsigned values ( bytesCount + 1 )
1 is an int. The type of an integral arithmetic expression depends on the types involved. In this case, you have an unsigned type and a signed type where the unsigned type is smaller than the signed type. This falls under the C++ standard on expressions (section 5.10 [expr]):
Otherwise, if the type of the operand with signed integer type can
represent all of the values of the type of the operand with unsigned
integer type, the operand with unsigned integer type shall be
converted to the type of the operand with signed integer type.
I.e., the type of the expression bytesCount + 1 is int which is signed by default.
Since 1 is of type int the expression bytesCount + 1 is int (signed).
In fact when a type smaller than int is used in a mathematical expression it is promoted to int, so even + bytesCount and bytesCount + bytesCount are considered int and not uint8_t (while bytesCount + 1U is an unsigned int since that is larger than int).
The following program outputs true three times.
#include <iostream>
int main()
{
unsigned short s = 1;
std::cout << (&typeid( s + 1U ) == &typeid(1U)) << std::endl;
std::cout << (&typeid( + s ) == &typeid(1)) << std::endl;
std::cout << (&typeid( s + s ) == &typeid(1)) << std::endl;
}
The other answers already tell you that bytesCount + 1 is interpreted as signed int. However, I'd like to add that in bytesCount == idx, bytesCount is also interpreted as signed int. Conceptually, it is first converted to signed int, and it is only converted to unsigned int after that. Your compiler does not warn about this, because it has enough information to know that there is not really problem. The conversion to signed int cannot possibly make bytesCount negative. Comparing bytesCount + 1 is equally valid, equally safe, but just that slight bit more complex to make the compiler no longer recognise it as safe.