U suffix in the variable declaration - c++

I know that if the number is followed with U suffix it is treated as unsigned. But why following program prints correct value of variable i even though it is initialized with negative value. (Compiled with gcc 4.9.2, 4.8.2, & 4.7.1)
Program1.cpp
#include <iostream>
int main()
{
int i=-5U;
std::cout<<i; // prints -5 on gcc 4.9.2, 4.8.2, & 4.7.1
}
Program2.cpp
#include <iostream>
int main()
{
auto i=-5U;
std::cout<<i; // prints large positive number as output
}
But If I use auto keyword (The type deducer new C++0x feature) it gives me large positive number as expected.
Please correct me If I am understanding incorrect something.

-5U is not -5 U. It is -(5U). The minus sign is a negation operator operating on 5U, not the first character of an integer literal.
When you negate an unsigned number, it is equivalent to subtracting the current value from 2^n, where n is the number of bits in the integer type. So that explains the second program. As for the first, when you cast an unsigned integer to a signed integer (as you are doing by assigning it to an int), if the value is out of range the result is undefined behavior but in general* will result in the value being reinterpreted as an unsigned integer-- and since unsigned negation just so happens to have the same behavior as two's complement signed negation, the result is the same as if the negation happened in a signed context.
.* Note: This is NOT one of those "undefined behavior" situations that is only of academic concern to language wonks. Compilers can and do assume that casting an unsigned number to signed will not result in overflow (particularly when the resulting integer is then used in a loop), and there are known instances of this assumption turning carelessly written code into buggy programs.

Related

Why, in some C++ compilers, does `int x = 2147483647+1;` give only a warning but stores a negative value, while some compilers give a runtime error?

I want to check if the reverse of an signed int value x lies inside INT_MAX and INT_MIN. For this, I have reversed x twice and checked if it is equal to the original x, if so then it lies inside INT_MAX and INT_MIN, else it does not.
But online compilers are giving a runtime error, but my g++ compiler is working fine and giving the correct output. Can anybody tell me the reason?
int reverse(int x) {
int tx=x,rx=0,ans;
while(tx!=0){
rx = rx+rx+rx+rx+rx+rx+rx+rx+rx+rx+tx%10;
tx/=10;
}
ans = tx = rx;
rx=0;
while(tx!=0){
rx = rx*10 + tx%10;
tx/=10;
}
while(x%10==0&&x!=0)x/=10;
//triming trailing zeros
if(rx!=x){
return 0;
}else{
return ans;
}
}
ERROR:
Line 6: Char 23: runtime error: signed integer overflow: 1929264870 + 964632435 cannot be represented in type 'int' (solution.cpp)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior prog_joined.cpp:15:23
According to the cppreference:
Overflows
...
When signed integer arithmetic operation overflows (the result does not fit in the result type), the behavior is undefined: it may wrap around according to the rules of the representation (typically 2's complement), it may trap on some platforms or due to compiler options (e.g. -ftrapv in GCC and Clang), or may be completely optimized out by the compiler.
So the online compiler may comes with more strict checking, while your local GCC choosed to "wrap" on overflow.
Instead, if you want to wrap on overflow for sure, you may promote your operands to 64 bit width, perform the addition and then convert the result to 32 bit width again. According to cppreference this seems to be well defined after C++20.
Numeric conversions
Unlike the promotions, numeric conversions may change the values, with potential loss of precision.
Integral conversions
...
if the destination type is signed, the value does not change if the source integer can be represented in the destination type. Otherwise the result is { implementation-defined (until C++20) } / { the unique value of the destination type equal to the source value modulo 2n
where n is the number of bits used to represent the destination type. (since C++20) }
Example code on godbolt
I'm not sure what's wrong with your algorithm, but let me suggest an alternative that avoids doing math and the possibility of integer overflows.
If you want to find of if say, 2147483641 is a valid integer when reversed (e.g. 1463847412), you can do it entirely as string comparisons after converting the initial value to a string and reversing that string.
Basic algorithm for a non-negative value it this:
convert integer to string (let's call this string s)
convert INT_MAX to a string (let's call this string max_s)
reverse s. Handle leading zero's and negative chars appropriately. That is, 120 reversed is "21" and -456 reversed is "-654". The result of this conversion is a string called rev.
If rev.length() < s_max.length() Then rev is valid as an integer.
If rev.length() > s_max.length(), then rev is not valid as a integer.
If rev.size() == s_max.length(), then the reversed string is the same length as INT_MAX as a string. A lexical comparison will suffice to see if it's valid. That is isValid = (rev <= s_max) or isValid = strcmp(rev, s_max) < 1
The algorithm for a negative value is identical except replace INT_MAX with INT_MIN.

Why is long l = 0x80000000 a positive number?

In C++, why is long l = 0x80000000; positive?
C++:
long l = 0x80000000; // l is positive. Why??
int i = 0x80000000;
long l = i; // l is negative
According to this site: https://en.cppreference.com/w/cpp/language/integer_literal, 0x80000000 should be a signed int but it doesn't appear to be case because when it gets assigned to l sign extension doesn't occur.
Java:
long l = 0x80000000; // l is negative
int i = 0x80000000;
long l = i; // l is negative
On the other hand, Java has a more consistent behavior.
C++ Test code:
#include <stdio.h>
#include <string.h>
void print_sign(long l) {
if (l < 0) {
printf("Negative\n");
} else if (l > 0) {
printf("Positive\n");
} else {
printf("Zero\n");
}
}
int main() {
long l = -0x80000000;
print_sign(l); // Positive
long l2 = 0x80000000;
print_sign(l2); // Positive
int i = 0x80000000;
long l3 = i;
print_sign(l3); // Negative
int i2 = -0x80000000;
long l4 = i2;
print_sign(l4); // Negative
}
From your link: "The type of the integer literal is the first type in which the value can fit, from the list of types which depends on which numeric base and which integer-suffix was used." and for hexadecimal values lists int, unsigned int...
Your compiler uses 32 bit ints, so the largest (signed) int is 0x7FFFFFFF. The reason a signed int cannot represent 0x8000000...0xFFFFFFF is that it needs some of the 2^32 possible values of its 32 bits to represent negative numbers. However, 0x80000000 fits in an 32 bit unsigned int. Your compiler uses 64 bit longs, which can hold up to 0x7FFF FFFF FFFF FFFF, so 0x80000000 also fits in a signed long, and so the long l is the positive value 0x80000000.
On the other hand int i is a signed int and simply doesn't fit 0x80000000, so undefined behaviour occurs. What often happens when a signed number is too big to fit in C++ is that two-complement arithmetic is used and the number wraps round to a large negative number. (Do not rely on this behaviour; optimisations have been known to break this). In any case it appears the two's complement behaviour has indeed happened in this case, resulting in i being negative.
In your example code you use both 0x80000000 and -0x80000000 and in each case they have the same result. In fact, the are the same. Recall that 0x8000000 is an unsigned int. The 2003 C++ standard says in 5.3.1c7: "The negative of an unsigned quantity is computed by subtracting its value from 2^n, where n is the number of bits in the promoted operand." 0x80000000 is precisely 2^31, and so -0x80000000 is 2^32-2^31=2^31. To get the expected behaviours we would have to use -(long)0x80000000 instead.
With the help of the awesome people on SO, I think I can answer my own question now:
Just to correct the notion that 0x80000000 can't fit in an int:
It is possible to store, without loss or undefined behavior, the value 0x80000000 to an int (assuming sizeof(int) == 4). The following code can demonstrate this behavior:
#include <limits.h>
#include <stdio.h>
int main() {
int i = INT_MIN;
printf("%X\n", i);
return 0;
}
Assigning the literal 0x80000000 to a variable is little more nuanced, though.
What the other others failed to mention (except #Daniel Langr) is the fact that C++ doesn't have a concept of negative literals.
There are no negative integer literals. Expressions such as -1 apply the unary minus operator to the value represented by the literal, which may involve implicit type conversions.
With this in mind, the literal 0x80000000 is always treated as a positive number. Negations come after the size and sign have been determined. This is important: negations don't affect the unsigned/signedness of the literal, only the base and the value do. 0x80000000 is too big to fit in a signed integer, so C++ tries to use the next applicable type: unsigned int, which then succeeds. The order of types C++ tries depends on the base of the literal plus any suffixes it may or may not have.
The table is listed here: https://en.cppreference.com/w/cpp/language/integer_literal
So with this rule in mind let's work out some examples:
-2147483648: Treated as a long int because it can't fit in an int.
2147483648: Treated as a long int because C++ doesn't consider unsigned int as a candidate for decimal literals.
0x80000000: Treated as an unsigned int because C++ considers unsigned int as a candidate for non-decimal literals.
(-2147483647 - 1): Treated as an int. This is typically how INT_MIN is defined to preserve the type of the literal as an int. This is the type safe way of saying -2147483648 as an int.
-0x80000000: Treated as an unsigned int even though there's a negation. Negating any unsigned is undefined behavior, though.
-0x80000000l: Treated as a long int and the sign is properly negated.

Why does the most negative int value cause an error about ambiguous function overloads?

I'm learning about function overloading in C++ and came across this:
void display(int a)
{
cout << "int" << endl;
}
void display(unsigned a)
{
cout << "unsigned" << endl;
}
int main()
{
int i = -2147483648;
cout << i << endl; //will display -2147483648
display(-2147483648);
}
From what I understood, any value given in the int range (in my case int is 4 byte) will call display(int) and any value outside this range will be ambiguous (since the compiler cannot decide which function to call). It is valid for the complete range of int values except its min value i.e. -2147483648 where compilation fails with the error
call of overloaded display(long int) is ambiguous
But taking the same value to an int and printing the value gives 2147483648. I'm literally confused with this behavior.
Why is this behavior observed only when the most negative number is passed? (The behavior is the same if a short is used with -32768 - in fact, in any case where the negative number and positive number have the same binary representation)
Compiler used: g++ (GCC) 4.8.5
This is a very subtle error. What you are seeing is a consequence of there being no negative integer literals in C++. If we look at [lex.icon] we get that a integer-literal,
integer-literal
decimal-literal integer-suffixopt
[...]
can be a decimal-literal,
decimal-literal:
nonzero-digit
decimal-literal ’ opt digit
where digit is [0-9] and nonzero-digit is [1-9] and the suffix par can be one of u, U, l, L, ll, or LL. Nowhere in here does it include - as being part of the decimal literal.
In §2.13.2, we also have:
An integer literal is a sequence of digits that has no period or exponent part, with optional separating single quotes that are ignored when determining its value. An integer literal may have a prefix that specifies its base and a suffix that specifies its type. The lexically first digit of the sequence of digits is the most significant. A decimal integer literal (base ten) begins with a digit other than 0 and consists of a sequence of decimal digits.
(emphasis mine)
Which means the - in -2147483648 is the unary operator -. That means -2147483648 is actually treated as -1 * (2147483648). Since 2147483648 is one too many for your int it is promoted to a long int and the ambiguity comes from that not matching.
If you want to get the minimum or maximum value for a type in a portable manner you can use:
std::numeric_limits<type>::min(); // or max()
The expression -2147483648 is actually applying the - operator to the constant 2147483648. On your platform, int can't store 2147483648, it must be represented by a larger type. Therefore, the expression -2147483648 is not deduced to be signed int but a larger signed type, signed long int.
Since you do not provide an overload for long the compiler is forced to choose between two overloads that are both equally valid. Your compiler should issue a compiler error about ambiguous overloads.
Expanding on others' answers
To clarify why the OP is confused, first: consider the signed int binary representation of 2147483647, below.
Next, add one to this number: giving another signed int of -2147483648 (which the OP wishes to use)
Finally: we can see why the OP is confused when -2147483648 compiles to a long int instead of a signed int, since it clearly fits in 32 bits.
But, as the current answers mention, the unary operator (-) is applied after resolving 2147483648 which is a long int and does NOT fit in 32 bits.

Smallest values for int8_t and int64_t

With regard to to those definitions found in stdint.h, I wish to test a function for converting vectors of int8_t or vectors of int64_t to vectors of std::string.
Here are my tests:
TEST(TestAlgorithms, toStringForInt8)
{
std::vector<int8_t> input = boost::assign::list_of(-128)(0)(127);
Container container(input);
EXPECT_TRUE(boost::apply_visitor(ToString(),container) == boost::assign::list_of("-128")("0")("127"));
}
TEST(TestAlgorithms, toStringForInt64)
{
std::vector<int64_t> input = boost::assign::list_of(-9223372036854775808)(0)(9223372036854775807);
Container container(input);
EXPECT_TRUE(boost::apply_visitor(ToString(),container) == boost::assign::list_of("-9223372036854775808")("0")("9223372036854775807"));
}
However, I am getting a warning in visual studio for the line:
std::vector<int64_t> input = boost::assign::list_of(-9223372036854775808)(0)(9223372036854775807);
as follows:
warning C4146: unary minus operator applied to unsigned type, result still unsigned
If I change -9223372036854775808 to -9223372036854775807, the warning disappears.
What is the issue here? With regard to my original code, the test is passing.
It's like the compiler says, -9223372036854775808 is not a valid number because the - and the digits are treated separately.
You could try -9223372036854775807 - 1 or use std::numeric_limits<int64_t>::min() instead.
The issue is that integer literals are not negative; so -42 is not a literal with a negative value, but rather the - operator applied to the literal 42.
In this case, 9223372036854775808 is out of the range of int64_t, so it will be given an unsigned type. Due to the magic of modular arithmetic, you can still negate it, assign it to int64_t, and end up with the result you expect; but the compiler will warn you about the unsigned negation (if you tell it to) since that can often be the result of an error.
You could avoid the warning (and make the code more obviously correct) by using std::numeric_limits<int64_t>::min() instead.
Change -9223372036854775808 to -9223372036854775807-1.
The issue is that -9223372036854775808 isn't -9223372036854775808 but rather -(9223372036854775808) and 9223372036854775808 cannot fit into a signed 64-bit type (decimal integer constants by default are a signed type), so it instead becomes unsigned. Applying negation with - to an unsigned type is suspicious, hence the warning.

Curious arithmetic error- 255x256x256x256=18446744073692774400

I encountered a strange thing when I was programming under c++. It's about a simple multiplication.
Code:
unsigned __int64 a1 = 255*256*256*256;
unsigned __int64 a2= 255 << 24; // same as the above
cerr()<<"a1 is:"<<a1;
cerr()<<"a2 is:"<<a2;
interestingly the result is:
a1 is: 18446744073692774400
a2 is: 18446744073692774400
whereas it should be:(using calculator confirms)
4278190080
Can anybody tell me how could it be possible?
255*256*256*256
all operands are int you are overflowing int. The overflow of a signed integer is undefined behavior in C and C++.
EDIT:
note that the expression 255 << 24 in your second declaration also invokes undefined behavior if your int type is 32-bit. 255 x (2^24) is 4278190080 which cannot be represented in a 32-bit int (the maximum value is usually 2147483647 on a 32-bit int in two's complement representation).
C and C++ both say for E1 << E2 that if E1 is of a signed type and positive and that E1 x (2^E2) cannot be represented in the type of E1, the program invokes undefined behavior. Here ^ is the mathematical power operator.
Your literals are int. This means that all the operations are actually performed on int, and promptly overflow. This overflowed value, when converted to an unsigned 64bit int, is the value you observe.
It is perhaps worth explaining what happened to produce the number 18446744073692774400. Technically speaking, the expressions you wrote trigger "undefined behavior" and so the compiler could have produced anything as the result; however, assuming int is a 32-bit type, which it almost always is nowadays, you'll get the same "wrong" answer if you write
uint64_t x = (int) (255u*256u*256u*256u);
and that expression does not trigger undefined behavior. (The conversion from unsigned int to int involves implementation-defined behavior, but as nobody has produced a ones-complement or sign-and-magnitude CPU in many years, all implementations you are likely to encounter define it exactly the same way.) I have written the cast in C style because everything I'm saying here applies equally to C and C++.
First off, let's look at the multiplication. I'm writing the right hand side in hex because it's easier to see what's going on that way.
255u * 256u = 0x0000FF00u
255u * 256u * 256u = 0x00FF0000u
255u * 256u * 256u * 256u = 0xFF000000u (= 4278190080)
That last result, 0xFF000000u, has the highest bit of a 32-bit number set. Casting that value to a signed 32-bit type therefore causes it to become negative as-if 232 had been subtracted from it (that's the implementation-defined operation I mentioned above).
(int) (255u*256u*256u*256u) = 0xFF000000 = -16777216
I write the hexadecimal number there, sans u suffix, to emphasize that the bit pattern of the value does not change when you convert it to a signed type; it is only reinterpreted.
Now, when you assign -16777216 to a uint64_t variable, it is back-converted to unsigned as-if by adding 264. (Unlike the unsigned-to-signed conversion, this semantic is prescribed by the standard.) This does change the bit pattern, setting all of the high 32 bits of the number to 1 instead of 0 as you had expected:
(uint64_t) (int) (255u*256u*256u*256u) = 0xFFFFFFFFFF000000u
And if you write 0xFFFFFFFFFF000000 in decimal, you get 18446744073692774400.
As a closing piece of advice, whenever you get an "impossible" integer from C or C++, try printing it out in hexadecimal; it's much easier to see oddities of twos-complement fixed-width arithmetic that way.
The answer is simple -- overflowed.
Here Overflow occurred on int and when you are assigning it to unsigned int64 its converted in to 18446744073692774400 instead of 4278190080