Can't Assign 64 Bit Number to 64 Bit Min in C++ - c++

I'm new to C++, so sorry if this sounds like a dumb question. I'm attempting to assign the 64 bit minimum (-9223372036854775808) to an int64_t (from cstdint), however I'm getting the current error message:
main.cpp:5:27: warning: integer literal is too large to be represented in a signed integer type, interpreting as unsigned [-Wimplicitly-unsigned-literal]
int64_t int_64_min = -9223372036854775808;
The code is as follows:
#include <iostream>
#include <cstdint>
int main() {
int64_t int_64_min = -9223372036854775808;
std::cout << int_64_min << std::endl;
std::cout << INT64_MIN << std::endl;
return 0;
}
Am I missing something obvious?

This is exactly why the INT_MIN family of macros are often defined like -INT_MAX - 1 on a 2's complement platform (virtually ubiquitous and compulsory from C++20).
There is no such thing as a negative literal in C++: just the unary negation of a positive number which produces a compile time evaluable constant expression.
9223372036854775808 is too big to fit into a 64 bit signed integral type, so the behavior of negating this and assigning to a signed 64 bit integral type is implementation defined.
Writing -9223372036854775807 - 1 is probably what your C++ standard library implementation does. If I were you, I'd use INT64_MIN directly or std::numeric_limits<std::int64_t>::min().

The problem is that -9223372036854775808 is not a literal. 9223372036854775808 is a literal, and -9223372036854775808 is an expression consisting of that literal with a unary - applied to it.
All unsuffixed decimal literals are of type int, long int, or long long int. Since 9223372036854775808 is 263, it's bigger than the maximum value of long long int (assuming long long int is 64 bits, which it almost certainly is).
The simplest solution is to use the INT64_MIN macro defined in <cstdint>.
Aside from that, the problem is that there are no literals for negative values. Another solution is to replace -9223372036854775808 by -9223372036854775807-1 -- or by (-9223372036854775807-1) to avoid ambiguity. (The <cstdint> header very likely does something similar).

Related

decimal literal is more than LONG_LONG_MAX

In the "C++ Primer" book it is said that decimal literals have the smallest type of int, long or long long in which the literal's value fits, and it is an error to use a literal that is too large to fit in the largest related type. So, i guess, that the max decimal literal is the max value long long can hold.
But I tried it: printed the max long long value, then a decimal literal higher. I thought it had to cause an error, but it was okay. I then tried to print a decimal literal higher than the unsigned long long max value - it wrapped around.
Here is my code:
#include <iostream>
#include <limits>
using namespace std;
int main() {
cout << LONG_LONG_MAX << endl;
cout << 9223372036854775808 << endl;
cout << ULONG_LONG_MAX << endl;
cout << 18446744073709551616 << endl;
}
Here is the output:
9223372036854775807
9223372036854775808
18446744073709551615
0
Explain please, why a decimal literal can be higher than the long long max value
UPD:
I noticed that there actually were warnings. I work under CodeBlocks, and on rebuilding and reruning your project it doesn't print warnings again, if the code isn't changed. So I just didn't notice. There were 3 warnings:
integer constant is so large that it is unsigned (8)
this decimal constant is unsigned only in ISO C90 (8)
integer constant is too large for its type (10)
But output was the same. Then I noticed that in debugger options I forgot to fill in the 'executable path' line, so I pasted there my gdb path. But I don't think it mattered.
Afterwards I noticed that I didn't choose the 'have g++ follow c++11' option in compiler settings. I did choose and compiler then printed both warnings and some errors. There wasn't the this decimal constant is unsigned only in ISO C90 warning, but was an ..\main.cpp|8|error: ambiguous overload for 'operator<<' (operand types are 'std::ostream {aka std::basic_ostream<char>}' and '<unnamed-signed:128>')| error. Also it didn't recognise limits with LONG_LONG_MAX and ULONG_LONG_MAX, but recognised climits with LLONG_MAX and ULLONG_MAX.
With c++98 there were warnings like in the first case (without any option chosen) and it didn't recognise limits, only climits. When I changed the header name, the output was like in the beginning.
The compiler is set to GNU GCC Compiler.
Maybe someone could explain why compiler supports limits if I don't specify standard, and then, I think, the question is closed.
According to the language specification, a decimal literal with no suffix has a signed type (the smallest type of int, long int, or long long int that can hold the value). Binary, octal, and hexadecimal constants can have unsigned types.
An implementation is allowed to support extended integer types. If your compiler has an integer type larger than a long long, then your decimal literal would be of that type. If it doesn't, then it is either a compiler bug or compiler extension since an unsuffixed decimal literal cannot be of type unsigned long long.
Please use header and constants defined by standard. http://en.cppreference.com/w/cpp/types/climits.
Upper limit for 64bit int (long long on most POSIX systems) is 9223372036854775807 (2^63-1).
Maximum value for an object of type unsigned long long int 18446744073709551615 (2^64-1).
<limits> may exists but it not necessary includes <climits>.
As your literal doesn't fit into signed value, the compiler must have to give you diagnostic (if you didn't disabled it), something like
main.c:8:13: warning: integer constant is so large that it is unsigned
cout << 9223372036854775808 << endl;
Apparently gcc\clang are liberal enough to consider that enough and can compile program assuming that value would be represented with unsigned type, and to truncate the second literal. It's an undetermined behavior and former would not be true, so you may get negative values in output, e.g.:
9223372036854775807
-9223372036854775808
18446744073709551615
0

Is `-1` correct for using as maximum value of an unsigned integer?

Is there any c++ standard paragraph which says that using -1 for this is portable and correct way or the only way of doing this correctly is using predefined values?
I have had a conversation with my colleague, what is better: using -1 for a maximum unsigned integer number or using a value from limits.h or std::numeric_limits ?
I have told my colleague that using predefined maximum values from limits.h or std::numeric_limits is the portable and clean way of doing this, however, the colleague objected to -1 being as same portable as numeric limits, and more, it has one more advantage:
unsigned short i = -1; // unsigned short max
can easily be changed to any other type, like
unsigned long i = -1; // unsigned long max
when using the predefined value from the limits.h header file or std::numeric_limits also requires to rewrite it too along with the type to the left.
Regarding conversions of integers, C 2011 [draft N1570] 6.3.1.3 2 says
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
Thus, converting -1 to an unsigned integer type necessarily produces the maximum value of that type.
There may be issues with using -1 in various contexts where it is not immediately converted to the desired type. If it is immediately converted to the desired unsigned integer type, as by assignment or explicit conversion, then the result is clear. However, if it is a part of an expression, its type is int, and it behaves like an int until converted. In contrast, UINT_MAX has the type unsigned int, so it behaves like an unsigned int.
As chux points out in a comment, USHRT_MAX effectively has a type of int, so even the named limits are not fully safe from type issues.
Not using the standard way or not clearly showing the intent is often a bad idea that we pay later
I would suggest:
auto i = std::numeric_limits<unsigned int>::max();
or #jamesdin suggested a certainly better one, closer to the C
habits:
unsigned int i = std::numeric_limits<decltype(i)>::max();
Your colleague argument is not admissible. Changing int -> long int, as bellow:
auto i = std::numeric_limits<unsigned long int>::max();
does not require extra work compared to the -1 solution (thanks to the use of auto).
the '-1' solution does not directly reflect our intent, hence it possibly has harmful consequences. Consider this code snippet:
.
using index_t = unsigned int;
... now in another file (or far away from the previous line) ...
const index_t max_index = -1;
First, we do not understand why max_index is -1.
Worst, if someone wants to improve the code and define
using index_t = ptrdiff_t;
=> then the statement max_index=-1 is not the max anymore and you get a buggy code. Again this can not happen with something like:
const index_t max_index = std::numeric_limits<index_t>::max();
CAVEAT: nevertheless there is a caveat when using std::numeric_limits. It has nothing to do with integers, but is related to floating point numbers.
std::cout << "\ndouble lowest: "
<< std::numeric_limits<double>::lowest()
<< "\ndouble min : "
<< std::numeric_limits<double>::min() << '\n';
prints:
double lowest: -1.79769e+308
double min : 2.22507e-308 <-- maybe you expected -1.79769e+308 here!
min returns the smallest finite value of the given type
lowest returns the lowest finite value of the given type
Always interesting to remember that, as it can be a source of bug if we do not pay attention to (using min instead of lowest).
Is -1 correct for using as maximum value of an unsigned integer?
Yes, it is functionally correct when used as a direct assignment/initialization. Yet often looks questionable #Ron.
Constants from limits.h or std::numeric_limits convey more code understanding, yet need maintenance should the type of i change.
[Note] OP later drop the C tag.
To add an alternative to assigning a maximum value (available in C11) that helps reduce code maintenance:
Use the loved/hated _Generic
#define info_max(X) _Generic((X), \
long double: LDBL_MAX, \
double: DBL_MAX, \
float: FLT_MAX, \
unsigned long long: ULLONG_MAX, \
long long: LLONG_MAX, \
unsigned long: ULONG_MAX, \
long: LONG_MAX, \
unsigned: UINT_MAX, \
int: INT_MAX, \
unsigned short: USHRT_MAX, \
short: SHRT_MAX, \
unsigned char: UCHAR_MAX, \
signed char: SCHAR_MAX, \
char: CHAR_MAX, \
_Bool: 1, \
default: 1/0 \
)
int main() {
...
some_basic_type i = info_max(i);
...
}
The above macro info_max() have limitations concerning types like size_t, intmax_t, etc. that may not be enumerated in the above list. There are more complex macros that can cope with that. The idea here is illustrative.
The technical side has been covered by other answers; and while you focus on technical correctness in your question, pointing out the cleanness aspect again is important, because imo that’s the much more important point.
The major reason why it is a bad idea to use that particular trickery is: The code is ambiguous. It is unclear whether someone used the unsigned trickery intentionally or made a mistake and actually wanted to initialize a signed variable to -1. Should your colleague mention a comment after you present this argument, tell him to stop being silly. :)
I’m actually slightly baffled that someone would even consider this trick in earnest. There’s an unambigous, intuitive and idiomatic way to set a value to its max in C: the _MAX macros. And there’s an additional, equally unambigous, intuitive and idiomatic way in C++ that provides some more type safety: numeric_limits. That -1 trick is a classic case of being clever.
The C++ standard says this about signed to unsigned conversions ([conv.integral]/2):
If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo
2n where n is the number of bits used to represent the unsigned type). [ Note: In a two's complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). — end note ]
So yes, converting -1 to an n-bit unsigned integer will always give you 2n-1, regardless of which signed integer type the -1 started as.
Whether or not unsigned x = -1; is more or less readable than unsigned x = UINT_MAX; though is another discussion (there's definitely the chance that it'll raise some eyebrows, maybe even your own when you look at your own code later;).

Is Sign Extension in C++ a compiler option, or compiler dependent or target dependent?

The following code has been compiled on 3 different compilers and 3 different processors and gave 2 different results:
typedef unsigned long int u32;
typedef signed long long s64;
int main ()
{ u32 Operand1,Operand2;
s64 Result;
Operand1=95;
Operand2=100;
Result= (s64)(Operand1-Operand2);}
Result produces 2 results:
either
-5 or 4294967291
I do understand that the operation of (Operand1-Operand2) is done in as 32-bit unsigned calculation, then when casted to s64 sign extension was done correctly in the first case but not done correctly for the 2nd case.
My question is whether the sign extension is possible to be controlled via compiler options, or it is compiler-dependent or maybe it is target-dependent.
Your issue is that you assume unsigned long int to be 32 bit wide and signed long long to be 64 bit wide. This assumption is wrong.
We can visualize what's going on by using types that have a guaranteed (by the standard) bit width:
int main() {
{
uint32_t large = 100, small = 95;
int64_t result = (small - large);
std::cout << "32 and 64 bits: " << result << std::endl;
} // 4294967291
{
uint32_t large = 100, small = 95;
int32_t result = (small - large);
std::cout << "32 and 32 bits: " << result << std::endl;
} // -5
{
uint64_t large = 100, small = 95;
int64_t result = (small - large);
std::cout << "64 and 64 bits: " << result << std::endl;
} // -5
return 0;
}
In every of these three cases, the expression small - large results in a result of unsigned integer type (of according width). This result is calculated using modular arithmetic.
In the first case, because that unsigned result can be stored in the wider signed integer, no conversion of the value is performed.
In the other cases the result cannot be stored in the signed integer. Thus an implementation defined conversion is performed, which usually means interpreting the bit pattern of the unsigned value as signed value. Because the result is "large", the highest bits will be set, which when treated as signed value (under two's complement) is equivalent to a "small" negative value.
To highlight the comment from Lưu Vĩnh Phúc:
Operand1-Operand2 is unsigned therefore when casting to s64 it's always zero extension. [..]
The sign extension is only done in the first case as only then there is a widening conversion, and it is indeed always zero extension.
Quotes from the standard, emphasis mine. Regarding small - large:
If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2^n$ where n is the number of bits used to represent the unsigned type). [..]
§ 4.7/2
Regarding the conversion from unsigned to signed:
If the destination type [of the integral conversion] is signed, the value is unchanged if it can be represented in the destination type; otherwise, the value is implementation-defined.
§ 4.7/3
Sign extension is platform dependent, where platform is a combination of a compiler, target hardware architecture and operating system.
Moreover, as Paul R mentioned, width of built-in types (like unsigned long) is platform-dependent too. Use types from <cstdint> to get fixed-width types. Nevertheless, they are just platform-dependent definitions, so their sign extension behavior still depends on the platform.
Here is a good almost-duplicate question about type sizes. And here is a good table about type size relations.
Type promotions, and the corresponding sign-extensions are specified by the C++ language.
What's not specified, but is platform-dependent, is the range of integer types provided. It's even Standard-compliant for char, short int, int, long int and long long int all to have the same range, provided that range satisfies the C++ Standard requirements for long long int. On such a platform, no widening or narrowing would ever happen, but signed<->unsigned conversion could still alter values.

Cast from unsigned long long to double and vice versa changes the value

When writing a C++ code I suddenly realised that my numbers are incorrectly casted from double to unsigned long long.
To be specific, I use the following code:
#define _CRT_SECURE_NO_WARNINGS
#include <iostream>
#include <limits>
using namespace std;
int main()
{
unsigned long long ull = numeric_limits<unsigned long long>::max();
double d = static_cast<double>(ull);
unsigned long long ull2 = static_cast<unsigned long long>(d);
cout << ull << endl << d << endl << ull2 << endl;
return 0;
}
Ideone live example.
When this code is executed on my computer, I have the following output:
18446744073709551615
1.84467e+019
9223372036854775808
Press any key to continue . . .
I expected the first and third numbers to be exactly the same (just like on Ideone) because I was sure that long double took 10 bytes, and stored the mantissa in 8 of them. I would understand if the third number were truncated compared to first one - just for the case I'm wrong with the floating-point numbers format. But here the values are twice different!
So, the main question is: why? And how can I predict such situations?
Some details: I use Visual Studio 2013 on Windows 7, compile for x86, and sizeof(long double) == 8 for my system.
18446744073709551615 is not exactly representible in double (in IEEE754). This is not unexpected, as a 64-bit floating point obviously cannot represent all integers that are representible in 64 bits.
According to the C++ Standard, it is implementation-defined whether the next-highest or next-lowest double value is used. Apparently on your system, it selects the next highest value, which seems to be 1.8446744073709552e19. You could confirm this by outputting the double with more digits of precision.
Note that this is larger than the original number.
When you convert this double to integer, the behaviour is covered by [conv.fpint]/1:
A prvalue of a floating point type can be converted to a prvalue of an integer type. The conversion truncates; that is, the fractional part is discarded. The behavior is undefined if the truncated value cannot be represented in the destination type.
So this code potentially causes undefined behaviour. When undefined behaviour has occurred, anything can happen, including (but not limited to) bogus output.
The question was originally posted with long double, rather than double. On my gcc, the long double case behaves correctly, but on OP's MSVC it gave the same error. This could be explained by gcc using 80-bit long double, but MSVC using 64-bit long double.
It's due to double approximation to long long. Its precision means ~100 units error at 10^19; as you try to convert values around the upper limit of long long range, it overflows. Try to convert 10000 lower value instead :)
BTW, at Cygwin, the third printed value is zero
The problem is surprisingly simple. This is what is happening in your case:
18446744073709551615 when converted to a double is round up to the nearest number that the floating point can represent. (The closest representable number is larger).
When that's converted back to an unsigned long long, it's larger than max(). Formally, the behaviour of converting this back to an unsigned long long is undefined but what appears to be happening in your case is a wrap around.
The observed significantly smaller number is the result of this.

Smallest values for int8_t and int64_t

With regard to to those definitions found in stdint.h, I wish to test a function for converting vectors of int8_t or vectors of int64_t to vectors of std::string.
Here are my tests:
TEST(TestAlgorithms, toStringForInt8)
{
std::vector<int8_t> input = boost::assign::list_of(-128)(0)(127);
Container container(input);
EXPECT_TRUE(boost::apply_visitor(ToString(),container) == boost::assign::list_of("-128")("0")("127"));
}
TEST(TestAlgorithms, toStringForInt64)
{
std::vector<int64_t> input = boost::assign::list_of(-9223372036854775808)(0)(9223372036854775807);
Container container(input);
EXPECT_TRUE(boost::apply_visitor(ToString(),container) == boost::assign::list_of("-9223372036854775808")("0")("9223372036854775807"));
}
However, I am getting a warning in visual studio for the line:
std::vector<int64_t> input = boost::assign::list_of(-9223372036854775808)(0)(9223372036854775807);
as follows:
warning C4146: unary minus operator applied to unsigned type, result still unsigned
If I change -9223372036854775808 to -9223372036854775807, the warning disappears.
What is the issue here? With regard to my original code, the test is passing.
It's like the compiler says, -9223372036854775808 is not a valid number because the - and the digits are treated separately.
You could try -9223372036854775807 - 1 or use std::numeric_limits<int64_t>::min() instead.
The issue is that integer literals are not negative; so -42 is not a literal with a negative value, but rather the - operator applied to the literal 42.
In this case, 9223372036854775808 is out of the range of int64_t, so it will be given an unsigned type. Due to the magic of modular arithmetic, you can still negate it, assign it to int64_t, and end up with the result you expect; but the compiler will warn you about the unsigned negation (if you tell it to) since that can often be the result of an error.
You could avoid the warning (and make the code more obviously correct) by using std::numeric_limits<int64_t>::min() instead.
Change -9223372036854775808 to -9223372036854775807-1.
The issue is that -9223372036854775808 isn't -9223372036854775808 but rather -(9223372036854775808) and 9223372036854775808 cannot fit into a signed 64-bit type (decimal integer constants by default are a signed type), so it instead becomes unsigned. Applying negation with - to an unsigned type is suspicious, hence the warning.