Ambiguous call to overloaded integer constructor - c++

I shall provide these two constructors.
BigUnsigned(int value)
:BigUnsigned(static_cast<unsigned long long>(value)){
}
BigUnsigned(unsigned long long value);
The problem is the call is ambiguous for some argument values. According to this answer, that refers the C++11 standard,
the type of an integer literal is the first of the corresponding list
in Table 6 in which its value can be represented.
Table 6 is here
int
long int
long long int
Therefore, in my case, the type of the constructor argument (integer literal) if it belongs to the range...
<0, numeric_limits<int>::max()> is int
---> calling BigUnsigned(int)
(numeric_limits<int>::max(), numeric_limits<long int>::max()> is long int
---> ambiguous
(numeric_limits<long int>::max(), too big literal) is long long int
---> ambiguous
How can I solve the ambiguity without declaring any more constructors or explicitly typecasting the argument?
This question on integer promotion and integer conversion might be useful. I still don't know which of them applies in my case, though.

One fundamental problem here is that a decimal literal will never be deduced as an unsigned type. Therefore, a decimal literal that's too large to fit in an int end up requiring a signed->unsigned conversion in one case, and a long->int conversion in the other. Both of these are classed as "integral conversions", so neither is considered a "better" conversion, and the overload is ambiguous.
As to possible ways to deal with this without explicitly casting the argument or adding more constructors, I can see a couple.
At least for a literal, you could add a suffix that specifies that the type of the literal is unsigned:
BigUnsigned a(5000000000U); // unambiguous
Another (that also applies only to literals) would be to use a hexadecimal or octal literal, which (according to part of table 6 that isn't quoted in the question) can be deduced as either signed or unsigned. This is only a partial fix though--it only works for values that will deduce as unsigned. For a typical system with 32-bit int, 32-bit long, and 64-bit long long, I believe it'll come out like this:
So for a parameter large enough that it won't fit in a signed long long, this gives an unambiguous call where the decimal constant would still have been ambiguous.
For people who've worked with smaller types, it might initially seem like the conversion from unsigned long to unsigned long long would qualify as a promotion instead of a conversion, which would make it preferable. And indeed, if (for example) the types involved were unsigned short and unsigned int, that would be exactly true--but that special preference is only given for types with conversion ranks less than int (which basically translates to: types that are smaller than int).
So that fixes the problem for one range of numbers, but only if they're literals, and only if they fall into one specific (albeit, quite large) range.
For the more general case, the only real cure is to change the interface. Either remove the overload for int, or add a few more ctor overloads, specifically for unsigned and for long long. These can be delegating constructors just like the existing one for int, if you decide you need them at all (but it's probably better to just have the one for unsigned long long and be done with it).

I had the same Problem. Also with some Big-Integer classes. However the literal-only solution did not work for me since I need to convert totally different kinds of (Big-)Ints between each other. I tried to avoid the excessive implementation of constructors. I solved the problem like this:
template <std::same_as<MyOtherBigInt> T>
explicit constexpr MyBigInt(T pValue) noexcept;
This stops the special MyOtherBigInt-Constructor from being called with any other integer.

Related

Why can't `std::min` be used between `int` and `long long int`

So I just want to know that why I can't pass values with different datatypes in min and max function?
int a=7;
long long int b=5;
long long int c=min(a,b);
cout<<c;
My doubt arises because what I know for sure that compiler can implicitly type cast from smaller datatype(int) to larger datatype(long long int) so why not compiler couldn't type cast here!
This comes down to how std::min and std::max were designed.
They could've had two different types for the two parameters, and use something like std::common_type to figure out a good result type. But instead they have the same type for both parameters, so the two arguments you pass must also have the same type.
compiler can implicitly type cast from smaller datatype(int) to larger datatype(long long int)
Yeah, but it also can implicitly convert back from long long to int. ("Cast" isn't the right word, it means an explicit conversion.)
The rule is that when a function parameter participates in template argument deduction, no implicit conversions are allowed for the argument passed to that parameter. Deduction rules are already compilcated, allowing some implicit conversions (i.e. to wider arithmetic types) would make them even more complex.
Because there is loss of precision error comes when you do this they are made for the int datatypes so you cannot pass long or something like float.
You have to implicit cast them.

The printf `h` length qualifier and variadic arguments [duplicate]

Aside from %hn and %hhn (where the h or hh specifies the size of the pointed-to object), what is the point of the h and hh modifiers for printf format specifiers?
Due to default promotions which are required by the standard to be applied for variadic functions, it is impossible to pass arguments of type char or short (or any signed/unsigned variants thereof) to printf.
According to 7.19.6.1(7), the h modifier:
Specifies that a following d, i, o, u, x, or X conversion specifier applies to a
short int or unsigned short int argument (the argument will
have been promoted according to the integer promotions, but its value shall
be converted to short int or unsigned short int before printing);
or that a following n conversion specifier applies to a pointer to a short
int argument.
If the argument was actually of type short or unsigned short, then promotion to int followed by a conversion back to short or unsigned short will yield the same value as promotion to int without any conversion back. Thus, for arguments of type short or unsigned short, %d, %u, etc. should give identical results to %hd, %hu, etc. (and likewise for char types and hh).
As far as I can tell, the only situation where the h or hh modifier could possibly be useful is when the argument passed it an int outside the range of short or unsigned short, e.g.
printf("%hu", 0x10000);
but my understanding is that passing the wrong type like this results in undefined behavior anyway, so that you could not expect it to print 0.
One real world case I've seen is code like this:
char c = 0xf0;
printf("%hhx", c);
where the author expects it to print f0 despite the implementation having a plain char type that's signed (in which case, printf("%x", c) would print fffffff0 or similar). But is this expectation warranted?
(Note: What's going on is that the original type was char, which gets promoted to int and converted back to unsigned char instead of char, thus changing the value that gets printed. But does the standard specify this behavior, or is it an implementation detail that broken software might be relying on?)
One possible reason: for symmetry with the use of those modifiers in the formatted input functions? I know it wouldn't be strictly necessary, but maybe there was value seen for that?
Although they don't mention the importance of symmetry for the "h" and "hh" modifiers in the C99 Rationale document, the committee does mention it as a consideration for why the "%p" conversion specifier is supported for fscanf() (even though that wasn't new for C99 - "%p" support is in C90):
Input pointer conversion with %p was added to C89, although it is obviously risky, for symmetry with fprintf.
In the section on fprintf(), the C99 rationale document does discuss that "hh" was added, but merely refers the reader to the fscanf() section:
The %hh and %ll length modifiers were added in C99 (see §7.19.6.2).
I know it's a tenuous thread, but I'm speculating anyway, so I figured I'd give whatever argument there might be.
Also, for completeness, the "h" modifier was in the original C89 standard - presumably it would be there even if it wasn't strictly necessary because of widespread existing use, even if there might not have been a technical requirement to use the modifier.
In %...x mode, all values are interpreted as unsigned. Negative numbers are therefore printed as their unsigned conversions. In 2's complement arithmetic, which most processors use, there is no difference in bit patterns between a signed negative number and its positive unsigned equivalent, which is defined by modulus arithmetic (adding the maximum value for the field plus one to the negative number, according to the C99 standard). Lots of software- especially the debugging code most likely to use %x- makes the silent assumption that the bit representation of a signed negative value and its unsigned cast is the same, which is only true on a 2's complement machine.
The mechanics of this cast are such that hexidecimal representations of value always imply, possibly inaccurately, that a number has been rendered in 2's complement, as long as it didn't hit an edge condition of where the different integer representations have different ranges. This even holds true for arithmetic representations where the value 0 is not represented with the binary pattern of all 0s.
A negative short displayed as an unsigned long in hexidecimal will therefore, on any machine, be padded with f, due to implicit sign extension in the promotion, which printf will print. The value is the same, but it is truly visually misleading as to the size of the field, implying a significant amount of range that simply isn't present.
%hx truncates the displayed representation to avoid this padding, exactly as you concluded from your real-world use case.
The behavior of printf is undefined when passed an int outside the range of short that should be printed as a short, but the easiest implementation by far simply discards the high bit by a raw downcast, so while the spec doesn't require any specific behavior, pretty much any sane implementation is going to just perform the truncation. There're generally better ways to do that, though.
If printf isn't padding values or displaying unsigned representations of signed values, %h isn't very useful.
The only use I can think of is for passing an unsigned short or unsigned char and using the %x conversion specifier. You cannot simply use a bare %x - the value may be promoted to int rather than unsigned int, and then you have undefined behaviour.
Your alternatives are either to explicitly cast the argument to unsigned; or to use %hx / %hhx with a bare argument.
The variadic arguments to printf() et al are automatically promoted using the default conversions, so any short or char values are promoted to int when passed to the function.
In the absence of the h or hh modifiers, you would have to mask the values passed to get the correct behaviour reliably. With the modifiers, you no longer have to mask the values; the printf() implementation does the job properly.
Specifically, for the format %hx, the code inside printf() can do something like:
va_list args;
va_start(args, format);
...
int i = va_arg(args, int);
unsigned short s = (unsigned short)i;
...print s correctly, as 4 hex digits maximum
...even on a machine with 64-bit `int`!
I'm blithely assuming that short is a 16-bit quantity; the standard does not actually guarantee that, of course.
I found it useful to avoid casting when formatting unsigned chars to hex:
sprintf_s(tmpBuf, 3, "%2.2hhx", *(CEKey + i));
It's a minor coding convenience, and looks cleaner than multiple casts (IMO).
another place it's handy is snprintf size check.
gcc7 added size check when using snprintf
so this will fail
char arr[4];
char x='r';
snprintf(arr,sizeof(arr),"%d",r);
so it forces you to use bigger char when using %d when formatting a char
here is a commit that shows those fixes instead of increasing the char array size they changed %d to %h. this also give more accurate description
https://github.com/Mellanox/libvma/commit/b5cb1e34a04b40427d195b14763e462a0a705d23#diff-6258d0a11a435aa372068037fe161d24
I agree with you that it is not strictly necessary, and so by that reason alone is no good in a C library function :)
It might be "nice" for the symmetry of the different flags, but it is mostly counter-productive because it hides the "conversion to int" rule.

Is it safe to implicitly convert a `uint8_t` (read from a socket) to a `char`?

I am confused by the C++ conversion rules regarding unsigned-to-signed and vice versa.
I'm reading data from a socket and saving it in a std::vector<uint8_t>. I then need to read a part of it
(assuming it is ASCII data) and save it in a std::string. This is what I'm doing:
for (std::vector<uint8_t>::const_iterator it = payload.begin() + start; it < payload.begin() + end; ++it) {
store_name.push_back(*it);
}
So as you can see, *it returns a uint8_t and passes it into the push_back member function of std::string, which takes a char - thus an implicit conversion occurs. char may in fact be either signed or unsigned. I'm not sure what happens if it is signed.
I cannot wrap (no pun intended) my head around what is happening here, and whether or not it is safe.
Does store_name.push_back(*it) change the bit-pattern of *it before storing it in the std::string?
What rules exactly govern this?
I've gone through many places online explaining type-conversion rules, but it still doesn't really stick with me. Explanations will be appreciated.
EDIT: As a different way to put it - in general, what happens when we cast unsigned to signed and vice versa?
unsigned char a = 50; // Inside the range of signed char
signed char b = (signed char) a;
Is the bit pattern in b required to be the same as the bit pattern in a? Or may the bit pattern change?
Also, what about the opposite direction:
a = (unsigned char) b;
Again - does a change to the bit pattern occur? Or is it guaranteed that the underlying bit pattern stays the same, no matter how many signed-unsigned conversion we do, as long as the value is in the correct range?
And does it matter if it's an explicit cast using (cstyle cast) or static_cast<>, or if it's an implicit cast by assignment?
From implicit conversions - Numeric Conversion/Integral conversions:
To unsigned
If the destination type is unsigned, the resulting value is the
smallest unsigned value equal to the source value modulo 2n where n
is the number of bits used to represent the destination type. That is,
depending on whether the destination type is wider or narrower, signed
integers are sign-extended[footnote 1] or truncated and unsigned
integers are zero-extended or truncated respectively.
To signed
If the destination type is signed, the value does not change if the
source integer can be represented in the destination type. Otherwise
the result is implementation-defined (until C++20)the unique value of
the destination type equal to the source value modulo 2n where n is
the number of bits used to represent the destination type. (since
C++20). (Note that this is different from signed integer arithmetic
overflow, which is undefined).
So for values in range, there should be no conversion. Otherwise, I interpret it as if your machine represents values as two's complement, there is no changes in the bits for conversion to unsigned (from C++20 also to signed) and implementation defined until C++20. (I am not sure why, but I assume most compilers do not change the value, even though they are allowed to).
Regarding cstyle-cast vs static-cast: cstyle-cast performs (link)
When the C-style cast expression is encountered, the compiler
attempts to interpret it as the following cast expressions, in this
order:
a) const_cast<new_type>(expression);
b) static_cast<new_type>(expression), with extensions: pointer or
reference to a derived class is additionally allowed to be cast to
pointer or reference to unambiguous base class (and vice versa) even
if the base class is inaccessible (that is, this cast ignores the
private inheritance specifier). Same applies to casting pointer to
member to pointer to member of unambiguous non-virtual base;
c) static_cast (with extensions) followed by const_cast;
d) reinterpret_cast<new_type>(expression);
e) reinterpret_cast followed> by const_cast. The first choice that satisfies the requirements of the respective cast operator is selected, even if it cannot be compiled.
So for signed<->unsiged conversions, cstyle-cast should be the same as static_cast.
For implicit conversion (implicit conversions - Order of the conversions)
Implicit conversion sequence consists of the following, in this order:
zero or one standard conversion sequence;
zero or one user-defined conversion;
zero or one standard conversion sequence.
, where
A standard conversion sequence consists of the following, in this
order:
zero or one conversion from the following set: lvalue-to-rvalue
conversion, array-to-pointer conversion, and function-to-pointer
conversion;
zero or one numeric promotion or numeric conversion;
zero or one function pointer conversion; (since C++17) 4) zero or one
qualification adjustment.
and numeric conversion is yet again the conversion quoted on the top.
static_cast itself converts between types using a combination of implicit and user-defined conversions (link). So there should not be any difference between implicit or explicit.

size_t divided by int type conversion rules

When I am doing arithmetic operations with size_t type (or unsigned long), how careful should I be with decorating integer constants with type literals. For example,
size_t a = 1111111;
if (a/2 > 0) ...;
What happens when compiler does the division? Does it treat 2 as integer or as unsigned integer? If the former, then what is the resulting type for (unsigned int)/(int)?
Should I always carefully write 'u' literals
if (a/2u > 0) ...;
for (a=amax; a >= 0u; a -= 3u) ...;
or compiler will correctly guess that I want to use operations with unsigned integers?
2 is indeed treated as an int, which is then implicitly converted to size_t. In a mixed operation size_t / int, the unsigned type "wins" and signed type gets converted to unsigned one, assuming the unsigned type is at least as wide as the signed one. The result is unsigned, i.e. size_t in your case. (See Usual arithmetic conversions for details).
It is a better idea to just write it as a / 2. No suffixes, no type casts. Keep the code as type-independent as possible. Type names (and suffixes) belong in declarations, not in statements.
The C standard guarantees that size_t is an unsigned integer.
The literal 2 is always of type int.
The "usual artihmetic converstions guarantee that whenever an unsigned and a signed integer of the same size ("rank") are used as operands in a binary operation, the signed operand gets converted to unsigned type.
So the compiler actually interprets the expression like this:
a/(size_t)2 > (size_t)0
(The result of the > operator or any relational operator is always of type int though, as a special case for that group of operators.)
Should I always carefully write 'u' literals
Some coding standards, most notably MISRA-C, would have you do this, to make sure that no implicit type promotions exist in the code. Implicit promotions or conversions are very dangerous and they are a flaw in the C language.
For your specific case, there is no real danger with implicit promotions. But there are cases when small integer types are used and you might end up with unintentional change of signedness because of implicit type promotions.
There is never any harm with being explicit, although writing an u suffix to every literal in your code may arguably reduce readability.
Now what you really must do as a C programmer to deal with type promotion dangers, is to learn how the integer promotions and usual arithmetic conversions work (here's some example on the topic). Sadly, there are lots of C programmers who don't, veterans included. The result is subtle, but sometimes critical bugs. Particularly when using the bit-wise operators such as shifts, where change of signedness could invoke undefined behavior.
These rules can be somewhat tricky to learn as they aren't really behaving rationally or consistently. But until you know these rules in detail you have to be explicit with types.
EDIT: To be picky, the size of size_t is actually not specified, all the standard says is that it must be large enough to at least hold the value 65535 (2 bytes). So in theory, size_t could be equal to unsigned short, in which case promotions would turn out quite different. But in practice I doubt that scenario is of any interest, since I don't believe there exists any implementation where size_t is smaller than unsigned int.
Both C++ and C promote signed types to unsigned types when evaluating an operator like division which takes two arguments, and one of the arguments is an unsigned type.
So the literal 2 will be converted to an unsigned type.
Personally, I believe it's better to leave promotion to the compiler rather than be explicit: if your code was ever refactored and a became a signed type then a / 2u would cause a to be promoted to an unsigned type, with potentially disastrous consequences.
size_t sz = 11;
sz / 2 = 5
sz / (-2) = 0
WHY?
sz is treated as unsigned int because size can not be negative. When doing an arithmetic operation with an unsigned int and an int, the int is turned into unsigned int.
From "CS dummies"

Postfix (or suffix) U, UL, ULL for negative constants

Is it good practice to use unsigned suffix for negative constants in C++?
For example, is it safe to use,
foo(-1ull);
instead of
foo(unsigned long long(-1));
It is not really used "for negative constant". Constants (literals) in C++ are always non-negative. So, what you have here is literal 1ull with unary - operator in front of it. This means that the exact semantics of your first variant can be directly expressed as
-(unsigned long long) 1
but not as
(unsigned long long) -1
(although both produce the same result in the end).
BTW, your second variant, as written, is syntactically incorrect. "Multiword" type names cannot be used in functional-style casts. usigned(-1) is legal, but unsigned int(-1) is not. unsigned long long(-1) is also illegal.
But let's assume I understand what you tried to say by your second variant. Under that assumption, both variants do the same thing, so you can use either, depending on the personal preference or coding standard. The first one is obviously shorter though.