Order of int32_t to uint64_t casting - c++

Does the C++ standard guarantee whether integer conversion that both widens and casts away the sign will sign-extend or zero-extend?
The quick test:
int32_t s = -1;
uint64_t u = s;
produces an 0xFFFFFFFFFFFFFFFF under Xcode, but is that a defined behavior in the first place?

When you do
uint64_t u = s;
[dcl.init]/17.9 applies which states:
the initial value of the object being initialized is the (possibly converted) value of the initializer expression. A standard conversion sequence ([conv]) will be used, if necessary, to convert the initializer expression to the cv-unqualified version of the destination type; no user-defined conversions are considered.
and if we look in [conv], under integral conversions, we have
Otherwise, the result is the unique value of the destination type that is congruent to the source integer modulo 2N, where N is the width of the destination type.
So what you are guaranteed to have happen is that -1 becomes the largest number possible to represent, -2 is one less then that, -3 is one less then -2 and so on, basically it "wraps around".
In fact,
unsigned_type some_name = -1;
Is the canonical way to create a variable with the maximum value for that unsigned integer type.

You can find the standard verbiage in other answers.
But to help you form an intuitive mental model of widening conversions, it is helpful to think of these as a 2-step process:
Sign- or zero-extension of the value. If the value is of signed type, then sign extension is used. Here int32_t is sign-extended to int64_t. On x86, the signedness of type detetermines whether MOVSX or MOVZX instruction is used.
Converting the extended value to the destination type (change of signedness). Here int64_t is converted to uint64_t. It involves 0 assembly instructions as registers are untyped, the compiler just treats that register, which contains the result of sign extension of int32_t, as uint64_t.
Note that the standard doesn't specify these steps, it just specifies the required result.

From the section on Integral conversions:
[conv.integral/3]: Otherwise, the result is the unique value of the destination type that is congruent to the source integer modulo 2N, where N is the width of the destination type.
In other words, the wrap-around "happens last".

Related

Questions about C++20 two's-complement proposal R4

I am reading revision 4 of the two's-complement proposal (adopted by C++20), and I have some questions.
In the introduction, it says:
Status-quo Signed integer arithmetic remains non-commutative in general (though some implementations may guarantee that it is).
Does it really mean "non-commutative", as in a + b versus b + a? Or should that read "non-associative"?
It also says:
Change Conversion from signed to unsigned is always well-defined: the result is the unique value of the destination type that is congruent to the source integer modulo 2^N.
Hasn't signed-to-unsigned conversion been well-defined in precisely this way since the beginning of time? Should that read "conversion from unsigned to signed"?
Is there anything else in the list of changes that is missing or mis-stated?
Note that it wasn't P0907 that was adopted - it was P1236.
Or should that read "non-associative"?
Yes.
Should that read "conversion from unsigned to signed"?
Yes. If you look at P1236R1, you can see that the rule changed from:
If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type).
If the destination type is signed, the value is unchanged if it can be represented in the destination type; otherwise, the value is implementation-defined.
to:
Otherwise, the result is the unique value of the destination type that is congruent to the source integer modulo 2N, where N is the range exponent of the destination type.

Question on C++ undefined behavior. Casting between uint8 and int8 [duplicate]

Suppose I assign an eleven digits number to an int, what will happen? I played around with it a little bit and I know it's giving me some other numbers within the int range. How is this new number created?
It is implementation-defined behaviour. This means that your compiler must provide documentation saying what happens in this scenario.
So, consult that documentation to get your answer.
A common way that implementations define it is to truncate the input integer to the number of bits of int (after reinterpreting unsigned as signed if necessary).
C++14 Standard references: [expr.ass]/3, [conv.integral]/3
In C++20, this behavior will still be implementation-defined1 but the requirements are much stronger than before. This is a side-consequence of the requirement to use two's complement representation for signed integer types coming in C++20.
This kind of conversion is specified by [conv.integral]:
A prvalue of an integer type can be converted to a prvalue of another integer type. [...]
[...]
Otherwise, the result is the unique value of the destination type that is congruent to the source integer modulo 2N, where N is the width of the destination type.
[...]
This behaviour is the same as truncating the representation of the number to the width of the integer type you are assigning to, e.g.:
int32_t u = 0x6881736df7939752;
...will kept the 32 right-most bits, i.e., 0xf7939752, and "copy" these bits to u. In two's complement, this corresponds to -141322414.
1 This will still be implementation-defined because the size of int is implementation-defined. If you assign to, e.g., int32_t, then the behaviour is fully defined.
Prior to C++20, there were two different rules for unsigned and signed types:
A prvalue of an integer type can be converted to a prvalue of another integer type. [...]
If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type). [...]
If the destination type is signed, the value is unchanged if it can be represented in the destination type; otherwise, the value is implementation-defined.
INT_MAX on a 32 bit system is 2,147,483,647 (231 − 1), UINT_MAX is 4,294,967,295 (232 − 1).
int thing = 21474836470;
What happens is implementation-defined, it's up to the compiler. Mine appears to truncate the higher bits. 21474836470 is 0x4fffffff6,
warning: implicit conversion from 'long' to 'int' changes
value from 21474836470 to -10 [-Wconstant-conversion]
int thing = 21474836470;

What type is used in C++ to define an array size?

Compiling some test code in avr-gcc for an 8-bit micro-controller, the line
const uint32_t N = 65537;
uint8_t values[N];
I got the following compilation warning (by default should be an error, really)
warning: conversion from 'long unsigned int' to 'unsigned int' changes value from '65537' to '1' [-Woverflow]
uint8_t values[N];
Note that when compiling for this target, sizeof(int) is 2.
So it seems that, at an array size cannot exceed the size of an unsigned int.
Am I correct? Is this GCC-specific or is it part of some C or C++ standard?
Before somebody remarks that an 8-bit microcontroller generally does not have enough memory for an array so large, let me just anticipate saying that this is beside the point.
size_t is considered as the type to use, despite not being formally ratified by either the C or C++ standards.
The rationale for this is that the sizeof(values) will be that type (that is mandatated by the C and C++ standards), and the number of elements will be necessarily not greater than this since sizeof for an object is at least 1.
So it seems that, at an array size cannot exceed the size of an
unsigned int.
That seems to be the case in your particular C[++] implementation.
Am I correct? Is this gcc-specific or is it part of some C or C++
standard?
It is not a characteristic of GCC in general, nor is it specified by either the C or C++ standard. It is a characteristic of your particular implementation: a version of GCC for your specific computing platform.
The C standard requires the expression designating the number of elements of an array to have an integer type, but it does not specify a particular one. I do think it's strange that your GCC seems to claim it's giving you an array with a different number of elements than you specified. I don't think that conforms to the standard, and I don't think it makes much sense as an extension. I would prefer to see it reject the code instead.
I'll dissect the issue with the rules in the "incorrekt and incomplet" ISO CPP standard draft n4659. Emphasis is added by me.
11.3.4 defines array declarations. Paragraph one contains
If the constant-expression [between the square brackets] (8.20) is present, it shall be a converted constant expression of type std::size_t [...].
std::size_t is from <cstddef>and defined as
[...] an implementation-defined unsigned integer type that is large enough to contain the size in bytes of any object.
Since it is imported via the C standard library headers the C standard is relevant for the properties of size_t. The ISO C draft N2176 prescribes in 7.20.3 the "minimal maximums", if you want, of integer types. For size_t that maximum is 65535. In other words, a 16 bit size_t is entirely conformant.
A "converted constant expression" is defined in 8.20/4:
A converted constant expression of type T is an expression, implicitly converted to type T, where the converted expression is a constant expression and the implicit conversion sequence contains only [any of 10 distinct conversions, one of which concerns integers (par. 4.7):]
— integral conversions (7.8) other than narrowing conversions (11.6.4)
An integral conversion (as opposed to a promotion which changes the type to equivalent or larger types) is defined as follows (7.8/3):
A prvalue of an integer type can be converted to a prvalue of another integer type.
7.8/5 then excludes the integral promotions from the integral conversions. This means that the conversions are usually narrowing type changes.
Narrowing conversions (which, as you'll remember, are excluded from the list of allowed conversions in converted constant expressions used for array sizes) are defined in the context of list-initialization, 11.6.4, par. 7
A narrowing conversion is an implicit conversion
[...]
7.31 — from an integer type [...] to an integer type that cannot represent all the values of the original type, except where the source is a constant expression whose value after integral promotions will fit into the target type.
This is effectively saying that the effective array size must be the constant value at display, which is an entirely reasonable requirement for avoiding surprises.
Now let's cobble it all together. The working hypothesis is that std::size_t is a 16 bit unsigned integer type with a value range of 0..65535. The integer literal 65537 is not representable in the system's 16 bit unsigned int and thus has type long. Therefore it will undergo an integer conversion. This will be a narrowing conversion because the value is not representable in the 16 bit size_t2, so that the exception condition in 11.6.4/7.3, "value fits anyway", does not apply.
So what does this mean?
11.6.4/3.11 is the catch-all rule for the failure to produce an initializer value from an item in an intializer list. Because the initializer-list rules are used for array sizes, we can assume that the catch-all for conversion failure applies to the array size constant:
(3.11) — Otherwise, the program is ill-formed.
A conformant compiler is required to produce a diagnostic, which it does. Case closed.
1 Yes, they sub-divide paragraphs.
2 Converting an integer value of 65537 (in whatever type can hold the number — here probably a `long) to a 16 bit unsigned integer is a defined operation. 7.8/2 details:
If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source
integer (modulo 2n where n is the number of bits used to represent the unsigned type). [ Note: In a two’s
complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is
no truncation). —end note ]
The binary representation of 65537 is 1_0000_0000_0000_0001, i.e. only the least significant bit of the lower 16 bits is set. The conversion to a 16 bit unsigned value (which circumstantial evidence indicates size_t is) computes the [expression value] modulo 2^16, i.e. simply takes the lower 16 bits. This results in the value of 1 mentioned in the compiler diagnostics.
In your implementation size_t is defined as unsigned int and uint32_t is defined as a long unsigned int. When you create a C array the argument for the array size gets implicitly converted to size_t by the compiler.
This is why you're getting a warning. You're specifying the array size argument with an uint32_t that gets converted to size_t and these types don't match.
This is probably not what you want. Use size_t instead.
The value returned by sizeof will be of type size_t.
It is generally used as the number of elements in an array, because it will be of sufficient size. size_t is always unsigned but it is implementation-defined which type this is. Lastly, it is implementation-defined whether the implementation can support objects of even SIZE_MAX bytes... or even close to it.
[This answer was written when the question was tagged with C and C++. I have not yet re-examined it in light of OP’s revelation they are using C++ rather than C.]
size_t is the type the C standard designates for working with object sizes. However, it is not a cure-all for getting sizes correct.
size_t should be defined in the <stddef.h> header (and also in other headers).
The C standard does not require that expressions for array sizes, when specified in declarations, have the type size_t, nor does it require that they fit in a size_t. It is not specified what a C implementation ought to do when it cannot satisfy a request for an array size, especially for variable length arrays.
In your code:
const uint32_t N = 65537;
uint8_t values[N];
values is declared as a variable length array. (Although we can see the value of N could easily be known at compile time, it does not fit C’s definition of a constant expression, so uint8_t values[N]; qualifies as a declaration of a variable length array.) As you observed, GCC warns you that the 32-bit unsigned integer N is narrowed to a 16-bit unsigned integer. This warning is not required by the C standard; it is a courtesy provided by the compiler. More than that, the conversion is not required at all—since the C standard does not specify the type for an array dimension, the compiler could accept any integer expression here. So the fact that it has inserted an implicit conversion to the type it needs for array dimensions and warned you about it is a feature of the compiler, not of the C standard.
Consider what would happen if you wrote:
size_t N = 65537;
uint8_t values[N];
Now there would be no warning in uint8_t values[N];, as a 16-bit integer (the width of size_t in your C implementation) is being used where a 16-bit integer is needed. However, in this case, your compiler likely warns in size_t N = 65537;, since 65537 will have a 32-bit integer type, and a narrowing conversion is performed during the initialization of N.
However, the fact that you are using a variable length array suggests you may be computing array sizes at run-time, and this is only a simplified example. Possibly your actual code does not use constant sizes like this; it may calculate sizes during execution. For example, you might use:
size_t N = NumberOfGroups * ElementsPerGroup + Header;
In this case, there is a possibility that the wrong result will be calculated. If the variables all have type size_t, the result may easily wrap (effectively overflow the limits of the size_t type). In this case, the compiler will not give you any warning, because the values are all the same width; there is no narrowing conversion, just overflow.
Therefore, using size_t is insufficient to guard against errors in array dimensions.
An alternative is to use a type you expect to be wide enough for your calculations, perhaps uint32_t. Given NumberOfGroups and such as uint32_t types, then:
const uint32_t N = NumberOfGroups * ElementsPerGroup + Header;
will produce a correct value for N. Then you can test it at run-time to guard against errors:
if ((size_t) N != N)
Report error…
uint8_t values[(size_t) N];

Aliasing of otherwise equivalent signed and unsigned types

The C and C++ standards both allow signed and unsigned variants of the same integer type to alias each other. For example, unsigned int* and int* may alias. But that's not the whole story because they clearly have a different range of representable values. I have the following assumptions:
If an unsigned int is read through an int*, the value must be within the range of int or an integer overflow occurs and the behaviour is undefined. Is this correct?
If an int is read through an unsigned int*, negative values wrap around as if they were casted to unsigned int. Is this correct?
If the value is within the range of both int and unsigned int, accessing it through a pointer of either type is fully defined and gives the same value. Is this correct?
Additionally, what about compatible but not equivalent integer types?
On systems where int and long have the same range, alignment, etc., can int* and long* alias? (I assume not.)
Can char16_t* and uint_least16_t* alias? I suspect this differs between C and C++. In C, char16_t is a typedef for uint_least16_t (correct?). In C++, char16_t is its own primitive type, which compatible with uint_least16_t. Unlike C, C++ seems to have no exception allowing compatible but distinct types to alias.
If an unsigned int is read through an int*, the value must be
within the range of int or an integer overflow occurs and the
behaviour is undefined. Is this correct?
Why would it be undefined? there is no integer overflow since no conversion or computation is done. We take an object representation of an unsigned int object and see it through an int. In what way the value of the unsigned int object transposes to the value of an int is completely implementation defined.
If an int is read through an unsigned int*, negative values wrap
around as if they were casted to unsigned int. Is this correct?
Depends on the representation. With two's complement and equivalent padding, yes. Not with signed magnitude though - a cast from int to unsigned is always defined through a congruence:
If the destination type is unsigned, the resulting value is the
least unsigned integer congruent to the source integer (modulo
2n where n is the number of bits used to represent the unsigned type). [ Note: In a two’s complement representation, this
conversion is conceptual and there is no change in the bit pattern (if
there is no truncation). — end note ]
And now consider
10000000 00000001 // -1 in signed magnitude for 16-bit int
This would certainly be 215+1 if interpreted as an unsigned. A cast would yield 216-1 though.
If the value is within the range of both int and unsigned int,
accessing it through a pointer of either type is fully defined and
gives the same value. Is this correct?
Again, with two's complement and equivalent padding, yes. With signed magnitude we might have -0.
On systems where int and long have the same range, alignment,
etc., can int* and long* alias? (I assume not.)
No. They are independent types.
Can char16_t* and uint_least16_t* alias?
Technically not, but that seems to be an unneccessary restriction of the standard.
Types char16_t and char32_t denote distinct types with the same
size, signedness, and alignment as uint_least16_t and
uint_least32_t, respectively, in <cstdint>, called the underlying
types.
So it should be practically possible without any risks (since there shouldn't be any padding).
If an int is read through an unsigned int*, negative values wrap around as if they were casted to unsigned int. Is this correct?
For a system using two's complement, type-punning and signed-to-unsigned conversion are equivalent, for example:
int n = ...;
unsigned u1 = (unsigned)n;
unsigned u2 = *(unsigned *)&n;
Here, both u1 and u2 have the same value. This is by far the most common setup (e.g. Gcc documents this behaviour for all its targets). However, the C standard also addresses machines using ones' complement or sign-magnitude to represent signed integers. In such an implementation (assuming no padding bits and no trap representations), the result of a conversion of an integer value and type-punning can yield different results. As an example, assume sign-magnitude and n being initialized to -1:
int n = -1; /* 10000000 00000001 assuming 16-bit integers*/
unsigned u1 = (unsigned)n; /* 11111111 11111111
effectively 2's complement, UINT_MAX */
unsigned u2 = *(unsigned *)&n; /* 10000000 00000001
only reinterpreted, the value is now INT_MAX + 2u */
Conversion to an unsigned type means adding/subtracting one more than the maximum value of that type until the value is in range. Dereferencing a converted pointer simply reinterprets the bit pattern. In other words, the conversion in the initialization of u1 is a no-op on 2's complement machines, but requires some calculations on other machines.
If an unsigned int is read through an int*, the value must be within the range of int or an integer overflow occurs and the behaviour is undefined. Is this correct?
Not exactly. The bit pattern must represent a valid value in the new type, it doesn't matter if the old value is representable. From C11 (n1570) [omitted footnotes]:
6.2.6.2 Integer types
For unsigned integer types other than unsigned char, the bits of the object representation shall be divided into two groups: value bits and padding bits (there need not be any of the latter). If there are N value bits, each bit shall represent a different power of 2 between 1 and 2N-1, so that objects of that type shall be capable of representing values from 0 to 2N-1 using a pure binary representation; this shall be known as the value representation. The values of any padding bits are unspecified.
For signed integer types, the bits of the object representation shall be divided into three groups: value bits, padding bits, and the sign bit. There need not be any padding bits; signed char shall not have any padding bits. There shall be exactly one sign bit. Each bit that is a value bit shall have the same value as the same bit in the object representation of the corresponding unsigned type (if there are M value bits in the signed type and N in the unsigned type, then M≤N). If the sign bit is zero, it shall not affect the resulting value. If the sign bit is one, the value shall be modified in one of the following ways:
the corresponding value with sign bit 0 is negated (sign and magnitude);
the sign bit has the value -2M (two's complement);
the sign bit has the value -2M-1 (ones' complement).
Which of these applies is implementation-defined, as is whether the value with sign bit 1 and all value bits zero (for the first two), or with sign bit and all value bits 1 (for ones' complement), is a trap representation or a normal value. In the case of sign and magnitude and ones' complement, if this representation is a normal value it is called a negative zero.
E.g., an unsigned int could have value bits, where the corresponding signed type (int) has a padding bit, something like unsigned u = ...; int n = *(int *)&u; may result in a trap representation on such a system (reading of which is undefined behaviour), but not the other way round.
If the value is within the range of both int and unsigned int, accessing it through a pointer of either type is fully defined and gives the same value. Is this correct?
I think, the standard would allow for one of the types to have a padding bit, which is always ignored (thus, two different bit patterns can represent the same value and that bit may be set on initialization) but be an always-trap-if-set bit for the other type. This leeway, however, is limited at least by ibid. p5:
The values of any padding bits are unspecified. A valid (non-trap) object representation of a signed integer type where the sign bit is zero is a valid object representation of the corresponding unsigned type, and shall represent the same value. For any integer type, the object representation where all the bits are zero shall be a representation of the value zero in that type.
On systems where int and long have the same range, alignment, etc., can int* and long* alias? (I assume not.)
Sure they can, if you don't use them ;) But no, the following is invalid on such platforms:
int n = 42;
long l = *(long *)&n; // UB
Can char16_t* and uint_least16_t* alias? I suspect this differs between C and C++. In C, char16_t is a typedef for uint_least16_t (correct?). In C++, char16_t is its own primitive type, which compatible with uint_least16_t. Unlike C, C++ seems to have no exception allowing compatible but distinct types to alias.
I'm not sure about C++, but at least for C, char16_t is a typedef, but not necessarily for uint_least16_t, it could very well be a typedef of some implementation-specific __char16_t, some type incompatible with uint_least16_t (or any other type).
It is not defined that happens since the c standard does not exactly define how singed integers should be stored. so you can not rely on the internal representation. Also there does no overflow occur. if you just typecast a pointer nothing other happens then another interpretation of the binary data in the following calculations.
Edit
Oh, i misread the phrase "but not equivalent integer types", but i keep the paragraph for your interest:
Your second question has much more trouble in it. Many machines can only read from correctly aligned addresses there the data has to lie on multiples of the types width. If you read a int32 from a non-by-4-divisable address (because you casted a 2-byte int pointer) your CPU may crash.
You should not rely on the sizes of types. If you chose another compiler or platform your long and int may not match anymore.
Conclusion:
Do not do this. You wrote highly platform dependent (compiler, target machine, architecture) code that hides its errors behind casts that suppress any warnings.
Concerning your questions regarding unsigned int* and int*: if the
value in the actual type doesn't fit in the type you're reading, the
behavior is undefined, simply because the standard neglects to define
any behavior in this case, and any time the standard fails to define
behavior, the behavior is undefined. In practice, you'll almost always
obtain a value (no signals or anything), but the value will vary
depending on the machine: a machine with signed magnitude or 1's
complement, for example, will result in different values (both ways)
from the usual 2's complement.
For the rest, int and long are different types, regardless of their
representations, and int* and long* cannot alias. Similarly, as you
say, in C++, char16_t is a distinct type in C++, but a typedef in
C (so the rules concerning aliasing are different).

converting -1 to unsigned types

Consider the following code to set all bits of x
unsigned int x = -1;
Is this portable ? It seems to work on at least Visual Studio 2005-2010
The citation-heavy answer:
I know there are plenty of correct answers in here, but I'd like to add a few citations to the mix. I'll cite two standards: C99 n1256 draft (freely available) and C++ n1905 draft (also freely available). There's nothing special about these particular standards, they're just both freely available and whatever happened to be easiest to find at the moment.
The C++ version:
§5.3.2 ¶9: According to this paragraph, the value ~(type)0 is guaranteed to have all bits set, if (type) is an unsigned type.
The operand of ~ shall have integral or enumeration type; the result is the one’s complement of its operand. Integral promotions are performed. The type of the result is the type of the promoted operand.
§3.9.1 ¶4: This explains how overflow works with unsigned numbers.
Unsigned integers, declared unsigned, shall obey the laws of arithmetic modulo 2n where n is the number of bits in the value representation of that particular size of integer.
§3.9.1 ¶7, plus footnote 49: This explains that numbers must be binary. From this, we can infer that ~(type)0 must be the largest number representable in type (since it has all bits turned on, and each bit is additive).
The representations of integral types shall define values by use of a pure
binary numeration system49.
49) A positional representation for integers that uses the binary digits 0 and 1, in which the values represented by successive bits are additive, begin
with 1, and are multiplied by successive integral power of 2, except perhaps for the bit with the highest position. (Adapted from the American National
Dictionary for Information Processing Systems.)
Since arithmetic is done modulo 2n, it is guaranteed that (type)-1 is the largest value representable in that type. It is also guaranteed that ~(type)0 is the largest value representable in that type. They must therefore be equal.
The C99 version:
The C99 version spells it out in a much more compact, explicit way.
§6.5.3 ¶3:
The result of the ~ operator is the bitwise complement of its (promoted) operand (that is,
each bit in the result is set if and only if the corresponding bit in the converted operand is
not set). The integer promotions are performed on the operand, and the result has the
promoted type. If the promoted type is an unsigned type, the expression ~E is equivalent
to the maximum value representable in that type minus E.
As in C++, unsigned arithmetic is guaranteed to be modular (I think I've done enough digging through standards for now), so the C99 standard definitely guarantees that ~(type)0 == (type)-1, and we know from §6.5.3 ¶3 that ~(type)0 must have all bits set.
The summary:
Yes, it is portable. unsigned type x = -1; is guaranteed to have all bits set according to the standard.
Footnote: Yes, we are talking about value bits and not padding bits. I doubt that you need to set padding bits to one, however. You can see from a recent Stack Overflow question (link) that GCC was ported to the PDP-10 where the long long type has a single padding bit. On such a system, unsigned long long x = -1; may not set that padding bit to 1. However, you would only be able to discover this if you used pointer casts, which isn't usually portable anyway.
Apparently it is:
(4.7) If the destination type is unsigned, the resulting value is the least
unsigned integer congruent to the source integer (modulo 2n where n is
the number of bits used to represent the unsigned type). [Note: In a
two’s complement representation, this conversion is conceptual and
there is no change in the bit pattern (if there is no truncation).
It is guaranteed to be the largest amount possible for that type due to the properties of modulo.
C99 also allows it:
Otherwise, if the newtype is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that
can be represented in the newtype until the value is in the range of
the newtype. 49)
Which wold also be the largest amount possible.
Largest amount possible may not be all bits set. Use ~static_cast<unsigned int>(0) for that.
I was sloppy in reading the question, and made several comments that might be misleading because of that. I'll try to clear up the confusion in this answer.
The declaration
unsigned int x = -1;
is guaranteed to set x to UINT_MAX, the maximum value of type unsigned int. The expression -1 is of type int, and it's implicitly converted to unsigned int. The conversion (which is defined in terms of values, not representations) results in the maximum value of the target unsigned type.
(It happens that the semantics of the conversion are optimized for two's-complement systems; for other schemes, the conversion might involve something more than just copying the bits.)
But the question referred to setting all bits of x. So, is UINT_MAX represented as all-bits-one?
There are several possible representations for signed integers (two's-complement is most common, but ones'-complement and sign-and-magnitude are also possible). But we're dealing with an unsigned integer type, so the way that signed integers are represented is irrelevant.
Unsigned integers are required to be represented in a pure binary format. Assuming that all the bits of the representation contribute to the value of an unsigned int object, then yes, UINT_MAX must be represented as all-bits-one.
On the other hand, integer types are allowed to have padding bits, bits that don't contribute to the representation. For example, it's legal for unsigned int to be 32 bits, but for only 24 of those bits to be value bits, so UINT_MAX would be 2*24-1 rather than 2*32-1. So in the most general case, all you can say is that
unsigned int x = -1;
sets all the value bits of x to 1.
In practice, very very few systems have padding bits in integer types. So on the vast majority of systems, unsigned int has a size of N bits, and a maximum value of 2**N-1, and the above declaration will set all the bits of x to 1.
This:
unsigned int x = ~0U;
will also set x to UINT_MAX, since bitwise complement for unsigned types is defined in terms of subtraction.
Beware!
This is implementation-defined, as how a negative integer shall be represented, whether two's complement or what, is not defined by the C++ Standard. It is up to the compiler which makes the decision, and has to document it properly.
In short, it is not portable. It may not set all bits of x.