Is signed to unsigned conversion, and back, defined behaviour for integers? - c++

#include <cstdint>
#include <iostream>
int main() {
uint32_t i = -64;
int32_t j = i;
std::cout << j;
return 0;
}
Most compilers I've tried will create programs that output -64, but is this defined behaviour?
Is the assignment of a signed integer to and unsigned integer uint32_t i = -64; defined behaviour?
Is the signed integer assignment int32_t j = i;, when i equals 4294967232, defined behaviour?

For unsigned integer out-of-range conversion, the result is defined; for signed integers, it's implementation-defined.
C++11(ISO/IEC 14882:2011) §4.7 Integral conversions [conv.integral/2]
If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2^n where n is the number of bits used to represent the unsigned type). [ Note: In a two’s complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). —end note ]
If the destination type is signed, the value is unchanged if it can be represented in the destination type (and bit-field width); otherwise, the value is implementation-defined.
This text remains the same for C++14.

The Standard requires that implementations document, somehow, how they will determine what value to use when an integer is converted to a signed type which is too small to accommodate it. It does not specify the form such documentation will take. A conforming implementation's documentation could specify in readable print that values will be truncated and two's-complement sign extended, and then in impossibly small print specify "...except when a program is compiled on the fifth Tuesday of a month, in which case out-of-range conversions will yield the value 24601". Such documentation would, of course, be less than helpful, but the Standard does not concern itself with "quality of implementation" issues.
In practice, implementations that define the behavior in any fashion other than 100% consistent truncation and two's-complement sign extension are extremely rare; I would not be particularly surprised if in fact 100% of conforming C99 and C11 implementations that are intended for production code default to working in that fashion. Unfortunately, neither <limits.h> nor any other standard header defines any means via which implementations can indicate that they follow the essentially-universal convention.
To be sure, it's unlikely that code which expects the common behavior will be tripped up by the behavior of any conforming compiler. It's plausible, however, that compilers might offer a non-conforming mode, since that could make certain kinds of code more efficient. For example, given:
int32_t x,i;
int16_t *p;
...
x = ++p[i];
If int is larger than 16 bits, behavior would be defined in case p[i] was 32767 before the code executed. The increment would yield -32768, the value would be converted to int16_t in Implementation-Defined fashion (which is guaranteed to yield -32768 unless an implementation documents something else), and that value would then be stored to both x and p[i].
On processors like the ARM which always do arithmetic using 32 bits, truncating the value stored to p[i] would cost nothing, but truncating the value written to x would require an instruction (or, for some older ARM models, two instructions). Allowing x to receive +32768 in that case would improve efficiency on such processors. Such an option would not affect the behavior of most programs, but it would be helpful if the Standard defined a means via which code which relied upon behavior could say, e.g.
#ifdef __STDC_UNUSUAL_INT_TRUNCATION
#error This code relies upon truncating integer type conversions
#endif
so that those programs that would be affected could guard against accidental compilation in such modes. As yet the Standard doesn't define any such test macro.

Related

Does std::memcpy make its destination determinate?

Here is the code:
unsigned int a; // a is indeterminate
unsigned long long b = 1; // b is initialized to 1
std::memcpy(&a, &b, sizeof(unsigned int));
unsigned int c = a; // Is this not undefined behavior? (Implementation-defined behavior?)
Is a guaranteed by the standard to be a determinate value where we access it to initialize c? Cppreference says:
void* memcpy( void* dest, const void* src, std::size_t count );
Copies count bytes from the object pointed to by src to the object pointed to by dest. Both objects are reinterpreted as arrays of unsigned char.
But I don't see anywhere in cppreference that says if an indeterminate value is "copied to" like this, it becomes determinate.
From the standard, it seems it's analogous to this:
unsigned int a; // a is indeterminate
unsigned long long b = 1; // b is initialized to 1
auto* a_ptr = reinterpret_cast<unsigned char*>(&a);
auto* b_ptr = reinterpret_cast<unsigned char*>(&b);
a_ptr[0] = b_ptr[0];
a_ptr[1] = b_ptr[1];
a_ptr[2] = b_ptr[2];
a_ptr[3] = b_ptr[3];
unsigned int c = a; // Is this undefined behavior? (Implementation defined behavior?)
It seems like the standard leaves room for this to be allowed, because the type aliasing rules allow for the object a to be accessed as an unsigned char this way. But I can't find something that says this makes a no longer indeterminate.
Is this not undefined behavior
It's UB, because you're copying into the wrong type. [basic.types]2 and 3 permit byte copying, but only between objects of the same type. You copied from a long long into an int. That has nothing to do with the value being indeterminate. Even though you're only copying sizeof(int) bytes, the fact that you're not copying from an actual int means that you don't get the protection of those rules.
If you were copying into the value of the same type, then [basic.types]3 says that it's equivalent to simply assigning them. That is, a " shall subsequently hold the same value as" b.
TL;DR: It's implementation-defined whether there will be undefined behavior or not. Proof-style, with lines of code numbered:
unsigned int a;
The variable a is assumed to have automatic storage duration. Its lifetime begins (6.6.3/1). Since it is not a class, its lifetime begins with default initialization, in which no other initialization is performed (9.3/7.3).
unsigned long long b = 1ull;
The variable b is assumed to have automatic storage duration. Its lifetime begins (6.6.3/1). Since it is not a class, its lifetime begins with copy-initialization (9.3/15).
std::memcpy(&a, &b, sizeof(unsigned int));
Per 16.2/2, std::memcpy should have the same semantics and preconditions as the C standard library's memcpy. In the C standard 7.21.2.1, assuming sizeof(unsigned int) == 4, 4 characters are copied from the object pointed to by &b into the object pointed to by &a. (These two points are what is missing from other answers.)
At this point, the sizes of unsigned int, unsigned long long, their representations (e.g. endianness), and the size of a character are all implementation defined (to my understanding, see 6.7.1/4 and its note saying that ISO C 5.2.4.2.1 applies). I will assume that the implementation is little-endian, unsigned int is 32 bits, unsigned long long is 64 bits, and a character is 8 bits.
Now that I have said what the implementation is, I know that a has a value-representation for an unsigned int of 1u. Nothing, so far, has been undefined behavior.
unsigned int c = a;
Now we access a. Then, 6.7/4 says that
For trivially copyable types, the value representation is a set of bits in the object representation that determines a value, which is one discrete element of an implementation-defined set of values.
I know now that the value of a is determined by the implementation-defined value bits in a, which I know hold the value-representation for 1u. The value of a is then 1u.
Then like (2), the variable c is copy-initialized to 1u.
We made use of implementation-defined values to find what happens. It is possible that the implementation-defined value of 1ull is not one of the implementation-defined set of values for unsigned int. In that case, accessing a will be undefined behavior, because the standard doesn't say what happens when you access a variable with a value-representation that is invalid.
AFAIK, we can take advantage of the fact that most implementations define an unsigned int where any possible bit pattern is a valid value-representation. Therefore, there will be no undefined behavior.
Note: I updated this answer since by exploring the issue further in some of the comments has reveled cases where it would be implementation defined or even undefined in a case I did not consider originally (specifically in C++17 as well).
I believe that this is either implementation defined behavior in some cases and undefined in others (as another answer came to conclude for similar reasons). In a sense it's implementation defined if it's undefined behavior or implementation defined, so I am not sure if it being undefined in general takes precedence in such a classification.
Since std::memcpy works entirely on the object representation of the types in question (by aliasing the pointers given to unsigned char as is specified by 6.10/8.8 [basic.lval]). If the bits within the bytes in question of the unsigned long long are guaranteed to be something specific then you can manipulate them however you wish or write them into the object representation of any other type. The destination type will then use the bits to form its value based on its value representation (whatever that may be) as is defined in 6.9/4 [basic.types]:
The object representation of an object of type T is the sequence of N
unsigned char objects taken up by the object of type T, where N equals
sizeof(T). The value representation of an object is the set of bits
that hold the value of type T. For trivially copyable types, the value
representation is a set of bits in the object representation that
determines a value, which is one discrete element of an
implementation-defined set of values.
And that:
The intent is that the memory model of C++ is compatible with that of
ISO/IEC 9899 Programming Language C.
Knowing this, all that matters now is what the object representation of the integer types in question are. According to 6.9.1/7 [basic.fundemental]:
Types bool, char, char16_t, char32_t, wchar_t, and the signed and
unsigned integer types are collectively called integral types. A
synonym for integral type is integer type. The representations of
integral types shall define values by use of a pure binary numeration
system. [Example: This International Standard permits two’s
complement, ones’ complement and signed magnitude representations for
integral types. — end example ]
A footnote does clarify the definition of "binary numeration system" however:
A positional representation for integers that uses the binary digits 0
and 1, in which the values represented by successive bits are
additive, begin with 1, and are multiplied by successive integral
power of 2, except perhaps for the bit with the highest position.
(Adapted from the American National Dictionary for Information
Processing Systems.)
We also know that unsigned integers have the same value representation as signed integers, just under a modulus according to 6.9.1/4 [basic.fundemental]:
Unsigned integers shall obey the laws of arithmetic modulo 2^n where n
is the number of bits in the value representation of that particular
size of integer.
While this does not say exactly what the value representation may be, based on the specified definition of a binary numeration system, successive bits are to be additive powers of two as expected (rather than allowing the bits to be in any given order), with the exception of the maybe present sign bit. Additionally since signed and unsigned value representations this means an unsigned integer will stored as an increasing binary sequence up until 2^(n-1) (past then depending on how signed number are handled things are implementation defined).
There are still some other considerations however, such as endianness and how many padding bits may be present due to sizeof(T) only measuring the size of the object representation rather than the value representation (as stated before). Since in C++17 there is no standard way (I think) to check for endianness, this is the main factor that would leave this to be implementation defined in what the outcome would be. As for padding bits, while they may be present (but not specified where they will be from what I can tell other than the implication that they will not interrupt the contiguous sequence of bits forming the value representation of a integer), writing to them can prove potentially problematic. Since the intent of the C++ memory model is based on the C99 standard's memory model in a "comparable" way, a footnote from 6.2.6.2 (which is referenced in the C++20 standard as a note to remind that it's based on that) can be taken which say as follows:
Some combinations of padding bits might generate trap representations,
for example, if one padding bit is a parity bit. Regardless, no
arithmetic operation on valid values can generate a trap
representation other than as part of an exceptional condition such as
an overflow, and this cannot occur with unsigned types. All other
combinations of padding bits are alternative object representations of
the value specified by the value bits.
This implies that writing directly to padding bits incorrectly could potentially generate a trap representation from what I can tell.
This shows that in some cases depending on if padding bits are present and endianness, the result can be influenced in an implementation-defined manner. If some combination of padding bits is also a trap representation, this may become undefined behavior.
While not possible in C++17, in C++20 one can use std::endian in conjunction with std::has_unique_object_representations<T> (which was present in C++17) or some math with CHAR_BIT, UINT_MAX/ULLONG_MAX and the sizeof those types to ensure the expected endianness is correct as well as the absence of padding bits, allowing this to actually produce the expected result in a defined manner given what was previously established with how integers are said to be stored. Of course C++20 also further refines this and specifies that integer are to be stored in two's complement alone eliminating further implementation-specific issues.

What's the difference between casting a long to int versus using a bitwise AND in order to get the 4 least significant bytes?

I know that in order to get the 4 least significant bytes of a number of type long I can cast it to int/unsigned int or use a bitwise AND (& 0xFFFFFFFF).
This code produces the following output:
#include <stdio.h>
int main()
{
long n = 0x8899AABBCCDDEEFF;
printf("0x%016lX\n", n);
printf("0x%016X\n", (int)n);
printf("0x%016X\n", (unsigned int)n);
printf("0x%016lX\n", n & 0xFFFFFFFF);
}
Output:
0x8899AABBCCDDEEFF
0x00000000CCDDEEFF
0x00000000CCDDEEFF
0x00000000CCDDEEFF
Does that mean that the two methods used are equivalent? If so, do they always produce the same output regardless of the platform/compiler?
Also, is there any catch or pitfall while casting to unsigned int rather than int for the purpose of this question?
Finally, why is the output the same if you change the number n to be an unsigned long instead?
The methods are definitely different.
According to integral conversion rules (cf, for example, this online c++11 standard), a conversion (e.g. through an explicit cast) from one integral type to another depends on whether the destination type is signed or unsigned. If the destination type is unsigned, one can rely on a "modulo 2n" truncation, whereas with signed destination types one could tap into implementation defined behaviour:
4.7 Integral conversions [conv.integral]
2 If the destination type is unsigned, the resulting value is the
least unsigned integer congruent to the source integer (modulo 2n
where n is the number of bits used to represent the unsigned type). [
Note: In a two's complement representation, this conversion is
conceptual and there is no change in the bit pattern (if there is no
truncation). — end note ]
3 If the destination type is signed, the value is unchanged if it can
be represented in the destination type (and bit-field width);
otherwise, the value is implementation-defined.
For your first question, as others have pointed out, the size of int and long is dependent on the platform, so the methods are not equivalent. In C data types, check that the types say "at least XX bits in size"
For the second question, it comes down to this: long and int are signed, meaning that one bit is reserved for sign (take a look also to two's complement). If you were the compiler, what can you do with negative values (especially the long ones)? As Stepahn Lechner mentioned, this is implementation defined (that is, is up to the compiler).
Finally, in the spirit of "your code must do what it says it does", the best thing to do if you need to do masks is to use masks (and, if you use masks, use unsigned types). Don't try to use cleaver answers. Believe me, they always bite you in the rear. I've dealt with a lot of legacy code to know that by heart.
What's the difference between casting a long to int versus using a bitwise AND in order to get the 4 least significant bytes?
Type. Casting makes the value an int. And'ing does not change the type.
Range. Depending on int,long range, a cast may not change the value at all.
IDB and UB. implementation defined behavior and undefined behavior are present with mixing signed-ness.
To "get" the 4 LSBytes, use & 0xFFFFFFFFu or cast to uint32_t.
OP's question is unnecessarily convoluted.
long n = 0x8899AABBCCDDEEFF; --> Converting a value outside the range of a signed integer type is implementation-defined.
Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-defined or an implementation-defined signal is raised.
C11 §6.3.1.3 3
printf("0x%016lX\n", n); --> Printing a long with a "%lX" outside the the common range of long/unsigned long is undefined behavior.
Let's go forward with unsigned long:
unsigned long n = 0x8899AABBCCDDEEFF; // no problem,
printf("0x%016lX\n", n); // no problem,
printf("0x%016X\n", (int)n); // problem, C11 6.3.1.3 3
printf("0x%016X\n", (unsigned int)n); // no problem,
printf("0x%016lX\n", n & 0xFFFFFFFF); // no problem,
The "no problem" are OK even is unsigned long is 32-bit or 64-bit. The output will differ, yet is OK.
Recall that int,long are not always 32,64 bit. (16,32), (32,32), (32,64) are common.
int is at least 16 bit.
long is at least that of int and at least 32 bit.

Implicit conversion in C++ between main and function

I have the below simple program:
#include <iostream>
#include <stdio.h>
void SomeFunction(int a)
{
std::cout<<"Value in function: a = "<<a<<std::endl;
}
int main(){
size_t a(0);
std::cout<<"Value in main: "<<a-1<<std::endl;
SomeFunction(a-1);
return 0;
}
Upon executing this I get:
Value in main: 18446744073709551615
Value in function: a = -1
I think I roughly understand why the function gets the 'correct' value of -1: there is an implicit conversion from the unsigned type to the signed one i.e. 18446744073709551615(unsigned) = -1(signed).
Is there any situation where the function will not get the 'correct' value?
Since size_t type is unsigned, subtracting 1 is well defined:
A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.
However, the resultant value of 264-1 is out of ints range, so you get implementation-defined behavior:
[when] the new type is signed and the value cannot be represented in it, either the result is implementation-defined or an implementation-defined signal is raised.
Therefore, the answer to your question is "yes": there are platforms where the value of a would be different; there are also platforms where instead of calling SomeFunction the program will raise a signal.
Not on your computer... but technically yes, there is a situation where things can go wrong.
All modern PCs use the "two's complement" system for signed integer arithmetic (read Wikipedia for details). Two's complement has many advantages, but one of the biggest is this: unsaturated addition and subtraction of signed integers is identical to that of unsigned integers. As long as overflow/underflow causes the result to "wrap around" (i.e., 0-1 = UINT_MAX), the computer can add and subtract without even knowing whether you're interpreting the numbers as signed or unsigned.
BUT! C/C++ do not technically require two's complement for signed integers. There are two other permissible systems, known as "sign-magnitude" and "one's complement". These are unusual systems, never found outside antique architectures and embedded processors (and rarely even there). But in those systems, signed and unsigned arithmetic do not match up, and (signed)(a+b) will not necessarily equal (signed)a + (signed) b.
There's another, more mundane caveat when you're also narrowing types, as is the case between size_t and int on x64, because C/C++ don't require compilers to follow a particular rule when narrowing out-of-range values to signed types. This is likewise more a matter of language lawyering than actual unsafeness, though: VC++, GCC, Clang, and all other compilers I'm aware of narrow through truncation, leading to the expected behavior.
It's easier to compare and contrast signed and insigned of type same basic type, say signed int and unsigned int.
On a system that uses 32 bits for int, the range of unsigned int is [0 - 4294967295] and the range of signed int is [-2147483647 - 2147483647].
Say you have a variable of type unsigned int and its value is greater than 2147483647. If you pass such a variable to SomeFunction, you will see an incorrect value in the function.
Conversely, say you have a variable of type signed int and its value is less than zero. If you pass such a variable to a function that expects an unsigned int, you will see an incorrect value in the function.

How does casting to "signed int" and back to "signed short" work for values larger than 32,767?

Code:
typedef signed short SIGNED_SHORT; //16 bit
typedef signed int SIGNED_INT; //32 bit
SIGNED_SHORT x;
x = (SIGNED_SHORT)(SIGNED_INT) 45512; //or any value over 32,767
Here is what I know:
Signed 16 bits:
Signed: From −32,768 to 32,767
Unsigned: From 0 to 65,535
Don't expect 45512 to fit into x as x is declared a 16 bit signed integer.
How and what does the double casting above do?
Thank You!
typedef signed short SIGNED_SHORT; //16 bit
typedef signed int SIGNED_INT; //32 bit
These typedefs are not particularly useful. A typedef does nothing more than provide a new name for an existing type. Type signed short already has a perfectly good name: "signed short"; calling it SIGNED_SHORT as well doesn't buy you anything. (It would make sense if it abstracted away some information about the type, or if the type were likely to change -- but using the name SIGNED_SHORT for a type other than signed short would be extremely confusing.)
Note also that short and int are both guaranteed to be at least 16 bits wide, and int is at least as wide as short, but different sizes are possible. For example, a compiler could make both short and int 16 bits -- or 64 bits for that matter. But I'll assume the sizes for your compiler are as you state.
In addition, signed short and short are names for the same type, as are signed int and int.
SIGNED_SHORT x;
x = (SIGNED_SHORT)(SIGNED_INT) 45512; //or any value over 32,767
A cast specifies a conversion to a specified type. Two casts specify two such conversions. The value 45512 is converted to signed int, and then to signed short.
The constant 45512 is already of type int (another name for signed int), so the innermost cast is fairly pointless. (Note that if int is only 16 bits, then 45512 will be of type long.)
When you assign a value of one numeric type to an object of another numeric type, the value is implicitly converted to the object's type, so the outermost cast is also redundant.
So the above code snippet is exactly equivalent to:
short x = 45512;
Given the ranges of int and short on your system, the mathematical value 45512 cannot be represented in type short. The language rules state that the result of such a conversion is implementation-defined, which means that it's up to each implementation to determine what the result is, and it must document that choice, but different implementations can do it differently. (Actually that's not quite the whole story; the 1999 ISO C standard added permission for such a conversion to raise an implementation-defined signal. I don't know of any compiler that does this.)
The most common semantics for this kind of conversion is that the result gets the low-order bits of the source value. This will probably result in the value -20024 being assigned to x. But you shouldn't depend on that if you want your program to be maximally portable.
When you cast twice, the casts are applied in sequence.
int a = 45512;
int b = (int) a;
short x = (short) b;
Since 45512 does not fit in a short on most (but not all!) platforms, the cast overflows on those platforms. This will either raise an implementation-defined signal or result in an implementation-defined value.
In practice, many platforms define the result as the truncated value, which is -20024 in this case. However, there are platforms which raise a signal, which will probably terminate your program if uncaught.
Citation: n1525 §6.3.1.3
Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-defined or an implementation-defined signal is raised.
The double casting is equivalent to:
short x = static_cast<short>(static_cast<int>(45512));
which is equivalent to:
short x = 45512;
which will likely wrap around so x equals -20024, but technically it's implementation defined behavior if a short has a maximum value less than 45512 on your platform. The literal 45512 is of type int.
You can assume it does two type conversions (although signed int and int are only separated once in the C standard, IIRC).
If SIGNED_SHORT is too small to handle 45512, the result is either implementation-defined or an implementation-defined signal is raised. (In C++ only the former applies.)

converting -1 to unsigned types

Consider the following code to set all bits of x
unsigned int x = -1;
Is this portable ? It seems to work on at least Visual Studio 2005-2010
The citation-heavy answer:
I know there are plenty of correct answers in here, but I'd like to add a few citations to the mix. I'll cite two standards: C99 n1256 draft (freely available) and C++ n1905 draft (also freely available). There's nothing special about these particular standards, they're just both freely available and whatever happened to be easiest to find at the moment.
The C++ version:
§5.3.2 ¶9: According to this paragraph, the value ~(type)0 is guaranteed to have all bits set, if (type) is an unsigned type.
The operand of ~ shall have integral or enumeration type; the result is the one’s complement of its operand. Integral promotions are performed. The type of the result is the type of the promoted operand.
§3.9.1 ¶4: This explains how overflow works with unsigned numbers.
Unsigned integers, declared unsigned, shall obey the laws of arithmetic modulo 2n where n is the number of bits in the value representation of that particular size of integer.
§3.9.1 ¶7, plus footnote 49: This explains that numbers must be binary. From this, we can infer that ~(type)0 must be the largest number representable in type (since it has all bits turned on, and each bit is additive).
The representations of integral types shall define values by use of a pure
binary numeration system49.
49) A positional representation for integers that uses the binary digits 0 and 1, in which the values represented by successive bits are additive, begin
with 1, and are multiplied by successive integral power of 2, except perhaps for the bit with the highest position. (Adapted from the American National
Dictionary for Information Processing Systems.)
Since arithmetic is done modulo 2n, it is guaranteed that (type)-1 is the largest value representable in that type. It is also guaranteed that ~(type)0 is the largest value representable in that type. They must therefore be equal.
The C99 version:
The C99 version spells it out in a much more compact, explicit way.
§6.5.3 ¶3:
The result of the ~ operator is the bitwise complement of its (promoted) operand (that is,
each bit in the result is set if and only if the corresponding bit in the converted operand is
not set). The integer promotions are performed on the operand, and the result has the
promoted type. If the promoted type is an unsigned type, the expression ~E is equivalent
to the maximum value representable in that type minus E.
As in C++, unsigned arithmetic is guaranteed to be modular (I think I've done enough digging through standards for now), so the C99 standard definitely guarantees that ~(type)0 == (type)-1, and we know from §6.5.3 ¶3 that ~(type)0 must have all bits set.
The summary:
Yes, it is portable. unsigned type x = -1; is guaranteed to have all bits set according to the standard.
Footnote: Yes, we are talking about value bits and not padding bits. I doubt that you need to set padding bits to one, however. You can see from a recent Stack Overflow question (link) that GCC was ported to the PDP-10 where the long long type has a single padding bit. On such a system, unsigned long long x = -1; may not set that padding bit to 1. However, you would only be able to discover this if you used pointer casts, which isn't usually portable anyway.
Apparently it is:
(4.7) If the destination type is unsigned, the resulting value is the least
unsigned integer congruent to the source integer (modulo 2n where n is
the number of bits used to represent the unsigned type). [Note: In a
two’s complement representation, this conversion is conceptual and
there is no change in the bit pattern (if there is no truncation).
It is guaranteed to be the largest amount possible for that type due to the properties of modulo.
C99 also allows it:
Otherwise, if the newtype is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that
can be represented in the newtype until the value is in the range of
the newtype. 49)
Which wold also be the largest amount possible.
Largest amount possible may not be all bits set. Use ~static_cast<unsigned int>(0) for that.
I was sloppy in reading the question, and made several comments that might be misleading because of that. I'll try to clear up the confusion in this answer.
The declaration
unsigned int x = -1;
is guaranteed to set x to UINT_MAX, the maximum value of type unsigned int. The expression -1 is of type int, and it's implicitly converted to unsigned int. The conversion (which is defined in terms of values, not representations) results in the maximum value of the target unsigned type.
(It happens that the semantics of the conversion are optimized for two's-complement systems; for other schemes, the conversion might involve something more than just copying the bits.)
But the question referred to setting all bits of x. So, is UINT_MAX represented as all-bits-one?
There are several possible representations for signed integers (two's-complement is most common, but ones'-complement and sign-and-magnitude are also possible). But we're dealing with an unsigned integer type, so the way that signed integers are represented is irrelevant.
Unsigned integers are required to be represented in a pure binary format. Assuming that all the bits of the representation contribute to the value of an unsigned int object, then yes, UINT_MAX must be represented as all-bits-one.
On the other hand, integer types are allowed to have padding bits, bits that don't contribute to the representation. For example, it's legal for unsigned int to be 32 bits, but for only 24 of those bits to be value bits, so UINT_MAX would be 2*24-1 rather than 2*32-1. So in the most general case, all you can say is that
unsigned int x = -1;
sets all the value bits of x to 1.
In practice, very very few systems have padding bits in integer types. So on the vast majority of systems, unsigned int has a size of N bits, and a maximum value of 2**N-1, and the above declaration will set all the bits of x to 1.
This:
unsigned int x = ~0U;
will also set x to UINT_MAX, since bitwise complement for unsigned types is defined in terms of subtraction.
Beware!
This is implementation-defined, as how a negative integer shall be represented, whether two's complement or what, is not defined by the C++ Standard. It is up to the compiler which makes the decision, and has to document it properly.
In short, it is not portable. It may not set all bits of x.