given a class who's only member is a char[10], that has no inheritance nor virtual members, that has a constructor that does not mention the array in any way (such that it gets default-initialization -> no initialization, like so:
class in_place_string {
char data[10];
static struct pre_initialized_type {} pre_initialized;
in_place_string(pre_initialized_type) {} //This is the constructor in question
in_place_string() :data() {} //this is so you don't yell at me, not relevent
};
Is it defined behavior to placement-new this class into a buffer that already has data, and then read from the array member?
int main() {
char buffer[sizeof(in_place_string)] = "HI!";
in_place_string* str = new(buffer) in_place_string(in_place_string::pre_initialized);
cout << str->data; //undefined behavior?
}
I'm pretty sure it's not well defined, so I'm asking if this is implementation defined or undefined behavior.
You're not performing a reinterpret_cast (which wouldn't be safe, since the class has non-trivial initialization); you're creating a new object whose member is uninitialized.
Performing lvalue->rvalue conversion on an uninitialized object gives an indeterminate value and undefined behavior. So is the object uninitialized?
According to 5.3.4 all objects created by new-expression have dynamic storage duration. There's no exception for placement new.
Entities created by a new-expression have dynamic storage duration
And then 8.5 says
If no initializer is specified for an object, the object is default-initialized. When storage for an object with automatic or dynamic storage duration is obtained, the object has an indeterminate value, and if no initialization is performed for the object, that object retains an indeterminate value until that value is replaced (5.17). [ Note: Objects with static or thread storage duration are zero-initialized, see end note ] If an indeterminate value is produced by an evaluation, the behavior is undefined except in the following cases:
and the following cases permit only unsigned char, and even then the value is not useful.
In your case the new object has dynamic storage duration (!) and its members for which no initialization is performed have indeterminate value. Reading them gives undefined behavior.
I think the relevant clause is 8.5 [dcl.init] paragraph 12:
If no initializer is specified for an object, the object is default-initialized. When storage for an object with automatic or dynamic storage duration is obtained, the object has an indeterminate value, and if no initialization is performed for the object, that object retains an indeterminate value until that value is replaced (5.17). [ Note: Objects with static or thread storage duration are zero-initialized, see 3.6.2. —end note ] If an indeterminate value is produced by an evaluation, the behavior is undefined except in the
following cases:
If an indeterminate value of unsigned narrow character type (3.9.1) is produced by the evaluation of:
the second or third operand of a conditional expression (5.16),
the right operand of a comma expression (5.18),
the operand of a cast or conversion to an unsigned narrow character type (4.7, 5.2.3, 5.2.9, 5.4), or
a discarded-value expression (Clause 5), then the result of the operation is an indeterminate value.
If an indeterminate value of unsigned narrow character type is produced by the evaluation of the right operand of a simple assignment operator (5.17) whose first operand is an lvalue of unsigned narrow character type, an indeterminate value replaces the value of the object referred to by the left operand.
If an indeterminate value of unsigned narrow character type is produced by the evaluation of the initialization expression when initializing an object of unsigned narrow character type, that object is initialized to an indeterminate value.
I don't think any of the exception applies. Since the value is read before being initialized after the object is constructed, I think the code results in undefined behavior.
Related
This question already has answers here:
Uninitialized variable behaviour in C++
(4 answers)
Closed 6 months ago.
Why does this print 32767 (or some other random number)? What is std::cout printing? Why is it not NULL (or 0)?
int main()
{
int a;
std::cout << a;
}
That is because variables with automatic storage duration are not automatically initialized to zero in C++. In C++, you don't pay for what you don't need, and automatically initializing a variable takes time (setting to zero a memory location ultimately reduces to machine intruction(s) which are then translated to electrical signals that control the physical bits).
The variable is being reserved a memory location, and it happens that some junk is at that memory location. That junk is being printed out by cout.
As pointed out by #dwcanillas, it is undefined behaviour. Related: What happens to a declared, uninitialized variable in C? Does it have a value?
From the C++ standard (emphasize mine):
8.5 Initializers [dcl.init]
7) To default-initialize an object of type T means:
If T is a (possibly cv-qualified) class type (Clause 9), constructors are
considered. The applicable constructors are enumerated (13.3.1.3), and the best
one for the initializer () is chosen through overload resolution (13.3). The
constructor thus selected is called, with an empty argument list, to initialize >> the object.
If T is an array type, each element is default-initialized.
Otherwise, no initialization is performed.
12) If no initializer is specified for an object, the object is default-initialized. When storage for an object with automatic or dynamic storage duration is obtained, the object has an indeterminate value, and if no initialization is performed for the object, that object retains an indeterminate value until that value is replaced (5.18). [Note: Objects with static or thread storage duration are zero-initialized, see 3.6.2. — end note ] If an indeterminate value is produced by an evaluation, the behavior is undefined except in the following cases:
— If an indeterminate value of unsigned narrow character type (3.9.1) is produced by the evaluation of:
— the second or third operand of a conditional expression (5.16),
— the right operand of a comma expression (5.19),
— the operand of a cast or conversion to an unsigned narrow character type (4.7, 5.2.3, 5.2.9, 5.4), or
— a discarded-value expression (Clause 5)
...
It's undefined behavior. You are printing whatever occupies the memory of a, which in this case happens to be 32767.
The behaviour is covered by C++14 (N3936) [dcl.init]/12:
If no initializer is specified for an object, the object is default-initialized. When storage for an object with automatic or dynamic storage duration is obtained, the object has an indeterminate value, and if no initialization is performed for the object, that object retains an indeterminate value until that value is replaced.
[...] If an indeterminate value is produced by an evaluation, the behavior is undefined except in the following cases:
and your code is not covered by any of the "following cases" which cover a few situations in which unsigned char indeterminate values are allowed to propagate.
Because "a" is not global/static. Its an automatic variable for which initialization happens at run time. If it was global, initialization to zero would have happened at compile time. i.e
• static variables are initialized at compile-time, since their address is known and fixed. Initializing them to 0 does not incur a runtime cost.
• automatic variables can have different addresses for different calls and would have to be initialized at runtime each time the function is called, incurring a runtime cost that may not be needed. If you do need that initialization, then request it.
This question already has answers here:
Uninitialized variable behaviour in C++
(4 answers)
Closed 6 months ago.
Why does this print 32767 (or some other random number)? What is std::cout printing? Why is it not NULL (or 0)?
int main()
{
int a;
std::cout << a;
}
That is because variables with automatic storage duration are not automatically initialized to zero in C++. In C++, you don't pay for what you don't need, and automatically initializing a variable takes time (setting to zero a memory location ultimately reduces to machine intruction(s) which are then translated to electrical signals that control the physical bits).
The variable is being reserved a memory location, and it happens that some junk is at that memory location. That junk is being printed out by cout.
As pointed out by #dwcanillas, it is undefined behaviour. Related: What happens to a declared, uninitialized variable in C? Does it have a value?
From the C++ standard (emphasize mine):
8.5 Initializers [dcl.init]
7) To default-initialize an object of type T means:
If T is a (possibly cv-qualified) class type (Clause 9), constructors are
considered. The applicable constructors are enumerated (13.3.1.3), and the best
one for the initializer () is chosen through overload resolution (13.3). The
constructor thus selected is called, with an empty argument list, to initialize >> the object.
If T is an array type, each element is default-initialized.
Otherwise, no initialization is performed.
12) If no initializer is specified for an object, the object is default-initialized. When storage for an object with automatic or dynamic storage duration is obtained, the object has an indeterminate value, and if no initialization is performed for the object, that object retains an indeterminate value until that value is replaced (5.18). [Note: Objects with static or thread storage duration are zero-initialized, see 3.6.2. — end note ] If an indeterminate value is produced by an evaluation, the behavior is undefined except in the following cases:
— If an indeterminate value of unsigned narrow character type (3.9.1) is produced by the evaluation of:
— the second or third operand of a conditional expression (5.16),
— the right operand of a comma expression (5.19),
— the operand of a cast or conversion to an unsigned narrow character type (4.7, 5.2.3, 5.2.9, 5.4), or
— a discarded-value expression (Clause 5)
...
It's undefined behavior. You are printing whatever occupies the memory of a, which in this case happens to be 32767.
The behaviour is covered by C++14 (N3936) [dcl.init]/12:
If no initializer is specified for an object, the object is default-initialized. When storage for an object with automatic or dynamic storage duration is obtained, the object has an indeterminate value, and if no initialization is performed for the object, that object retains an indeterminate value until that value is replaced.
[...] If an indeterminate value is produced by an evaluation, the behavior is undefined except in the following cases:
and your code is not covered by any of the "following cases" which cover a few situations in which unsigned char indeterminate values are allowed to propagate.
Because "a" is not global/static. Its an automatic variable for which initialization happens at run time. If it was global, initialization to zero would have happened at compile time. i.e
• static variables are initialized at compile-time, since their address is known and fixed. Initializing them to 0 does not incur a runtime cost.
• automatic variables can have different addresses for different calls and would have to be initialized at runtime each time the function is called, incurring a runtime cost that may not be needed. If you do need that initialization, then request it.
Today I've encountered some code that roughly looks like the following snippet. Both valgrind and UndefinedBehaviorSanitizer detected reads of uninitialized data.
template <typename T>
void foo(const T& x)
{
static_assert(std::is_pod_v<T> && sizeof(T) > 1);
auto p = reinterpret_cast<const char*>(&x);
std::size_t i = 1;
for(; i < sizeof(T); ++i)
{
if(p[i] != p[0]) { break; }
}
// ...
}
The aforementioned tools complained about the p[i] != p[0] comparison when an
object containing padding bytes was passed to foo. Example:
struct obj { char c; int* i; };
foo(obj{'b', nullptr});
Is it undefined behavior to read padding bytes from a POD type and compare them to something else? I couldn't find a definitive answer neither in the Standard nor on StackOverflow.
The behaviour of your program is implementation defined on two counts:
1) Prior to C++14: Due to the possibility of a 1's complement or signed magnitude signed type for your char, you might return a surprising result due to comparing +0 and -0.
The truly watertight way would be to use a const unsigned char* pointer. This obviates any concerns with the now abolished (from C++14) 1's complement or signed magnitude char.
Since (i) you own the memory, (ii) you are taking a pointer to x, and (iii) an unsigned char cannot contain a trap representation, (iv) char, unsigned char, and signed char being exempted from the strict aliasing rules, the behaviour on using const unsigned char* to read uninitialised memory is perfectly well defined.
2) But since you don't know what is contained in that uninitialised memory, the behaviour on reading it is unspecified and that means the program behaviour is implementation defined since the char types cannot contain trap representations.
It depends on the conditions.
If x is zero-initialized, then padding has zero bits, so this case is well defined (8.5/6 of C++14):
To zero-initialize an object or reference of type T means:
— if T is a scalar type (3.9), the object is initialized to the value
obtained by converting the integer literal
0 (zero) to T;105
— if T is a (possibly cv-qualified) non-union class type, each
non-static data member and each base-class
subobject is zero-initialized and padding is initialized to zero bits;
— if T is a (possibly cv-qualified) union type, the object’s first
non-static named data member is zero-
initialized and padding is initialized to zero bits;
— if T is an array type, each element is zero-initialized; — if T is a
reference type, no initialization is performed.
However, if x is default-initialized, then padding isn't specified, so it has indeterminate value (inferred by the fact that there's no mention of padding here) (8.5/7):
To default-initialize an object of type T means:
— if T is a (possibly cv-qualified) class type (Clause 9), the default
constructor (12.1) for T is called (and the initialization is
ill-formed if T has no default constructor or overload resolution
(13.3) results in an ambiguity or in a function that is deleted or
inaccessible from the context of the initialization);
— if T is an array type, each element is default-initialized;
— otherwise, no initialization is performed.
And comparing indeterminate values is UB for this case, as none of the mentioned exceptions apply, as you compare the indeterminate value to something (8.5/12):
If no initializer is specified for an object, the object is
default-initialized. When storage for an object with automatic or
dynamic storage duration is obtained, the object has an indeterminate
value, and if no initialization is performed for the object, that
object retains an indeterminate value until that value is replaced
(5.17). [ Note: Objects with static or thread storage duration are
zero-initialized, see 3.6.2. — end note ] If an indeterminate value is
produced by an evaluation, the behavior is undefined except in the
following cases:
— If an indeterminate value of unsigned narrow character type (3.9.1)
is produced by the evaluation of:
......— the second or third operand of a conditional expression (5.16),
......— the right operand of a comma expression (5.18),
......— the operand of a cast or conversion to an unsigned narrow character type (4.7, 5.2.3, 5.2.9, 5.4),
or
......— a discarded-value expression (Clause 5), then the result of the
operation is an indeterminate value.
— If an indeterminate value of unsigned narrow character type is
produced by the evaluation of the right operand of a simple assignment
operator (5.17) whose first operand is an lvalue of unsigned narrow
character type, an indeterminate value replaces the value of the
object referred to by the left operand.
— If an indeterminate value of
unsigned narrow character type is produced by the evaluation of the
initialization expression when initializing an object of unsigned
narrow character type, that object is initialized to an indeterminate
value.
Bathsheba's answer correctly describes the letter of the C++ standard.
The bad news is that all modern compilers I have tested (GCC, Clang, MSVC, and ICC) all ignore the letter of the standard on this point. They instead treat the bald statement in Annex J.2 to the C standard
[the behavior is undefined if] the value of an object with automatic storage duration is used while it is indeterminate
as if it were 100% normative, in both C and C++, even though Annex J is not normative. This applies to all possible read accesses to uninitialized storage, including those carefully performed through unsigned char *, and, yes, including read accesses to padding bytes.
Moreover, if you were to file a bug report, I am confident that you would be told that, to the extent the normative text of the standard does not agree with what they are doing, it is the standard that is defective.
The good news is that you will only incur UB upon access to padding bytes if you inspect the contents of the padding bytes. Copying them around is OK. In particular, if you initialize all the named fields of a POD structure, it will be safe to copy it around by structure assignment and by memcpy, but it will not be safe to compare it to another such structure using memcmp.
Is the following safe?
*(new int);
I get output as 0.
It’s undefined because you’re reading an object with an indeterminate value. The expression new int() uses zero-initialisation, guaranteeing a zero value, while new int (without parentheses) uses default-initialisation, giving you an indeterminate value. This is effectively the same as saying:
int x; // not initialised
cout << x << '\n'; // undefined value
But in addition, since you are immediately dereferencing the pointer to the object you just allocated, and do not store the pointer anywhere, this constitutes a memory leak.
Note that the presence of such an expression does not necessarily make a program ill-formed; this is a perfectly valid program, because it sets the value of the object before reading it:
int& x = *(new int); // x is an alias for a nameless new int of undefined value
x = 42;
cout << x << '\n';
delete &x;
This is undefined behavior(UB) since you are accessing an indeterminate value, C++14 clearly makes this undefined behavior. We can see that new without initializer is default initialized, from the draft C++14 standard section 5.3.4 New paragraph 17 which says (emphasis mine going forward):
If the new-initializer is omitted, the object is default-initialized
(8.5). [ Note: If no initialization is performed, the object has an
indeterminate value. —end note ]
for int this means an indeterminate value, from section 8.5 paragraph 7 which says:
To default-initialize an object of type T means:
— if T is a (possibly cv-qualified) class type (Clause 9), the default constructor (12.1) for T is called (and
the initialization is ill-formed if T has no default constructor or overload resolution (13.3) results in an
ambiguity or in a function that is deleted or inaccessible from the context of the initialization);
— if T is an array type, each element is default-initialized;
— otherwise, no initialization is performed.
we can see from section 8.5 that producing an indeterminate value is undefined:
If no initializer is specified for an object, the object is
default-initialized. When storage for an object with automatic or
dynamic storage duration is obtained, the object has an indeterminate
value, and if no initialization is performed for the object, that
object retains an indeterminate value until that value is replaced
(5.17). [ Note: Objects with static or thread storage duration are
zero-initialized, see 3.6.2. — end note
If an indeterminate value is produced by an evaluation, the behavior is undefined except in the following cases
and all the exceptions have to do with unsigned narrow char which int is not.
Jon brings up an interesting example:
int& x = *(new int);
it may not be immediately obvious why this is not undefined behavior. The key point to notice is that is is undefined behavior to produce a value but in this case no value is produced. We can see this by going to section 8.5.3 References, which covers initialization of references and it says:
A reference to type “cv1 T1” is initialized by an expression of type “cv2 T2” as follows:
— If the reference is an lvalue reference and the initializer expression
— is an lvalue (but is not a bit-field), and “cv1 T1” is reference-compatible with “cv2 T2,” or
and goes on to say:
then the reference is bound to the initializer expression lvalue in
the first case [...][ Note: The usual lvalue-to-rvalue (4.1),
array-to-pointer (4.2), and function-to-pointer (4.3) standard
conversions are not needed, and therefore are suppressed, when such
direct bindings to lvalues are done. —end note ]
It is possible that a computer has "trapping" values of int: invalid values, such as a checksum bit which raises a hardware exception when it doesn't match its expected state.
In general, uninitialized values lead to undefined behavior. Initialize it first.
Otherwise, no, there's nothing wrong or really unusual about dereferencing a new-expression. Here is some odd, but entirely valid code using your construction:
int & ir = * ( new int ) = 0;
…
delete & ir;
First of all, Shafik Yaghmour gave references to the Standard in his answer. That is the best, complete and authoritative answer. None the less, let me try to give you specific examples that should illustrate the aforementioned points.
This code is safe, well-formed and meaningful:
int *p = new int; // ie this is a local variable (ptr) that points
// to a heap-allocated block
You must not, however, dereference the pointer as that results in undefined behavior. IE you may get 0x00, or 0xFFFFFFFF, or the instruction pointer (aka RIP register on Intel) may jump to a random location. The computer may crash.
int *p = new int;
std::cout << *p; // Very, bad. Undefined behavior.
Run-time checkers such as Valgrind and ASan will catch the issue, flag it and crash with a nice error message.
It is, however, perfectly fine to initialize the memory block you had allocated:
int *p = new int;
*p = 0;
Background info: this particular way of writing the specification is very useful for performance, as it is prohibitively expensive to implement the alternative.
Note, as per the Standard references, sometimes the initialization is cheap, so you can do the following:
// at the file scope
int global1; // zero-initialized
int global2 = 1; // explicitly initialized
void f()
{
std::cout << global1;
}
These things go into the executable's sections (.bss and .data) and are initialized by the OS loader.
As covered in Does initialization entail lvalue-to-rvalue conversion? Is int x = x; UB? the C++ standard has a surprising example in section 3.3.2 Point of declaration in which an int is initialized with it's own indeterminate value:
int x = 12;
{ int x = x; }
Here the second x is initialized with its own (indeterminate) value.
— end example ]
Which Johannes answer to this question indicates is undefined behavior since it requires an lvalue-to-rvalue conversion.
In the latest C++14 draft standard N3936 which can be found here this example has changed to:
unsigned char x = 12;
{ unsigned char x = x; }
Here the second x is initialized with its own (indeterminate) value.
— end example ]
Has something changed in C++14 with respect to indeterminate values and undefined behavior that has driven this change in the example?
Yes, this change was driven by changes in the language which makes it undefined behavior if an indeterminate value is produced by an evaluation but with some exceptions for unsigned narrow characters.
Defect report 1787 whose proposed text can be found in N39141 was recently accepted in 2014 and is incorporated in the latest working draft N3936:
The most interesting change with respect to indeterminate values would be to section 8.5 paragraph 12 which goes from:
If no initializer is specified for an object, the object is default-initialized; if no initialization is performed, an object with automatic or dynamic storage duration has indeterminate value. [ Note: Objects with static or thread storage duration are zero-initialized, see 3.6.2. — end note ]
to (emphasis mine):
If no initializer is specified for an object, the object is
default-initialized. When storage for an object with automatic or
dynamic storage duration is obtained, the object has an indeterminate
value, and if no initialization is performed for the object, that
object retains an indeterminate value until that value is replaced
(5.17 [expr.ass]). [Note: Objects with static or thread storage
duration are zero-initialized, see 3.6.2 [basic.start.init]. —end
note] If an indeterminate value is produced by an evaluation, the
behavior is undefined except in the following cases:
If an indeterminate value of unsigned narrow character type (3.9.1 [basic.fundamental]) is produced by the evaluation of:
the second or third operand of a conditional expression (5.16 [expr.cond]),
the right operand of a comma (5.18 [expr.comma]),
the operand of a cast or conversion to an unsigned narrow character type (4.7 [conv.integral], 5.2.3 [expr.type.conv], 5.2.9
[expr.static.cast], 5.4 [expr.cast]), or
a discarded-value expression (Clause 5 [expr]),
then the result of the operation is an indeterminate value.
If an indeterminate value of unsigned narrow character type (3.9.1 [basic.fundamental]) is produced by the evaluation of the right
operand of a simple assignment operator (5.17 [expr.ass]) whose first
operand is an lvalue of unsigned narrow character type, an
indeterminate value replaces the value of the object referred to by
the left operand.
If an indeterminate value of unsigned narrow character type (3.9.1 [basic.fundamental]) is produced by the evaluation of the
initialization expression when initializing an object of unsigned
narrow character type, that object is initialized to an indeterminate
value.
and included the following example:
[ Example:
int f(bool b) {
unsigned char c;
unsigned char d = c; // OK, d has an indeterminate value
int e = d; // undefined behavior
return b ? d : 0; // undefined behavior if b is true
}
— end example ]
We can find this text in N3936 which is the current working draft and N3937 is the C++14 DIS.
Prior to C++1y
It is interesting to note that prior to this draft unlike C which has always had a well specified notion of what uses of indeterminate values were undefined C++ used the term indeterminate value without even defining it (assuming we can not borrow definition from C99) and also see defect report 616. We had to rely on the underspecified lvalue-to-rvalue conversion which in draft C++11 standard is covered in section 4.1 Lvalue-to-rvalue conversion paragraph 1 which says:
[...]if the object is uninitialized, a program that necessitates this conversion has undefined behavior.[...]
Footnotes:
1787 is a revision of defect report 616, we can find that information in N3903