std::start_lifetime_as and UB in C++23 multithreaded application - c++

Assuming X and Y are suitable types for such usage, is it UB to use std::start_lifetime_as<X> on an area of memory in one thread as one type and use std::start_lifetime_as<Y> on the exact same memory in another thread? Does the standard say anything about this? If it doesn't, what is the correct interpretation?

There is no data race from such calls, since none of them access any memory locations, but since (without synchronization) neither thread can know that the other has not ended the lifetime of its desired object by reusing its storage for an object of the other type, the objects created cannot be used. (There are not “even odds” that one thread can use them because it “went last”: there is an execution where it didn’t, so relying on that would have undefined behavior.)

Object lifetime is actually one of the more underspecified parts of the standard, especially when it comes to concurrency (and in some places the wording is outright defective IMO), but I think this specific question is answerable with what's there.
First, let's get data races out of the way.
[intro.races]/21:
The execution of a program contains a data race if it contains two potentially concurrent conflicting actions [...]
[intro.races]/2:
Two expression evaluations conflict if one of them modifies a memory location and the other one reads or modifies the same memory location.
[intro.memory]/3:
A memory location is either an object of scalar type that is not a bit-field or a maximal sequence of adjacent bit-fields all having nonzero width.
Two unrelated objects are definitely not the same 'memory location', so [intro.races]/21 doesn't apply.
However, [intro.object]/9 says:
Two objects with overlapping lifetimes that are not bit-fields may have the same address if one is nested within the other, or if at least one is a subobject of zero size and they are of different types; otherwise, they have distinct addresses and occupy disjoint bytes of storage.
This means that out of any two (unrelated) objects with overlapping storage, at most one can be within lifetime at any given point. [basic.life]/1.5 ensures this:
The lifetime of an object o of type T ends when: [...]
the storage which the object occupies is released, or is reused by an object that is not nested within o.
Accessing (reading or writing) an object outside its lifetime is not allowed ([basic.life]/4), and we've just established that X and Y can't both be within lifetime at the same time. So, if both threads proceed to access the created objects, the behavior is undefined: at least one will be accessing an object whose lifetime has ended.

Related

Is object existence distinct from object lifetime?

It may sound philosophical, but it isn't: in C++, can various (classes, scalars) objects exist outside of their lifetime? What is the existence of an object? What is the creation of an object? Is an object created when its lifetime starts?
(Edited for clarity: the question is not specifically about class types.)
I'm extremely confused and need someone to explain the terminology and basic concepts to me.
Note: Existing is the fact of being a thing. It's the most fundamental philosophical concept. It isn't an attribute of an object and I don't know nor do I care about the number of occurrences of the word "exist" in the standard text. Textbooks probably very rarely say that things "exist". I don't remember ever reading that numbers "exist" in registers or that expressions "exist" in the source code. Numbers fit in registers and the source code has expressions in it.
If we can refer to an object, it means it exists. If a pointer points to an object, that object exists.
This is best understood with [intro.object]/1:
An object is created by a definition, by a new-expression, when implicitly changing the active member of a union, or when a temporary object is created ([conv.rval], [class.temporary]). An object occupies a region of storage in its period of construction ([class.cdtor]), throughout its lifetime, and in its period of destruction ([class.cdtor]).
The second sentence there answers your question. If an object is able to occupy a region of storage outside of the period described as "its lifetime" (like during its construction and destruction), then clearly an object must be able to exist outside of its lifetime.
An object in C++ is an abstract concept. To machine code, it's just a big virtual array of bytes.
The physical "object" is a bunch of bytes in memory that get allocated for some purpose, and are associated some functions that act on those bytes. The lifetime of that object is the time when those bytes are used for that object's purpose, to when they are no longer used. They still exist afterwards, but they're available to be used by some other object.

Is it legal to compare dangling pointers?

Is it legal to compare dangling pointers?
int *p, *q;
{
int a;
p = &a;
}
{
int b;
q = &b;
}
std::cout << (p == q) << '\n';
Note how both p and q point to objects that have already vanished. Is this legal?
Introduction: The first issue is whether it is legal to use the value of p at all.
After a has been destroyed, p acquires what is known as an invalid pointer value. Quote from N4430 (for discussion of N4430's status see the "Note" below):
When the end of the duration of a region of storage is reached, the values of all pointers representing the address of any part of the deallocated storage become invalid pointer values.
The behaviour when an invalid pointer value is used is also covered in the same section of N4430 (and almost identical text appears in C++14 [basic.stc.dynamic.deallocation]/4):
Indirection through an invalid pointer value and passing an invalid pointer value to a deallocation function have undefined behavior. Any other use of an invalid pointer value has implementation-defined behavior.
[ Footnote: Some implementations might define that copying an invalid pointer value causes a system-generated runtime fault. — end footnote ]
So you will need to consult your implementation's documentation to find out what should happen here (since C++14).
The term use in the above quotes means necessitating lvalue-to-rvalue conversion, as in C++14 [conv.lval/2]:
When an lvalue-to-rvalue conversion is applied to an expression e, and [...] the object to which the glvalue refers contains an invalid pointer value, the behaviour is implementation-defined.
History: In C++11 this said undefined rather than implementation-defined; it was changed by DR1438. See the edit history of this post for the full quotes.
Application to p == q: Supposing we have accepted in C++14+N4430 that the result of evaluating p and q is implementation-defined, and that the implementation does not define that a hardware trap occurs; [expr.eq]/2 says:
Two pointers compare equal if they are both null, both point to the same function, or both represent the same address (3.9.2), otherwise they compare unequal.
Since it's implementation-defined what values are obtained when p and q are evaluated, we can't say for sure what will happen here. But it must be either implementation-defined or unspecified.
g++ appears to exhibit unspecified behaviour in this case; depending on the -O switch I was able to have it say either 1 or 0, corresponding to whether or not the same memory address was re-used for b after a had been destroyed.
Note about N4430: This is a proposed defect resolution to C++14, that hasn't been accepted yet. It cleans up a lot of wording surrounding object lifetime, invalid pointers, subobjects, unions, and array bounds access.
In the C++14 text, it is defined under [basic.stc.dynamic.deallocation]/4 and subsequent paragraphs that an invalid pointer value arises when delete is used. However it's not clearly stated whether or not the same principle applies to static or automatic storage.
There is a definition "valid pointer" in [basic.compound]/3 but it is too vague to use sensibly.The [basic.life]/5 (footnote) refers to the same text to define the behaviour of pointers to objects of static storage duration, which suggests that it was meant to apply to all types of storage.
In N4430 the text is moved from that section up one level so that it does clearly apply to all storage durations. There is a note attached:
Drafting note: this should apply to all storage durations that can end, not just to dynamic storage duration. On an implementation supporting threads or segmented stacks, thread and automatic storage may behave in the same way that dynamic storage does.
My opinion: I don't see any consistent way to interpret the standard (pre-N4430) other than to say that p acquires an invalid pointer value. The behaviour doesn't seem to be covered by any other section besides what we have already looked at. So I am happy to treat the N4430 wording as representing the intent of the standard in this case.
Historically, there have been some systems where using a pointer as an rvalue might cause the system to fetch some information identified by some bits in that pointer. For example, if a pointer could contain the address of an object's header along with an offset into the object, fetching a pointer could cause the system to also fetch some information from that header. If the object has ceased to exist, the attempt to fetch information from its header could fail with arbitrary consequences.
That having been said, in the vast majority of C implementations, all pointers that were alive at some particular moment in time will forever hold the same relationships with regard to the relational and subtraction operators as they had at that particular time. Indeed, in most implementations if one has char *p, one may determine whether it identifies part of an object identified by char *base; size_t size; by checking whether (size_t)(p-base) < size; such comparison will work even retrospectively if there is any overlap in the objects' lifetime.
Unfortunately, the Standard defines no means by which code can indicate that it requires any of the latter guarantees, nor is there a standard means by which code can ask whether a particular implementation can promise any of the latter behaviors and refuse compilation if it does not. Further, some hyper-modern implementations will regard any use of relational or subtraction operators on two pointers as a promise by the programmer that the pointers in question will always identify the same live object, and omit any code which would only be relevant if that assumption didn't hold. Consequently, even though many hardware platforms would be able to offer guarantees that would be useful to many algorithms, there's no safe way by which code can exploit any such guarantees even if code will never need to run on hardware which does not naturally provide them.
The pointers contain the addresses of the variables they reference. The addresses are valid even when the variables that used to be stored there are released / destroyed / unavailable.
As long as you don't try to use the values at those addresses you are safe, meaning *p and *q will be undefined.
Obviously the result is implementation defined, therefore this code example can be used to study the features of your compiler if one doesn't want to dig into to assembly code.
Whether this is a meaningful practice is totally different discussion.

What is the meaning of this piece of Standardese about shared_ptr's use_count()?

While trying to wrap my head around the problem shown in this question I found myself stuck on the following sentence from [util.smartptr.shared]/4:
[...] Changes in use_count() do not reflect modifications that can introduce data races.
I don't understand how I should read that, and what conclusions shall I draw.
Here are a few interpretations:
Invoking use_count() does not introduce data races (but this should be guaranteed by the const-ness of that function alone, together with the corresponding library-wide guarantees)
The value returned by use_count() is not influenced by ("does not reflect"?) the outcome of operations that require atomicity or synchronization (but what are these relevant operations?)
use_count() is executed atomically, but without preventing reordering by the CPU or the compiler (i.e. without sequential consistency, but then why not mentioning the particular model?)
To me, none of the above seems to follow from that sentence, and I am at loss trying to interpret it.
The current wording derives from library issue 896, which also addressed the question of whether shared_ptr should be thread safe in the sense that distinct shared_ptrs owning the same object can be accessed (in particular, copied and destructed) from distinct threads simultaneously. The conclusion of that discussion was that shared_ptr should be thread-safe; the mechanism to guarantee this was to pretend that shared_ptr member functions only access the shared_ptr object itself and not its on-heap control block:
For purposes of determining the presence of a data race, member functions access and modify only the shared_ptr and weak_ptr objects themselves and not objects they refer to.
Here "objects they refer to" means the control block.
However this raises an issue; if we pretend that distinct shared_ptrs owning the same object don't access the control block, then surely use_count() cannot change? This is patched by making use_count() a magic function that produces a result out of thin air:
Changes in use_count() do not reflect modifications that can introduce data races.
That is, use_count() can change from one call to the next, but that does not mean that a data race (or potential data race) has occurred. This is perhaps clearer from the previous wording of that sentence:
[Note: This is true in spite of that fact that such functions often modify use_count() --end note]
It means that the code in use_count() is either lock-free or uses mutexs to lock critical sections. In other words you can call it from threads without worring about race conditions.
I think when we add the previous sentence the intent becomes a little clearer:
For purposes of determining the presence of a data race, member functions shall access and modify only the shared_ptr and weak_ptr objects themselves and not objects they refer to. Changes in use_count() do not reflect modifications that can introduce data races.
So, the last sentence is just emphasizing the same point as the first sentence. For example, if I copy a shared_ptr, its use-count will be incremented to reflect the fact that the shared_ptr has been copied--therefore the result from use_count() will be changed--but this is not allowed to access (and, especially, not allowed to modify) the pointee object, so it can never introduce a data race in use of that pointee object.

Allowed compiler optimizations on loops in C++11

Is a C++11-compliant compiler allowed to optimize/transform this code from:
bool x = true; // *not* an atomic type, but suppose bool can be read/written atomically
/*...*/
{
while (x); // spins until another thread changes the value of x
}
to anything equivalent to an infinite loop:
{
while (true); // infinite loop
}
The above conversion is certainly valid from the point of view of a single-thread program, but this is not the general case.
Also, was that optimization allowed in pre-C++11?
Absolutely.
Since x is not marked as volatile and appears to be a local object with automatic storage duration and internal linkage, and the program does not modify it, the two programs are equivalent.
In both C++03 and C++11 this is by the as-if rule, since accessing a non-volatile object is not considered to be a "side effect" of the program:
[C++11: 1.9/12]: Accessing an object designated by a volatile glvalue (3.10), modifying an object, calling a library I/O function, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression (or a sub-expression) in general includes
both value computations (including determining the identity of an object for glvalue evaluation and fetching a value previously assigned to an object for prvalue evaluation) and initiation of side effects. When a call to a library I/O function returns or an access to a volatile object is evaluated the side effect is considered complete, even though some external actions implied by the call (such as the I/O itself) or by the volatile access may not have completed yet.
C++11 does make room for a global object to have its value changed in one thread then that new value read in another:
[C++11: 1.10/3]: The value of an object visible to a thread T at a particular point is the initial value of the object, a value assigned to the object by T, or a value assigned to the object by another thread, according to the rules below.
However, if you're doing this, since your object is not atomic:
[C++11: 1.10/21]: The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior.
And, when undefined behaviour is invoked, anything can happen.
Bootnote
[C++11: 1.10/25]: An implementation should ensure that the last value (in modification order) assigned by an atomic or synchronization operation will become visible to all other threads in a finite period of time.
Again, note that the object would have to be atomic (say, std::atomic<bool>) to obtain this guarantee.
The compiler is allowed go do anything to those two loops. Including terminating the program. Because infinite loops have undefined behaviour if they do not perform a synchronization-like operation (do something that requires synchronization with another thread or I/O), according to the C++ memory model:
Note that it means that a program with endless recursion or endless loop (whether implemented as a for-statement or by looping goto or otherwise) has undefined behavior.

Why are global and static variables initialized to their default values?

In C/C++, why are globals and static variables initialized to default values?
Why not leave it with just garbage values? Are there any special
reasons for this?
Security: leaving memory alone would leak information from other processes or the kernel.
Efficiency: the values are useless until initialized to something, and it's more efficient to zero them in a block with unrolled loops. The OS can even zero freelist pages when the system is otherwise idle, rather than when some client or user is waiting for the program to start.
Reproducibility: leaving the values alone would make program behavior non-repeatable, making bugs really hard to find.
Elegance: it's cleaner if programs can start from 0 without having to clutter the code with default initializers.
One might then wonder why the auto storage class does start as garbage. The answer is two-fold:
It doesn't, in a sense. The very first stack frame page at each level (i.e., every new page added to the stack) does receive zero values. The "garbage", or "uninitialized" values that subsequent function instances at the same stack level see are really the previous values left by other method instances of your own program and its library.
There might be a quadratic (or whatever) runtime performance penalty associated with initializing auto (function locals) to anything. A function might not use any or all of a large array, say, on any given call, and it could be invoked thousands or millions of times. The initialization of statics and globals, OTOH, only needs to happen once.
Because with the proper cooperation of the OS, 0 initializing statics and globals can be implemented with no runtime overhead.
Section 6.7.8 Initialization of C99 standard (n1256) answers this question:
If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static storage duration is not initialized explicitly, then:
— if it has pointer type, it is initialized to a null pointer;
— if it has arithmetic type, it is initialized to (positive or unsigned) zero;
— if it is an aggregate, every member is initialized (recursively) according to these rules;
— if it is a union, the first named member is initialized (recursively) according to these rules.
Think about it, in the static realm you can't tell always for sure something is indeed initialized, or that main has started. There's also a static init and a dynamic init phase, the static one first right after the dynamic one where order matters.
If you didn't have zeroing out of statics then you would be completely unable to tell in this phase for sure if anything was initialized AT ALL and in short the C++ world would fly apart and basic things like singletons (or any sort of dynamic static init) would simple cease to work.
The answer with the bulletpoints is enthusiastic but a bit silly. Those could all apply to nonstatic allocation but that isn't done (well, sometimes but not usually).
In C, statically-allocated objects without an explicit initializer are initialized to zero (for arithmetic types) or a null pointer (for pointer types). Implementations of C typically represent zero values and null pointer values using a bit pattern consisting solely of zero-valued bits (though this is not required by the C standard). Hence, the bss section typically includes all uninitialized variables declared at file scope (i.e., outside of any function) as well as uninitialized local variables declared with the static keyword.
Source: Wikipedia