This question might seem naive (hell, I think it is) but I am unable to find an answer that satisfies me.
Take this simple C++ program:
#include<iostream>
using namespace std;
int main ()
{
bool b;
cout << b;
return 0;
}
When compiled and executed, it always prints 0.
The problem is that is not what I'm expecting it to do: as far as I know, a local variable has no initialization value, and I believe that a random byte has more chances of being different rather than equal to 0.
What am I missing?
That is undefined behavior, because you are using the value of an uninitialized variable. You cannot expect anything out of a program with undefined behavior.
In particular, your program necessitates a so-called lvalue-to-rvalue conversion when initializing the parameter of operator << from b. Paragraph 4.1/1 of the C++11 Standard specifies:
A glvalue (3.10) of a non-function, non-array type T can be converted to a prvalue. If T is an incomplete
type, a program that necessitates this conversion is ill-formed. If the object to which the glvalue refers is not
an object of type T and is not an object of a type derived from T, or if the object is uninitialized, a program
that necessitates this conversion has undefined behavior. If T is a non-class type, the type of the prvalue is
the cv-unqualified version of T. Otherwise, the type of the prvalue is T.
The behaviour is undefined; there is no requirement for it to be assigned a random value, and certainly not a uniformly-distributed one.
What is probably happening is that the memory allocated to the process is zero-initialised by the operating system, and this is the first time that that byte is used, so it still contains zero.
But, like all undefined behaviour, you can't rely on it and there's little point speculating about the details.
As Andy said, it's undefined behaviour. I think the fact that you are so lucky and always receive 0 is implementation defined. Probably the stack is empty and clean (initialized with zeros) when you program starts. So it happens that you get zero when allocation a variable there.
This may be guaranteed to succeed in your current implementation (as others said, maybe it initializes the stack with zeros) but it's also guaranteed to fail in, say, a Visual C++ debug build (which initializes local variables to 0xCCCCCCCC).
C++ static and global variables are initialized by default as C89 and C99 says - and specifically arithmetic variables are initialized by 0.
Auto variables are indeterminate which by C89 and C99 at 3.17.2 means that they are either an unspecified value or a trap representation. Trap representation in the context of bool type might mean 0 - this is compiler specific.
Related
I am reading through the cppreference page on default initialization and I noticed a section that states something along these lines:
//UB
int x;
int y = x;
//Defined and ok
unsigned char c;
unsigned char d = c;
And the same rule for unsigned char, applys for std::byte aswell.
My question is why does every other non class variable (int, bool, char etc) result in UB if you try to use the value before assigning it (like above example), but not unsigned char? Why is unsigned char special?
The page I am reading for reference
The difference is not in initialisation behaviour. The value of uninitialised int is indeterminate and default initialisation leaves it indeterminate. The value of uninitialised unsigned char is indeterminate and default initialisation leaves it indeterminate. There is no difference there.
The difference is that behaviour of producing an indeterminate value of type int - or any other type besides the exceptional unsigned char or std::byte - is undefined (unless the value is discarded).
The exception for unsigned char (and later std::byte) was added to the language in C++14 when indeterminate value was properly defined (although since the change was a defect resolution, to my understanding it applies to the official standard at the time, C++11).
I could not find a documented rationale for that design choice. Here is a timeline of the definitions (all standard quotes are from drafts):
C89 - 1.6 DEFINITIONS OF TERMS
Undefined behavior --- behavior, upon use of ... indeterminately-valued objects
C89 - 3.5.7 Initialization - Semantics
... If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate.
There are no exceptions for any type. You'll see why C standard is relevant when reading C++98 standard.
C++98 - [dcl.init]
... Otherwise, if no initializer is specified for an object, the object and its subobjects, if any, have an indeterminate initial value
There was no definition for what indeterminate value means or what happens when you use it. The intended meaning may presumably have been same as C89, but it is underspecified.
C99 - 3. Terms, definitions, and symbols - 3.17.2
3.17.2 indeterminate value
either an unspecified value or a trap representation
3.17.3 unspecified value
valid value of the relevant type where this International Standard imposes no requirements on which value is chosen in any instance
NOTE An unspecified value cannot be a trap representation.
C99 - 6.2.6 Representations of types - 6.2.6.1 General
Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is read by an lvalue expression that does not have character type, the behavior is undefined. If such a representation is produced by a side effect that modifies all or any part of the object by an lvalue expression that does not have character type, the behavior is undefined. 41) Such a representation is called a trap representation.
C99 - J.2 Undefined behavior
The behavior is undefined in the following circumstances:
...
The value of an object with automatic storage duration is used while it is indeterminate
A trap representation is read by an lvalue expression that does not have character type
A trap representation is produced by a side effect that modifies any part of the object using an lvalue expression that does not have character type
...
C99 introduced the term trap representation, and which also have UB when used, just like indeterminate values. Character types (which are char, unsigned char and signed char) don't have trap representations, and may be used to operate on trap representations of other types without UB.
C++ core language issue - 616. Definition of “indeterminate value”
The C++ Standard uses the phrase “indeterminate value” without defining it. C99 defines it as “either an unspecified value or a trap representation.” Should C++ follow suit?
Proposed resolution (October, 2012):
[dcl.init] paragraph 12 as follows:
If no initializer is specified for an object, the object is default-initialized. When storage for an object with automatic or dynamic storage duration is obtained, the object has an indeterminate value, and if no initialization is performed for the object, that object retains an indeterminate value until that value is replaced (5.17 [expr.ass]). [Note: Objects with static or thread storage duration are zero-initialized, see 3.6.2 [basic.start.init]. —end note] If an indeterminate value is produced by an evaluation, the behavior is undefined except in the following cases:
If an indeterminate value of unsigned narrow character type (3.9.1 [basic.fundamental]) is produced by the evaluation of:
the second or third operand of a conditional expression (5.16 [expr.cond]),
the right operand of a comma (5.18 [expr.comma]),
the operand of a cast or conversion to an unsigned narrow character type (4.7 [conv.integral], 5.2.3 [expr.type.conv], 5.2.9 [expr.static.cast], 5.4 [expr.cast]), or
a discarded-value expression (Clause 5 [expr]),
then the result of the operation is an indeterminate value.
If an indeterminate value of unsigned narrow character type (3.9.1 [basic.fundamental]) is produced by the evaluation of the right operand of a simple assignment operator (5.17 [expr.ass]) whose first operand is an lvalue of unsigned narrow character type, an indeterminate value replaces the value of the object referred to by the left operand.
If an indeterminate value of unsigned narrow character type (3.9.1 [basic.fundamental]) is produced by the evaluation of the initialization expression when initializing an object of unsigned narrow character type, that object is initialized to an indeterminate value.
The proposed change was accepted as a defect resolution with some further changes (issue 1213) but has remained mostly the same (similar enough for purposes of this question). This is where the exception for unsigned char seems to have been introduced into C++. The core language issue has no public comments or notes about the rationale for the exception as far as I could find.
Under C89 and C99, uninitialized values could have any bit pattern. If addressable locations have n bits, then unsigned char was guaranteed to have 2ⁿ possible values, so every possible bit pattern would be a valid value. Other types, however, would on some platforms be stored in ways where not all bit patterns were valid. The Standard imposed no requirements on what might happen if code attempted to read an object when the stored bit pattern didn't represent a valid value, so the question of whether reading an object of a type other than unsigned char would yield an Unspecified Value, or could trigger arbitrary behavior, would depend upon whether the implementation's specified representation of the type assigned valid values to all possible bit patterns.
The C11 Standard added an additional proviso which says that even implementations which specify that all objects, whether or not their address is taken, will always be stored in ways were all bit patterns would represent valid values may opt to behave in completely arbitrary fashion if an attempt is made to access an uninitialized object that isn't an unsigned char whose address is taken. Although no rationale document is published for C11 (unlike earlier versions), I think such changes stem from a lack of consensus about whether the Standard is supposed to only describe the behavior of 100% portable programs, or of a wider variety of practical programs. If a program is going to be run on a completely unspecified implementation, then it will be impossible to know what the effect of reading an uninitialized object would be except in the case specified by the C11 Standard. If it's going to be run on a known implementation, then it will be processed however that implementation decides to process it, whether or not the Standard mandates the behavior, so there should be no need to mandate anything in particular. Unfortunately, the authors of a Gratuitously "Clever" Compiler believe that when the Standard characterizes an action as "non-portable or erroneous" what it really means is "non-portable, and therefore erroneous", and excludes the possibility of "non-portable but correct on the intended target", despite the fact that such a notion directly contradicts the published Rationale documents for earlier versions of the Standard.
When a class member cannot have a sensible meaning at the moment of construction,
I don't initialize it. Obviously that only applies to POD types, you cannot NOT
initialize an object with constructors.
The advantage of that, apart from saving CPU cycles initializing something to
a value that has no meaning, is that I can detect erroneous usage of these
variables with valgrind; which is not possible when I'd just give those variables
some random value.
For example,
struct MathProblem {
bool finished;
double answer;
MathProblem() : finished(false) { }
};
Until the math problem is solved (finished) there is no answer. It makes no sense to initialize answer in advance (to -say- zero) because that might not be the answer. answer only has a meaning after finished was set to true.
Usage of answer before it is initialized is therefore an error and perfectly OK to be UB.
However, a trivial copy of answer before it is initialized is currently ALSO UB (if I understand the standard correctly), and that doesn't make sense: the default copy and move constructor should simply be able to make a trivial copy (aka, as-if using memcpy), initialized or not: I might want to move this object into a container:
v.push_back(MathProblem());
and then work with the copy inside the container.
Is moving an object with an uninitialized, trivially copyable member indeed defined as UB by the standard? And if so, why? It doesn't seem to make sense.
Is moving an object with an uninitialized, trivially copyable member indeed defined as UB by the standard?
Depends on the type of the member. Standard says:
[basic.indet]
When storage for an object with automatic or dynamic storage duration is obtained, the object has an indeterminate value, and if no initialization is performed for the object, that object retains an indeterminate value until that value is replaced ([expr.ass]).
If an indeterminate value is produced by an evaluation, the behavior is undefined except in the following cases:
If an indeterminate value of unsigned ordinary character type ([basic.fundamental]) or std::byte type ([cstddef.syn]) is produced by the evaluation of:
the second or third operand of a conditional expression,
the right operand of a comma expression,
the operand of a cast or conversion ([conv.integral], [expr.type.conv], [expr.static.cast], [expr.cast]) to an unsigned ordinary character type or std::byte type ([cstddef.syn]), or
a discarded-value expression,
then the result of the operation is an indeterminate value.
If an indeterminate value of unsigned ordinary character type or std::byte type is produced by the evaluation of the right operand of a simple assignment operator ([expr.ass]) whose first operand is an lvalue of unsigned ordinary character type or std::byte type, an indeterminate value replaces the value of the object referred to by the left operand.
If an indeterminate value of unsigned ordinary character type is produced by the evaluation of the initialization expression when initializing an object of unsigned ordinary character type, that object is initialized to an indeterminate value.
If an indeterminate value of unsigned ordinary character type or std::byte type is produced by the evaluation of the initialization expression when initializing an object of std::byte type, that object is initialized to an indeterminate value.
None of the exceptional cases apply to your example object, so UB applies.
with memcpy is UB?
It is not. std::memcpy interprets the object as an array of bytes, in which exceptional case there is no UB. You still have UB if you attempt to read the indeterminate copy (unless the exceptions above apply).
why?
The C++ standard doesn't include a rationale for most rules. This particular rule has existed since the first standard. It is slightly stricter than the related C rule which is about trap representations. To my understanding, there is no established convention for trap handling, and the authors didn't wish to restrict implementations by specifying it, and instead opted to specify it as UB. This also has the effect of allowing optimiser to deduce that indeterminate values will never be read.
I might want to move this object into a container:
Moving an uninitialised object into a container is typically a logic error. It is unclear why you might want to do such thing.
The design of the C++ Standard was heavily influenced by the C Standard, whose authors (according to the published Rationale) intended and expected that implementations would, on a quality-of-implementation basis, extend the semantics of the language by meaningfully processing programs in cases where it was clear that doing so would be useful, even if the Standard didn't "officially" define the behavior of those programs. Consequently, both standards place more priority upon ensuring that they don't mandate behaviors in cases where doing so might make some implementations less useful, than upon ensuring that they mandate everything that should be supported by quality general-purpose implementations.
There are many cases where it may be useful for an implementation to extend the semantics of the language by guaranteeing that using memcpy on any valid region of storage will, at worst, behave in a fashion consistent with populating the destination with some possibly-meaningless bit pattern with no outside side effects, and few if any where it would be either easier or more useful to have it do something else. The only situations where anyone should care about whether the behavior of memcpy is defined in a particular situation involving valid regions of storage would be those in which some alternative behavior would be genuinely more useful than the commonplace one. If such situations exist, compiler writers and their customers would be better placed than the Committee to judge which behavior would be most useful.
As an example of a situation where an alternative behavior might be more useful, consider code which uses memcpy to copy a partially-written structure, and then uses it to make two copies of that structure. In some cases, having the compiler only write the parts of the two destination structures which had been written in the original may improve efficiency, but that behavior would be observably different from having the first memcpy behave as though it stores some bit pattern to its destination. Note that while such a change would not adversely affect a program's overall behavior if no copies of the uninitialized parts of the structure are ever used in a way that would affect behavior, the Standard has no nice way of distinguishing scenarios that could or could not occur under such a module, and thus leaves all such scenarios undefined.
For example, in the following code:
int myarray[3];
int x = myarray[1];
Is the code guaranteed to execute successfully in constant time, with x having some integral value? Or can the compiler skip emitting code for this entirely / emit code to launch GNU Chess and still comply with the C++ standard?
This is useful in a data structure that's like an array, but can be initialized in constant time. (Sorry, don't have my copy of Aho, Hopcroft and Ullman handy so can't look up the name.)
It's undefined behavior.
According to the standard ([dcl.init] paragraph 12),
If no initializer is specified for an object, the object is default-initialized. When storage for an object
with automatic or dynamic storage duration is obtained, the object has an indeterminate value, and if
no initialization is performed for the object, that object retains an indeterminate value until that value is
replaced ... If an indeterminate value is produced by an evaluation, the behavior is undefined except in the
following cases
with all the following cases addressing access of unsigned narrow character type or std::byte, which can result in an indeterminate value instead of being undefined.
Accessing any uninitialized data is undefined behavior.
So far I can't find how to deduce that the following:
int* ptr;
*ptr = 0;
is undefined behavior.
First of all, there's 5.3.1/1 that states that * means indirection which converts T* to T. But this doesn't say anything about UB.
Then there's often quoted 3.7.3.2/4 saying that using deallocation function on a non-null pointer renders the pointer invalid and later usage of the invalid pointer is UB. But in the code above there's nothing about deallocation.
How can UB be deduced in the code above?
Section 4.1 looks like a candidate (emphasis mine):
An lvalue (3.10) of a
non-function, non-array type T can be
converted to an rvalue. If T is an
incomplete type, a program that
necessitates this conversion is
ill-formed. If the object to which the
lvalue refers is not an object of type
T and is not an object of a type
derived from T, or if the object is
uninitialized, a program that
necessitates this conversion has
undefined behavior. If T is a
non-class type, the type of the rvalue
is the cv-unqualified version of T.
Otherwise, the type of the rvalue is
T.
I'm sure just searching on "uninitial" in the spec can find you more candidates.
I found the answer to this question is a unexpected corner of the C++ draft standard, section 24.2 Iterator requirements, specifically section 24.2.1 In general paragraph 5 and 10 which respectively say (emphasis mine):
[...][ Example: After the declaration of an uninitialized pointer x (as with int* x;), x must always be assumed to have a singular value of a pointer. —end example ] [...] Dereferenceable values are always non-singular.
and:
An invalid iterator is an iterator that may be singular.268
and footnote 268 says:
This definition applies to pointers, since pointers are iterators. The effect of dereferencing an iterator that has been invalidated is undefined.
Although it does look like there is some controversy over whether a null pointer is singular or not and it looks like the term singular value needs to be properly defined in a more general manner.
The intent of singular is seems to be summed up well in defect report 278. What does iterator validity mean? under the rationale section which says:
Why do we say "may be singular", instead of "is singular"? That's becuase a valid iterator is one that is known to be nonsingular. Invalidating an iterator means changing it in such a way that it's no longer known to be nonsingular. An example: inserting an element into the middle of a vector is correctly said to invalidate all iterators pointing into the vector. That doesn't necessarily mean they all become singular.
So invalidation and being uninitialized may create a value that is singular but since we can not prove they are nonsingular we must assume they are singular.
Update
An alternative common sense approach would be to note that the draft standard section 5.3.1 Unary operators paragraph 1 which says(emphasis mine):
The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object type, or a pointer to a function type and the result is an lvalue referring to the object or function to which the expression points.[...]
and if we then go to section 3.10 Lvalues and rvalues paragraph 1 says(emphasis mine):
An lvalue (so called, historically, because lvalues could appear on the left-hand side of an assignment expression) designates a function or an object. [...]
but ptr will not, except by chance, point to a valid object.
The OP's question is nonsense. There is no requirement that the Standard say certain behaviours are undefined, and indeed I would argue that all such wording be removed from the Standard because it confuses people and makes the Standard more verbose than necessary.
The Standard defines certain behaviour. The question is, does it specify any behaviour in this case? If it does not, the behaviour is undefined whether or not it says so explicitly.
In fact the specification that some things are undefined is left in the Standard primarily as a debugging aid for the Standards writers, the idea being to generate a contradiction if there is a requirement in one place which conflicts with an explicit statement of undefined behaviour in another: that's a way to prove a defect in the Standard. Without the explicit statement of undefined behaviour, the other clause prescribing behaviour would be normative and unchallenged.
Evaluating an uninitialized pointer causes undefined behaviour. Since dereferencing the pointer first requires evaluating it, this implies that dereferencing also causes undefined behaviour.
This was true in both C++11 and C++14, although the wording changed.
In C++14 it is fully covered by [dcl.init]/12:
When storage for an object with automatic or dynamic storage duration is obtained, the object has an indeterminate value, and if no initialization is performed for the object, that object retains an indeterminate value until that value is replaced.
If an indeterminate value is produced by an evaluation, the behavior is undefined except in the following cases:
where the "following cases" are particular operations on unsigned char.
In C++11, [conv.lval/2] covered this under the lvalue-to-rvalue conversion procedure (i.e. retrieving the pointer value from the storage area denoted by ptr):
A glvalue of a non-function, non-array type T can be converted to a prvalue. If T is an incomplete type, a program that necessitates this conversion is ill-formed. If the object to which the glvalue refers is not
an object of type T and is not an object of a type derived from T, or if the object is uninitialized, a program that necessitates this conversion has undefined behavior.
The bolded part was removed for C++14 and replaced with the extra text in [dcl.init/12].
I'm not going to pretend I know a lot about this, but some compilers would initialize the pointer to NULL and dereferencing a pointer to NULL is UB.
Also considering that uninitialized pointer could point to anything (this includes NULL) you could concluded that it's UB when you dereference it.
A note in section 8.3.2 [dcl.ref]
[Note: in particular, a null reference
cannot exist in a well-defined
program, because the only way to
create such a reference would be to
bind it to the “object” obtained by
dereferencing a null pointer, which
causes undefined behavior. As
described in 9.6, a reference cannot
be bound directly to a bitfield. ]
—ISO/IEC 14882:1998(E), the ISO C++ standard, in section 8.3.2 [dcl.ref]
I think I should have written this as comment instead, I'm not really that sure.
To dereference the pointer, you need to read from the pointer variable (not talking about the object it points to). Reading from an uninitialized variable is undefined behaviour.
What you do with the value of pointer after you have read it, doesn't matter anymore at this point, be it writing to (like in your example) or reading from the object it points to.
Even if the normal storage of something in memory would have no "room" for any trap bits or trap representations, implementations are not required to store automatic variables the same way as static-duration variables except when there is a possibility that user code might hold a pointer to them somewhere. This behavior is most visible with integer types. On a typical 32-bit system, given the code:
uint16_t foo(void);
uint16_t bar(void);
uint16_t blah(uint32_t q)
{
uint16_t a;
if (q & 1) a=foo();
if (q & 2) a=bar();
return a;
}
unsigned short test(void)
{
return blah(65540);
}
it would not be particularly surprising for test to yield 65540 even though that value is outside the representable range of uint16_t, a type which has no trap representations. If a local variable of type uint16_t holds Indeterminate Value, there is no requirement that reading it yield a value within the range of uint16_t. Since unexpected behaviors could result when using even unsigned integers in such fashion, there's no reason to expect that pointers couldn't behave in even worse fashion.
I'm in the middle of a discussion trying to figure out whether unaligned access is allowable in C++ through reinterpret_cast. I think not, but I'm having trouble finding the right part(s) of the standard which confirm or refute that. I have been looking at C++11, but I would be okay with another version if it is more clear.
Unaligned access is undefined in C11. The relevant part of the C11 standard (§ 6.3.2.3, paragraph 7):
A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined.
Since the behavior of an unaligned access is undefined, some compilers (at least GCC) take that to mean that it is okay to generate instructions which require aligned data. Most of the time the code still works for unaligned data because most x86 and ARM instructions these days work with unaligned data, but some don't. In particular, some vector instructions don't, which means that as the compiler gets better at generating optimized instructions code which worked with older versions of the compiler may not work with newer versions. And, of course, some architectures (like MIPS) don't do as well with unaligned data.
C++11 is, of course, more complicated. § 5.2.10, paragraph 7 says:
An object pointer can be explicitly converted to an object pointer of a different type. When a prvalue v of type “pointer to T1” is converted to the type “pointer to cv T2”, the result is static_cast<cv T2*>(static_cast<cv void*>(v)) if both T1 and T2 are standard-layout types (3.9) and the alignment requirements of T2 are no stricter than those of T1, or if either type is void. Converting a prvalue of type “pointer to T1” to the type “pointer to T2” (where T1 and T2 are object types and where the alignment requirements of T2 are no stricter than those of T1) and back to its original type yields the original pointer value. The result of any other such pointer conversion is unspecified.
Note that the last word is "unspecified", not "undefined". § 1.3.25 defines "unspecified behavior" as:
behavior, for a well-formed program construct and correct data, that depends on the implementation
[Note: The implementation is not required to document which behavior occurs. The range of possible behaviors is usually delineated by this International Standard. — end note]
Unless I'm missing something, the standard doesn't actually delineate the range of possible behaviors in this case, which seems to indicate to me that one very reasonable behavior is that which is implemented for C (at least by GCC): not supporting them. That would mean the compiler is free to assume unaligned accesses do not occur and emit instructions which may not work with unaligned memory, just like it does for C.
The person I'm discussing this with, however, has a different interpretation. They cite § 1.9, paragraph 5:
A conforming implementation executing a well-formed program shall produce the same observable behavior as one of the possible executions of the corresponding instance of the abstract machine with the same program and the same input. However, if any such execution contains an undefined operation, this International Standard places no requirement on the implementation executing that program with that input (not even with regard to operations preceding the first undefined operation).
Since there is no undefined behavior, they argue that the C++ compiler has no right to assume unaligned access don't occur.
So, are unaligned accesses through reinterpret_cast safe in C++? Where in the specification (any version) does it say?
Edit: By "access", I mean actually loading and storing. Something like
void unaligned_cp(void* a, void* b) {
*reinterpret_cast<volatile uint32_t*>(a) =
*reinterpret_cast<volatile uint32_t*>(b);
}
How the memory is allocated is actually outside my scope (it is for a library which can be called with data from anywhere), but malloc and an array on the stack are both likely candidates. I don't want to place any restrictions on how the memory is allocated.
Edit 2: Please cite sources (i.e., the C++ standard, section and paragraph) in answers.
Looking at 3.11/1:
Object types have alignment requirements (3.9.1, 3.9.2) which place restrictions on the addresses at which an object of that type may be allocated.
There's some debate in comments about exactly what constitutes allocating an object of a type. However I believe the following argument works regardless of how that discussion is resolved:
Take *reinterpret_cast<uint32_t*>(a) for example. If this expression does not cause UB, then (according to the strict aliasing rule) there must be an object of type uint32_t (or int32_t) at the given location after this statement. Whether the object was already there, or this write created it, does not matter.
According to the above Standard quote, objects with alignment requirement can only exist in a correctly aligned state.
Therefore any attempt to create or write an object that is not correctly aligned causes UB.
EDIT This answers the OP's original question, which was "is accessing a misaligned pointer safe". The OP has since edited their question to "is dereferencing a misaligned pointer safe", a far more practical and less interesting question.
The round-trip cast result of the pointer value is unspecified under those circumstances. Under certain limited circumstances (involving alignment), converting a pointer to A to a pointer to B, and then back again, results in the original pointer, even if you didn't have a B in that location.
If the alignment requirements are not met, than that round trip -- the pointer-to-A to pointer-to-B to pointer-to-A results in a pointer with an unspecified value.
As there are invalid pointer values, dereferencing a pointer with an unspecified value can result in undefined behavior. It is no different than *(int*)0xDEADBEEF in a sense.
Simply storing that pointer is not, however, undefined behavior.
None of the above C++ quotes talk about actually using a pointer-to-A as a pointer-to-B. Using a pointer to the "wrong type" in all but a very limited number of circumstances is undefined behavior, period.
An example of this involves creating a std::aligned_storage_t<sizeof(T), alignof(T)>. You can construct your T in that spot, and it will live their happily, even though it "actually" is an aligned_storage_t<sizeof(T), alignof(T)>. (You may, however, have to use the pointer returned from the placement new for full standard compliance; I am uncertain. See strict aliasing.)
Sadly, the standard is a bit lacking in terms of what object lifetime is. It refers to it, but does not define it well enough last I checked. You can only use a T at a particular location while a T lives there, but what that means is not made clear in all circumstances.
All of your quotes are about the pointer value, not the act of dereferencing.
5.2.10, paragraph 7 says that, assuming int has a stricter alignment than char, then the round trip of char* to int* to char* generates an unspecified value for the resulting char*.
On the other hand, if you convert int* to char* to int*, you are guaranteed to get back the exact same pointer as you started with.
It doesn't talk about what you get when you dereference said pointer. It simply states that in one case, you must be able to round trip. It washes its hands of the other way around.
Suppose you have some ints, and alignof(int) > 1:
int some_ints[3] ={0};
then you have an int pointer that is offset:
int* some_ptr = (int*)(((char*)&some_ints[0])+1);
We'll presume that copying this misaligned pointer doesn't cause undefined behavior for now.
The value of some_ptr is not specified by the standard. We'll be generous and presume it actually points to some chunk of bytes within some_bytes.
Now we have a int* that points to somewhere an int cannot be allocated (3.11/1). Under (3.8) the use of a pointer to an int is restricted in a number of ways. Usual use is restricted to a pointer to an T whose lifetime has begun allocated properly (/3). Some limited use is permitted on a pointer to a T which has been allocated properly, but whose lifetime has not begun (/5 and /6).
There is no way to create an int object that does not obey the alignment restrictions of int in the standard.
So the theoretical int* which claims to point to a misaligned int does not point to an int. No restrictions are placed on the behavior of said pointer when dereferenced; usual dereferencing rules provide behavior of a valid pointer to an object (including an int) and how it behaves.
And now our other assumptions. No restrictions on the value of some_ptr here are made by the standard: int* some_ptr = (int*)(((char*)&some_ints[0])+1);.
It is not a pointer to an int, much like (int*)nullptr is not a pointer to an int. Round tripping it back to a char* results in a pointer with unspecified value (it could be 0xbaadf00d or nullptr) explicitly in the standard.
The standard defines what you must do. There are (nearly? I guess evaluating it in a boolean context must return a bool) no requirements placed on the behavior of some_ptr by the standard, other than converting it back to char* results in an unspecified value (of the pointer).