According to cppreference section Core constant expressions point 19) a subtraction operator between two pointers is not legal constant expression until c++14. Can I assume that following code is legal c++17 code or is this interpretation an abuse?
int X, Y;
template <long long V>
struct S { };
int main() {
S<&X - &Y> s;
(void)s;
}
The question is moot. Pointer arithmetics is only defined on the pointers belonging to the same array, which is certainly not the case there. So, the code above is not legal C++, and in fact, fails to compile with compilers available to me.
The quoted cppref article says
A core constant expression is any expression that does not have any
one of the following ..
7) An expression whose evaluation leads to any form of core language
(since C++17) undefined behavior (including signed integer overflow,
division by zero, pointer arithmetic outside array bounds, etc).
Whether standard library undefined behavior is detected is
unspecified. (since C++17)
19) a subtraction operator between two pointers(until C++14)
Likely only array ptr arithemtics inside array bounds is getting 'legalized' since c++14, not all pointer arithmetics
Actually a demo shows that array ptr arithmetics compiles alright even with c++11 (not c++98)
Related
While reading comments of a C++ Weekly video about the constexpr new support in C++20 I found the comment that alleges that C++20 allows UB in constexpr context.
At first I was convinced that comment is right, but more I thought about it more and more I began to suspect that C++20 wording contains some clever language that makes this defined behavior.
Either that all transient allocations return unique addresses or maybe some more general notion in C++ that makes 2 distinct allocation pointers always(even in nonconstexpr context) compare false even if at runtime in reality it is possible that allocator would give you back same address(since you deleted the first allocation).
As a bonus weirdness: you can only use == for comparison, <, > fail...
Here is the program with alleged UB in constexpr:
#include <iostream>
static constexpr bool f()
{
auto p = new int(1);
delete p;
auto q = new int(2);
delete q;
return p == q;
}
int main()
{
constexpr bool res1 = f();
std::cout << res1 << std::endl; // May output 0 or 1
}
godbolt
The result here is implementation-defined. res1 could be false, true, or ill-formed, based on how the implementation wants to define it. And this is just as true for equality comparison as it is for relational comparison.
Both [expr.eq] (for equality) and [expr.rel] (for relational) start by doing an lvalue-to-rvalue conversion on the pointers (because we have to actually read what the value is to do a comparison). [conv.lval]/3 says that the result of that conversion is:
Otherwise, if the object to which the glvalue refers contains an invalid pointer value ([basic.stc.dynamic.deallocation], [basic.stc.dynamic.safety]), the behavior is implementation-defined.
That is the case here: both pointers contain an invalid pointer value, as per [basic.stc.general]/4:
When the end of the duration of a region of storage is reached, the values of all pointers representing the address of any part of that region of storage become invalid pointer values. Indirection through an invalid pointer value and passing an invalid pointer value to a deallocation function have undefined behavior. Any other use of an invalid pointer value has implementation-defined behavior.
with a footnote reading:
Some implementations might define that copying an invalid pointer value causes a system-generated runtime fault.
So the value we get out of the lvalue-to-rvalue conversion is... implementation-defined. It could be implementation-defined in a way that causes those two pointers to compare equal. It could be implementation-defined in a way that causes those two pointers to compare not equal (as apparently all implementations do). Or it could even be implementation-defined in a way that causes the comparison between those two pointers to be unspecified or undefined behavior.
Notably, [expr.const]/5 (the main rule governing constant expressions), despite rejecting undefined behavior and explicitly rejecting any comparison whose result is unspecified ([expr.const]/5.23), says nothing about a comparison whose result is implementation-defined.
There's no undefined behavior here. Anything goes. Which is admittedly very weird during constant evaluation, where we'd expect to see a stricter set of rules.
Notably, with p < q, it appears that gcc and clang reject the comparison as being not a constant expression (which is... an allowed result) while msvc considers both p < q and p > q to be constant expressions whose value is false (which is... also an allowed result).
I am looking for a way to map types to numeric values at compile time, ideally without using a hash as proposed in this answer.
Since pointers can be constexpr, I tried this:
struct Base{};
template<typename T> struct instance : public Base{};
template<typename T>
constexpr auto type_instance = instance<T>{};
template<typename T>
constexpr const Base* type_pointer = &type_instance<T>;
constexpr auto x = type_pointer<int> - type_pointer<float>; // not a constant expression
Both gcc and clang reject this code because type_pointer<int> - type_pointer<float> is not a constant expression, see here, for instance.
Why though?
I can understand that the difference between both values is not going to be stable from one compilation to the next, but within one compilation, it should be constexpr, IMHO.
Subtraction of two non-null pointers which do not point into the same array or to the same object (including one-past-the-array/object) is undefined behavior, see [expr.add] (in particular paragraph 5 and 7) of the C++17 standard (final draft).
Expressions which would have core undefined behavior[1] if evaluated are never constant expressions, see [expr.const]/2.6.
Therefore type_pointer<int> - type_pointer<float> cannot be a constant expression, because the two pointers are to unrelated objects.
Since type_pointer<int> - type_pointer<float> is not a constant expression, it cannot be used to initialize a constexpr variable such as
constexpr auto x = type_pointer<int> - type_pointer<float>;
Trying to use a non-constant expression as initializer to a constexpr variable makes the program ill-formed and requires the compiler to print a diagnostic message. This is what the error message you are seeing is.
Basically compilers are required to diagnose core undefined behavior when it appears in purely compile-time contexts.
You can see that there will be no error if the pointers are to the same object, e.g.:
constexpr auto x = type_pointer<int> - type_pointer<int>;
Here the subtraction is well-defined and the initializer is a constant expression. So the code will compile (and won't have undefined behavior). x will have a well-defined value of 0.
Be aware that if you make x non-constexpr the compiler won't be required to diagnose the undefined behavior and to print a diagnostic message anymore. It is therefore likely to compile.
Subtracting unrelated pointers is still undefined behavior though, not only unspecified behavior. Therefore you will loose any guarantee on what the resulting program will do. It does not only mean that you will get different values for x in each compilation/execution of the code.
[1] Core undefined behavior here refers to undefined behavior in the core language, in contrast with undefined behavior due to use of the standard library. It is unspecified whether undefined behavior as specified for the library causes an (otherwise constant) expression to not be a constant expression, see final sentence before the example in [expr.const]/2.
First of all, I've seen this question about C99 and the accepted answer references operand is not evaluated wording in the C99 Standard draft. I'm not sure this answer applies to C++03. There's also this question about C++ that has an accepted answer citing similar wording and also In some contexts, unevaluated operands appear. An unevaluated operand is not evaluated. wording.
I have this code:
int* ptr = 0;
void* buffer = malloc( 10 * sizeof( *ptr ) );
The question is - is there a null pointer dereference (and so UB) inside sizeof()?
C++03 5.3.3/1 says The sizeof operator yields the number of bytes in the object representation of its operand. The operand is either an expression, which is not evaluated, or a parenthesized type-id.
The linked to answers cite this or similar wording and make use of "is not evaluated" part to deduce there's no UB.
However I cannot find where exactly the Standard links evaluation to having or not having UB in this case.
Does "not evaluating" the expression to which sizeof is applied make it legal to dereference a null or invalid pointer inside sizeof in C++?
I believe this is currently underspecified in the standard, like many issues such as What is the value category of the operands of C++ operators when unspecified?. I don't think it was intentional, like hvd points outs it is probably obvious to the committee.
In this specific case I think we have the evidence to show what the intention was. From GB 91 comment from the Rapperswil meeting which says:
It is mildly distasteful to dereference a null pointer as part of our specification, as we are playing on the edges of undefined behaviour. With the addition of the declval function template, already used in these same expressions, this is no longer necessary.
and suggested an alternate expression, it refers to this expression which is no longer in the standard but can be found in N3090:
noexcept(*(U*)0 = declval<U>())
The suggestion was rejected since this does not invoke undefined behavior since it is unevaluated:
There is no undefined behavior because the expression is an unevaluated operand. It's not at all clear that the proposed change would be clearer.
This rationale applies to sizeof as well since it's operands are unevaluated.
I say underspecified but I wonder if this is covered by section 4.1 [conv.lval] which says:
The value contained in the object indicated by the lvalue is the rvalue result. When an lvalue-to-rvalue conversion occurs
within the operand of sizeof (5.3.3) the value contained in the referenced object is not accessed, since that operator
does not evaluate its operand.
It says the value contained is not accessed, which if we follow the logic of issue 232 means there is no undefined behavior:
In other words, it is only the act of "fetching", of lvalue-to-rvalue conversion, that triggers the ill-formed or undefined behavior
This is somewhat speculative since the issue is not settled yet.
Since you explicitly asked for standard references - [expr.sizeof]/1:
The operand is either an expression, which is an unevaluated operand
(Clause 5), or a parenthesized type-id.
[expr]/8:
In some contexts, unevaluated operands appear (5.2.8, 5.3.3, 5.3.7,
7.1.6.2). An unevaluated operand is not evaluated.
Because the expression (i.e. the dereferenciation) is never evaluated, this expression is not subject to some constraints that it would normally be violating. Solely the type is inspected. In fact, the standard uses null references itself in an example in [dcl.fct]/12:
A trailing-return-type is most useful for a type that would be more
complicated to specify before the
declarator-id:
template <class T, class U> auto add(T t, U u) -> decltype(t + u);
rather than
template <class T, class U> decltype((*(T*)0) + (*(U*)0)) add(T t, U u);
— end note ]
The specification only says that dereferencing some pointer that is NULL is UB. Since sizeof() is not a real function, and it doesn't actually use the arguments for anything other than getting the type, it never references the pointer. That's WHY it works. Someone else can get all the points for looking up the spec wording that states that "the argument to sizeof doesn't get referenced".
Note that it's also entirely legal to do int arr[2]; size_t s = sizeof(arr[-111100000]); too - it doesn't matter what the index is, because sizeof never actually "does anything" to the argument passed.
Another example to show how it's "not doing anything" would be something like this:
int func()
{
int *ptr = reinterpret_cast<int*>(32);
*ptr = 7;
return 42;
}
size_t size = sizeof(func());
Again, this wouldn't crash, because func() is just resolved by the compiler to the type that it produces.
Equally, if sizeof actually "does something" with the argument, what would happen when you do this:
char *buffer = new sizeof(char[10000000000]);
Would it create a 10000000000 stack allocation, then give the size back after it crashed the code because there isn't enough megabytes of stack? [In some systems, stack size is counted in bytes, not megabytes]. And whilst nobody writes code like that, you could easily come up with something similar using typedef of either buffer_type as an array of char, or some kind of struct with large content.
While checking the references for another question, I noticed an odd clause in C++11, at [expr.rel] ¶3:
Pointers to void (after pointer conversions) can be compared, with a result defined as follows: If both
pointers represent the same address or are both the null pointer value, the result is true if the operator is
<= or >= and false otherwise; otherwise the result is unspecified.
This seems to mean that, once two pointers have been casted to void *, their ordering relation is no longer guaranteed; for example, this:
int foo[] = {1, 2, 3, 4, 5};
void *a = &foo[0];
void *b = &foo[1];
std::cout<<(a < b);
would seem to be unspecified.
Interestingly, this clause wasn't there in C++03 and disappeared in C++14, so if we take the example above and apply the C++14 wording to it, I'd say that ¶3.1
If two pointers point to different elements of the same array, or to subobjects thereof, the pointer to the element with the higher subscript compares greater.
would apply, as a and b point to elements of the same array, even though they have been casted to void *. Notice that the wording of ¶3.1 was there pretty much the same in C++11, but seemed to be overridden by the void * clause.
Am I right in my understanding? What was the point of that oddball clause added in C++11 and immediately removed? Or maybe it's still there, but moved to/implied by some other part of the standard?
TL;DR:
in C++98/03 the clause was not present, and the standard did not specify relational operators for void pointers (core issue 879, see end of this post);
the odd clause about comparing void pointers was added in C++11 to resolve it, but this in turn gave rise to two other core issues 583 & 1512 (see below);
the resolution of these issues required the clause to be removed and be replaced with the wording found in C++14 standard, which allows for "normal" void * comparison.
Core Issue 583: Relational pointer comparisons against the null pointer constant
Relational pointer comparisons against the null pointer constant Section: 8.9 [expr.rel]
In C, this is ill-formed (cf C99 6.5.8):
void f(char* s) {
if (s < 0) { }
} ...but in C++, it's not. Why? Who would ever need to write (s > 0) when they could just as well write (s != 0)?
This has been in the language since the ARM (and possibly earlier);
apparently it's because the pointer conversions (7.11 [conv.ptr]) need
to be performed on both operands whenever one of the operands is of
pointer type. So it looks like the "null-ptr-to-real-pointer-type"
conversion is hitching a ride with the other pointer conversions.
Proposed resolution (April, 2013):
This issue is resolved by the resolution of issue 1512.
Core Issue 1512: Pointer comparison vs qualification conversions
Pointer comparison vs qualification conversions Section: 8.9 [expr.rel]
According to 8.9 [expr.rel] paragraph 2, describing pointer
comparisons,
Pointer conversions (7.11 [conv.ptr]) and qualification conversions
(7.5 [conv.qual]) are performed on pointer operands (or on a pointer
operand and a null pointer constant, or on two null pointer constants,
at least one of which is non-integral) to bring them to their
composite pointer type. This would appear to make the following
example ill-formed,
bool foo(int** x, const int** y) {
return x < y; // valid ? } because int** cannot be converted to const int**, according to the rules of 7.5 [conv.qual] paragraph 4.
This seems too strict for pointer comparison, and current
implementations accept the example.
Proposed resolution (November, 2012):
Relevant excerpts from resolution of the above issues are found in the paper: Pointer comparison vs qualification conversions (revision 3).
The following also resolves core issue 583.
Change in 5.9 expr.rel paragraphs 1 to 5:
In this section the following statement (the odd clause in C++11) has been expunged:
Pointers to void (after pointer conversions) can be compared, with a result defined as follows: If both pointers represent the same address or are both the null pointer value, the result is true if the operator is <= or >= and false otherwise; otherwise the result is unspecified
And the following statements have been added:
If two pointers point to different elements of the same array, or to subobjects thereof, the pointer to the element with the higher subscript compares greater.
If one pointer points to an element of an array, or to a subobject thereof, and another pointer points one past the last element of the array, the latter pointer compares greater.
So in the final working draft of C++14 (n4140) section [expr.rel]/3, the above statements are found as they were stated at the time of the resolution.
Digging for the reason why this odd clause was added led me to a much earlier issue 879: Missing built-in comparison operators for pointer types.
The proposed resolution of this issue (in July, 2009) led to the addition of this clause which was voted into WP in October, 2009.
And that is how it came to be included in the C++11 standard.
What is the unary-& doing here?
int * a = 1990;
int result = &5[a];
If you were to print result you would get the value 2010.
You have to compile it with -fpermissive or it will stop due to errors.
In C, x [y] and y [x] are identical. So &5[a] is the same as &a[5].
&5[a] is the same as &a[5] and the same as a + 5. In your case it's undefined behavior because a points to nowhere.
C11 standard chapter 6.5.6 Additive operators/8 (the same in C++):
If both the pointer
operand and the result point to elements of the same array object, or one past the last
element of the array object, the evaluation shall not produce an overflow; otherwise, the
behavior is undefined.
"...unary & on numeric literal"?
Postfix operators in C always have higher priority than prefix ones. In case of &5[a], the [] has higher priority than the &. Which means that in &5[a] the unary & is not applied to "numeric literal" as you seem to incorrectly believe. It is applied to the entire 5[a] subexpression. I.e. &5[a] is equivalent to &(5[a]).
As for what 5[a] means - this is a beaten-to-death FAQ. Look it up.
And no, you don't have "to compile it with -fpermissive" (my compiler tells me it doesn't even know what -fpermissive is). You have to figure out that this
int * a = 1990;
is not legal code in either C or C++. If anything, it requires an explicit cast
int * a = (int *) 1990;
not some obscure switch of some specific compiler you happened to be using at the moment. The same applies to another illegal initialization in int result = &5[a].
Finally, even if we overlook the illegal code and the undefined behavior triggered by that 5[a], the behavior of this code will still be highly implementation-dependent. I.e. the answer is no, in general case you will not get 2010 in result.
You cannot apply the unary & operator to an integer literal, because a literal is not an lvalue.
Due to operator precedence, your code doesn't do that. Since the indexing operator [] binds more tightly than unary &, &5[a] is equivalent to &(5[a]).
Here's a program similar to yours, except that it's valid code, not requiring -fpermissive to compile:
#include <stdio.h>
int main(void) {
int arr[6];
int *ptr1 = arr;
int *ptr2 = &5[ptr1];
printf("%p %p\n", ptr1, ptr2);
}
As explained in this question and my answer, the indexing operator is commutative (because it's defined in terms of addition, and addition is commutative), so 5[a] is equivalent to a[5]. So the expression &5[ptr1] computes the address of element 5 of arr.
In your program:
int * a = 1990;
int result = &5[a];
the initialization of a is invalid because a is of type int* and 1990 is of type int, and there is no implicit conversion from int to int*. Likewise, the initialization of result is invalid because &5[a] is of type int*. Apparently -fpermissive causes the compiler to violate the rules of the language and permit these invalid implicit conversions.
At least in the version of gcc I'm using, the -fpermissive option is valid only for C++ and Objective-C, not for C. In C, gcc permits such implicit conversions (with a warning) anyway. I strongly recommend not using this option. (Your question is tagged both C and C++. Keep in mind that C and C++ are two distinct, though closely related, languages. They happen to behave similarly in this case, but it's usually best to pick one language or the other.)